Solar Cycle Length, Sunspot Count, and Temperature – An Insurance “Pricing” Analysis
Posted by The Diatribe Guy on October 7, 2008
Being an actuary, my profession is the butt of many bad jokes. One of my “favorites” is the one about how you can tell the difference between an actuary and an accountant. Answer: Accountants look at the other person’s shoes when they are talking to them.
I’ve always considered myself atypical in a profession known for its geekdom. But, I do have to face a certain reality. I often feign memory-loss when someone asks me what I did the previous evening. That’s because I am somewhat embarrassed to say that I spent a couple hours reading over a research paper on solar cycles, or analyzing temperature anomalies. Even I have to admit that this makes me appear to be a loser. It often gets me in a little trouble at home when the wife notes that the boys need to be roughhoused with, or tomatoes need to be canned, and she could use a little help here or there. I try to point out that I’m trying to save the planet (just not in the way others claim to be) but alas, she doesn’t buy into the importance of understanding the significance of a slowing in the sun’s rotation at different latitudes.
Nonetheless, I press on. And not being a climatologist, but an actuary, I tend to look at the data and conjure up thoughts of how to process it utilizing my actuarial background. There are many ways that the data can be adjusted and analyzed. My interest as of late has been to try and determine a way to test the various elements of the solar cycle and see if there is some relationship to temperature that can be determined. And that is what I have done here.
In actuarialdom, one of the enigmatic things we do is price insurance products. A very simple illustration as to how that is done is to look at age and sex, for example. Suppose we have a large population of people. We decide to split out the ages into 10 groups. We have two groups relating to sex (if that needs explaining, then you must live in California). While it may seem apparent that you can just look at the results of the 20 individual cells defined by those two sets of groups, that is only true because of the simple example here. In reality, we usually have a large number of different rating parameters and the unique cells could literally be in the millions. So, we’ll proceed with this example as if each cell is not credible enough to analyze on its own.
The first thing you can do is look at the experience by age. If you have a base cost per policy, you can apply a rating factor to change the cost as your age adjustment. Then you can look at the experience by sex. If you multiply these two factors together, and then multiply by the base, you get a rate for each particular cell.
The problem with that, though, is that you are not accounting for cross-biases. In other words, if a disproportionate percentage of people in one age class are of a certain sex, then the results of your analysis are skewed. This influence must be eliminated (or at least mitigated to the extent possible). We do this through iterative procedures where the factors are continually adjusted and compared to the known results so that the resulting set of factors are essentially stripped of the other variables’ influences. That way, when the two factors are mutliplied together, it’s a true picture of the risk presented by that cell, rather than an understated or overstated picture because of undue influence of other parameters.
Why am I talking about this? Because when I think of temperature, I kind of think of it the same way as a pricing problem in insurance. A price is determined because there is an exposure, and the exposure has certain characteristics. These characteristics add or subtract dollars to the price according to the risk they present. The better we get at identifying all the appropriate risk characteristics, the more effective we are in pricing to suit the risk.
Likewise, temperature (at least in my mind) can be thought of as being comprised of a number of elements all working in concert with each other. I decided to take a look at the solar cycles, making an assumption (surely an incorrect one) that only the sun matters with regard to temperature. Consider this an initial analysis. As time and data allows, I can incorporate measures of just about anything into the spreadsheet, including measurements of Carbon Dioxide, methane, and the number of pirates seizing Ukranian warships. Adding factors will help refine the true impacts of each solar measure to temperature. In their absence, the factors are still appropriate for observing the general trend and relative magnitude, but there may well be changes to the factors with the introduction of other parameters.
All that said, let me outline the methodology here, in general terms. If anyone is interested in the more comprehensive details, I’d be happy to provide it:
The data used comes from this source.
Seven Parameters were defined, as follows:
1) Months since the most recent minimum
2) Months since the minimum antecedent to the most recent
Why #2? This is my way of noting the effect of longer cycles, particularly consecutive longer cycles. If length matters, then it stands to reason that it doesn’t only matter within the current cycle, but how consecutive cycles interplay in length.
3) Months since the most recent maximum
4) Months since the maximum antecedent to the most recent
5) Average sunspot number from the most recent 12 months
6) Average sunspot number from months 13-24 prior to current
7) Average sunspot number from months 25-36 prior to current
The purpose of 6,7 is to determine a lag effect to sunspot activity.
For each of these parameters, I selected groupings of data points to enhance credibility, while trying to get enough refinement. I tested a few different groups, and settled on 6-month increments for 1-4, and 10-count increments for 5-7.
Parameter 1 ranges from 0 months to 144 months
Parameter 2 ranges from 120 months to 288 months
Parameter 3 ranges from 0 months to 168 months
Parameter 4 ranges from 114 months to 288 months
Parameters 5-7 all range from 0 to 210.
The factors are additive, rather than multiplicative. For example, if a factor is determined to be 0.20, then it is saying that it is expected to add 0.2 degrees to the reference anomaly.
The reference anomaly is the 12-month average HadCrut anomaly at the time of the reference point. HadCrut was used simply because it extends back further than other temperature measures do. I’m at the mercy of the mystical adjustments and measurement issues, but so be it. The reference point is the start of the cycle to which the parameter refers. For example, if the current month is 112 months since the most recent minimum, then the reference anomaly is the 12-month average anomaly ending 112 months ago. But the next parameter of 222 months since the minimum antecedent the most recent minimum will have a different anomaly as a reference. Likewise, the months from the maximums will reference different anomalies. To account for this, the reference anomalies were summed and divided by 4, and then the formula determines the amount that needs to be added for each parameter to produce the expected current anomaly. Added to this are the adjustments for sunspot activity for the three different periods. These don’t require a reference period.
When the factors are added together depending on the interacting characteristics of the different parameters, and when the factors are determined through a simultaneous iterative process such that a minimum squared error is returned, the sets of factors are determined.
In an actuarial process, you would look to see if the factors show a predictive pattern of increasing or decreasing factors, and adjust outliers. Likewise, if a particular factor showed no pattern, it would be an indication that there is no correlation. I didn’t smooth the final results, as I wanted to show the actual numbers produced by the analysis. There is a pretty clear pattern (either increasing or decreasing) for every parameter, except perhaps the last one. I’m not ready to eliminate that yet, because sometimes random noise can be mitigated by introducing other parameters that may be introducing a cross-bias.
Here are the charts that demonstrate the results of my analysis. For the purposes of these charts, “Indicated Influence” is the term I’ve chosen to show the correlation with temperature change that each parameter shows. Just as different risk characteristics in insurance may actually be proxies for other things, the same may be true here. For example, when using age for rating automobile insurance, what is really being measured is maturity, years of experience, physical limitations, etc. Likewise, sunspots may have some direct effect on weather, but may also encompass other solar phenomena that coincides with sunpot cycles. Also, as noted, this analysis does not yet look into other parameters that potentially effect temperature. So, to the extent that some of these other impact correlate to or are somehow related to the solar parameters examined, there can be an understatement or overstatement of the influence. These limitations are noted and will be further examined at a later time. However, despite these limitations, it is expected that if there were no particular influence from solar activity, we would not see any particular correlation:
Surprisingly enough, there is a negative correlation near the recent maximum, and it takes between 6-7 years within the max-to-max cycle to see an increase. Later on, there is a large contribution to temperature evident.
I was surprised to see this pattern emerge, but it appears that sunspot counts have an inverse relationship in contribution to temperature in year one. Also, the magnitude of contribution is much less than some of the length parameters.
There seems to be a lag that shows an effect for up to three years, but the 3rd year impact is a little less certain. There is a strange reversal above 190 coutn average that may simply be an issue of credibility, because I’m not sure it makes much sense. Given a couple other reversals, it could simply be that the correlation is less clear, as well. In any case, the magnitude is between +/-0.1 degrees Celsius in most of the cases.
There are numerous issues with the study which need to be stated: (1) reliance on HadCrut anomaly changes; (2) definition of minimum and max as the lowest 12-month average could be investigated; (3) no other parameters are used that could influence these factors, such as Carbon Dioxide levels, Methane levels, volcanic activity, ENSO effects, etc. I understand this, and would like to expand this review to include some of these things. One major problem is the lack of good data extending back to 1850 or further. There simply becomes a credibility issue when you reduce the points of observation. (4) Are the factors additive? Could some be multiplicative? This could be reviewed and tested to see if different assumptions yield more accurate results.
In any case, I thought the exercise to be an interesting one. I hope you agree.