Digital Diatribes

A presentation of data on climate and other stuff

Solar Cycle Length, Sunspot Count, and Temperature – An Insurance “Pricing” Analysis

Posted by The Diatribe Guy on October 7, 2008

Being an actuary, my profession is the butt of many bad jokes. One of my “favorites” is the one about how you can tell the difference between an actuary and an accountant. Answer: Accountants look at the other person’s shoes when they are talking to them.

I’ve always considered myself atypical in a profession known for its geekdom. But, I do have to face a certain reality. I often feign memory-loss when someone asks me what I did the previous evening. That’s because I am somewhat embarrassed to say that I spent a couple hours reading over a research paper on solar cycles, or analyzing temperature anomalies. Even I have to admit that this makes me appear to be a loser. It often gets me in a little trouble at home when the wife notes that the boys need to be roughhoused with, or tomatoes need to be canned, and she could use a little help here or there. I try to point out that I’m trying to save the planet (just not in the way others claim to be) but alas, she doesn’t buy into the importance of understanding the significance of a slowing in the sun’s rotation at different latitudes.

Nonetheless, I press on. And not being a climatologist, but an actuary, I tend to look at the data and conjure up thoughts of how to process it utilizing my actuarial background. There are many ways that the data can be adjusted and analyzed. My interest as of late has been to try and determine a way to test the various elements of the solar cycle and see if there is some relationship to temperature that can be determined. And that is what I have done here.

In actuarialdom, one of the enigmatic things we do is price insurance products. A very simple illustration as to how that is done is to look at age and sex, for example. Suppose we have a large population of people. We decide to split out the ages into 10 groups. We have two groups relating to sex (if that needs explaining, then you must live in California). While it may seem apparent that you can just look at the results of the 20 individual cells defined by those two sets of groups, that is only true because of the simple example here. In reality, we usually have a large number of different rating parameters and the unique cells could literally be in the millions. So, we’ll proceed with this example as if each cell is not credible enough to analyze on its own.

The first thing you can do is look at the experience by age. If you have a base cost per policy, you can apply a rating factor to change the cost as your age adjustment. Then you can look at the experience by sex. If you multiply these two factors together, and then multiply by the base, you get a rate for each particular cell.

The problem with that, though, is that you are not accounting for cross-biases. In other words, if a disproportionate percentage of people in one age class are of a certain sex, then the results of your analysis are skewed. This influence must be eliminated (or at least mitigated to the extent possible). We do this through iterative procedures where the factors are continually adjusted and compared to the known results so that the resulting set of factors are essentially stripped of the other variables’ influences. That way, when the two factors are mutliplied together, it’s a true picture of the risk presented by that cell, rather than an understated or overstated picture because of undue influence of other parameters.

Why am I talking about this? Because when I think of temperature, I kind of think of it the same way as a pricing problem in insurance. A price is determined because there is an exposure, and the exposure has certain characteristics. These characteristics add or subtract dollars to the price according to the risk they present. The better we get at identifying all the appropriate risk characteristics, the more effective we are in pricing to suit the risk.

Likewise, temperature (at least in my mind) can be thought of as being comprised of a number of elements all working in concert with each other. I decided to take a look at the solar cycles, making an assumption (surely an incorrect one) that only the sun matters with regard to temperature. Consider this an initial analysis. As time and data allows, I can incorporate measures of just about anything into the spreadsheet, including measurements of Carbon Dioxide, methane, and the number of pirates seizing Ukranian warships. Adding factors will help refine the true impacts of each solar measure to temperature. In their absence, the factors are still appropriate for observing the general trend and relative magnitude, but there may well be changes to the factors with the introduction of other parameters.

All that said, let me outline the methodology here, in general terms. If anyone is interested in the more comprehensive details, I’d be happy to provide it:

The data used comes from this source.

Seven Parameters were defined, as follows:
1) Months since the most recent minimum
2) Months since the minimum antecedent to the most recent

Why #2? This is my way of noting the effect of longer cycles, particularly consecutive longer cycles. If length matters, then it stands to reason that it doesn’t only matter within the current cycle, but how consecutive cycles interplay in length.

3) Months since the most recent maximum
4) Months since the maximum antecedent to the most recent

5) Average sunspot number from the most recent 12 months
6) Average sunspot number from months 13-24 prior to current
7) Average sunspot number from months 25-36 prior to current

The purpose of 6,7 is to determine a lag effect to sunspot activity.

For each of these parameters, I selected groupings of data points to enhance credibility, while trying to get enough refinement. I tested a few different groups, and settled on 6-month increments for 1-4, and 10-count increments for 5-7.

Parameter 1 ranges from 0 months to 144 months
Parameter 2 ranges from 120 months to 288 months
Parameter 3 ranges from 0 months to 168 months
Parameter 4 ranges from 114 months to 288 months

Parameters 5-7 all range from 0 to 210.

The factors are additive, rather than multiplicative. For example, if a factor is determined to be 0.20, then it is saying that it is expected to add 0.2 degrees to the reference anomaly.

The reference anomaly is the 12-month average HadCrut anomaly at the time of the reference point. HadCrut was used simply because it extends back further than other temperature measures do. I’m at the mercy of the mystical adjustments and measurement issues, but so be it. The reference point is the start of the cycle to which the parameter refers. For example, if the current month is 112 months since the most recent minimum, then the reference anomaly is the 12-month average anomaly ending 112 months ago. But the next parameter of 222 months since the minimum antecedent the most recent minimum will have a different anomaly as a reference. Likewise, the months from the maximums will reference different anomalies. To account for this, the reference anomalies were summed and divided by 4, and then the formula determines the amount that needs to be added for each parameter to produce the expected current anomaly. Added to this are the adjustments for sunspot activity for the three different periods. These don’t require a reference period.

When the factors are added together depending on the interacting characteristics of the different parameters, and when the factors are determined through a simultaneous iterative process such that a minimum squared error is returned, the sets of factors are determined.

In an actuarial process, you would look to see if the factors show a predictive pattern of increasing or decreasing factors, and adjust outliers. Likewise, if a particular factor showed no pattern, it would be an indication that there is no correlation. I didn’t smooth the final results, as I wanted to show the actual numbers produced by the analysis. There is a pretty clear pattern (either increasing or decreasing) for every parameter, except perhaps the last one. I’m not ready to eliminate that yet, because sometimes random noise can be mitigated by introducing other parameters that may be introducing a cross-bias.

Here are the charts that demonstrate the results of my analysis. For the purposes of these charts, “Indicated Influence” is the term I’ve chosen to show the correlation with temperature change that each parameter shows. Just as different risk characteristics in insurance may actually be proxies for other things, the same may be true here. For example, when using age for rating automobile insurance, what is really being measured is maturity, years of experience, physical limitations, etc. Likewise, sunspots may have some direct effect on weather, but may also encompass other solar phenomena that coincides with sunpot cycles. Also, as noted, this analysis does not yet look into other parameters that potentially effect temperature. So, to the extent that some of these other impact correlate to or are somehow related to the solar parameters examined, there can be an understatement or overstatement of the influence. These limitations are noted and will be further examined at a later time. However, despite these limitations, it is expected that if there were no particular influence from solar activity, we would not see any particular correlation:

Temperature tends to decrease close to the most recent minimum, but increases with length of the current cycle, within that cycle after about 3 years has passed.

Temperature tends to decrease once the second subsequent cycle gets 240-250 months from the minimum antecedent to the most recent minimum.

Surprisingly enough, there is a negative correlation near the recent maximum, and it takes between 6-7 years within the max-to-max cycle to see an increase. Later on, there is a large contribution to temperature evident.

There is an offsetting negative contribution in the subsequent cycle. The longer two consecutive max-to-max cycles are, there seems to be a very significant correlation to temperature.

I was surprised to see this pattern emerge, but it appears that sunspot counts have an inverse relationship in contribution to temperature in year one. Also, the magnitude of contribution is much less than some of the length parameters.

A year removed from sunspot activity shows a direct relationship to temperature. This indicates that there is a lag in realization of the impact of a quiet or active sun.

There seems to be a lag that shows an effect for up to three years, but the 3rd year impact is a little less certain. There is a strange reversal above 190 coutn average that may simply be an issue of credibility, because I’m not sure it makes much sense. Given a couple other reversals, it could simply be that the correlation is less clear, as well. In any case, the magnitude is between +/-0.1 degrees Celsius in most of the cases.

There are numerous issues with the study which need to be stated: (1) reliance on HadCrut anomaly changes; (2) definition of minimum and max as the lowest 12-month average could be investigated; (3) no other parameters are used that could influence these factors, such as Carbon Dioxide levels, Methane levels, volcanic activity, ENSO effects, etc. I understand this, and would like to expand this review to include some of these things. One major problem is the lack of good data extending back to 1850 or further. There simply becomes a credibility issue when you reduce the points of observation. (4) Are the factors additive? Could some be multiplicative? This could be reviewed and tested to see if different assumptions yield more accurate results.

In any case, I thought the exercise to be an interesting one. I hope you agree.

11 Responses to “Solar Cycle Length, Sunspot Count, and Temperature – An Insurance “Pricing” Analysis”

  1. Jeff Id said

    It is a pretty interesting result. If there weren’t a good correlation you wouldn’t see such clear patterns in your plots. Too bad you don’t have 500 years of data to play with.

  2. Diatribical Idiot said

    The sparseness of the data is the largest issue, especially as I introduce other parameters into the mix. I have PDO data back to 1900, but I’d like to find El Nino data going back further, as well as good historical Carbon Dioxide data (which admittedly I haven’t searched too hard for at this point).

  3. Jeff Id said

    Be careful with the CO2 data, the records aren’t very good. Ice records are contaminated by partial pressure gas issues which make the CO2 disperse. Also, a bunch of CO2 measurements made by reasonable instruments were thrown out so even since 1900 the data is a bit of a mess.

    It’s amazing to me that we humans think we are so smart now yet only 200 years ago we didn’t even keep temp records.

  4. Al said

    This observation is in line with Svensmark and Calder’s book “The Chilling Stars” which relates solar magnetic activity to deflection of interstellar cosmic rays. Solar ejecta, moving at approximately 1 million miles per hour, take between two and three years to reach the edge of the solar system where much of the cosmic ray deflection occurs. The cosmic rays, moving at nearly the speed of light, are deflected more or less depending on the solar wind pressure at the solar system edge. Cosmic rays are important to earth temperature because they produce charged atmospheric particles which nucleate water vapor to form clouds, just like in the cloud chambers in physics labs. More clouds mean less heat reaching the earth’s surface, less clouds the reverse. Thus, high solar magnetic activity means less cosmic rays, less clouds and warmer temperatures. Low solar magnetic activity means more cosmic rays, more clouds and cooler temperatures. Here is a link to the book on Amazon

  5. […] tu, Pluto?Some fun stats with Sunspots and how the current activity stacks up against recent historySolar Cycle Length, Sunspot Count, and Temperature – An Insurance “Pricing” AnalysisDecember 2008 Update on Global Temperature – UAHA Look at the Atlantic Multidecadal Oscillation […]

  6. Alex Atkinson said

    Do you not think it’s worth trying this with the Central England Temperature Record? The longest standing accurate temperature record (1649-present). I was suprised at the comment above stating we only have records dating back 200 years. CET is well known! Been updated by the Met office over the past century.

    Im currently doing my thesis on the correlation between sunspot numbers and the CET. Some interesting results, i’m going to follow your method above for a large portion of the data and see what happens. Also taken in to account the effects on solar irradiation and a possible amplifier effect through cloud formation. Any other ideas would help 😀 I’ve got two weeks…

  7. Layman Lurker said

    Alex, perhaps you could post your thesis up as a guest post here and take Joe up on his offer.

  8. The Diatribe Guy said

    Alex, sounds interesting. Keep in mind the caveats of my analysis. Without isolating other contributors to temperature, you still get correlation regarding the sunspots and cycle length. But if you truly want to isolate it (or at least isolate it more) it would require that you also account for other contributors to temperature. This should include CO2 and the major ocean indices. It is possible that the ocean oscillations are somehow related to the solar cycle (or “a” solar cycle, anyway). I realize your focus is on the sun, but by introducing these other effects you can isolate their contributive effects and better isolate the solar effects.

    This has been the big to-do on my list for some time. Get all these ocean indices and other effects pooled together and run a correlation analysis. If you are to make progress, I wish you well and wouold be more than happy to report on it.

    I’ve also considered introducing seasonality as a parameter, to isolate any contribution by month, since axis tilt and distance may cause some predictable variations. It’s difficult to see those without pulling out the other contributing parameters, but the more things you can isolate, the more the actual imapct of these things can be seen.

    One suggestion on the iterative process: The more parameters you introduce, the more of an issue credibility becomes. Let it run full out, and then find the data set that seems to make the most sense. You will likely need to make some judgmental adjustments to the data, but you can note those. Then, fix that set of parameters and re-run your iterative process. Then find the next most trusted set, make any reasonable adjustments, and fix that set, and continue to do this until you’ve balanced everything out. This will help you to produce the most reasonable set of factors.

    Good luck to you. Let us know how it goes.

  9. […] I use HadCrut, since we have readings for that back to 1850. I explain the methodology more fully here, and won’t repeat myself in this […]

  10. Interesting article. Were did you got all the information from…

    • The Diatribe Guy said

      Links are to the right of my page. I used the HadCrut data for temperature and the NOAA sunspot data for the sunspots.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: