Digital Diatribes

A presentation of data on climate and other stuff

Deconstructing the HadCrut Data

Posted by The Diatribe Guy on February 10, 2009

In this post I took a look at the PDO, AMO, and ENSO data and went through the exercise of fitting a sine wave to see how the fit looked.  On all three, the fit seemed to be a good and reasonable way of estimating the general level of the curves at a given time.  Obviously, there are fluctuations about those curves, but all in all it seemed to be fairly adequate.

So, the other day I was thinking about the implications of hypothesizing the contribution of these
periodic elements to the temperature data, and I figured that it may be an interesting exercise to deconstruct the HadCrut data in the same way.  Understanding that there may be multiple oscillations going on, I set out to fit multiple waves to the data to see what I could make of it.

I used HadCrut because their records data back the furthest.  And also because the older data has not undergone the continual adjustments that the GISS data has.  Since HadCrut protects their process in determining the anomaly, it is not transparent to the user of the data what kind of spreading and adjustments they make to it.  However, I am inclined to place a little more trust in it than I do GISS because the data is more in line over the last 30 years with the satellite data than GISS is.  (Admittedly, based on observation.  I haven’t done a rigorous analysis on that).

Let me start by introducing the most recent chart that shows the overall linear trend on HadCrut over time. I didn’t do a January update, but the additional data point won’t have much of an effect.

Overall Trend

The overall trend since January 1850 has a slope of 0.0003649, which corresponds to warming of 0.4378 degrees Celsius per Century. Light blue lines are raw anomalies, and the black line is a 12-month smoothed number.

Here are the required parameters and procedures to get the best fit:

  1. A parameter that assigns the optimal wave length. This is done by determining a degree increment per month that gets added to the previous degree amount. That value is then converted to radians, and the sine value is calculated.
  2. A wave scale factor is used to determine amplitude of the wave.
  3. The starting wave position in January 1850
  4. Linear trend that the wave is centered on
  5. Vertical wave shift – since the anomalies over time aren’t necessarily centered about zero, and even if they were there may still be a shift needed depending on wave cycle weights, the wave needs to be shifted up or down to achieve optimal fit.
  6. Parameters 1-3 are wave specific when optimizing multiple waves, while parameters 4 and 5 are applied after the cooperative effect of all the waves tested are determined.
  7. To determine the best fit, I set up my calculations using the parameters above to compare against the historical HadCrut anomalies, determine the difference, and square the result. I then minimized the sum of the squares.

The first run I did was against only one wave. Even this fit provides substantial improvement over a simple linear fit, and starts to demonstrate the fact that there are clear cycles in the data.

Single Sine Wave Fit Against HadCrut

The best-fit single sine wave along a linear trend.

While we can see visually that it is not entirely perfect, we can definitely see the general wave it’s tracing. The parameters on this wave are as follows:

  • Wave length parameter = 0.45658. This corresponds to a 788.5 month complete cycle, or 65.7 years. This fit lines up right along with the cycle lengths determined in the PDO/AMO/ENSO study.
  • Wave scale parameter = 0.13222. Basically, this means that the overall wave doesn’t deviate from the trend line by more than this value.
  • Start Wave position = -31.71316. (0 would be right on the trend line, and +/-90 would be max deviation from the trend line in either direction.)
  • Linear trend = 0.00037. This is spot on with the overall linear trend observed without the sine curve applied.
  • Vertical Wave Shift = -0.53644 means that the start of the sin wave had to be shifted downward by this amount.

The least squares result was 56.18. This compares to the least squares fit on the linear trend line only of 72.58. This is nearly a 25% improvement.

The conclusions of this:

  • there is still a definite linear trend, but most of the fluctuation about that trend can be explained by adding a single sin wave.
  • the most recent decade or two is not satisfactorily explained by the sine wave, and the latest anomalies are above the wave. This could be consistent with the idea that the something has changed (e.g. increased Carbon Dioxide has accelerated what had been a linear trend).  Alternatively, it may simply be that a single sine wave is insufficient and there are other periodic influences that need to be examined.

An interesting exercise is to extrapolate the linear trend with the single sine curve forward. Taking this to 2050 shows us the following:

Single Sine Wave Fit Against HadCrut Extrapolated

The best-fit single sine wave along a linear trend, extrapolated to 2050.


  • The graph indicates that we are at or near the peak of the single sine curve fit, and that the next 23 years will cool. 
  • There is still the linear trend line observed at just over 0.4 degrees Celsius per Century, so the anticipated trough won’t be as severe, but we will still be cooler than today for the next 30 years or so.
  • The trough of the curve occurs October 2032, where single-digit anomalies would be the norm.

While the single sine wave fit provides interesting information, from observing the AMO/PDO information, it seemed clear that there would likely be at least one additional periodic wave in the data. So I added an additional wave and the following chart ensued:

Double Sine Wave Fit Against HadCrut

The best-fit double sine wave along a linear trend.

The interesting thing about this fit is that the cycles of both waves are shorter than the singularly combined wave (59.1 years and 58.5 years), and the amplitude of both waves is around 1.5. The combined wave is only 0.13222 because the phases of these two waves at the start of the period are almost perfectly offset (182 degrees apart). The linear trend is still apparent, though just a shade less (0.42 degrees per century). The vertical wave shift is nearly the same as the single wave. Overall least squares fit isn’t remarkably better, at 54.118.

All in all, the waves do seem to do a pretty good job of fitting the curve. In the early 1900s, the wave seemed to ride above the curve a bit and in the last decade or two the wave rides below a bit. So, it’s not perfect. It is possible that a linear curve is not the best approximation to have the sine curve fluctuate around. I may do further tests with other alternatives, such as geometric or exponential approaches.

The extrapolated chart, though, shows a little more severity in projected cooling (the title is wrong – it should read “Double Sin”  I would have corrected it except that I need to re-create it and don’t have time at the moment.  My apologies for the oversight:

Double Sine Wave Fit Against HadCrut Extrapolated

The best-fit double sine wave along a linear trend, extrapolated to 2050.


  • The shifting of the wave phases over time lead to more fluctuation in the peaks and troughs, which explains why the right hand side of the chart fluctuates more than the left.  Since the amplitudes of these waves are 1.5, one can imagine a time centuries from now where there would be astonishing swings in the temperature trends over 50-year periods of time.
  • The projected trough in temperature is expected to be June 2030 according to this chart.   Average anomalies at that time would be slightly negative.   The last half of next century would warm considerably.
  • All this fluctuation occurs over a linear trend that is less than a haf-degree Celsius per Century.

I didn’t stop there. I tested three waves. However, the results that provide the best fit don’t make a lot of sense. The fit of existing data is certainly impressive, so let’s look at that first:

Triple Sine Wave Fit Against HadCrut

The best-fit triple sine wave along a linear trend.

Adding a third sine wave certainly appears to put our temperature observations spot on with the generated curve. One may feel inclined to get all puffy and declare that the case has been solved.

But looks can be deceiving. The best fit with three waves – a reduction to 45.47 – uses parameters that are not sensible. And this becomes a case study in trying to be “too accurate” with model forecasting.

When the third wave is introduced, I was able to generate a number of “best fit” scenarios all around the same value of 45. One such scenario shows that we will go into dramatic cooling by 2051, and the anomaly will be -2 at that time. The next scenario had a best fit of 46.43, with dramatically different parameters. That model shows no downturn whatever in temperature after the current period, and suggests unabated warming through 2050, where the anomaly will be around 1.

This is common when adding multiple parameters to models. In the quest to get more accurate, you actually introduce so much additional uncertainty that the range of reasonable projections becomes meaningless.

My conclusion is that the best representation using a sine wave analysis is a simpler 2-wave representation.

The conclusions are basically in line with the PDO/AMO analysis, as well. These drive the longer-term cycles about a trend. Whether it is a linear trend or something else is worth looking into. Also, is the trend related to the sun or carbon dioxide, or other things? The cycles still do not explain the overall trend, but they do help explain why the recent linear trends cannot simply be extrapolated into the future.


11 Responses to “Deconstructing the HadCrut Data”

  1. Carrick said

    the most recent decade or two is not satisfactorily explained by the sine wave, and the latest anomalies are above the wave.

    Similarly the period around 1950 is poorly explained by the model…. it appears you have large negative anomalies on the same order of magnitude as the positive ones from the most recent decade.

    (I like to plot the residuals of the fit, it makes these kinds of problems stand out.)

    Also…. if we believe that one of the drivers of warming is increased CO2 content, wouldn’t it make sense to fit to a smoothed version of that function, or that function plus a sine wave?

  2. The Diatribe Guy said

    Yes, it would. I’m not suggesting the model is entirely complete. I have a grand idea that I’ll probably never actually get to where I want to do a simultaneoous correlation analysis across a number of factors, including CO2, to try and determine the relative contribution to temperature of each. I have read different correlation studies that have attempted to do this, and quite honestly I have found them quite lacking.

    Unfortunately, there’s a real time issue that I have in being able to do this.

    Back to a CO2 graph, though, an increasing exponential function doesn’t really make any sense. I know there are hypothetical positive feedbacks, but there are also negative feedbacks. And I would ahve to believe the negative feedbacks overwhelm the positive.

    I have used an example in the past of a house with no insulation. You heat it. Measure the temperature. Now add insulation, all around it, one ince thick at a time. At first, this insulation will have a huge impact. Eventually, though, the incremental insulation value will be negligible.

    In our planet, we already come with that first layer of insulation that has the largest impact. Adding more CO2 may well prevent heat from escaping as fast as in the past, but there reaches some maximum level where it will be negligible. And that’s if all the heat comes from the inside. Since the sun is our source, it stands to reason that more CO2 blocks/reflects more solar energy, offsetting this effect. Thus, I cannot fathom how the contribution of CO2 is anything more than a linear curve, and more likely a logarithmic curve with some horizontal asymptotic value.

    I’m not saying it isn’t worth measuring, I just don’t see how that couold be a good explanation for recent anomalies rising above the curve.

    However, since I have not done the analysis, it’s conjecture at this point.

  3. […] Is it just coincidence? I suppose it could be. I consider that doubtful, however, because of the identification of waves in the HadCrut data that I […]

  4. […] over the next Century in the midst of 12 years where no warming has occurred. My own analysis here, here, and here leads me to believe that cooling is on the way. But in each of those analyses, it […]

    • Tonyb said

      Very interested in your work as I am writing an article on the Historic thermometers pre 1850 Hadcrut.

      Could you send me a private email and I will send you some details. You are the first person I have come across who has deconstructed Hadcrut. Have I seen you over at EM Smiths site on Gistemp?


  5. […] […]

  6. […] […]

  7. Phil said

    The Diatribe Guy, You have an interesting thing going here, don’t wish to say to much as is something I am working on but would you mind using your sine wave to go backwards in time to fit the temperature record as it stands for the last few hundred thousand years and posting it here, I have a feeling what you have done here with its clear pattern will fit almost perfectly with what I am working on.

    Would like to point out that while you show the curve still rising toward 2050, there is clearly a wave within a wave.

    Will watch this with interest 🙂

  8. Phil said

    Should have said which temp data set to use, I think it will be sufficient to use the HadCrut temp data, and to be able to show it clearly due to stretching restrictions would think the last 20,000 years would suffice, not 100-200 thousand.

    If it does fit clearly with what I am working on will work out a way to get in touch, am glad you have done this and seen the pattern within the data 🙂

  9. Fluffy Clouds (Tim L) said

    Any way to use a saw tooth wave for 1 wave and a sin/cos wave for #2???
    looking forward to your up date

  10. […] appears to be quite steep after 97/98, would be enough to make the "triple-fit sine wave" here seem even more convincing? I doubt it. Fitting a sine wave to data is terrible at the ends. I […]

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: