Deconstructing the HadCrut Data
Posted by The Diatribe Guy on February 10, 2009
In this post I took a look at the PDO, AMO, and ENSO data and went through the exercise of fitting a sine wave to see how the fit looked. On all three, the fit seemed to be a good and reasonable way of estimating the general level of the curves at a given time. Obviously, there are fluctuations about those curves, but all in all it seemed to be fairly adequate.
So, the other day I was thinking about the implications of hypothesizing the contribution of these
periodic elements to the temperature data, and I figured that it may be an interesting exercise to deconstruct the HadCrut data in the same way. Understanding that there may be multiple oscillations going on, I set out to fit multiple waves to the data to see what I could make of it.
I used HadCrut because their records data back the furthest. And also because the older data has not undergone the continual adjustments that the GISS data has. Since HadCrut protects their process in determining the anomaly, it is not transparent to the user of the data what kind of spreading and adjustments they make to it. However, I am inclined to place a little more trust in it than I do GISS because the data is more in line over the last 30 years with the satellite data than GISS is. (Admittedly, based on observation. I haven’t done a rigorous analysis on that).
Let me start by introducing the most recent chart that shows the overall linear trend on HadCrut over time. I didn’t do a January update, but the additional data point won’t have much of an effect.
Here are the required parameters and procedures to get the best fit:
- A parameter that assigns the optimal wave length. This is done by determining a degree increment per month that gets added to the previous degree amount. That value is then converted to radians, and the sine value is calculated.
- A wave scale factor is used to determine amplitude of the wave.
- The starting wave position in January 1850
- Linear trend that the wave is centered on
- Vertical wave shift – since the anomalies over time aren’t necessarily centered about zero, and even if they were there may still be a shift needed depending on wave cycle weights, the wave needs to be shifted up or down to achieve optimal fit.
- Parameters 1-3 are wave specific when optimizing multiple waves, while parameters 4 and 5 are applied after the cooperative effect of all the waves tested are determined.
- To determine the best fit, I set up my calculations using the parameters above to compare against the historical HadCrut anomalies, determine the difference, and square the result. I then minimized the sum of the squares.
The first run I did was against only one wave. Even this fit provides substantial improvement over a simple linear fit, and starts to demonstrate the fact that there are clear cycles in the data.
While we can see visually that it is not entirely perfect, we can definitely see the general wave it’s tracing. The parameters on this wave are as follows:
- Wave length parameter = 0.45658. This corresponds to a 788.5 month complete cycle, or 65.7 years. This fit lines up right along with the cycle lengths determined in the PDO/AMO/ENSO study.
- Wave scale parameter = 0.13222. Basically, this means that the overall wave doesn’t deviate from the trend line by more than this value.
- Start Wave position = -31.71316. (0 would be right on the trend line, and +/-90 would be max deviation from the trend line in either direction.)
- Linear trend = 0.00037. This is spot on with the overall linear trend observed without the sine curve applied.
- Vertical Wave Shift = -0.53644 means that the start of the sin wave had to be shifted downward by this amount.
The least squares result was 56.18. This compares to the least squares fit on the linear trend line only of 72.58. This is nearly a 25% improvement.
The conclusions of this:
- there is still a definite linear trend, but most of the fluctuation about that trend can be explained by adding a single sin wave.
- the most recent decade or two is not satisfactorily explained by the sine wave, and the latest anomalies are above the wave. This could be consistent with the idea that the something has changed (e.g. increased Carbon Dioxide has accelerated what had been a linear trend). Alternatively, it may simply be that a single sine wave is insufficient and there are other periodic influences that need to be examined.
An interesting exercise is to extrapolate the linear trend with the single sine curve forward. Taking this to 2050 shows us the following:
- The graph indicates that we are at or near the peak of the single sine curve fit, and that the next 23 years will cool.
- There is still the linear trend line observed at just over 0.4 degrees Celsius per Century, so the anticipated trough won’t be as severe, but we will still be cooler than today for the next 30 years or so.
- The trough of the curve occurs October 2032, where single-digit anomalies would be the norm.
While the single sine wave fit provides interesting information, from observing the AMO/PDO information, it seemed clear that there would likely be at least one additional periodic wave in the data. So I added an additional wave and the following chart ensued:
The interesting thing about this fit is that the cycles of both waves are shorter than the singularly combined wave (59.1 years and 58.5 years), and the amplitude of both waves is around 1.5. The combined wave is only 0.13222 because the phases of these two waves at the start of the period are almost perfectly offset (182 degrees apart). The linear trend is still apparent, though just a shade less (0.42 degrees per century). The vertical wave shift is nearly the same as the single wave. Overall least squares fit isn’t remarkably better, at 54.118.
All in all, the waves do seem to do a pretty good job of fitting the curve. In the early 1900s, the wave seemed to ride above the curve a bit and in the last decade or two the wave rides below a bit. So, it’s not perfect. It is possible that a linear curve is not the best approximation to have the sine curve fluctuate around. I may do further tests with other alternatives, such as geometric or exponential approaches.
The extrapolated chart, though, shows a little more severity in projected cooling (the title is wrong – it should read “Double Sin” I would have corrected it except that I need to re-create it and don’t have time at the moment. My apologies for the oversight:
- The shifting of the wave phases over time lead to more fluctuation in the peaks and troughs, which explains why the right hand side of the chart fluctuates more than the left. Since the amplitudes of these waves are 1.5, one can imagine a time centuries from now where there would be astonishing swings in the temperature trends over 50-year periods of time.
- The projected trough in temperature is expected to be June 2030 according to this chart. Average anomalies at that time would be slightly negative. The last half of next century would warm considerably.
- All this fluctuation occurs over a linear trend that is less than a haf-degree Celsius per Century.
I didn’t stop there. I tested three waves. However, the results that provide the best fit don’t make a lot of sense. The fit of existing data is certainly impressive, so let’s look at that first:
Adding a third sine wave certainly appears to put our temperature observations spot on with the generated curve. One may feel inclined to get all puffy and declare that the case has been solved.
But looks can be deceiving. The best fit with three waves – a reduction to 45.47 – uses parameters that are not sensible. And this becomes a case study in trying to be “too accurate” with model forecasting.
When the third wave is introduced, I was able to generate a number of “best fit” scenarios all around the same value of 45. One such scenario shows that we will go into dramatic cooling by 2051, and the anomaly will be -2 at that time. The next scenario had a best fit of 46.43, with dramatically different parameters. That model shows no downturn whatever in temperature after the current period, and suggests unabated warming through 2050, where the anomaly will be around 1.
This is common when adding multiple parameters to models. In the quest to get more accurate, you actually introduce so much additional uncertainty that the range of reasonable projections becomes meaningless.
My conclusion is that the best representation using a sine wave analysis is a simpler 2-wave representation.
The conclusions are basically in line with the PDO/AMO analysis, as well. These drive the longer-term cycles about a trend. Whether it is a linear trend or something else is worth looking into. Also, is the trend related to the sun or carbon dioxide, or other things? The cycles still do not explain the overall trend, but they do help explain why the recent linear trends cannot simply be extrapolated into the future.