How Important Was NASA’s Change to Historical Climate Data Last Week?
Last week's revelation by Climate Audit's Steve McIntyre of a serious mistake and subsequent changes made by NASA's Goddard Institute for Space Studies in the temperature history of America has created quite a debate in the new media.
While conservative bloggers were quick to point out the hypocrisy regarding the lack of an official announcement from GISS chief James Hansen as well as the possible significance to the entire global warming debate, alarmists such as RealClimate and TNR's The Plank viewed McIntyre's discovery and GISS's alterations less than earth shattering.
With that in mind, McIntyre published a response at Anthony Watts' "Watts Up With That?" Saturday (Climate Audit is undergoing a server change) with his take on the issue (emphasis added throughout):
The Hansen error is far from trivial at the level of individual [weather] stations. Grand Canyon was one of the stations previously discussed at climateaudit.org in connection with Tucson urban heat island. In this case, the Hansen error was about 0.5 deg C. Some discrepancies are 1 deg C or higher.
[A]s you can see from the distribution, the impact on the majority of stations is substantially higher than 0.15 deg. For users of information regarding individual stations, the changes may be highly relevant.
GISS recognized that the error had a significant impact on individual stations and took rapid steps to revise their station data (and indeed the form of their revision seems far from ideal indicating the haste of their revision.) GISS failed to provide any explicit notice or warning on their station data webpage that the data had been changed, or an explicit notice to users who had downloaded data or graphs in the past that there had been significant changes to many U.S. series. This obligation existed regardless of any impact on world totals.
Readers should certainly be aware that this was what I specifically took issue with - the lack of disclosure that this had occurred:
GISS has emphasized recently that the U.S. constitutes only 2% of global land surface, arguing that the impact of the error is negligible on the global averagel [sic]. While this may be so for users of the GISS global average, U.S. HCN stations constitute about 50% of active (with values in 2004 or later) stations in the GISS network (as shown below). The sharp downward step in station counts after March 2006 in the right panel shows the last month in which USHCN data is presently included in the GISS system. The Hansen error affects all the USHCN stations and, to the extent that users of the GISS system are interested in individual stations, the number of affected stations is far from insignificant, regardless of the impact on global averages.
McIntyre then pointed out the hypocrisy in the lack of official reporting of these changes:
In my opinion, it would have been more appropriate for Gavin Schmidt of GISS (who was copied on the GISS correspondence to me) to ensure that a statement like this was on the caption to the U.S. temperature history on the GISS webpage, rather than after the fact at realclimate.
Obviously much of the blogosphere delight in the leader board changes is a reaction to many fevered press releases and news stories about year x being the "warmest year". For example, on Jan 7, 2007, NOAA announced that
The 2006 average annual temperature for the contiguous U.S. was the warmest on record.
This press release was widely covered as you can determine by googling "warmest year 2006 united states". Now NOAA and NASA are different organizations and NOAA, not NASA, made the above press release, but members of the public can surely be forgiven for not making fine distinctions between different alphabet soups. I think that NASA might reasonably have foreseen that the change in rankings would catch the interest of the public and, had they made a proper report on their webpage, they might have forestalled much subsequent criticism.
In addition, while Schmidt describes the changes atop the leader board as "very minor re-arrangements", many followers of the climate debate are aware of intense battles over 0.1 or 0.2 degree (consider the satellite battles.) Readers might perform a little thought experiment: suppose that Spencer and Christy had published a temperature history in which they claimed that 1934 was the warmest U.S. year on record and then it turned out that they had been a computer programming error opposite to the one that Hansen made, that Wentz and Mears discovered there was an error of 0.15 deg C in the Spencer and Christy results and, after fiixing this error, it turned out that 2006 was the warmest year on record. Would realclimate simply describe this as a "very minor re-arrangement"?
Not a chance. In fact, this would have been announced with great enthusiasm, and likely would have been the lead report on all of the evening news programs, as well as making front page headlines the following day:
So while the Hansen error did not have a material impact on world temperatures, it did have a very substantial impact on U.S. station data and a "significant" impact on the U.S. average. Both of these surely "matter" and both deserved formal notice from Hansen and GISS.
Yet, something that has been lost in the fight over this issue is that as a result of identifying this Y2K error by Hansen et al, McIntyre has grown more concerned about the veracity of other data being collated and disseminated by GISS, as well as the lack of transparency concerning adjustments to raw data to compensate for the heat island effect:
In the course of reviewing quality problems at various surface sites, among other things, I compared these different versions of station data, including a comparison of the Tucson weather station shown above to the Grand Canyon weather station, which is presumably less affected by urban problems. This comparison demonstrated a very odd pattern discussed here. The adjustments show that the trend in the problematic Tucson site was reduced in the course of the adjustments, but they also showed that the Grand Canyon data was also adjusted, so that, instead of the 1930s being warmer than the present as in the raw data, the 2000s were warmer than the 1930s, with a sharp increase in the 2000s.
Now some portion of the post-2000 jump in adjusted Grand Canyon values shown here is due to Hansen's Y2K error, but it only accounts for a 0.5 deg C jump after 2000 and does not explain why Grand Canyon values should have been adjusted so much. In this case, the adjustments are primarily at the USHCN stage. The USHCN station history adjustments appear particularly troublesome to me, not just here but at other sites (e.g. Orland CA). They end up making material changes to sites identified as "good" sites and my impression is that the USHCN adjustment procedures may be adjusting some of the very "best" sites (in terms of appearance and reported history) to better fit histories from sites that are clearly non-compliant with WMO standards (e.g. Marysville, Tucson). There are some real and interesting statistical issues with the USHCN station history adjustment procedure and it is ridiculous that the source code for these adjustments (and the subsequent GISS adjustments - see bottom panel) is not available/ [sic]
Adding it up, and data from seemingly good weather stations is being adjusted up for reasons that McIntyre can't explain, and Hansen and company refuse to provide the procedure and the source code such that folks like McIntyre - and policymakers - can review the methodology.
Why is all this a big secret, and why should any American citizen or politician just blindly accept data from an agency that refuses to make transparent what the station history adjustment procedure is?
If one views the above assessment as a type of limited software audit (limited by lack of access to source code and operating manuals), one can say firmly that the GISS software had not only failed to pick up and correct fictitious steps of up to 1 deg C, but that GISS actually introduced this error in the course of their programming.
According to any reasonable audit standards, one would conclude that the GISS software had failed this particular test. While GISS can (and has) patched the particular error that I reported to them, their patching hardly proves the merit of the GISS (and USHCN) adjustment procedures. These need to be carefully examined. This was a crying need prior to the identification of the Hansen error and would have been a crying need even without the Hansen error.
One practical effect of the error is that it surely becomes much harder for GISS to continue the obstruction of detailed examination of their source code and methodologies after the embarrassment of this particular incident. GISS itself has no policy against placing source code online and, indeed, a huge amount of code for their climate model is online. So it's hard to understand their present stubbornness.
Finally, McIntyre addressed how the Y2K changes might impact global data (ROW):
In the U.S., despite the criticisms being rendered at surfacestations.org, there are many rural stations that have been in existence over a relatively long period of time; while one may cavil at how NOAA and/or GISS have carried out adjustments, they have collected metadata for many stations and made a concerted effort to adjust for such metadata. On the other hand, many of the stations in China, Indonesia, Brazil and elsewhere are in urban areas (such as Shanghai or Beijing). In some of the major indexes (CRU,NOAA), there appears to be no attempt whatever to adjust for urbanization. GISS does report an effort to adjust for urbanization in some cases, but their ability to do so depends on the existence of nearby rural stations, which are not always available. Thus, ithere [sic] is a real concern that the need for urban adjustment is most severe in the very areas where adjustments are either not made or not accurately made.
In its consideration of possible urbanization and/or microsite effects, IPCC has taken the position that urban effects are negligible, relying on a very few studies (Jones et al 1990, Peterson et al 2003, Parker 2005, 2006), each of which has been discussed at length at this site. In my opinion, none of these studies can be relied on for concluding that urbanization impacts have been avoided in the ROW sites contributing to the overall history.
Moreover, Keenan's report last week cast grave doubt about the veracity of Jones et al's 1990 paper on urban effects being negligible.
In sum, though this Y2K error and subsequent changes to America's climate history is not necessarily a smoking gun, the lack of reporting, and consistent refusal on the part of Hansen and Schmidt to share methodologies and source codes surrounding statistical formulae remains a grave concern, as does how much all this impacts the global numbers.
Of course, I'm sure when Hansen and Schmidt get around to seeing how this does indeed relate to world temperatures, they'll be quick in alerting the media.
Alas, unless the changes to global data are deemed miniscule, that could be irrelevant, for with the exception of Fox News, it appears that not one major American press organization felt the revelation of GISS's Y2K error, and how it related to U.S. climate history, was at all newsworthy.