domingo, 13 de dezembro de 2009

The CRU hack




As many of you will be aware, a large number of emails from the Climate Research Unit (CRU) at the University of East Anglia webmail server were hacked recently (Despite some confusion generated by Anthony Watts, this has absolutely nothing to do with the Hadley Centre which is a completely separate institution). As people are also no doubt aware the breaking into of computers and releasing private information is illegal, and regardless of how they were obtained, posting private correspondence without permission is unethical. We therefore aren’t going to post any of the emails here. We were made aware of the existence of this archive last Tuesday morning when the hackers attempted to upload it to RealClimate, and we notified CRU of their possible security breach later that day.

Nonetheless, these emails (a presumably careful selection of (possibly edited?) correspondence dating back to 1996 and as recently as Nov 12) are being widely circulated, and therefore require some comment. Some of them involve people here (and the archive includes the first RealClimate email we ever sent out to colleagues) and include discussions we’ve had with the CRU folk on topics related to the surface temperature record and some paleo-related issues, mainly to ensure that posting were accurate.

Since emails are normally intended to be private, people writing them are, shall we say, somewhat freer in expressing themselves than they would in a public statement. For instance, we are sure it comes as no shock to know that many scientists do not hold Steve McIntyre in high regard. Nor that a large group of them thought that the Soon and Baliunas (2003), Douglass et al (2008) or McClean et al (2009) papers were not very good (to say the least) and should not have been published. These sentiments have been made abundantly clear in the literature (though possibly less bluntly).

More interesting is what is not contained in the emails. There is no evidence of any worldwide conspiracy, no mention of George Soros nefariously funding climate research, no grand plan to ‘get rid of the MWP’, no admission that global warming is a hoax, no evidence of the falsifying of data, and no ‘marching orders’ from our socialist/communist/vegetarian overlords. The truly paranoid will put this down to the hackers also being in on the plot though.

Instead, there is a peek into how scientists actually interact and the conflicts show that the community is a far cry from the monolith that is sometimes imagined. People working constructively to improve joint publications; scientists who are friendly and agree on many of the big picture issues, disagreeing at times about details and engaging in ‘robust’ discussions; Scientists expressing frustration at the misrepresentation of their work in politicized arenas and complaining when media reports get it wrong; Scientists resenting the time they have to take out of their research to deal with over-hyped nonsense. None of this should be shocking.

It’s obvious that the noise-generating components of the blogosphere will generate a lot of noise about this. but it’s important to remember that science doesn’t work because people are polite at all times. Gravity isn’t a useful theory because Newton was a nice person. QED isn’t powerful because Feynman was respectful of other people around him. Science works because different groups go about trying to find the best approximations of the truth, and are generally very competitive about that. That the same scientists can still all agree on the wording of an IPCC chapter for instance is thus even more remarkable.

No doubt, instances of cherry-picked and poorly-worded “gotcha” phrases will be pulled out of context. One example is worth mentioning quickly. Phil Jones in discussing the presentation of temperature reconstructions stated that “I’ve just completed Mike’s Nature trick of adding in the real temps to each series for the last 20 years (ie from 1981 onwards) and from 1961 for Keith’s to hide the decline.” The paper in question is the Mann, Bradley and Hughes (1998) Nature paper on the original multiproxy temperature reconstruction, and the ‘trick’ is just to plot the instrumental records along with reconstruction so that the context of the recent warming is clear. Scientists often use the term “trick” to refer to a “a good way to deal with a problem”, rather than something that is “secret”, and so there is nothing problematic in this at all. As for the ‘decline’, it is well known that Keith Briffa’s maximum latewood tree ring density proxy diverges from the temperature records after 1960 (this is more commonly known as the “divergence problem”–see e.g. the recent discussion in this paper) and has been discussed in the literature since Briffa et al in Nature in 1998 (Nature, 391, 678-682). Those authors have always recommend not using the post 1960 part of their reconstruction, and so while ‘hiding’ is probably a poor choice of words (since it is ‘hidden’ in plain sight), not using the data in the plot is completely appropriate, as is further research to understand why this happens.

The timing of this particular episode is probably not coincidental. But if cherry-picked out-of-context phrases from stolen personal emails is the only response to the weight of the scientific evidence for the human influence on climate change, then there probably isn’t much to it.

There are of course lessons to be learned. Clearly no-one would have gone to this trouble if the academic object of study was the mating habits of European butterflies. That community’s internal discussions are probably safe from the public eye. But it is important to remember that emails do seem to exist forever, and that there is always a chance that they will be inadvertently released. Most people do not act as if this is true, but they probably should.

It is tempting to point fingers and declare that people should not have been so open with their thoughts, but who amongst us would really be happy to have all of their email made public?

Let he who is without PIN cast the the first stone.

Update: The official UEA statement is as follows:

“We are aware that information from a server used for research information
in one area of the university has been made available on public websites,”
the spokesman stated.

“Because of the volume of this information we cannot currently confirm
that all of this material is genuine.”

“This information has been obtained and published without our permission
and we took immediate action to remove the server in question from
operation.”

“We are undertaking a thorough internal investigation and we have involved
the police in this enquiry.”

This is a continuation of the last thread which is getting a little unwieldy. The emails cover a 13 year period in which many things happened, and very few people are up to speed on some of the long-buried issues. So to save some time, I’ve pulled a few bits out of the comment thread that shed some light on some of the context which is missing in some of the discussion of various emails.

  • Trenberth: You need to read his recent paper on quantifying the current changes in the Earth’s energy budget to realise why he is concerned about our inability currently to track small year-to-year variations in the radiative fluxes.
  • Wigley: The concern with sea surface temperatures in the 1940s stems from the paper by Thompson et al (2007) which identified a spurious discontinuity in ocean temperatures. The impact of this has not yet been fully corrected for in the HadSST data set, but people still want to assess what impact it might have on any work that used the original data.
  • Climate Research and peer-review: You should read about the issues from the editors (Claire Goodess, Hans von Storch) who resigned because of a breakdown of the peer review process at that journal, that came to light with the particularly egregious (and well-publicised) paper by Soon and Baliunas (2003). The publisher’s assessment is here.

Update: Pulling out some of the common points being raised in the comments.

  • HARRY_read_me.txt. This is a 4 year-long work log of Ian (Harry) Harris who was working to upgrade the documentation, metadata and databases associated with the legacy CRU TS 2.1 product, which is not the same as the HadCRUT data (see Mitchell and Jones, 2003 for details). The CSU TS 3.0 is available now (via ClimateExplorer for instance), and so presumably the database problems got fixed. Anyone who has ever worked on constructing a database from dozens of individual, sometimes contradictory and inconsistently formatted datasets will share his evident frustration with how tedious that can be.
  • “Redefine the peer-reviewed literature!” . Nobody actually gets to do that, and both papers discussed in that comment – McKitrick and Michaels (2004) and Kalnay and Cai (2003) were both cited and discussed in Chapter 2 of 3 the IPCC AR4 report. As an aside, neither has stood the test of time.
  • “Declines” in the MXD record. This decline was hidden written up in Nature in 1998 where the authors suggested not using the post 1960 data. Their actual programs (in IDL script), unsurprisingly warn against using post 1960 data. Added: Note that the ‘hide the decline’ comment was made in 1999 – 10 years ago, and has no connection whatsoever to more recent instrumental records.
  • CRU data accessibility. From the date of the first FOI request to CRU (in 2007), it has been made abundantly clear that the main impediment to releasing the whole CRU archive is the small % of it that was given to CRU on the understanding it wouldn’t be passed on to third parties. Those restrictions are in place because of the originating organisations (the various National Met. Services) around the world and are not CRU’s to break. As of Nov 13, the response to the umpteenth FOI request for the same data met with exactly the same response. This is an unfortunate situation, and pressure should be brought to bear on the National Met Services to release CRU from that obligation. It is not however the fault of CRU. The vast majority of the data in the HadCRU records is publicly available from GHCN (v2.mean.Z).
  • Suggestions that FOI-related material be deleted … are ill-advised even if not carried out. What is and is not responsive and deliverable to an FOI request is however a subject that it is very appropriate to discuss.

Further update: This comment from Halldór Björnsson of the Icelandic Met. Service goes right to the heart of the accessibility issue:

Re: CRU data accessibility.

National Meteorological Services (NMSs) have different rules on data exchange. The World Meteorological Organization (WMO) organizes the exchange of “basic data”, i.e. data that are needed for weather forecasts. For details on these see WMO resolution number 40 (see http://bit.ly/8jOjX1).

This document acknowledges that WMO member states can place restrictions on the dissemination of data to third parties “for reasons such as national laws or costs of production”. These restrictions are only supposed to apply to commercial use, the research and education community is supposed to have free access to all the data.

Now, for researchers this sounds open and fine. In practice it hasn’t proved to be so.

Most NMSs also can distribute all sorts of data that are classified as “additional data and products”. Restrictions can be placed on these. These special data and products (which can range from regular weather data from a specific station to maps of rain intensity based on satellite and radar data). Many nations do place restrictions on such data (see link for additional data on above WMO-40 webpage for details).

The reasons for restricting access is often commercial, NMSs are often required by law to have substantial income from commercial sources, in other cases it can be for national security reasons, but in many cases (in my experience) the reasons simply seem to be “because we can”.

What has this got to do with CRU? The data that CRU needs for their data base comes from entities that restrict access to much of their data. And even better, since the UK has submitted an exception for additional data, some nations that otherwise would provide data without question will not provide data to the UK. I know this from experience, since my nation (Iceland) did send in such conditions and for years I had problem getting certain data from the US.

The ideal, that all data should be free and open is unfortunately not adhered to by a large portion of the meteorological community. Probably only a small portion of the CRU data is “locked” but the end effect is that all their data becomes closed. It is not their fault, and I am sure that they dislike them as much as any other researcher who has tried to get access to all data from stations in region X in country Y.

These restrictions end up by wasting resources and hurting everyone. The research community (CRU included) and the public are the victims. If you don’t like it, write to you NMSs and urge them to open all their data.

I can update (further) this if there is demand. Please let me know in the comments, which, as always, should be substantive, non-insulting and on topic


Nota: Enviado por Miguel Madeira acerca dos emails da CRU East Anglia, hackeados por um grupo russo(?) ou pelo menos colocados em servidor russo que apontam para os dados falsificados e adulterados por esta equipa, e que neste artigo tenta negar o que se encontra nesses emails.

Sem comentários:

Enviar um comentário