- About Us
- SW Climate
Climate division methodology
Published July 25, 2007
In order to ascertain which climate stations have the tendency to exhibit the same climate anomalies, we performed analyses on temperature (T), precipitation (P), and combined records. We found that the last approach, with combined time series, yielded the best-defined climate regions.
From currently available records for 17,575 COOP stations in the lower 48 states, we selected 4,324 stations with both sufficient P and T records to perform statistical analyses for water years 1979–2006. For much of the U.S., this translates into at least one station per 1,000 square miles. Some less populated regions, such as the deserts in the Interior West, have less dense spatial coverage. We used several thousand more P COOP stations of similar quality for supportive analyses. In addition, there are more than 500 SNOTEL sites in the higher elevations of the western U.S. that have sufficient P records since 1979 to be analyzed as well. However, their T records typically only start in the late 1980s and have been somewhat unreliable. To develop new experimental climate divisions (CDs) we used five steps:
Step1. For every climate station, we computed average T and P totals for every three-month season from October 1978–September 2006. These ‘sliding’ seasons include all three month periods (i.e., October–December, November–January) within the 28-year record. Individual seasonal anomalies were calculated by subtracting the 28-year average for that same season. For missing data, anomaly values were set equal to zero to keep all station anomaly time series to the same length.
Step 2. Multivariate cluster analyses, a statistical method for grouping data in a way that yields a strong degree of association between members of the same cluster and a weak degree of association between members of different clusters (http:// www.statsoft.com/textbook/stcluan.html), were used to find out which stations tended to experience climate anomalies of the same sign (i.e., above average or below average), based on correlation matrices among all of them. The two cluster analysis techniques applied here were Average Linkage and Ward’s method, both well-established and superior to other methods (Wilks, 1995, pp. 419-428).
Step 3. Results from both clustering methods were compared against each other and used to group stations with similar T and P anomalies into core regions. A large majority of these cores could be identified by simple overlapping station counts, but some less clear-cut groupings were settled by correlating the respective cluster time series against each other. After this initial classification, core time series were computed based on normalized T and P time series (produced by taking each data value, subtracting from it the long-term average, then dividing by the standard deviation) at the station level. These were used to calculate correlation coefficients between all stations and all cores.
Step 4. The assignment of stations to cores was refined iteratively, until no changes occurred. In particular, if a station was not classified as belonging to a core, but correlated highly with a nearby core, it was admitted to that core; or if a station had been classified as being inside a core, but did not correlate highly with the core time series, it was removed from that core. (This was a rare event inthe combined analysis suite, but more common in P analyses). A third scenario involved the transfer of a station from one core to another, if its correlation with the new core was substantially higher than with the old core.
Step 5. While there was some experimentation with correlation thresholds, the basic procedure always remained the same and yielded similar results. Transfers between core regions required at least a 1 percent increase in explained variance for that station, and the drop-correlation threshold had to be lower than the add correlation threshold. The final correlation thresholds were in the 0.55–0.60 range, to allow for virtually all stations to be classified. One final check consisted in correlating all new CD time series against each other to flag regions that were extremely well correlated (r>0.90) and thus prime candidates for mergers, as long as the resulting new division did not exceed certain size limitations.
The current version of the new 139 combined core regions (i.e., new CDs) for water years 1979–2006 is shown in Figure 1c. From the pool of 4,324 COOP stations with sufficient temperature and precipitation data, the initial core map classified 3,112 stations as being within 145 initial clusters (Step 3). Using the iterative methodology described above, the remaining stations were gathered into core regions, resulting in a stable classification of all but one station by the seventh iteration in 139 final core regions (Steps 4 and 5; Figure. 1c). While there was no requirement for stations within a core to be spatially adjacent to each other, it is reassuring to see that virtually all of them are indeed neighbors, even in the more diverse terrain of Wyoming, Colorado, Utah, and New Mexico.