Historical Water Supply Outlooks


Official US government seasonal streamflow forecasts (called water supply outlooks) are issued jointly by the National Weather Service (NWS) and the US Department of Agriculture (USDA) Natural Resources Conservation Service (NRCS). In Arizona, the Salt River Project (water and energy provider for the Phoenix metropolitan area) also participates. The period for which forecasts are made has varied over time, but now is generally April–July in the Upper Colorado River Basin and January–May in the Lower Colorado River Basin.

This project aims to establish a quantitative baseline of forecast performance by conducting the most comprehensive evaluation of official historical forecasts ever attempted for the Colorado River Basin.


We compared forecasts with “observed” values at 55 streamflow locations in the Colorado River Basin. The forecast flows represent the volume of water that would have occurred in the absence of diversions or regulations (such as by dams, irrigation, or municipal use). The actual “observed” flows are combined with estimates of withdrawals and diversions to reconstruct what the natural flows at each site would have been in the absence of these activities.

At each site, the observed flows were used to determine three flow levels for each forecast period: low flows (the lowest 30 percent of observations), moderate flows (the middle 40 percent), and high flows (the highest 30 percent of observations).

Several different statistical tests were used to examine the quality of the historical forecasts. These included basic statistical tests, such as root-mean-square error and correlation; categorical tests, such as the probability of detection and false alarm rate; probabilistic tests, such as Brier scores and the ranked probability score; and distributive statistics, such as reliability and discrimination.


The study revealed that predictions of flows on the majority of streams have been very conservative. Below-average flows are often over-predicted (forecast values are too high) and above-average flows are under-predicted (forecast values are too low). This problem is most pronounced for early forecasts (i.e. January) at many locations, but improves with later forecasts (i.e. May).

For the low and high flows there was a low false alarm rate, which means that when low and high flows are forecast, those forecasts are generally accurate, and such flows do occur. However, for low and high flows there was also a low probability of detection at most sites—in other words, low and high flows actually occurred far more often than they were forecasted. Moderate flows, on the other hand, had a very high probability of detection, but also a very high false alarm rate, indicating that moderate flows are forecasted more frequently than they actually occur.

There was good discrimination between high and low flows, particularly with forecasts issued later in the year. This means that when high flows were forecasted, low flows rarely occurred, and vice versa. The accuracy of forecasts tended to improve with each month, so that forecasts issued in May tended to be much more reliable than those issued in January.