What is a Probability Forecast?

There are two options when delivering a forecast; the forecaster can go with a single value which represents their best estimate of the most likely outcome. This is known as a deterministic forecast. The forecast is unlikely to be perfectly correct but the forecaster’s objective is to select the most accurate choice among many forecasts.

The second option breaks down the possible outcomes into ranges, or bins. A probability of occurrence is assigned to each bin. This is a probability forecast. In this type of forecast we do not seek to forecast future weather conditions with a specific number. Rather, the probabilistic forecaster seeks to correctly describe the probabilities of the outcome falling in each of the bins.

It is impossible to say whether a single calibrated probability forecast was correct – it is never right or wrong. Forecasters can, however, gauge how correct (or reliable) probability forecasts are once we have a long enough history of similar forecasts. A reliable probability forecast predicts the probability of the specific conditions to occur with the same frequency that they actually occur in the real world. In other words, a forecast of above normal temperature with a 60 percent probability is reliable if the temperature is above normal at least 60 percent of the times that a forecast with that probability is issued. 

What is a Probability Forecast: Probability Forecast Bins

When using probability forecasts we have the freedom to choose the size of each bin. For example, if a wind turbine requires a minimum wind speed of 5 ms-1 to start generation, then we could create two bins of 0 to 4.99ms-1, and 5ms-1 and above. We could then assign a forecast probability to each of the two bins. The wind speeds can either be above or below 5 ms-1 and so the two probabilities predicted must add up to 100 percent.

For a given weather variable, we know the full range of possible future outcomes is highly likely to be described by the long-term history of that variable, which is known as a climatology. As signals in long range weather forecasting are relatively weak, the World Climate Service splits the climatology into three equally likely categories or terciles; above normal, near normal, and below normal. An example of creating terciles in an observed temperature distribution is shown in Figure 1.

A picture containing chart

Description automatically generated
Fig. 1: Segmentation of an observed temperature probability distribution into three equal bins, or terciles. Each bin represents a 33.33 percent likelihood

A random number would have a 33.33 percent chance of falling into any one of these three categories. The forecast contains useful information when the predicted probabilities are not evenly distributed amongst the terciles.

The World Climate Service shows data enabling the user to see the predicted distribution for every point on Earth. An example from the product is shown in Figure 2. Notice how the below normal, normal, and above normal categories are not 33 percent but sum to 100 percent.

Probability forecast point probs
Fig. 2: The tercile probabilities at a point on a WCS probability forecast temperature map 

A forecast framed in terms of probabilities may be difficult for people to interpret as the forecast information includes many possible outcomes. However, the various probabilities do provide a convenient input into a numerical contingency model. In other words, a probability forecast allows a user to estimate the risk of different outcomes.

New call-to-action

What is a Probability Forecast: World Climate Service Probability Maps

Figure 3 shows a WCS surface temperature probability map with a week 2 lead time. The map displays only the highest probability for each location when that probability is greater than 40 percent. This threshold is significantly above 33.33 percent and therefore represents a substantially strong signal. In this example we see a large red indication of 80-90 percent probability of above normal over the Northwestern US.

Fig. 3: WCS surface temperature probability map with a week 2 lead time

It is important to remember that, despite appearances, this does not tell us precisely how warm it will be, but simply that temperature is highly likely to fall into the above normal tercile. 

The map is shaded blue over locations where below normal temperatures are most likely tercile. The gray shades represent areas where the near normal tercile is most likely. For statistical reasons, gray shades are less likely to appear than the other colors.

As we move to long forecast lead times, the signals in long-range dynamical forecast models become weaker. As a result, there tends to be less shading on the map. Blank maps can be expected at longer lead times, but it does focus the mind on spots where we do see a signal in the model. 

Fig. 4: WCS surface temperature probability map with a week 5 lead time

What is a Probability Forecast: Ensembles

Ensemble forecasting generates many forecasts simultaneously. Each forecast is an equally likely possible future outcome. The output lends itself perfectly to probabilistic forecasting but the forecasts must be calibrated, which applies corrections based on historical forecast performance.

The World Climate Service displays subseasonal ensemble forecasts from ECMWF, GEFS, CFSv2 and the JMA as probability maps. In addition, the WCS combines the output from ECMWF and the CFSv2 to create a “super-ensemble”, or multi-model ensemble (MME).

In the seasonal time frame, the World Climate Service display calibrated long-range forecasts from the CFSv2, ECWMF, UKMO, ECCC, CMCC, DWD, JMA, and Météo-France.

The various models possess different levels of skill, and it is observed that the combination of forecasts from different models produces a higher skill than any one individual model.

Fig. 5: Skill usually increases when forecasts from different models are combined

What is a Probability Forecast: Long-Range Forecasting

Long-range forecasts may be highly variable because the uncertainty of the forecast increases with lead time. It is impossible to precisely predict weather variables for a given time and place with lead times greater than around 14 days. It is possible, however, to predict the likelihood of various possible outcomes over longer time periods. Averaging forecasts over a week in the future reduces the noise that causes forecast uncertainty while extracting the signal that provides predictability. This is why subseasonal forecasts for weeks three to six in the future can have skill.

As a result, long-range forecasts are framed in terms of the probabilities of specific outcomes, such as below normal temperatures averaged over a week. WCS probability forecasts have been shown to be skilful surprisingly far into the future. 

The graphs in Figure 6 show that in lead week six (i.e., a forecast six weeks into the future), when a surface temperature tercile is predicted with a greater than 33.3 percent percent chance, then that outcome occurs as often as 40 percent of the time (winter, WCS multi-model forecast).

Similarly, in lead week six, when a tercile is predicted with a greater than 50 percent percent chance, then it occurs as often as 68 percent percent of the time (for summer time forecasts from the WCS multi-model [MME] forecast).

Fig. 6: World Climate Service skill scores for probability forecasts with a 6 week lead time

New call-to-action

What is a Probability Forecast: Conclusion

Probability forecasts do not seek to predict an atmospheric variable precisely for a given time. Instead, probability forecasts describe the likelihood of a variable falling into a defined range over a given period of time. This means that probability forecasts are ideally suited to long-range forecasting where useful signals only appear in ranges over periods of time. 

Probability forecasts are also ideally suited to conveying the output of an ensemble forecast which is comprised of a group of similar forecasts.

The skill of a probability forecast cannot be gauged in isolation; it is only when we have a significant number of similar forecasts that we can define the success or reliability of those forecasts.

Recent Articles