We demonstrate how a probabilistic population forecast can be evaluated, when observations for the predicted variables become available. Statisticians have developed various scoring rules for that purpose, but there are hardly any applications in population forecasting literature. A scoring rule measures the distance between the probability distribution of the predicted variable, and the actual outcome. We use scoring rules that reward accuracy (the outcome is close to the expected value of the prediction) and sharpness (the predictive distribution has low variance, which makes it difficult to hit the target).
We evaluate probabilistic population forecasts for France, the Netherlands, and Norway. For all three countries, we use results from the UPE-project ("Uncertain Population of Europe"). We inspect prediction intervals for population size in the period 2004-2019 and 3000 sample paths for population pyramids for the year 2010. For the Netherlands and for Norway, we compare the UPE-results with findings from the official probabilistic population forecast by Statistics Netherlands (2001-2019) and from a probabilistic forecast for Norway (1997-2019). All forecasts were computed using the cohort-component method and stochastically varying parameters for fertility, mortality and migration.
We show that the UPE-forecasts for the Netherlands and for Norway performed better than the other forecasts for these two countries. The error in the jump-off population caused a bad score for the French forecast.
We evaluate the 3000 UPE-simulations of the age and sex composition predicted for the year 2010. When normalized for population numbers in each age-sex category, the predictions for the Netherlands received the best scores, except for the oldest old. The age pattern for the Norwegian score reflects the under-prediction of immigration after the enlargement of the European Union in 2005.