Presenting Climate Projection Ensembles as Mean and Reasonable Worst Case, with Application to EURO-CORDEX Precipitation


 Users of ensemble climate projections have choices with respect to how they interpret and apply the ensemble. A simplistic approach is to consider just the ensemble mean and ignore the individual ensemble members. A more thorough approach is to consider every ensemble member, although for complex impact models this may be unfeasible. Building on previous work in ensemble weather forecasting we explore an approach in-between these two extremes, in which the ensemble is represented by the mean and a reasonable worst case. The reasonable worst case is calculated using Directional Component Analysis (DCA), which is a simple statistical method that gives a robust estimate of worst-case for a given linear metric of impact, and which has various advantages relative to alternative definitions of worst-case. We present new mathematical results that clarify the interpretation of DCA and we illustrate DCA with an extensive set of synthetic examples. We then apply the mean and worst-case method based on DCA to EURO-CORDEX projections of future precipitation in Europe, with two different impact metrics. We conclude that the mean and worst-case method based on DCA is suitable for climate projection users who wish to explore the implications of the uncertainty around the ensemble mean without having to calculate the impacts of every ensemble member.

of northern Europe are projected to experience an increase in precipitation because of climate change, while parts 97 of southern Europe are projected to experience a decrease in precipitation (European Environment Agency, 2017).

98
In this situation, a Europe-wide worst-case might consist of greater than the ensemble mean precipitation in 99 northern Europe and less than the ensemble mean precipitation in southern Europe. Second, we present various 100 additional mathematical properties of DCA that were not described in Jewson (2020) or SJM, that help to clarify 101 the interpretation of DCA. Third, we discuss in detail how the linear impact function used in DCA can be 102 considered as a linearisation of a non-linear impact function, and how the method can be generalized to quadratic 103 and cubic approximations. Fourth, we apply DCA to high-resolution multi-model climate projections from 104 EURO-CORDEX for two different linear impact functions.

105
In Sect. 2 we describe the data we will use and review the DCA methodology. In Sect. 3 we present new 106 illustrations of DCA. In Sect. 4 we present new mathematical properties of DCA that help clarify the 107 interpretation. We also discuss how to generalize DCA to quadratic and cubic impact functions. In Sect. 5 we 108 illustrate the DCA methodology with results from the EURO-CORDEX climate model ensemble. In Sect. 6 we 109 draw some conclusions. Appendices A to E include a number of related mathematical derivations.

113
In section 3 we analyse synthetic ensemble data to illustrate the DCA method in various ways, and compare it 114 with using just the worst member to define the worst case. The synthetic data was created by simulating from a 115 bivariate normal distribution. The two dimensions are intended to represent changes in future annual mean 116 precipitation values at two different but nearby locations and each ensemble member is intended to represent 117 output from a different climate model. We simulate ensembles of both 10 and 1000 members, and we also simulate 118 500 repeats of the 10 member ensemble to understand the robustness of the results to the statistical sampling 119 involved in creating the ensembles. 1000 is larger than the number of climate models in existence, but using 1000 120 member ensembles is helpful for illustrating how DCA works.

122
The climate model data we use in Sect. 5 is annual mean precipitation data extracted from the EURO-CORDEX 123 ensemble projections of future climate (Jacob, et al., 2014;Benestad, et al., 2017;Jacob, et al., 2020). We use data 124 from 10 separate projections, simulated by 10 different combinations of global models and regional models: a list 125 of the models used is given in Table 1. The model output is at 0.11-degree resolution (roughly 12km). Our analysis 126 is based on the ensemble member by ensemble member difference between RCP4.5 annual mean precipitation in 127 the period 2011-2040 and the baseline period 1981-2010, which we refer to as precipitation changes. example of impact that can be expressed using = 0 + is the total precipitation change (change relative to 141 the previous climate) across a precipitation field: if is the spatial field of precipitation anomalies, 0 is the total 142 precipitation change of the ensemble mean and is a vector of ones, then is the total precipitation change due 143 to the ensemble mean plus the anomaly pattern . A more complex example might involve using the components 144 of the vector to apply weighting as a function of spatial variation in population density.

145
Given the above definitions the DCA pattern is then a vector with components given by the proportional 146 relationship: The DCA vector can be interpreted as a spatial pattern. Since = ⁄ this gives ∝ ( / ) = 149 ( )/ , which can also be written as

154
Equations (1) and (2) are proportional relationships and define the direction of the DCA vector (i.e., the shape 155 of the spatial pattern described by ), but not the magnitude of the vector (i.e., not the amplitude of the spatial 156 pattern decribed by ). To make the definition of DCA unique, Jewson (2020) defines the first DCA pattern as a 157 unit vector. The length of the unit vector DCA pattern can then be scaled to an appropriate value, depending on 158 the application. Subsequent DCA patterns can also be derived, to create a set of orthogonal spatial patterns and a 159 corresponding method for matrix factorisation, although in this study we will only consider the first pattern, which 160 we will refer to simply as the DCA pattern or DCA vector.

170
If the ensemble is not MVN, but is still elliptically distributed (for example, is distributed as a multivariate t 171 distribution) then these properties may still hold: more precise details are given in Appendix A.

172
If the ensemble is not elliptically distributed it is useful to distinguish between what we will call smoothable and 173    we move away from the DCA pattern in any direction then either the impact or the probability density will reduce: 239 moving into the ellipse or along the ellipse reduces impact, while moving out of the ellipse reduces probability 240 density.

241
Figure 1b illustrates the same ensemble (i.e., the same ellipse) as Fig. 1a but a situation in which the definition of 242 impact puts twice as much weight on precipitation change at the second location (the vertical axis) as on the Fig. 1a, and the line of constant impact is more horizontal. This changes the DCA vector, which now has more 245 precipitation at location 2 than location 1. Fig. 1c illustrates a different ensemble, but for the same impact vector 246 as in Fig. 1a. This also changes the DCA vector relative to Fig. 1a. Finally, Fig. 1d illustrates a more complex 247 situation using a different ensemble in which the impact increases as precipitation changes increase at location 2 248 but impact decreases as precipitation changes increase at location 1. This would occur in the situation discussed 249 in the introduction in which at location 2 the main concern with respect to climate change and future precipitation 250 is related to increases in precipitation (e.g., such as concerns about increased flooding in the UK) while at location 251 1 the main concern is with respect to decreases in precipitation (e.g., such as concerns about increased drought in 252 some parts of Southern Europe). In this case the vector points to the top left corner of the diagram, and the lines 253 of constant impact are correspondingly different, and run from the lower left to the upper right. The DCA direction 254 is also correspondingly very different, and selects a pattern that consists of increased precipitation at location 1 255 and decreased precipitation at location 2, relative to the ensemble mean, as the reasonable worst-case. (1). Figure 2b shows the same ensemble, with the same ensemble members, but now with a different impact 264 vector which puts twice as much weight on the precipitation change on the vertical axis (following Fig. 1b).

265
This shifts the DCA pattern. Figure 2c shows a different ensemble (now generated using a correlation of zero but 266 unequal variances in the two directions), but with the same impact vector as Fig. 2a (following Fig. 1c). This leads 267 to a different DCA pattern relative to Fig. 2a. Figure 2d shows a different ensemble again (generated using a

282
while the DCA pattern shows much lower variability (Fig. 4b). This shows that the DCA pattern is a more robust 283 estimate of a worst-case than the worst member of the ensemble, which is because the DCA pattern is based on 284 more information: from Eq. (1) and Eq.
(2) we see that it uses information from the entire ensemble, rather than

307
We will illustrate these additional properties in Fig. 5 with a simple two-dimensional example. Fig. 5a shows   Figure 5d shows the DCA pattern, given by a red circle, calculated using Eq.
(2), 323 without further scaling. The red circle in Fig. 5d represents the same spatial pattern as shown by the black cross 324 in Fig. 5c (i.e., has the same direction from the origin) but with a slightly different length scaling.

326
The first new property we describe is that the DCA vector maximises the ratio of the linear impact ′ to the 327 Mahalanobis distance. We write this ratio as

329
This property of DCA emphasizes that in the MVN case DCA finds patterns that have both a high linear impact 330 (a large value of ′) and at the same time have a high probability density, which corresponds to a low value of .

331
The proof that maximising Eq. (3) leads to the same expression for the DCA pattern as Eq. (1) above is given in 332 Appendix B.

333
The ratio can also be written using the fact that 2 = ln( 0 ) − ln( ), where is the probability density of and

340
The second new property we describe is that any DCA pattern, of any length scaling, can be described as the 341 pattern that maximises the product of the linear impact ′ and the probability density to some positive power,

342
i.e., the function In Appendix C we show that maximising Eq. (4) leads to a pattern which is a scaling of the DCA pattern given

346
This property of DCA emphasizes more clearly than any of the other properties that for the MVN DCA finds 347 patterns that have both a high linear impact, and a high probability density. One could also attempt to find patterns

359
This property is shown in Appendix D below.

369
The fourth new property we describe is that, for MVN data, the direction of the DCA vector is the same as the

374
This is shown in Appendix E below.

375
A special case of this property is that the expectation over all patterns, weighted by the linear impact, is parallel 376 to DCA. This is the expectation version of Eq. (2).

381
Given all the above, we can summarize the key properties of DCA, as applied to MVN distributed data, as follows: 382 1) When scaled appropriately, the DCA pattern gives the unique spatial pattern that maximises the 383 probability density, for a given level of linear impact, for any level of linear impact.

384
2) When scaled appropriately, the DCA pattern gives the unique spatial pattern that maximises the linear 385 impact, for a given level of probability density, for any level of probability density.

386
3) All scalings of the DCA pattern maximise the ratio of linear impact to Mahalanobis distance. All scalings 387 give the same value for this ratio.
388 4) For any given scaling, the DCA pattern is the unique spatial pattern that maximises the linear impact 389 multiplied by a positive power of the probability density. An implication of this is that for any given 390 scaling of the DCA pattern, there is no other pattern which has both a higher probability density and a 391 higher linear impact, and that any change to a DCA pattern would lead to either the probability density 392 or the impact reducing.

419
The derivation for DCA in Jewson (2020) is based on maximising likelihood for a given impact, and solves a 420 Lagrangian maximisation problem for the linear impact with the Lagrangian given by: Which has the solution ∝ .

450
We now show DCA patterns derived from the EURO-CORDEX data described in Sect. 2.2. Figure 6a

473
that where the ensemble mean is positive there may be concern about increasing precipitation, and where the 474 ensemble mean is negative there may be concern about decreasing precipitation. However, as with simply adding 475 two standard deviations everywhere, this is also not a realistic pattern of climate uncertainty and is also not a 476 candidate for worst-case over the whole domain. 486 also, there is no pattern with higher total precipitation anomaly for the same likelihood; this pattern maximises 487 both total precipitation anomaly divided by Mahalanobis distance and total precipitation anomaly multiplied by 488 probability density to some positive power; the pattern is proportional to (i.e., the vector is parallel to) the 489 expectation over all possible patterns weighted by their linear impact, and finally there is no possible adjustment 490 to this pattern which could increase both the total precipitation anomaly and the likelihood. Figure 7c shows the 491 sum of the ensemble mean with this DCA pattern. We see that precipitation amounts have increased over much 492 of Northern Europe, relative to the ensemble mean, but to a much lesser extent than was seen in the ensemble 493 mean plus two standard deviations pattern in Fig. 6c, which emphasizes further that Fig. 6c is not a realistic

520
Another alternative would be to consider the worst member of the ensemble. However, the worst member is highly 521 affected by the randomness of the ensemble generation process, and hence is not robust. In a multi-model 522 ensemble, in which some models may be less realistic than others, there is also a risk that the model that determines  approximation to a more general non-linear impact function, and shown that a higher order quadratic 535 approximation is possible, and is scarcely more complex to apply than DCA. A cubic approximation is also 536 possible. Finally, we have applied DCA to a high-resolution EURO-CORDEX climate projection ensemble, and 537 shown that it gives a reasonable worst-case that is materially less severe than adding or subtracting two standard 538 deviations from the mean locally.

539
The properties of DCA can be summarized as follows. If we assume that the ensemble is multivariate normally

571
If is strictly decreasing then ′ is never zero, and this equation gives the same solution as Eq.

572
To cover all elliptical distributions, including those that are not strictly decreasing, property 1 has to be amended 573 to become "DCA is the spatial pattern that minimises the Mahalanobis distance for a given value of the linear 574 impact function".

576
We will now show that the DCA pattern provides a solution to the question: given an ensemble with covariance 577 matrix , what is the spatial pattern that maximises the ratio of the linear impact ′ = to the Mahalanobis distance from the ensemble mean ( ) = √ −1 . The Mahalanobis distance represents the distance between 579 the pattern and the origin, measured using a metric that takes into account the covariance structure: it can be 580 loosely described as being proportional to the number of equally-spaced contour lines of probability density 581 between the origin and the pattern . In one dimension it is the number of standard deviations from the mean. This 582 derivation applies to any probability distribution, although it is most meaningful for elliptically distributed 583 ensembles.

584
We will consider the function The length of the vector does not affect the ratio because it cancels between the numerator and the 587 denominator.

636
The point with maximum probability density of an MVN is also the mean of that MVN (just as the mode and 637 mean coincide for a univariate normal distribution) and so the DCA pattern is the mean of the reduced dimension 638 MVN.

639
We write this property for a pattern , for any value of ′ , as the proportional relationship:

641
We will integrate this expression over a range of levels of linear impact ′ , and since the integral of DCA patterns 642 is a DCA pattern, we find that DCA is also the expectation conditional on a range of linear impacts. Integrating

647
For instance, we could integrate over the region between two of the diagonal lines in Fig. 5b.

663
Considering the LHS first: writing the expectation as an integral: where 1 is the volume defined by the values of that satisfy = ′ , 1 is a volume element in this volume,

666
and ℎ( ′ ) can be taken inside the 1 integral because it is a constant within this volume.

667
The two integrals can then be combined into one, giving:       As Fig. 2a, but now for two randomly generated 10 member ensembles Worst members and DCA patterns for 500 randomly generated ensembles, for the same parameters as the ensembles shown in Fig. 3. The red dots show the mean of the ensemble means. Panel (a) shows 500 worst members, one for each ensemble. Panel (b) shows 500 DCA patterns, one for each ensemble Annual mean precipitation change from the EURO-CORDEX ensemble, based on the models listed in   Reasonable worst-case precipitation scenarios calculated using DCA, based on the EURO-CORDEX data used to create Fig. 6. Panel (a) shows a DCA pattern derived from an impact function de ned as the sum of precipitation at all points, normalized to two standard deviations. Panel (c) shows the ensemble mean change plus this pattern. Panel (b) shows a DCA pattern derived from an impact function de ned as the    As Fig. 2a, but now for two randomly generated 10 member ensembles Worst members and DCA patterns for 500 randomly generated ensembles, for the same parameters as the ensembles shown in Fig. 3. The red dots show the mean of the ensemble means. Panel (a) shows 500 worst members, one for each ensemble. Panel (b) shows 500 DCA patterns, one for each ensemble Annual mean precipitation change from the EURO-CORDEX ensemble, based on the models listed in   Reasonable worst-case precipitation scenarios calculated using DCA, based on the EURO-CORDEX data used to create Fig. 6.    Worst members and DCA patterns for 500 randomly generated ensembles, for the same parameters as the ensembles shown in Fig. 3. The red dots show the mean of the ensemble means. Panel (a) shows 500 worst members, one for each ensemble. Panel (b) shows 500 DCA patterns, one for each ensemble Annual mean precipitation change from the EURO-CORDEX ensemble, based on the models listed in     Worst members and DCA patterns for 500 randomly generated ensembles, for the same parameters as the ensembles shown in Fig. 3. The red dots show the mean of the ensemble means. Panel (a) shows 500 worst members, one for each ensemble. Panel (b) shows 500 DCA patterns, one for each ensemble Annual mean precipitation change from the EURO-CORDEX ensemble, based on the models listed in   Reasonable worst-case precipitation scenarios calculated using DCA, based on the EURO-CORDEX data used to create Fig. 6. Panel (a) shows a DCA pattern derived from an impact function de ned as the sum of precipitation at all points, normalized to two standard deviations. Panel (c) shows the ensemble mean change plus this pattern. Panel (b) shows a DCA pattern derived from an impact function de ned as the    As Fig. 2a, but now for two randomly generated 10 member ensembles Worst members and DCA patterns for 500 randomly generated ensembles, for the same parameters as the ensembles shown in Fig. 3. The red dots show the mean of the ensemble means. Panel (a) shows 500 worst members, one for each ensemble. Panel (b) shows 500 DCA patterns, one for each ensemble  Annual mean precipitation change from the EURO-CORDEX ensemble, based on the models listed in   Reasonable worst-case precipitation scenarios calculated using DCA, based on the EURO-CORDEX data used to create Fig. 6. Panel (a) shows a DCA pattern derived from an impact function de ned as the sum of precipitation at all points, normalized to two standard deviations. Panel (c) shows the ensemble mean change plus this pattern. Panel (b) shows a DCA pattern derived from an impact function de ned as the    As Fig. 2a, but now for two randomly generated 10 member ensembles Worst members and DCA patterns for 500 randomly generated ensembles, for the same parameters as the ensembles shown in Fig. 3. The red dots show the mean of the ensemble means. Panel (a) shows 500 worst members, one for each ensemble. Panel (b) shows 500 DCA patterns, one for each ensemble Annual mean precipitation change from the EURO-CORDEX ensemble, based on the models listed in   Reasonable worst-case precipitation scenarios calculated using DCA, based on the EURO-CORDEX data used to create Fig. 6. Panel (a) shows a DCA pattern derived from an impact function de ned as the sum of precipitation at all points, normalized to two standard deviations. Panel (c) shows the ensemble mean change plus this pattern. Panel (b) shows a DCA pattern derived from an impact function de ned as the sum of precipitation at all points where the ensemble mean change shows positive precipitation, minus the sum of precipitation at all points where the ensemble mean change shows negative precipitation. Panel (d) shows the ensemble mean change plus this pattern Note: The designations employed and the presentation of the material on this map do not imply the expression of any opinion whatsoever on the part of Research Square concerning the legal status of any country, territory, city or area or of its authorities, or concerning the delimitation of its frontiers or boundaries. This map has been provided by the authors.    As Fig. 2a, but now for two randomly generated 10 member ensembles Worst members and DCA patterns for 500 randomly generated ensembles, for the same parameters as the ensembles shown in Fig. 3. The red dots show the mean of the ensemble means. Panel (a) shows 500 worst members, one for each ensemble. Panel (b) shows 500 DCA patterns, one for each ensemble Annual mean precipitation change from the EURO-CORDEX ensemble, based on the models listed in   Reasonable worst-case precipitation scenarios calculated using DCA, based on the EURO-CORDEX data used to create Fig. 6. Panel (a) shows a DCA pattern derived from an impact function de ned as the sum of precipitation at all points, normalized to two standard deviations. Panel (c) shows the ensemble mean change plus this pattern. Panel (b) shows a DCA pattern derived from an impact function de ned as the sum of precipitation at all points where the ensemble mean change shows positive precipitation, minus the sum of precipitation at all points where the ensemble mean change shows negative precipitation. Panel (d) shows the ensemble mean change plus this pattern Note: The designations employed and the presentation of the material on this map do not imply the expression of any opinion whatsoever on the part of Research Square concerning the legal status of any country, territory, city or area or of its authorities, or concerning the delimitation of its frontiers or boundaries. This map has been provided by the authors.