On the dependency of GCM-based regional surface climate change projections on model biases, resolution and climate sensitivity

We investigate the dependency of projected regional changes in surface air temperature (SAT) and precipitation on the model biases, resolution and global temperature sensitivity in two global climate model (GCM) ensembles. End of twenty-first century changes under high end scenarios normalized in units of per degree of global warming (PDGW) are examined for CMIP5 (RCP8.5) and CMIP6 (SSP5-8.5) ensembles of comparable size over 26 sub-continental scale regions, for December–January–February (DJF) and June–July–August (JJA). A brief analysis is also carried out for the scenario SSP3-7.0, which shows results essentially in line with the SSP5-8.5 ones. We find that the average regional change patterns are very similar between the CMIP5 and CMIP6 ensembles, both for SAT and precipitation, with spatial correlations exceeding 0.84. Also similar are the regional bias patterns over most regions analyzed, suggesting that these two generations of models still share some common systematic errors. A statistically significant relationship between projected regional changes and biases is found in 27% of regional cases for both SAT and precipitation; between regional changes and model resolution in 2% of cases for SAT and 12% of cases for precipitation; and between regional changes and global temperature sensitivity in 19% of cases for SAT and 14% of cases for precipitation. Therefore, we assess that the GCM resolution does not appear to be a significant factor in affecting the sub-continental scale projected changes, at least for the resolution range in the CMIP5 and CMIP6 models, while global temperature sensitivity and especially model biases play a more important role. These dependencies are not always consistent between the CMIP5 and CMIP6 ensembles. Overall, in our assessment the CMIP6 ensemble does not appear to provide substantially different, and presumably improved, regional surface climate change information compared to CMIP5 despite the use of more comprehensive models and somewhat higher resolution.


Introduction
Coupled global climate models (GCMs) are the primary tools we have today to carry out projections of future climate change based on different greenhouse gas (GHG) concentration pathways. During the last decades, different ensembles of GCM climate projections have been completed as part of various phases of the Coupled Model Intercomparison Project, culminating in the CMIP5 (Taylor et al. 2012) and CMIP6 (Eyring et al. 2016), which have provided the basis of assessments of projected changes in surface climate variables, from global to regional scales (e.g. Christensen et al. 2007;Collins et al. 2013).
With successive generations of models, the model resolution and complexity, along with the performance in reproducing the fundamental characteristics of present day climate, have indeed improved (Flato et al. 2013), however some broad scale ensemble average patterns of change, such as a precipitation increase at high latitudes and in the equatorial belt, and a decrease in the sub-tropics, have persisted for at least three generations of models (Giorgi et al. 2001;Giorgi and Bi 2005;Christensen et al. 2007;Collins et al. 2013). On the other hand, at the sub-continental scale, a pronounced intra-ensemble variability of changes across individual models can be found, in magnitude and often in sign (for precipitation), especially over those regions lying in the transition zones of the change patterns (e.g. Christensen et al. 2007). Different aspects of model configuration can contribute to the projected regional climate change patterns by a given GCM, such as the model physics representations, which affect the model's systematic errors and sensitivity 1 3 to changing external forcings (often referred to as "global climate sensitivity"), or the model spatial resolution. Some past studies have investigated the dependency of the GCM simulated changes in variables such as surface air temperature (SAT) and precipitation to model biases (e.g. Buser et al. 2009;Giorgi and Coppola 2010;Gobiet et al. 2015;Ivanov et al. 2018) and model resolution (e.g. Gao et al. 2006;Rajendran et al. 2013;Iles et al. 2020;Rana et al. 2020), finding regionally specific responses. It is also well known that the absolute magnitudes of the regional responses of temperature and precipitation depend on the model global climate sensitivity, a result that constitutes the basis of the "pattern-scaling" technique (e.g. Mitchell 2003;Tebaldi and Arblaster 2014;Osborn et al. 2018).
The availability of at least two generations of relatively large ensembles of advanced model projections allows a revisitation of the issues mentioned above, which are of critical importance for the understanding and production of regional climate change scenarios for use either in downscaling experiments or directly in impact and climate service applications (e.g. Gutowski et al. 2016;Hewitson et al. 2013;Giorgi 2019Giorgi , 2020. More specifically, here we investigate the sensitivity of GCM-based projections of regional (sub-continental scale) changes in SAT and precipitation to model resolution, global temperature sensitivity (GTS) and biases in present day climatologies with respect to observations. We analyze CMIP5 and CMIP6 ensembles to investigate eventual differences in behavior between these successive generations of models, and focus on an end of twenty-first century time slice for the most extreme scenarios, RCP8.5 for CMIP5 and SSP5-8.5 for CMIP6 (Moss et al. 2010).
This choice of scenario was made in order to maximize the signal-to-noise ratio, however the analysis is carried out in terms of change per degree of global warming, which, on the one hand removes the direct dependency of the regional change magnitudes on the GTS, and on the other hand it can be extended to a first order approximation to other scenarios and time slices through the above mentioned pattern scaling property of GCM projections. To verify this, we also added a brief analysis of another CMIP6 scenario, the SSP3-7.0. Also, we include in our study only one realization per model so as not to skew the results towards models with large numbers of realizations. This is justified on the grounds that the change signals are much less sensitive to the internal model variability than the inter-model differences and scenarios when looking at end of century changes (Hawkins and Sutton 2009).
Models and analysis methodology are described in Sect. 2, while the results are discussed in Sect. 3 and conclusions are given in Sect. 4.

Models and methods
We analyze ensembles from the CMIP5 (Taylor et al. 2012) and CMIP6 (Eyring et al. 2016) projections. The models, along with their horizontal resolution and GTS are reported in Table 1. We use 34 CMIP5 and 32 CMIP6 simulations, i.e. ensembles of comparable size. We selected available models at the time of completion of this work and single realizations, although we recognise that some models are different versions of the same base modeling system, and therefore cannot be considered as rigorously independent. The horizontal grid spacing varies from 0.75° to 3.75° in the CMIP5 ensemble and 0.7° to 3.75° in the CMIP6 one (Table 1), i.e. the range of resolutions is comparable in the two ensembles, although on average there are more high resolution models in CMIP6.
In our basic analysis we consider two 30-year time slices, 1970-1999 for reference present day climate and 2070-2099 for end of twenty-first century future climate, under the high end RCP8.5 (CMIP5) and SSP5-8.5 (CMIP6) scenarios, which are comparable in terms of greenhouse gas (GHG) forcing. As recommended by a reviewer, however, we also added the analysis of a further CMIP6 scenario, the SSP3-7.0. The GTS for each model and scenario is reported in Table 1, and is calculated as the difference between the mean global temperature (land and ocean areas) in the future time slice minus the present day one. It can be seen that the CMIP5 ensemble has a GTS in the range of ~ 2.5 to ~ 4.7 °C while the CMIP6 in the range of ~ 2.5 to ~ 6.1 °C, i.e. the CMIP6 models cover a substantially wider GTS range in the upper tail of the distribution (e.g. Forster et al 2020). In order to filter out the direct dependence of the regional changes on the GTS, we normalize all changes in the individual models by the corresponding model GTS value, i.e. the changes are expressed in Per Degree of Global Warming (PDGW) units. Therefore, all changes mentioned hereafter in the paper refer to the normalized (PDGW) difference between the 2070-2099 and 1970-1999 time slices.
We analyze SAT and precipitation changes for the 26 sub-continental scale regions shown in Fig. 1, which are variants of those used by Giorgi (2006) following some of the IPCC AR6 framework. All the data are interpolated onto a common 1° grid and we only consider land points of this grid, hence for coastal regions there might be an uncertainty related to how the models describe the coastlines at their respective resolutions. Finally, we focus on the boreal (austral) winter (summer) (December-January-February, DJF) and summer (winter) (June-July-August) seasons, given the strong seasonal dependency of the climate change signals (e.g. Collins et al. 2013). Table 1 List of CMIP5 and CMIP6 models used in this study, along with the model horizontal grid spacing in the east-west (Delta-x) and north-south (Delta-y) directions, and the global temperature sensitivity (GTS, see text)

CMIP5
Delta-x Delta-y  Bold characters indicates models that are included in the sub-ensemble of clear evolutions from CMIP5 to CMIP6. Asterisks indicate models used in the analysis of the SSP3-7.0 scenario Also shown are correlation coefficients between the CMIP5 and CMIP6 change patterns. Units are °C PDGW

Ensemble mean changes
Figures 2 and 3 show the ensemble mean SAT and precipitation changes, respectively, for DJF and JJA in the CMIP5 and CMIP6 ensembles over land areas, along with the linear correlation coefficient between the corresponding changes in the two ensembles. The PDGW temperature change patterns are very similar between CMIP5 and CMIP6 (correlations of 0.93-0.99), with a strong maximum in DJF over the Arctic regions (exceeding 2.5 °C PDGW) and maxima exceeding 1.5 °C PDGW in JJA over the northern hemisphere mid-continental regions. Also for precipitation the patterns are generally similar in the CMIP5 and CMIP6, with correlations of 0.91 in DJF and 0.84 in JJA, consisting of the well known increases over high latitude regions, decreases over some sub-tropical areas and prevaling increases over the equatorial belt and the monsoon regions, except for West Africa (Collins et al. 2013). The magnitudes of the PDGW changes are different in some instances across the two ensembles, such as over the Amazon Basin and Eastern Africa. Figure 4 compares the SAT and precipitation changes averaged over the 26 analysis regions in the CMIP5 and CMIP6 ensembles. For temperature, in the vast majority of cases the magnitudes of the change agree in the two ensembles, with particularly large values exceeding 2 C o PDGW over the northern hemisphere high latitude regions EEU, NWNA and NENA in DJF, most likely as a result of the snow/ice albedo feedback mechanism. For precipitation, in all cases with changes of magnitude in excess of 0.5% PDGW the sign of the change is the same in CMIP5 and CMIP6, with most cases exhibiting also similar magnitudes. Note the particularly large positive percent changes exceeding 6% PDGW found over the northern high latitude regions in DJF and the desert regions of SAH and ARP in JJA (for CMIP6), evidently related to the low precipitation amounts in the present day time slice. The most pronounced decreases of precipitation, in excess of − 5% PDGW, are found over SAF, MED The inter-model spread of changes for the 26 regions is illustrated in Figs. 5 (SAT) and 6 (precipitation). Starting with temperature (Fig. 5), there is no systematic difference in the inter-model range of PDGW changes projected by the two ensembles, even though the CMIP6 ensemble covers a more pronounced climate sensitivity range. Both in DJF and JJA the largest intermodel spreads are found in high latitude regions, especially over North America, Northern Europe and Asia. This is evidently due to the model representation of snow and ice processes and the associated snow albedo feedback mechanism. The inter-quartile range is generally less than 0.5 °C, except for some of the northernmost American regions, where the intermodel 5th-95th percentile range of the PDGW changes can exceed 1.5 °C.
Moving to precipitation (Fig. 6), there appear to be more cases of large interquartile range in the CMIP5 ensemble than the CMIP6 one, especially over the dry regions (WCAS, ARP MED in JJA, EAS SAS, WCAS in DJF), which is an indication of greater diversity of model behaviors in CMIP5. Differently from temperature, the inter-model spreads in the PDGW precipitation changes appear more pronounced in tropical regions, likely due to the use of different cumulus convection schemes in the models. In about 36% of regional/ season cases the 5th-95th percentile range does not cross the 0 line, indicating full (or almost full) intermodel agreement in the projected precipitation change sign, and in the vast majority of these cases the two ensembles exhibit a similar behavior. The (almost) full intermodel agreement is found mostly in mid to high latitude regions, especially during the winter season (positive changes), while the lack of intermodel agreement in the precipitation change sign prevails over tropical regions.

Sensitivity to model bias
We now turn our attention to the assessment of the dependence of the model projected regional changes on the regional model bias in reproducing present day climate. It has been suggested that errors in the present day model climatology might be carried over, possibly in an amplified way, to the change projections (e.g. Gobiet et al. 2015;Ivanov et al. 2018). In this regard, for example, Giorgi and Coppola (2010) carried out an analysis of the dependency of the model regional change patterns on the model bias in the CMIP3 GCM ensemble over 24 sub-continental scale regions. They found a negligible effect for temperature and a significant effect in about 30% of the regional cases for precipitation, even though in many cases models with quite different biases showed a similar change signal. They concluded that, at least in the CMIP3 ensemble and for regional scale mean changes, errors in present day climatologies did not play a dominant role in modulating the projected change signals, especially for temperature.
Here we extend the analysis of Giorgi and Coppola (2010) to the CMIP5 and CMIP6 ensembles, with the different approach of assessing PDGW changes. Figure 7 first compares the SAT and precipitation biases over the 26 regions of Fig. 1 in the two ensembles. The biases are calculated with respect to the observation dataset of the University of Delaware (UDEL, Willmott and Matsuura 2001). Observation datasets are characterized by substantial uncertainties and can differ significantly from one another (e.g. Sylla et al. 2013;Hartmann et al. 2013), so that the UDEL should be considered more as representing a "reference" dataset useful for intercomparing models rather than the best representation of reality. We selected it because it is generally close to the dataset from the Climatic Research Unit of the University of East Anglia (CRU, Harris et al. 2020) with the added value of including a gauge undercatch correction for precipitation, which has been shown to be important especially in mountainous regions during the winter (e.g. Adam and Lettemaier 2003).
The most striking aspect of Fig. 7 is that for the vast majority of cases the ensemble average bias values of the CMIP5 and CMIP6 ensembles have the same sign, both for temperature and precipitation. In addition, we do not find a systematic difference in the bias magnitudes between the ensembles, with cases in which either the CMIP5 or the CMIP6 exhibit lower values. The only exception is JJA, where more cases of warm bias in excess of 0.5 °C are actually found in CMIP6. Therefore, overall, the performance of the newest generation CMIP6 ensemble does not appear to be significantly improved compared to the CMIP5, at least in terms of seasonal ensemble averages. For precipitation, only in two cases we find regional biases exceeding 5% in magnitude having different signs between CMIP5 and CMIP6. This indicates that the two generation models still share similar systematic errors in terms of precipitation generation and its driving processes (e.g. circulation features). Particularly noticeable are the large dry biases in excess of -40% over the northeast Brazil region in JJA and the Arabian Peninsula and Sahel regions in DJF, along with the pronounced wet biases (> 40 to 50%) over Western North America and Central Asia in DJF.
To explore further the possible evolution of biases from the CMIP5 to the CMIP6 models, we identified 12 GCMs in CMIP6 that are clearly evolutions of CMIP5 counterparts.
These are highlighted in Table 1, and the regional biases in these sub-ensembles are presented in Supplementary Fig.  S1. This figure confirms that the CMIP6 sub-ensemble does not show a systematic improvement in biases, and that the biases in the CMIP5 sub-ensemble are mostly found also in the CMIP6 evolution models.
In order to explore the dependency of the change signals on the model biases we calculated the linear trend regression between the seasonal change and bias values of the individual models over the different regions of Fig. 1 for the Fig. 6 Box and whiskers plots denoting interquartile (boxes) and the 5th-95th percentile inter-model ranges (bars) of projected regionally averaged precipitation changes in the CMIP5 (blue, RCP8.5) and CMIP6 (green, SSP5-8.5) ensembles. Data are shown for the 26 regions of Fig. 1 and units are % PDGW two ensembles. Figure 8 shows the linear regression values for the regional cases in which this regression is statistically significant at the 95% confidence level.
This figure offers some interesting considerations. For temperature, about 27% of regional cases show a significant bias/change dependency, a percentage much higher than found by Giorgi and Coppola (2010). This is possibly because here we use changes in PDGW while Giorgi and Coppola (2010) used the absolute values of the temperature change, and thus the effect of the GTS might have at least partially masked the actual regional amplification signal. In DJF, the CMIP5 ensemble exhibits many more cases of significant change-bias relationship than CMIP6, with this dependency being negative for northern hemisphere regions (thus winter conditions). This implies that, as the bias becomes more positive (or less negative), the warming tends to decrease, which may be associated with the fact that warm biases yield an underestimation of snow cover, and thus a reduced warming amplification by the snow albedo feedback mechanism. The opposite appears to occur during the warm seasons, with mostly positive regression coefficients, implying that the warmer the bias the more amplified the warming. This can at least partially be ascribed to an underestimation of soil moisture by the models, which tends to amplify the response to the radiative forcing because of the lack of moisture available for evaporation (and thus evaporative cooling).
Also for precipitation about 27% of regional cases show a significant bias/change dependency, thus more in line with the results of Giorgi and Coppola (2010), with a prevalence of negative significant linear regression values. This implies that the drier (or less wet) the bias, the more negative (or less positive) the change. In other words, dry biases tend to either amplify dry changes or reduce wet changes. Again, this response can at least partially be attributed to an amplification of the effect of underestimated soil moisture and precipitation feedback over these regions.
The analysis of Fig. 8 is repeated in supplementary Fig.  S2 for the sub-ensemble of 12 models identified as evolutions of previous generation models (see Fig. S1 and Table 1). In this case, we find a lower number of regional cases with statistically significant bias-change correlations, but this result can be expected in view of the smaller number of ensemble members which implies more stringent conditions for statistical significance.
Some examples of scatter plots and regression lines for regions exhibiting significant bias-change relationships, either positive or negative, are given in Fig. 9 for both CMIP5 and CMIP6, SAT and precipitation changes. It can be seen that in some cases, for example SAT in DJF-NEU-CMIP6 or precipitation in JJA-SAU-CMIP5, the spread around the trend line is relatively low, while in most cases it is wide, although still statistically significant at the 95% confidence level. Also note that the range of the model biases can be quite large. For SAT, most regional biases are in the range of ± 3 °C, but there are several cases of biases with magnitude > 4 °C, reaching even − 10 °C in DJF-NEU-CMIP6. For precipitation we find a wide range of biases, in most cases within ± 40% of observed values, but in some cases even exceeding 100%. In addition, there are instances in which almost all models share the same sign of precipitation bias, indicating common systematic errors.

Sensitivity to model resolution
In general, the GCM resolution can affect the regional surface climate changes either through its effect on the simulation of large scale circulation features and teleconnection patterns or through the different representation of local forcings and associated feedbacks (e.g. topography; Giorgi et al. 2016). Figure 10 reports the cases of significant linear regression coefficients (95% confidence level) between the model simulated change and the model resolution, indicating that only a small number of regional cases are characterized by a significant change/resolution dependency.
In fact, for temperature, essentially the model resolution does not affect the change significantly (except for two regional cases). For precipitation more cases of significant dependency are found, about 12%, but with no consistent responses across the CMIP5 and CMIP6 ensembles. Specifically, in 7 regional cases the CMIP6 ensemble shows a positive change/resolution relation, implying an amplified positive precipitation response, or decreased negative response, by resolution. Of interest is the case of the Mediterranean region in JJA, where the CMIP6 ensemble appears to indicate that the decrease in precipitation tends to be less pronounced with increasing resolution. This may be at least partially related to the contribution of increased high elevation convection associated with enhanced topographical warming over better resolved topography (e.g. Giorgi et al. 2016). Conversely, the CMIP5 ensemble shows mostly negative trend values, implying an amplification of dry signals or a reduction of wet ones with resolution, for example over the Mediterranean in DJF and central America in JJA, both of which are prominent dry hotspots ( Fig. 4; Giorgi 2006).
Examples of scatterplots and regression lines for which the change-resolution relationship is statistically significant at the 95% confidence level are shown in Fig. 11. It can be seen that in both ensembles different models sharing the same resolution can have quite different change responses, which results in relatively pronounced deviations from the regression line, even if the regression itself is significant. This is again an indication of a weak resolution-change dependency.
In summary, although some regional cases of significant resolution dependency are found, Fig. 10 overall indicates that, at least within the range of resolutions in the coupled GCMs of CMIP5 and CMIP6, model resolution is not a dominant factor in determining the regional surface climate responses. Figure 12 shows the regional cases of significant linear regression between change and GTS. In this regard, we recall that the direct dependency of the change on the model climate sensitivity is removed via the expression of the change in PDGW units, and therefore the dependencies of Fig. 12 indicate whether the response is regionally amplified or reduced compared to the global sensitivity. For temperature, we find 19% of cases with a significant regression relation at the 95% confidence level. There is a prevalence of negative values of the change/GTS relation, implying that over these regions the warming signal is reduced in the high sensitivity models. Noticeable exceptions are some high latitude Northern Hemisphere regions, such as NEU, EEU, NENA, NWNA in JJA, where the regional warming is actually amplified for high sensitivity models. This is possibly related to the snow albedo feedback, since some snow may persist in these regions during the summer in present day conditions, while it essentially disappears in future warmer climates especially for the higher sensitivity models, thereby amplifying the warming.

Sensitivity to model global temperature sensitivity (GTS)
For precipitation, only 3 cases in JJA show a significant (positive) linear regression in the CMIP5 ensemble, while in CMIP6 significant regression values are found in 12 regional cases (about 23% of all cases), with prevailing positive regressions implying that positive (negative) precipitation changes are amplified (reduced) for high sensitivity models. The larger number of significant cases in CMIP6 may be related to the broader range of GTS response within this ensemble compared to CMIP5. Noticeable cases are some north and northeastern African and central Asia regions, although it is difficult to attribute this result to specific processes. Fig. 8 Regional cases for which the linear regression coefficient between the models' regional prejected change and bias is statistically significant at the 95% confidence level in the CMIP5 (C5, RCP8.5) and CMIP6 (C6, SSP5-8.5) ensembles. The value presented is the coefficient of the linear regression line (if significant) and units are °C/°C for temperature and %/% for precipitation Fig. 9 Illustrative examples of individual model simulated change (y-axis, units of °C for SAT and % of present day values for precipitation) vs. bias (x-axis, units of °C for SAT and % of observed values for precipitation) scatterplots for different regional cases in which the change/bias linear regression is statistically significant at the 95% confidence level. The regression line is also shown. CMIP5 (RCP8.5) and CMIP6 (SSP5-8.5) cases are included for both SAT and precipitation, as reported in the title of the different figure panels Examples of scatterplots for regional cases with statistically significant GTS-change dependency are shown in Fig. 13. In general, the spread around the regression line appears somewhat more pronounced than in the case of the bias-change dependency (Fig. 9), but the relationships are clear in all cases.
Another measure of global climate sensitivity is the equilibrium climate sensitivity (ECS), defined as the global temperature increase associated with a doubling of carbon dioxide concentration. Compared to the GTS, on the one hand, the ECS does not depend on the chosen scenario, and thus it can provide a more independent indicator of climate sensitivity, but on the other it does not include the effects of changes in other forcings (e.g. aerosols), and thus is not entirely consistent with the regional changes for a given scenario. As suggested by a reviewer, we repeated the calculations of Fig. 12 using ECS instead of GTS to calculate regional cases with significant change-ECS correlations. Table 1 reports the sub-ensembles of models for which we could retrieve the information on ECS, showing that while for the CMIP6 this was possible for almost all models, for CMIP5 the ECS-based sub-ensemble is significantly smaller than the GTS-based one. Also note that, interestingly, for a few models there is no agreement in relative model ranking between the GTS and ECS magnitudes.
The results of the ECS-based analysis are presented in supplementary Figure S3. They show, first of all, a lower number of statistically significant regional correlation cases than when GTS is used, which may be due to the above mentioned greater independence of the ECS variable or, for CMIP5, to the smaller ensemble size. Focusing on the significant cases, for precipitation all the ECSbased significant regional cases are also significant when using GTS, in fact with the same sign and magnitude of regression slope. This is mostly found also for temperature, although in a few regional instances the results differ across the two analyses. Therefore, by and large, the ECSbased calculations support the GTS-based ones.

Dependency on scenario: SSP3-7.0
Following the recommendation by a reviewer, we extended our calculations to the projections associated to an additional scenario within CMIP6, the SSP3-7.0. The reason for choosing this scenario is twofold. First, this is also a relatively high GHG concentration scenario, and thus provides a clearer signal than lower concentration ones. Second, it includes larger forcings for aerosols and land-use (IPCC 2021), which are most relevant at the regional scale and can therefore affect regional projections. Table 1 shows the 21 models for which we could find projections with the SSP3-7.0 scenario, i.e. a sub-ensemble of the SSP5-8.5 one. First, we computed maps of the ensemble average changes of temperature and precipitation in PDGW units analogous to Figs. 2 and 3 (shown in Supplementary Figs. S4 and S5). Note that for these we used for the normalization the GTS from the SSP3-7.0 scenario (also shown in Table 1) and found that the change patterns are very similar to those for the SSP5-8.5, with pattern correlations between the two ensembles of 0.99 for temperature and 0.96-0.97 for precipitation. Figure 14 then compares the regional precipitation and temperature changes for the SSP5-8.5 and SSP3-7.0 scenarios, including only the 21 models common to the two scenarios. It is evident that in the vast majority of cases there is agreement across the two scenarios, both in magnitude and, for precipitation, in sign. For temperature, only in 5 out of 104 regional cases the magnitude of the warming differs across the two scenarios. For precipitation, this happens in 15 out of 104 cases, and the sign of the change always agrees across the two scenarios when the changes are of magnitude greater than 0.5%. Supplementary Fig. S6 finally compares Fig. 10 Regional cases for which the linear regression coefficient between the models' regional prejected change and resolution is statistically significant at the 95% confidence level in the CMIP5 (C5, RCP8.5) and CMIP6 (C6, SSP5-8.5) ensembles. The value presented is the coefficient of the regression line (if significant) and units are °C/Deg. for temperature and %/Deg for precipitation. Note that the value of the east-west grid spacing of Table1 is used to calculate the regression the regional cases with significant change-bias, change-resolution and change-GTS relationships in the SSP5-8.5 and SSP3-7.0 scenarios (21 models only), showing that, in general, fewer regional cases with statistically significant correlations are found in both scenarios due to the smaller size of the projection ensembles. When significance is found, the two scenarios show a tendency to mostly agree on the sign of the dependency, although this is not found ubiquitously. Overall, our assessment is that relatively small differences are found across the two projection scenario ensembles. Fig. 11 Illustrative examples of individual model simulated change (y-axis, units of °C for SAT and % for precipitation) vs. resolution (x-axis, units of Deg) scatterplots for different regional cases in which the change/resolution linear regression is statistically significant at the 95% confidence level. The regression line is also shown. CMIP5 (RCP8.5) and CMIP6 (SSP5-8.5) cases are included for both SAT and precipitation, as reported in the title of the different figure panels Fig. 12 Regional cases for which the linear regression coefficient between the models' regional projected change and global climate sensitivity (GTS) is statistically significant at the 95% confidence level in the CMIP5 (RCP8.5) and CMIP6 (C6, SSP5-8.5) ensembles. The value presented is the coefficient of the regression line (if significant) and units are °C/°C for temperature and %/°C for precipitation

Conclusions
In this paper we analyzed the projected regional changes in surface air temperature (SAT) and precipitation produced by ensembles of CMIP5 and CMIP6 projections, and in particular we investigated the effect of model biases, resolution and global temperature sensitivity (GTS) on the regional changes. We focused on 26 regions of sub-continental size (Fig. 1) and on an end of century time slice of high end scenarios (RCP85 for CMIP5 and SSP5-8.5 for CMIP6), however since our analysis is carried out on the basis of changes expressed in PDGW, the conclusions are to a first order scalable to other scenarios and time slices (e.g. Tebaldi and Arblaster 2014;Osborn et al. 2018). To verify this conclusion, we also carried out an analysis of the SSP3-7.0 scenario in CMIP6, finding results generally consistent with those for the SSP5-8.5 scenario.
We first found that, in PDGW terms, the ensemble average change patterns in the new CMIP6 ensemble are close to the CMIP5 ones, both for SAT and precipitation, with spatial correlations over land areas greater than 0.84. Therefore, the basic regional change patterns found in the last few generations GCMs (Giorgi et al. 2001;Giorgi and Bi 2005;Christensen et al. 2007;Collins et al. 2013) remain essentially unchanged in CMIP6, despite the use of more comprehensive and higher resolution Earth System Models.
The dependency of the regional changes on the model resolution is generally weak, except for a small number of regional cases, most noticeably the Mediterranean region in JJA, where the effect of high elevation convection in the higher resolution models might be the cause of an amelioration of the projected drying (Giorgi et al. 2016).
Concerning the dependency of the projected regional changes on the model regional biases, first we do not find a general improvement of systematic biases in CMIP6 compared to CMIP5, and in fact the CMIP6 SAT biases are larger in JJA. In addition, the signs of the regional biases mostly agree between the CMIP5 and CMIP6 ensembles, indicating that the latest generation GCMs used for climate projections are still characterized by the same basic systematic errors present in the previous one. This conclusion is also generally valid for a sub-ensemble of models that are clear evolutions from CMIP5 to CMIP6. Concerning the dependency change/bias, we find a significant relationship in about 25% of regional cases for temperature (more for CMIP5 than CMIP6), and 30% for precipitation. In particular, most models exhibit a negative change/bias relationship for precipitation, implying that dry biases tend to amplify negative changes or reduce positive ones, likely because of an underrepresentation of soil moisture-precipitation feedbacks. For SAT a negative relation dominates in DJF and a positive one in JJA, the latter (former) implying that warm (cold) biases tend to amplify the warming. For JJA this may be ascribed to the underestimation of soil moisture inducing a feedback with surface temperatures, while for DJF the result is likely related to the snow albedo feedback mechanism.
Finally, about 20% of the regional cases show a significant dependency of the changes on the GTS for the SAT, and Fig. 13 Illustrative examples of individual model simulated change (y-axis, units of °C for SAT and % for precipitation) vs. the model global temperature sensitivity (GTS) (x-axis, units of °C) scatterplots for different regional cases in which the change/GTS linear regression is statistically significant at the 95% confidence level. The regression line is also shown. CMIP5 (RCP8.5) and CMIP6 (SSP5-8.5) cases are included for both SAT and precipitation, as reported in the title of the different figure panels about 15% for precipitation, with most cases occurring in CMIP6 and indicating a positive dependence, i.e. amplified positive change or reduced negative change, with increasing GTS. This may be related to the generally more pronounced climate sensitivity range shown by the CMIP6 models compared to CMIP5. A brief analysis using ECS instead of GTS shows fewer cases of significant change-ECS correlations than change-GTS ones, either due to a lower number of models for which we could retrieve the ECS (for CMIP5) or because of the more independent nature of the ECS metric, which reduces consistency with the actual changes simulated for a given scenario. When cases show significant correlations, there is a prevailing agreement between the two analyses.
Overall, the effect of bias and GTS on the sub-continental scale ensemble average changes appears an important, albeit not dominant, feature of these two generations of model projections, while the effect of resolution is largely not significant. As a general assessment, the CMIP6 ensemble does not appear to provide a stepwise improvement in regional surface climate performance and projections compared to CMIP5, with the models seemingly still sharing common systematic biases and responses, despite the increase in resolution (although not pronounced) and complexity of the CMIP6 models.
This conclusion is obviously limited to the variables and temporal/spatial scales analyzed here, but offers grounds for some considerations concerning the potential improvement in regional scale projections by current GCMs. Specifically, it is the authors' opinion that further incremental enhancements in model resolution and complexity will likely not change substantially the basic patterns of regional SAT and precipitation projections. Probably only the use of very high resolutions of ~ 10 km or even higher (convection permitting) will lead to substantial effects, since RCM experiments have indeed indicated that at these resolutions the surface climate change signals are strongly affected by the description of local scale processes (e.g. cumulus convection) and forcings (e.g. complex topography) (Giorgi et al. 2016;Prein et al. 2016). Global high resolution atmospheric model simulations have also shown that some features of the global circulation, e.g. the position of the mid-latitude storm track or the monsoon, are better represented at high resolutions (e.g. Rajendran et al. 2013;Flato et al. 2013;van Haren et al. 2015;Haarsma et al. 2016), thus potentially affecting regional projections. In addition, regional climate model simulations at convection permitting resolutions have shown that the simulation and projection of higher order precipitation characteristics, such as extremes and sub-daily intensities, are considerably improved at such resolutions (Prein et al. 2015;Coppola et al. 2020;Ban et al. 2021;Pichelli et al. 2021).
Another factor that may significantly affect projected regional changes is a more detailed representation of forcings relevant at regional to local scales, such as the effects of changes in tropospheric aerosols and land use. Our analysis of the SSP3-7.0 scenario, which includes more pronounced aerosol and land-use change forcings, does not indicate an important role of these factors, at least for the metrics utilized. However, so far, these model components and associated forcings have not been included with sufficient detail in climate change projections, and their model description needs to be improved in order to achieve greater robustness of regional projections.