Future Changes in Precipitation Over Northern Europe Based on a Multi-model Ensemble from CMIP6: Focus on Tana River Basin

Accurate climate projections help policymakers mitigate the negative effects of climatic changes and prioritize environmental issues based on scientific evidences. These projections rely heavily on the outputs of GCMs (General Circulation Models), but the large number of GCMs and their different outputs in each region confuses researchers in their selection. In this paper, we analyzed the performance of a CMIP6 (Climate Model Intercomparison Project Phase 6) multi-model ensemble for Pr (precipitation) data over NE (Northern Europe). First of all, we evaluated the overall performance of 12 CMIP6 models from GCMs in 30 years of 1985–2014. Furthermore, future projections were analyzed between 2071 and 2100 using SSP1-2.6 and SSP5-8.5 (Shared Socioeconomic Pathways). Then, simulations were statistically improved using an ensemble method to correct the systematic error of the CMIP6 models and then the capacity of postprocessed data to reproduce historical trends of climate events was investigated. Finally, the possible spatio-temporal changes of future Pr data were explored in Tana River Basin. The results of this study show that different CMIP6 models do not have the same accuracy in estimating Pr in the study area. However, the ensemble method can be effective in increasing the accuracy of the projections. The results of this study projected a change in the monthly Pr data over Tana River Basin by 2.46% and 2.06% from 2071 to 2100 compared to the historical period, based on SSP1-2.6 and SSP5-8.5, respectively.


Introduction
Climate is changing in the Arctic areas twice as fast as in other areas (Anisimov et al. 2007;Cohen et al. 2014). Although the effects of climate change have been investigated previously in different Arctic regions (Ashraf et al. 2016;Callaghan et al. 2010; 1 3 2009; Olsson et al. 2015;Pascual et al. 2021), recently, research has shown that ecosystem responses to climatic changes vary among these regions (Callaghan et al. , 2013Danby and Hik 2007;Elmendorf et al. 2012;Kivinen et al. 2017;Van Bogaert et al. 2011). Though, less emphasis has been given to the possible changes in local Sub-Arctic areas than the broader pan-Arctic scale.
GCMs are commonly employed to simulate and predict the climatic data across the entire planet (Hwang and Graham 2013;Rupp et al. 2013;Su et al. 2013). Due to differences in various GCMs structure, assumptions, parametrization, the applied procedures for the calibration processes, etc., their forecasts differ from place to place and time to time (Kay et al. 2009;Khan et al. 2018;Kumar et al. 2013). As a result, performance evaluation of different GCMs has been emphasized as a precondition for any climatic research (Ahmadalipour et al. 2015;Gulizia and Camilloni 2015;Lovino et al. 2018;Miao et al. 2014;Moise et al. 2015;Purich et al. 2013;Raghavan et al. 2018;Rivera and Arnould 2019;Zazulie et al. 2018).
Recently, the latest generation of GCMs, the CMIP6 released the latest suite of coordinated simulations so as to facilitate the 6 th Assessment Report from IPCC (Intergovernmental Panel on Climate Change). Recent studies have shown that the CMIP6 models can better predict climatic parameters than previous ones (Gusain et al. 2020;Rivera and Arnould 2019). Furthermore, projections of all of the GCMs can be post-processed, representing the local scales better. Furthermore, ensemble methods combine multiple models to produce improved results. Finally, the multi-source methods take advantage of different models by reducing the effects of systematic biases, errors, and the sensitivity to the factors which decrease the accuracy of estimations (Bishop and Abramowitz 2013;Gong et al. 2022;Heidinger et al. 2012;Kim et al. 2020;Moradian et al. 2022;Pakdaman et al. 2022;Tebaldi and Knutti 2007;Todini 2001;Zhai et al. 2020).
In this paper, we analyzed the performance of a CMIP6 multi-model ensemble for Pr data over NE. We evaluated the overall performance of different CMIP6 models; future projections (up to 2100) were analyzed using SSP1-2.6 and SSP5-8.5, simulations were statistically improved, using a novel ensemble weighting method and the capacity of postprocessed data to reproduce historical trends of climate events was investigated, and finally, the possible spatio-temporal changes of future Pr were explored across Tana River Basin.

Study Area
Based on the classification devised by the United Nations geoscheme, NE covers the area between the latitudes 50° and 75° and the longitudes -25° and 35° including Sweden (528,447 km 2 ), Norway (385,207 km 2 ), Finland (338,440 km 2 ), the UK (243,610 km 2 ), Iceland (103,000 km 2 ), Ireland (84,421 km 2 ), Lithuania (65,300 km 2 ), Latvia (64,589 km 2 ), Estonia (45,338 km 2 ) and Denmark (42,951 km 2 ) (UNSD 2019) with various climates consisting from Sub-arctic to Humid continental. The present study was performed in NE, focusing on the Tana River Basin (Fig. 1). Since the spatial variation of Pr is very high (Han et al. 2021;Portmann et al. 2009;Shang et al. 2019;Zhang et al. 2009), here, the basin was chosen to provide a more detailed study. The basin (16,389 km 2 ) is situated in the middle parts of Finnmark in Norway (69%) and the northern parts of Lapland in Finland (31%). Previous studies have indicated that the Sub-Arctic and Arctic regions, such as the Tana basin, are expected to be influenced by climate change (Lotsari et al. 2010).
Mean temperatures range from -2.9 ºC to below 0 ºC. Temperatures are usually below 0 ºC from October to Jun. similar to most sub-arctic environments, usually snowmelt in this basin is a rapid process. It is expected that changes in summer rainfall will be reinforced by changes in summer temperature and evapotranspiration. Also, the overall distribution of vegetation types in this basin includes some pine forests, birch forests, tundra heaths, and bare rock (Lee et al. 2000;Seppa and Birks 2002).

Data and Method
Pr simulations were gathered from the runs of 12 GCMs from CMIP6 ensemble ( Table 1). The CMIP6 simulated data were assessed against the GPCC (Global Precipitation Climatology Centre; Schneider et al. 2018) data on a 1-degree resolution. The assessment and projection presented in this study are based on historical monthly Pr data from 1985 to 2014 and future outputs data from 2071 to 2100 under the SSP1-2.6 and SSP5-8.5 scenarios. SSP1-2.6 is the IPCC's most optimistic scenario, in which the emission of Green House Gases is cut to net zero around 2050, leading to a forcing pathway of 2.6 Wm −2 by 2100. On the contrary, SSP5-8.5 represents an extreme scenario in which no policies are applied regarding the global CO 2 emissions, resulting in a forcing pathway of 8.5 Wm −2 in 2100. Then, based on the performance of the models in the baseline period, an ensemble method was applied and then the main focus of the paper is on the impacts of climate change over Tana Basin. In the following, an overview of the proposed method is elaborated.
First, to evaluate and predict the effects of climate change on Pr in NE, different metrics were used in this study including: (1) bias, (2) correlation coefficient (CC), (3) mean, (4) mean absolute error (MAE), (5) mean error (ME), (6) median, (7) centered root mean square difference (RMSD) and (8) root mean squared error (RMSE) and (9) standard deviation (StD) (Duan et al. 2016;Moradian and Yazdandoost 2021;Taylor 2001).  Then, to improve the climate projections, an ensemble method was employed. Recent studies have shown that ensemble averaged data generally offer better projections than individual models do (Becker et al. 2014;Christiansen 2018aChristiansen , b, 2019Danandeh 2021;Ma et al. 2015;Madadgar et al. 2016;Rahmani-Rezaeieh et al. 2020;Sajedi et al. 2020;Sheikh et al. 2020). The employed technique combines multiple models to produce improved results, originating from the idea that when the dispersion of a model simulations increases, that model becomes more important (Hwang and Yoon 1981;Shannon and Weaver 1947). In other words, the weighting scheme is inspired by the hypothesis that models having larger relative variability in their Pr time series should be given more weight. This hypothesis can be tested because models with this feature are probably able to better predict changes in Pr values, which will be evaluated later by comparing the proposed method with the commonly used unweighted ensemble mean method.
Let the matrix contain m objectives of Pr data in each 1° cell and n alternatives of CMIP6 models (m × n elements; Eq. (1)): In this matrix, Pr ij represents the value of objective i (monthly data from 1985 to 2014) for alternative j (the CMIP6 model). The degree of diversification dj of the information provided by the outcomes of alternative j can be defined as Eq. (2): where K is calculated from Eq. (3): And P ij is the normalized Pr ij (Eq. (4)): And the final weights are obtained as follows (Eq. (5)): In the final step, the performance of the proposed method was analyzed and a detailed evaluation of the effects of climate change on Pr was done over Tana Basin based on the proposed technique.

Evaluation of Raw CMIP6 Pr simulations
In the first step, monthly area-averaged Pr data of the analyzed CMIP6 models were plotted versus observational data in different Violin plots (Fig. 2). The plots depict the groups of simulated Pr data in both space and time through their quartiles; and the spacings between different parts of the plots indicate the degree of dispersions in the data (Thrun et al. 2020). The plots are hybrid of kernel density plots and box plots, showing peaks, mean and median in the data. The plots also illustrate the probability density of the data at different values, smoothed by a Kernel density function (Hintze and Nelson 1998;Parzen 1962;Rosenblatt 1956). The findings show substantial inter-model variability over the studied area. In evaluating the CMIP6 models, they also found that these models have the highest evaluation error in areas with heavy Pr. Figure 3 offers a quantitative way of the CMIP6 models errors by showing the Taylor diagram of CMIP6 estimations against in situ observational data, using the CC along with the RMSD and StD (Taylor 2001). This figure clearly shows that none of the models can reproduce the Pr patterns apparent in the observed data. Results from this figure, along with previous research (Gusain et al. 2020;Rivera and Arnould 2019), emphasize the need for post-processing and improving the accuracy of the CMIP6 outputs before any further analysis.
Since the study area has a diverse climatic pattern with a wide range of Pr values, obtaining information on the estimated Pr and observational values throughout it is a step of the evaluation process. To capture the spatial Pr patterns, Fig. 4 shows the distribution maps of monthly ME over the study area during the historical period. This figure, along with Fig. 3, indicate that all the CMIP6 models capture a similar spatial Pr pattern and none of them have been able to estimate the patterns. Most of the models have underestimated the amount of coastal Pr in Norway (ME < 0). In addition, they have overestimated the data in the UK. The findings illustrated that the outputs of the 12 CMIP6 models exhibited less systematic error in Finland, Sweden, and the eastern parts. Fig. 2 Violin plots of monthly area-averaged Pr of the analyzed CMIP6 models versus observational data over NE (1985NE ( -2014 While bias is a major issue, Fig. 5 depicts variation of monthly mean biases in space and with month through their quartiles in Box plots. The plots also have outliers as individual points, indicating variability outside the upper and lower quartiles. The figure indicates that in term of Bias, CESM2 seems best to capture the variability of Pr during the study period. However, it is necessary to perform the bias correction methods, reducing the uncertainties. Figure 6 further shows the future Pr change to the observed ratio (P/P obs ) versus the future change to the historical ratio (P/P his ). So, in this figure P obs and P his refer to GPCC data and data derived from the CMIP6 models for the historical period, respectively. Also, P represents modeled data for the historical period as well as future data for the period of 2071-2100. According to this figure, the blue dots representing the historical data should be on the bisector axis, which indicates that the mean of the observed observational data is close to the modeling. Among the models, the GFDL-ESM4 model and later the HadGEM3-GC31-LL and GISS-E2-1-G have this characteristic. Moreover, among these models, BCC-CSM2-MR and later CanESM5, CNRM-CM6-1, and FGOALS-g3 show the most changes in future Pr based on SSP1-2.6 and SSP5-8.5. Additionally, this figure shows that the FGOALS-g3 model has the worst performance according to these criteria.
As CMIP6 outputs are not forced with all of the historical oceanic and climatic boundary conditions and are not expected to simulate observational data at high spatiotemporal scales, assessments are usually performed over a large spatial domain averaged over a long-term period. Figure 7 shows 30-year time series of area averaged Pr over the Concerning Figs. 2, 3, 4, 5, 6, 7 and 8, results indicated that none of the employed raw CMIP6 models had provided accurate magnitudes of Pr compared to the observations. Results from these figures, along with previous studies (Gusain et al. 2020;Lovino et al. 2018;Rivera and Arnould 2019), highlight the need for post-processing the CMIP6 outputs. So, in this study, an ensemble method was employed to improve the climate forecasts.

Fig. 6
Future Pr changes to the observed ratio (P/P obs ) plotted against the future change to the historical ratio (P/P his ) over NE. This figure is inspired by Moshir et al. (2020)

Evaluation of the Proposed Multi-model Ensemble Estimations
In order to assess the performance of the proposed multi-model ensemble method, Table 2 summarizes the results for forecast verification of the CMIP6 models against the data derived from the proposed ensemble method. It evaluates the models based on the Bias, CC, MAE, ME, RMSE and StD. In this table, the top three models according to the evaluation criteria are shown in bold. As expected, the merged model never performed the worst in the study area and has been among the best models in Bias, CC, MAE, RMSE and StD. Also, it produces better statistics than the simple 12-model mean. Results are consistent with previous studies, using different ensemble techniques (Christiansen 2018b, a;Donat et al. 2010;Hargreaves et al. 2007;Kumar et al. 2021;Sanderson et al. 2015;Yazdandoost and Moradian 2021); but employing the utilized method is simple, comprehensible, rational, and efficient. So, it can be claimed that the proposed method provides accuratr forecasts as it corrects the drawbacks and enhances the advantages of individual sources for each 1° cell.
As previous studies have shown that the performance of climate models varies in temporal steps (Debebe and Mengistu 2020), in this part, we also investigated CMIP6 spatially averaged Pr simulations against observational data across different months (the right panel) and seasons (the left panel) in Fig. 9. In this figure, the lowest bar presents the observational data (GPCC). Based on the results of this figure, unlike the whole basin, the models in this area overestimated the amount of Pr. This figure also shows the superiority of the presented hybrid model over the Tana Basin. According to these results, in comparison with the commonly used ensemble mean method for examining forecasts (Ensemble Mean), the hybrid method can be trusted.

Future Changes in Pr over Tana River Basin
In the final step, a detailed evaluation of the effects of climate change on Pr was done over NE and the Tana River Basin based on the proposed technique. Since the study area has a wide range of Pr variability, firstly, we focused on capturing the spatial variability of Pr. Figure 10 depicts the distribution maps of monthly Pr derived from the proposed ensemble method during the historical and forecast periods over NE and the Tana Basin. The model can simulate Pr in regions receiving substantial Pr (e.g., the UK and Ireland). We note that the model captures relatively high Pr changes over the study area between   (Jylha et al. 2008). Furthermore, from 2071 to 2100, more extreme Pr events will occur over the basin based on SSP1-2.6 and SSP5-8.5. Figure 11 evaluates the changes in an extreme climate, the situation that the average conditions of the weather event is severe (IPCC 2007;Planton et al. 2008). According to the figure, from 2071 to 2100, more extreme Pr events will occur over the Basin based on SSP1-2.6 and SSP5-8.5 (see the tails of the distributions). Projected Pr data of the hybrid method reveals that the monthly mean Pr is expected to increase from 54.94 mm during 1985-2014 to 56.29 and 56.07 mm in 2071-2100 based on SSP1-2.6 and SSP5-8.5. It also showed an impact from recent climate-related extremes, such as cyclones, droughts, floods, heat waves, etc.; however, it suggested that combinations with extreme weather patterns will increase and as a consequence, the combined effects of these extreme events will be more serious than the effects of individual events (Kaewunruen et al. 2018).

Conclusion
In this study, the effects of climate change on Pr in NE were evaluated by employing 12 CMIP6 models, comparing the SSP1-2.6 and SSP5-8.5 in years 2071-2100 with the baseline period 1985-2014. For this purpose, based on the performance of the utilized CMIP6 models in the historical period, an ensemble method was presented to correct the systematic error of the models. Then, the effects of climate change over the Basin were investigated using the obtained data from the proposed method. Results showed that different CMIP6 models do not have the same accuracy in estimating Pr in the study area. And the proposed ensemble method can be appropriate in increasing the accuracy of the forecasts.
The present study consisted in a high-level, large-scale assessment. The proposed algorithm for developing a merged Pr model can be strengthened and extended from different aspects. For example, in this study, the weighting process was done based on the values of Pr data; but sensitivity analysis can be done to evaluate other possible parameters such statistical and categorical metrics, etc.to improve the efficiency of the proposed method.