Comparison of CMIP6 and CMIP5 model performance in simulating historical precipitation and temperature in Bangladesh: a preliminary study

The relative performance of global climate models (GCMs) of phases 5 and 6 of the coupled model intercomparison project (CMIP5 and CMIP6, respectively) was assessed in this study based on their ability to simulate annual and seasonal mean rainfall and temperature over Bangladesh for the period 1977–2005. Multiple statistical metrics were used to measure the performance of the GCMs at 30 meteorological observation stations. Two robust multi-criteria decision analysis methods were used to integrate the results obtained using different metrics for an unbiased ranking of the GCMs. The results revealed MIROC5 as the most skillful among CMIP5 GCMs and ACCESS-CM2 among CMIP6 GCMs. Overall, CMIP6 MME showed a significant improvement in simulating rainfall and temperature over Bangladesh compared to CMIP5 MME. The highest improvements were found in simulating cold season (winter and post-monsoon) rainfall and temperature in higher elevated areas. The improvement was relatively more for rainfall than for temperature. The models could capture the interannual variability of annual and seasonal rainfall and temperature reliably, except for the winter rainfall. However, systematic wet and cold/warm biases still exist in CMIP6 models for Bangladesh. CMIP6 GCMs showed higher spatial correlations with observed data, but the higher difference in standard deviations and centered root mean square errors compared to CMIP5 GCMs indicates better performance in simulating geographical distribution but lower performance in simulating spatial variability of most of the climate variables except for minimum temperature at different timescales. In terms of Taylor skill score, the CMIP6 MME showed higher performance in simulating rainfall but lower performance in simulating temperature than CMIP5 MME for most of the timeframes. The findings of this study suggest that the added value of rainfall and temperature simulations in CMIP6 models is not consistent among the climate models used in this research. However, it sets a precedent for future research on climate change risk assessment for the scientific community.


Introduction
Appraisal of climate change impacts on precipitation and temperature has become essential due to increased climate-related extreme events such as floods and droughts. Global climate models (GCMs) are vital for climate change impact assessment. However, the major challenge in climate change projections and impact assessments is selecting an appropriate subset of GCMs. GCM simulations are associated with large uncertainties arising from different sources, including model resolution, mathematical formulation, initial assumptions, and calibration processes that restrict the use of all GCMs for reliable projections of climate at the regional or local scale (Hijmans et al. 2005;Foley 2010;Chen et al. 2011;Northrop 2013;Khan et al. 2018a;Salman et al. 2018;Sun et al. 2018;Ahmed et al. 2019c). Therefore, a subset of GCMs is suggested by removing the less skilled models in simulating observed climate to minimize uncertainties in projection (Lutz et al. 2016;Lin and Tung 2017;Khan et al. 2018b;Salman et al. 2018;Ahmed et al. 2019b).
Previous studies also suggest GCM selection based on their performance in simulating the climate variable of interest to reduce the uncertainty in the projection of that variable (Gleckler et al. 2008;McMahon et al. 2015;Lutz et al. 2016;Sa'adi et al. 2017;Salman et al. 2018). The assessment of the ability of GCMs to simulate different climatic parameters, such as surface mean temperature, precipitation, summer monsoon rainfall, and sea surface temperature, has been demonstrated in different regions of the world (Perkins et al. 2007;Maxino et al. 2008;Johnson et al. 2011). The studies revealed no generally recommended approach for GCM selection. Besides, there are no well-established guidelines for the selection of appropriate GCMs. However, it is expected that the selected GCM would be able to replicate the mean, spatial variability, and distribution of historical climate . It is also suggested that the selection of GCMs based on their performance in simulating both rainfall and temperature as both are equally required for most of the climate change studies (Ahmed et al. 2019a;Nashwan and Shahid 2020;Shiru et al. 2020) GCM simulations disseminated through different phases of coupled model intercomparison project (CMIP) are vital sources for quantitative climate projection over the twentyfirst century (Baker and Huang 2014;Eyring et al. 2016). The CMIP phase 3 (CMIP3) GCM simulations (Meehl et al. 2007) were used to prepare the fourth assessment report of IPCC (Solomon et al. 2007). The CMIP5 models were the improved version of CMIP3 models in terms of physical processes and network accuracy (Taylor et al. 2012). Comparison of CMIP3 and CMIP5 models showed better performance of CMIP5 GCMs in simulating observed climate in many regions and large-scale atmospheric circulations that define regional climate (Sperber et al. 2013;Ogata et al. 2014).
A new coordinated series of climate experiments have recently been carried out under the umbrella of phase 6 of CMIP. In many ways, the CMIP6 GCMs differ from previous generations, including finer spatial resolutions, enhanced parameters of the cloud microphysical process, and additional Earth system processes and components such as biogeochemical cycles and ice sheets . The vital difference between CMIP5 and CMIP6 is the future scenario. CMIP5 projections are available based on 2100 radiative forcing values for four GHG concentration pathways (van Vuuren and Riahi 2011). In contrast, CMIP6 uses socioeconomic pathways (SSPs) with the CMIP5 scenarios premises (O'Neill et al. 2014). Therefore, the shared SSPs are considered more realistic future scenarios (Song et al. 2021). Another vital update of CMIP6 is the development and support of the intercomparison model, focusing on biases, processes, and climate models' feeds (Heinze et al. 2019). Several studies have been conducted to compare the performance of CMIP6 GCMs with CMIP5 GCMs in different regions (Rivera and Arnould 2020;Gusain et al. 2020). Both better and poorer performances of CMIP6 GCMs than their earlier versions in CMIP5 have been reported to simulate different climate variables and phenomena in different regions. Gusain et al. (2020) compared the performance of CMIP6 and CMIP5 models in simulating Indian summer monsoon rainfall and reported that the added value of CMIP6 models in summer rainfall simulation was inconsistent. Nie et al. (2020) showed that CMIP6 models provide more accurate measures of the magnitude of global temperature extremes compared to CMIP5. Rivera and Arnould (2020) showed the better capability of CMIP6 models in simulating declining precipitation and droughts in Southwestern South America. Wu et al. (2019) have found important enhancements in CMIP6 models in simulating tropospheric air temperature and circulation in East Asia at global and regional levels and climatic variability at different time intervals, including the diurnal rainfall cycle, annual shifts in sea levels, and the long-term surface air temperature trend in the Pacific Ocean. Different studies also showed higher warming and more sensitivity of CMIP6 GCMs compared to their previous version (Tokarska et al. 2020;Zelinka et al. 2020). Overall, the studies suggest the different performance of CMIP6 GCMs compared to CMIP5 GCMs in different regions. This can be attributed to the spatial variability of GCM uncertainty (Tiwari et al. 2014). Therefore, it is necessary to assess newly released CMIP6 models' ability to simulate the current climate and evaluate their performance relative to CMIP5 in different regions.
It remains unclear how well the new CMIP6 models simulate the climate response to anthropogenic forcing in Bangladesh. The comparison of CMIP6 and CMIP5 is important for the various sectors of this highly vulnerable country for which policymakers have been engaged in developing adaptation alternatives based on the climate change impacts assessed by the CMIP5 simulations. Any major improvement in the projection of the CMIP6 model relative to that of CMIP5 models will alter the probable impact and alternatives to adaptation (Shashikanth et al. 2014). However, the CMIP6 datasets have not been examined by any analysis to investigate precipitation and temperature changes in Bangladesh so far. Therefore, it is of great interest to systematically evaluate CMIP6 GCMs in climate simulation across Bangladesh and compare their performance with the previous generation of GCMs.
The motivation of this study is to compare the performance of CMIP6 models with their versions in CMIP5 in simulating precipitation and temperature climatology of Bangladesh for the period 1977-2005. The common GCMs from CMIP5 and CMIP6 were ranked based on their performance in replicating the annual and seasonal climatology to facilitate selecting suitable subsets of GCMs of CMIP5 and CMIP6 for climate change impact assessment in Bangladesh. The adaptation measures based on CMIP5 scenarios can be simplified with the new shared socioeconomic pathways (SSPs) scenarios. The performance assessment of CMIP6 models would also provide important information such as their biases for different climate variables in different timescales which are essential for making decisions on effective adaptation measures.

Study area
Bangladesh is located between latitude 20.34-26.38°N and longitude 88.01-92.41°E that is bordered by India on three sides (west, north, and northeast), Myanmar in the southeast, and the Bay of Bengal in the South. The country is a low-lying flood plain with three major river systems: the Ganges, the Brahmaputra, and the Meghna, commonly known as the GBM river system. The elevation of the country varies from near to mean sea level (MSL) in the south to about 105 m above MSL in the north (Fig. 1). However, there are few uplifted land and hills in the northeast and southeast of the country. A warm and humid climate characterized by wide seasonal variation in rainfall dominates the country. Most of the rainfall (~70%) occurs during the monsoon (June to September). Between the pre-monsoon (March to May) and post-monsoon (October to November) seasons, the rest of the rainfall is distributed ). The winter is fairly dry (December to February). About 20 % is flooded annually due to the flat topography and heavy rainfall during the monsoon. The recorded inundation was as high as 70% of the total land in extreme cases, as in 1998. According to some reports, annual precipitation in most parts of Bangladesh will increase in the twenty-first century. The drought-prone northern area will see the greatest rise in rainfall; however, rainfall will decrease in the southwest (Kamruzzaman et al. 2019a). The mean annual temperature of Bangladesh is about 25°C (Kamruzzaman et al. 2018). A noticeable regional variation in rainfall and temperature is seen in Bangladesh, despite being located in a monsoon-dominated area (Khan et al. 2019). The rainfall varies from nearly 1600 mm in the northwest to more than 4000 mm in the northeast, and the mean temperature varies between 11 and 29°C in winter and between 21 and 34°C during summer (Kamruzzaman et al. 2019b). Bangladesh frequently suffers from different kinds of natural disasters such as flash floods, monsoon floods, droughts, cyclone and storm surges, riverbank erosion, and urban floods. It is recognized globally as one of the most vulnerable countries to natural hazards and climate change.

Models, datasets, and analysis method
The study compared the performance of 11 GCMs of CMIP5 with their updated versions in CMIP6. The monthly simulation of rainfall (R), maximum temperature (Tmax), and minimum temperature (Tmin) of CMIP5 and CMIP6 GCMs were retrieved from the data portals of the Earth System Grid Federation (ESGF). For each model, only the historical realization was analyzed. The performance assessment was conducted for the period 1977-2005, considering the availability of observed data for that period. The list of GCMs and their developing organization is given in Table 1.
Rainfall and temperature data recorded at 35 in situ meteorological stations were collected from Bangladesh Meteorological Department (BMD). The common period of the collected data was 29 years, ranging from 1977 to 2005. After the quality control and homogeneity test, 30 stations were selected for the present study. The locations of the stations over the map of Bangladesh are shown in Figure 1. It can be observed that the stations are distributed over the country, and therefore, it can be considered that these 30 stations can well represent the climate of Bangladesh. Some missing values were observed in the collected dataset. However, the amount of missing data was <2%. The average values of the nearest three stations were used to replace the missing data.

Methodology
The performance of the GCMs was evaluated based on their capability in reconstructing annual and seasonal R, Tmax, and Tmin climatology of Bangladesh for the period 1977-2005.
Observed data is available from 1977, while the CMIP5 hindcast is available until 2005. Therefore, the period 1977-2005 was selected for performance assessment. The GCM simulations were interpolated to 30 observation locations using the inverse distance weighting method. The annual and seasonal mean of GCM and observed R, Tmax, and Tmin for the period 1977-2005 was estimated at all the thirty station locations. These values were compared using Kling-Gupta Efficient (KGE) metric to assess the performance of the  GCMs. For example, the performance of a GCM in simulating annual rainfall was evaluated by comparing the annual mean of GCM rainfall and the annual mean of observed rainfall at 30 locations. Similarly, the performance of GCMs for all three climate variables (R, Tmax, and Tmin) for five timescales (annual and four seasons) was computed. Therefore, a total of 15 (3 variables × 5 seasons) KGE values were generated for each GCM to present its performance. Ranking of GCMs based on their performance in simulating multiple variables in different timeframes is challenging because a GCM may show various degrees of accuracy for different variables and timeframes. Therefore, multi-criteria decision analysis (MCDA) algorithms were used to generate a composite index from 15 KGE measurements. In this study, two MCDA algorithms were used to avoid the bias that may arise from a single MCDA, which are global performance indicator (GPI) (Despotovic et al. 2015) and compromise programming index (CPI) (Raju and Kumar 2020). The GCMs were then ranked according to descending order of GPIs and ascending order to CPI. The simple average of the ranks obtained using GPI and CPI was used to provide the final rank of GCMs. Details of the methods used for performance evaluation and ranking of GCMs are presented in the following subsections.

Kling-Gupta efficiency
The KGE (Gupta et al. 2009;Kling et al. 2012) is an objective statistical metric that uses three measures, correlation, bias, and similarity invariance to assess the similarity between two datasets. The multi-component essence of KGE has made it a composite index to be used alone for a more holistic and balanced goodness-of-fit evaluation (Koch et al. 2018). KGE is expressed as follows: where r is Pearson's correlation between GCM simulation (s) and observed data (o) and β represents the bias normalized by the standard deviation of the observed data. γ is a fraction of the variation coefficient reflecting spatial variability, and μ and σ represent the mean and standard deviation of the simulation of GCM (s) and observed data (o), respectively. The values of KGE vary between 1 and −∞, where 1 indicates a perfect agreement. The KGE is a robust metric and is also commonly used as a metric for spatial assessment (Zambrano-Bigiarini et al. 2017;Ahmed et al. 2019a;Nashwan et al. 2019; Nashwan and Shahid 2020).

Global performance indicator
GPI (Despotovic et al. 2015) combines the effects of individual statistical indicators to provide a single measure. The GPI has been used in many other fields as an effective multicriteria decision analysis (MCDA) tool (Behar et al. 2015;Despotovic et al. 2015). The distance between the normalized value of a performance indicator and the median of the same performance indicator's normalized values is calculated.
where y j is the median of the normalized values of the performance indicator j, n is the number of performance indicators, and y i,j is the normalized value of the performance indicator j for the model i. A higher value of GPI indicates better performance.

Compromise programming index
The CPI also combines multiple performance metrics into a single metric like GPI but in a different way. The distance between the standardized value of a performance indicator and the ideal value of the same indicator is calculated (Raju and Kumar 2014): where j means statistical index, x 1 j is the normalized value of index j, x * j is the normalized ideal value of index j, and p is the parameter that was considered 1 in this study for measuring linear Euclidean distance from the ideal value. A lower value of CPI indicates better performance of a GCM.

Relative performance of CMIP5 and CMIP6 GCMs
Multi-model ensembles (MMEs) of CMIP6 and CMIP5 GCMs were prepared by a simple averaging method. The performance of individual GCMs and their MMEs for CMIP5 and CMIP6 were compared to show the relative performance of the GCMs of those two intercomparison projects. Taylor diagram (Taylor 2001) was prepared for visual presentation of relative performance. The Taylor diagram provides a concise statistical summary of the degree of correlation (spatial correlation coefficient (SCC)), centered root-mean-square error (CRMSE), and the ratio of spatial standard deviation (SD) and, thus, provides a composite comparison of model performance.
The quantitative assessment of the relative performance of CMIP5 and CMIP6 models was done using Tylor skill score ( (Taylor 2001), where r represents the correlation between model simulation and observation and SDR is the ratio of the standard deviation (SD) of model simulation and observation.

Evaluation of CCM
The KGEs estimated for CMIP5 and CMIP6 models in simulating annual, pre-monsoon, monsoon, post-monsoon, winter rainfall (R), maximum temperature (Tmax), and minimum temperature (Tmin) climatology in Bangladesh are presented in Table 2. The performance of the GCMs was found to vary significantly for different climate variables and timescales. For example, ACCESS1-0 of CMIP5 performed best in simulating annual Tmin and Tmax, while MIROC5 in simulating R. MIROC5 performed best in simulating monsoon R but performed badly in estimating pre-monsoon R. Similar disparity in the performance of CMIP6 GCMs can also be observed from Table 2. A large disparity in the performance of a model in CMIP5 and CMIP6 was also noticed. For example, MIROC5 performed best among the CMIP5 GCMs in simulating observed rainfall, but it was ranked 3 among CMIP6 GCMs. This inconsistency between CMIP5 and CMIP6 was more in simulating seasonal mean climatology compared to annual mean climatology. It was not possible to compare the relative performance of different GCMs due to large variability in their performance in simulating R, Tmax, and Tmin for different timescales. Therefore, GPI and CPI were used to generate composite metrics and ranking of GCMs.

Ranking of GCMs
The estimated GPI and CPI values for each of the CMIP5 and CMIP6 GCMs are presented in Table 3. The ranking of the GCMs based on average values of GPI and CPI are also presented in the table. As the GCMs were ranked based on composite indices, it can be considered that the ranks indicate their performance in reproducing the spatial characteristics of all climate variables for all timeframes. Results revealed MIROC5 as the most skillful CMIP5 GCM and ACCESS-CM2 as the most skillful among CMIP6 GCMs. On the other hand, IPSL-CM5A-LR and MPI-ESM1-2-LR showed the poorest performance among the CMIP5 and CMIP6 GCMs, respectively. Fig. S1-S4 compare the annual and seasonal precipitation climatology simulated by the CMIP5 and CMIP6 MMEs with observed climatology. The observed rainfall (both annual and seasonal except winter) in Bangladesh is highest in the east and gradually decreases to the west (Fig.  2a, S1a, S2a, S3a, S4a). This spatial feature was reasonably reproduced by both CMIP5 and CMIP6 MMEs (Fig. 2, S1-4, (b, c)). However, underestimation in annual, pre-monsoon, post-monsoon rainfall and overestimation in monsoon and winter rainfall, especially in the hilly eastern region, were noticed ( Fig. 2, S1-4 (b, c)).

Fig. 2 and
The spatial pattern in biases of CMIP5 and CMIP6 MMEs was almost similar. The dominant rainfall underestimation was in the northeastern and southeastern hilly areas. This indicates that the effect of high topography on precipitation is still a challenge in climate modeling. In CMIP5, the underestimation in those regions for annual, monsoon, premonsoon, post-monsoon, and winter precipitation was over 6 mm·day −1 , 3 mm·day −1 , 1 mm·day −1 , 0.4 mm·day −1 , and 0.1 mm·day −1 , respectively, which were higher than the underestimated values of CMIP6 models, particularly in the eastern hilly areas (Fig. 2, S1-4 (d, e)).
CMIP6 MME showed improvement compared to CMIP5 MME in terms of bias in annual, pre-monsoon, and postmonsoon rainfall in most of the country. However, the bias in CMIP5 MME for monsoon and winter was higher than CMIP6 MME. Notably, an improvement was observed in CMIP6 over CMIP5 in simulating the spatial variability of mean rainfall over the high rainfall receiving areas. A significant improvement in CMIP6 models and their MME has been reported over Central and North India by Jain et al. (2019). Most CMIP5 models underestimated high rainfall over these areas.
The performance of each GCM and MME for both CMIP5 and CMIP6 in producing observed rainfall is presented using the Taylor diagram in Fig. 3. The figure shows a large and dispersed between-model distribution for both CMIP6 and CMIP5 and, therefore, large variability in bias and RMSE. The SSC of CMIP6 MME was 0.61 (Fig. 3a), whereas it was 0.33 for CMIP5. Overall, CMIP6 models showed a better ability to simulate the spatial pattern of annual precipitation in Bangladesh. The SCCs of CMIP6 MME for pre-monsoon (0.57), monsoon (0.62), and post-monsoon (0.74) were greater than that of CMIP5 MME. However, it was inferior for winter (0.22). The SCC intervals for different models were relatively consistent for pre-monsoon, monsoon, post-monsoon, and summer. However, most models showed a poor correlation in winter. This may be due to low and erratic rainfall during winter, which is often difficult to be captured in climate models.

Correlation Coefficient
Standard Deviation Centered RMS Difference

Correlation Coefficient
Standard Deviation Centered RMS Difference  Table 1, respectively.
Compared to CMIP5 GCMs, SSCs were higher, but SDs and CRMSEs were further away from the observation for CMIP6 GCMs (Fig. 3). This indicates a relative superiority of CMIP6 GCMs in reproducing geographical distribution but inferiority in simulating spatial variability. A higher SD represents high extreme precipitation events (Mohsenipour and Shahid 2018;Attogouinon et al. 2020). Since CMIP5 MME has a smaller SD than that of CMIP6 except for pre-monsoon, the likelihood of extreme precipitation events is higher in CMIP6 GCMs. Fig. 4 shows the Taylor skill score of CMIP5 and CMIP6 models in reproducing annual and seasonal precipitation. It was found that skill score varies with seasons. However, the skill scores of CMIP6 models were higher than that of their previous versions in CMIP5 for all the seasons except winter.
The number of CMIP6 GCMs showing a better (poorer) score than their corresponding CMIP5 versions was 9 (2) for Fig. 4 Skill scores of the climatology of a annual, b monsoon, c pre-monsoon, d post-monsoon, and e winter precipitation in each model and the MME from CMIP5 and CMIP6 models annual and monsoon, 7(4) for pre-monsoon, and 5(6) for winter (Fig. 4). ACCESS-CM2 (0.99) and INM-CM5-0 (0.99) for annual, ACCESS-CM2 (1.00) and ACCESS-ESM1-5 (1.00) for monsoon, ACCESS-CM2 (0.98) for pre-monsoon, CanESM5 (1.00) for post-monsoon, and CNRM-CM6-1 (0.94) and INM-CM5-0 (0.94) for winter were the highest performing models of CMIP6. The most remarkable improvement among the models was found for IPSL-CM6A-LR (0.82) relative to IPSL-CM5A-LR (0.01) in the pre-monsoon season. The skill score of CMIP6 MME was 0.78 for postmonsoon, and 0.54 for winter, which were larger than the CMIP5 MME (0.50 and 0.19, respectively), indicating the better performance of CMIP6 MME compared to CMIP5 MME. Almost similar scores were found for annual (0.37) and monsoon (0.35) timeframes. However, CMIP6 MME showed a lower score than CMIP5 MME for pre-monsoon. Based on a fair comparison of GCMs produced by the same modeling group, the GCMs' precipitation simulation has slight improvement from CMIP5 to CMIP6, suggesting improvement in intrinsic key physics schemes (Wu et al. 2019). Fig.(S5-S8) present the Tmax climatology simulated by CMIP5 and CMIP6 MMEs and their biases in annual and seasonal timeframes. The annual and seasonal Tmax is highest over the western part of Bangladesh which gradually decreases to the east (Fig. 5a, S5a, S6a, S7a, S8a) except for the winter season. In winter, the maximum temperature is higher in the north and lower in the south. Both the CMIP5 and CMIP6 MMEs were able to reproduce this spatial attribute of Tmax reasonably. However, CMIP6 MME underestimated Tmax during post-monsoon and winter seasons and overestimated annual, monsoon, and pre-monsoon seasons in most parts of the country except for the high elevated areas (Fig. 5, S5-8 (b, c)).

Fig. 5 and
The Tmax biases in CMIP5 and CMIP6 MMEs were found dominant in the northwest, northeast, and southeastern hilly areas. The underestimations in CMIP5 MME for annual (0.98-2.98°C), monsoon (0.36-1.89°C), pre-monsoon (0.03-2.52°C), post-monsoon (1.13-4.02°C), and winter temperature (0.93-4.3°C) were higher than that for CMIP6, particularly in eastern hilly areas (Fig. 5, S5-8 (d, e)). The highest negative bias was found in cold seasons (winter and postmonsoon) due to cold bias in high elevated areas. Overall, the results indicate greater reproducibility of annual, postmonsoon, and winter Tmax in most areas by CMIP6 MME compared to CMIP5 MME. Fig. 6 shows the performance of each GCM and MME of both CMIP5 and CMIP6 on the Taylor diagram. The CMIP6 models performed well in simulating the spatial pattern of seasonal Tmax. The SCC of CMIP6 MME (CMIP5 MME) was 0.67 (0.63), 0.77 (0.65), 0.73 (0.73), 0.28 (0.30), and 0.59 (0.53) for annual, pre-monsoon, post-monsoon, and winter seasons, respectively. This indicates a greater or similar performance of CMIP6 models compared to CMIP5 models. For individual models, the correlation coefficients were ranged from 0.23 to 0.72 for the annual, 0.31 to 0.82 for monsoon, 0.31 to 0.76 for pre-monsoon, −0.04 to 0.42 for post-monsoon, and −0.07 to 0.79 for winter. The simulation to observed SD ratios was larger than 1 for both CMIP6 and CMIP5 MMEs. This implies that the models overestimated the annual and seasonal variabilities of Tmax. This ratio in CMIP6 MME was highest for monsoon (2.22) and lowest for post-monsoon (1.15). All the models also showed larger variabilities in monsoon Tmax ranging from 1.51 to 4.14. Compared to CMIP5 GCMs, SSCs for CMIP6 GCMs were higher, but SDs and CRMSEs were further away from observation (Fig. 6). This indicates a relative superiority of CMIP6 GCMs in reproducing the geographical distribution of Tmax but inferiority in simulating the spatial variability of Tmax. Fig.7 presents the skill scores of CMIP5 and CMIP6 models in reproducing annual and seasonal Tmax. The skill scores of CMIP6 models and MME were found lower than that estimated for their corresponding CMIP5 models. The number of CMIP6 GCMs showed a better (poorer) score than their CMIP5 parents that were 2(9) for annual, 3(8) for postmonsoons, 4(7) for monsoon, 3(7) for pre-monsoon, and 2(9) for winter. Among the CMIP6 models, the greatest improvement was noticed for INM-CM5-0 and MRI-ESM2-0 for all the timeframes. Fig. (S9-S12) compare the Tmin climatology simulated by CMIP5 and CMIP6 MME with observed climatology in annual and seasonal timescales. The observed annual and seasonal Tmin is highest in the south which decreases gradually towards the north except for monsoon (Fig. 8a, S9a, S10a, S11a, S12a). This spatial feature was reasonably reproduced by both the CMIP5 and CMIP6 MMEs. However, CMIP5 MME underestimated and CMIP6 MME overestimated annual, pre-monsoon, and monsoon Tmin, while both CMIP5 and CMIP6 MMEs underestimated postmonsoon and winter Tmin over most of the country (Fig. 8,  S9-12 (d, e)). The highest negative bias (less than 5°C) was found for winter and post-monsoon seasons in high elevated areas in CMIP5MME. Notably, an improvement was observed in CMIP6 MME over CMIP5 MME in simulating Tmin for annual and seasonal timeframes. The highest improvements were found in the cold season (winter and postmonsoon) in high elevated areas. This implies that CMIP6 MME can easily remove the cold bias compared to CMIP5 MME. Overall, the results indicate greater reproducibility of annual and seasonal Tmin in most of the areas by CMIP6 MME compared to CMIP5 MME.

Fig. 8 and
In the northwest of the country, biases in CMIP5 and CMIP6 MMEs were found to be dominant. However, the positive bias persisted in the southwest for CMIP6 MME.
The CMIP5 MME estimated Tmin in the northwestern regions for post-monsoon and winter more than 5°C less than the CMIP6 estimated Tmin.     Fig. 6. As in Fig. 3, but for maximum temperature Fig. 9 shows the performance of each GCM and MMEs of both CMIP5 and CMIP6 sets in producing observed Tmin based on the Taylor diagram. The SCC of CMIP6 MME (CMIP5 MME) were 0.74 (0.78), 0.77 (0.83), 0.70 (0.69), 0.50 (0.54), and 0.69 (0.66) for annual, pre-monsoon, monsoon, post-monsoon, and winter, respectively. This indicates better performance of CMIP6 GCMs compared to CMIP5 GCMs for all seasons except for winter. The SSC of the individual models were ranged from 0.33 to 0.61 for annual, −0.15 to 0.70 for monsoon, 0.34 to 0.71 for pre-monsoon, 0.19 to 0.59 for post-monsoon, and 0.22 to 0.63 for winter. The interval of annual and seasonal SSCs for CMIP6 was closer than that for CMIP5. The SSC intervals for different models are relatively consistent for pre-monsoon, monsoon, post-monsoon, and summer. However, most models showed poor SSC post-monsoon.
The simulated SDs of both CMIP6 and CMIP5 models were larger than the observed SD for all timescales, implying that the models overestimated annual and seasonal Tmin variabilities. Large variability was noticed for monsoon compared to other seasons. The SCCs, SDs, and CRMSEs were further away from observation in CMIP6 GCMs compared to CMIP5 GCMs (Fig. 9). This indicates the inferior performance of CMIP6 GCMs compared to CMIP5 GCMs in replicating both geographical distribution and spatial variability. Fig.10 presents CMIP5 and CMIP6 model's skill scores in reproducing annual and seasonal Tmin. It was found that the Fig. 7. As in Fig. 4, but for maximum temperature skill scores of CMIP6 MMEs were lower than that of CMIP5 MMEs for all seasonal except post-monsoon. The number of CMIP6 GCMs showed a better (lower) skill score compared to their CMIP5 parents that was 2(9) for annual, post-monsoons, and monsoons and 3(8) for pre-monsoons and winter. Among the CMIP6 models, the most significant improvement in simulating Tmin was observed for INM-CM5-0 and MRI-ESM2-0.  ESM1-5  CanESM5  CNRM-CM6-1  INM-CM5-0  IPSL-CM6A-LR  MIROC6  MPI-ESM1-2-HR  MPI-ESM1-2-LR  MRI-ESM2-0  NorESM2-LM  Ensemble   ACCESS 1-0  ACCESS 1-3  CanESM2  CNRM-CM5  inmcm4  IPSL-CM5A-LR  MIROC5  MPI-ESM-LR  MPI-ESM1-MR  MRI-

Correlation Coefficient
Standard Deviation Centered RMS Difference 2   Fig. 9. As in Fig. 3, but for minimum temperature

Discussions
Performance of GCMs of CMIP5 and CMIP6 in simulating rainfall and temperature at annual, pre-monsoon, monsoon, post-monsoon, and winter timescales over Bangladesh for the period 1977-2005 was evaluated in this study. Both the CMIP5 and CMIP6 MMEs were able to reasonably reproduce the spatial pattern of rainfall climatology in Bangladesh. However, both the MMEs underestimated annual, pre-monsoon, post-monsoon rainfall and overestimated monsoon and winter rainfall, especially in the hilly eastern region. The coarser resolution of GCMs does not capture the orographic effects and local landmass changes that affect spatial variability and rainfall distribution (Shashikanth et al. 2014;Jain et al. 2019). Therefore, high resolution of climate information is needed for the practical application over Bangladesh using spatial downscaling techniques. The wet bias in annual precipitation was higher over the northeastern hilly regions of the country. The models simulated higher than the observed rainfall in the northeastern region during monsoon and winter. This can be the possible cause of wet bias in annual rainfall. Most of the CMIP6 and CMIP5 models simulated the southwest monsoon signals and the easterly wind flows from the Bay of Bengal in winter and monsoon. However, precipitation uncertainties in monsoon and winter were more than those in pre-monsoon and postmonsoon due to the easterly wind system and the orographic effect.
Overall, an improvement in CMIP6 over CMIP5 was observed in simulating the spatial variability of mean Fig. 10. As in Fig. 4, but for minimum temperature precipitation over the high rainfall receiving areas. The result was found consistent with the finding of Jain et al. (2019) in India. They reported a significant improvement over Central and North India in CMIP6 models and their MME, where the majority of CMIP5 models underestimated high precipitation over these areas. Compared to CMIP5 GCMs, SCCs of CMIP6 GCMs were higher (except winter), but SDs and CRMSEs were further away from the observation (Fig. 3). This indicates a relative superiority of CMIP6 GCMs in reproducing the geographical distribution but inferiority in simulating spatial variability. Duan et al. (2013) indicated that the incorporation of sulfate aerosol indirect effects could enhance the ability of CMIP6 models to simulate the annual precipitation cycle, but a large bias remains. Considering the systematic model biases, more cautions should be taken in using GCM projections for impact assessment over Bangladesh.
The cold bias was a common error for most models in the previous generations of CMIPs (IPCC 2013;Guo et al. 2013). It was also found to exist in CMIP6. Zhu and Yang (2020) showed a general cold bias in annual mean temperature in China. Almazroui et al. (2020) showed a general cold bias in annual mean temperature in south Asian countries, including Bangladesh. This study presents that CMIP6 models outperformed CMIP5 in terms of simulating and Tmin for annual and seasonal timeframes in most parts of the country. The highest improvements were seen in high elevated areas during the cold season (winter and post-monsoon). SCCs of CMIP6 MMEs were higher than the CMIP5 GCMs, but SDs and CRMSEs were farther away from the observation (Fig. 6). It indicates a relative dominance in replicating the geographical distribution of Tmax but inferiority in replicating spatial variability. Almazroui et al. (2020) also demonstrated that some models overestimated the annual mean temperature. Previous studies based on CMIP5 models showed that the performance of GCM varies significantly and depends on the variable being considered (Kamworapan and Surussavadee 2019;Pathak et al. 2019). In this study, most CMIP6 GCMs, including their MME, had a lower ability score for simulating Tmax and Tmin at both annual and seasonal scales than CMIP5. However, some individual models performed well in CMIP6 compared to their earlier version in CMIP5. The results emphasize the need for further research to understand the origins of systematic model biases in Bangladesh.
Several CMIP6 models showed better simulations of temperature or precipitation compared to CMIP5. IMM-CM5 from CMIP5 and MRI-ESM2 from CMIP6 performed best in simulating temperature over the country. Contrary to this finding, the temperature simulation capability of CanESM5 over the Tibetan Plateau was among the top five CMIP5 models, reported by Chen et al. 2017. Likewise, IPSL-CM5A-LR from CMIP5 was the optimal model for simulating precipitation in China (Zhou and Li 2002). Zamani et al. (2020) reported that the outcomes of HadGEM2-ES from CMIP5 and CESM2 from CMIP6 were best in simulating precipitation in northeastern Iran. The uncertainties, errors, and topographic differences may be possible reasons for large geographical variation in the optimal model. Some explicit parameter control experiments are required to detect the main factor influencing the errors and uncertainties over Bangladesh in individual models. The weaker performance of new-generation models indicates that confidence is not compatible with larger scales in the model's ability to simulate surface temperature and precipitation on a regional scale. More attention should be paid to choosing CMIP6 models rather than merely replacing the corresponding CMIP5 model without verification. The bias correction is also needed to improve the utility of CMIP6 for further applications. The seasonal changes of intertropical convergence zone (ITCZ) bias and their differences among CMIP5 and CMIP6 models are significant (Tian and Dong 2020) in the Indian subcontinent, including Bangladesh. Moreover, the effects of El Nino-Southern Oscillation (ENSO), Indian Ocean Dipole (IOD), and Southern Oscillation Index (SOI) on rainfall and temperature have been observed in Bangladesh (Chowdhury 2003; Wahiduzzaman and Luo 2020; Yousuf 2019; Ghose et al. 2021). However, they are not examined in this paper and should be examined in the future.
Furthermore, considering the Indian summer monsoon (ISM) dominated climate of Bangladesh, it is expected that GCMs can simulate ISM appropriately and would provide better climate simulation for Bangladesh. Recently, a few studies have been conducted to evaluate the CMIP6 GCMs' performance in simulating ISM. Investigations revealed the GCMs can simulate mean monsoon rainfall is always not capable of simulating spatial variability of monsoon rainfall. Katzenberger et al. (2021) showed that CMIP6 GCMs, CNRMCM6-1, NorESM2-MM, and FGOALS-f3-L simulated mean ISM rainfall closest to the reanalysis mean, but all the models overestimated rainfall over the Himalaya region. This indicates that GCMs with better ISM rainfall simulation capability does not guarantee their good performance in Bangladesh. However, consistency in the findings of previous studies on CMIP6 GCMs' capability in replicating ISM was noticed with our results. Gusain et al. (2020) compared CMIP6 and CMIP5 MMEs' skills in simulating ISM rainfall and reported better capability CMIP6 MME in capturing ISM precipitation in most parts of India. The present study also revealed that CMIP6 MME can better simulate rainfall and temperature over Bangladesh than the CMIP5 MME. The models could capture the interannual variability of annual and seasonal rainfall and temperature reliably, except for the winter rainfall. Winter rainfall is little (nearly 3% of total annual rainfall) and erratic in Bangladesh, and therefore, GCMs are not expected to capture such rainfall. Among the CMIP6 models, ACCESS-CM2 and INM-CM5-0 showed better performance in simulating annual rainfall, ACCESS-CM2 and ACCESS-ESM1-5 for simulating monsoon rainfall, ACCESS-CM2 for pre-monsoon, and CanESM5 for postmonsoon rainfall. Though ACCESS-CM2 performed best to produce annual and monsoon rainfall in Bangladesh, Katzenberger et al. (2021) showed that it underestimated India's ISM rainfall. But their results showed that it was able to replicate the spatial distribution of monsoon rainfall over northeast India, including Bangladesh. Improvement in model resolution from one generation to another generation improved the model's performance. For example, the spatial resolution of many CMIP5 models was higher than CMIP3 models. Sun et al. (2015) evaluated the performance of CMIP3 and CMIP5 GCMs. They concluded that improvement in CMIP5 models' skill over CMIP5 models was partially due to spatial resolution improvement. It is generally assumed that improved parameterizations and additional process representations required to improve models' spatial resolution eventually improved models' skills (Sheffield et al. 2013). However, the improvement in models' skills was not valid in simulating all climate variables. Chan et al. (2012) showed that the improvement in resolution did not improve the precipitation simulation skill of some CMIP5 models compared to CMIP3 models. However, model skill is not related to model resolution for a particular CMIP. For example, the skills of CMIP5 models are not related to their resolution. The resolution of CMIP6 models, developed for basic diagnostic analysis, is not different from CMIP6 (Table 1). Therefore, the improved performance of some of the CMIP6 GCMs cannot be related to the resolution. Besides the basic diagnostic simulations, CMIP6 introduced several model intercomparison projects (MIPs) for specified climate change assessment. HighResMIP is one such MIP that can be used in the future to evaluate the performance of highresolution GCM compared to basic diagnostic models.

Conclusion
The ability of eleven CMIP6 climate models was compared with their previous versions in CMIP5 in simulating annual and seasonal mean rainfall and temperature over Bangladesh for the period 1977-2005. The results showed MIROC5 as the most skillful among CMIP5 GCMs and ACCESS-CM2 among CMIP6 GCMs in reproducing annual and seasonal rainfall and temperature of Bangladesh. The CMIP6 GCMs showed better skill in simulating the geographical distribution of temperature and precipitation climatology over Bangladesh. The performance was relatively better for rainfall than for temperature. However, systemic wet biases in CMIP6 were found to exist in high precipitation receiving areas. CMIP6 models outperformed CMIP5 in most parts of the country in simulating and Tmin for annual and seasonal timeframes. During the cold season (winter and post-monsoon), the highest changes were observed in high-altitude regions. CMIP6 MME also showed significant improvement in Tmax and Tmin biases, but systemic cold/warm biases still exist. However, the highest improvement was found in cold seasons (post-monsoon and winter) in high elevated regions. SCCs of CMIP6 GCMs were higher than that for CMIP5 GCMs, but SDs and CRMSEs were farther away from the observation for most of the CMIP6 GCMs. This indicates a relative dominance of CMIP6 GCMs in replicating geographical distribution of temperature but inferiority in simulating spatial variability in temperature. However, for minimum temperature, relative inferiority was noticed in simulating both geographical distribution and spatial variability. The Taylor skill score showed a higher score for CMIP6 MME in precipitation simulation but a lower score for temperature than CMIP5 MME in most of the timeframes. However, some individual models showed good agreement with observation in simulating the quantity and spatial distribution of rainfall.
This preliminary study has some limitations and can be regarded as a possible field of near-future research that heralds the beginning of a new age of high-resolution climate models for CMIP6. A similar evaluation could be carried out after the release of more CMIP6 models to gain a greater insight into the changes within CMIP6 models commensurate to climate valuables over Bangladesh.