In recent years, climate change, a complex and multi-dimensional phenomenon, and its impact on human living environment have become a key research issue (Cameron, 2011; Schlosser & Pfirman, 2012). Precipitation and temperature are often used as the major variables in climate change studies (Sheffield and Wood, 2008). Existing studies have shown that changes in the two variables of precipitation and temperature will cause changes in the frequency and intensity of drought (Qaisrani et al., 2021), flood (C. Wu et al., 2014) disasters, heat wave (Brown, 2020),and low temperature events (Y. Wang et al., 2016). It is generally believed that the sharp rise in temperature during the post-industrial revolution has had a significant impact on the global hydrological cycle (Schlosser & Pfirman, 2012), which in turn affects the human production and living environment. Therefore, it is necessary to study the spatial variables of climate model, such as precipitation and temperature (Das et al., 2015). The prediction of climate change is mostly based on the Global Climate Model (GCM) simulating the large scale process of atmospheric circulation system (Lal et al., 2012). The parameterization of some unknown physical processes (Choi et al., 2017) and the lack of initial conditions (Piao et al., 2021) result in the uncertainty of the models. Therefore, it is important to evaluate uncertainty in different models (Tebaldi & Knutti, 2007).
The Couple Model Inter-comparison phase (CMIP) is based on the model comparison results represented by the mid-19th century atmospheric model comparison project. Under the auspices of the World Climate Research Programme (WCRP), the CMIP is developed to gain a better understanding of the past, present, and future climate change in a multi-model environment (Simpkins, 2017). WCRP CMIP6 is developed on the basis of CMIP historical simulation, organized by IPCC and composed of a series of GCMs (Eyring et al., 2015). The GCMs is developed based on the CMIP, which is mainly used to simulate and predict climate globally (Knutti et al., 2017; Nguyen et al., 2017). Compared with CMIP5, CMIP6 improves climate sensitivity in Earth models, in addition to increased horizontal and vertical resolution (Wyser et al., 2020), and performs better overall in terms of inter-annual variation (Schlosser & Pfirman, 2012). Currently, the CMIP6 already has over 100 GCMs with different resolutions produced by multiple institutions. In the analysis of climate change and its impact, in order to save human resources, a subset of all patterns are usually selected for analysis (L. Gu et al., 2020; Herger et al., 2018; H. M. Wang et al., 2020).
Some studies try to prove the applicability of variables in GCMs, such as precipitation (H. Gu et al., 2015; McMahon et al., 2015). Applicability is usually verified in four aspects: (1) update, considering only the latest generation of GCM, (2) spatial resolution, high resolution is better than low resolution applicability, (3) effectiveness, considering the performance of different GCM, and (4) representativeness, a combination of variables (e. g. precipitation) within GCM (Feenstra et al., 1998). In the above criteria, the third method is used more, where GCM is sorted and selected according to the performance of its simulated historical climate (Mendlik & Gobiet, 2016).
There are currently some methods used to evaluate the performance of historical data of climate models, such as reliability average integration method (Nychka & Tebaldi, 2003), relative entropy (Shukla et al., 2006), Bayesian method (Tebaldi et al., 2005), probability density function (Perkins et al., 2007), hierarchical ANOVA model (Sansom et al., 2013), clustering method (Knutti et al., 2013), correlation method (Jiang et al., 2015), symmetric uncertainty method (Salman et al., 2018), etc. Johnson & Sharma (2009) evaluated the inter-annual variability of GCM, and Thober & Samaniego (2014) selected indicators of extreme precipitation and extreme temperature for evaluation. Ahmadalipour et al. (2017) incorporated some performance evaluation methods, such as root mean square error, average absolute error, correlation coefficient, and comprehensive scoring index to evaluate the accurate performance of GCM historical data. Some studies evaluate GCM from different time scales, daily scale (Perkins et al., 2007), monthly scale (Srinivasa Raju et al., 2017), seasonal scale (Ahmadalipour et al., 2017), and annual scale (M. et al., 2004). In addition to the time scale, some studies have also performed GCM on the spatial scale, such as the spatial area average (Abbasian et al., 2019; Ahmadalipour et al., 2017) and the performance of GCM on the spatial grid (Srinivasa Raju et al., 2017).
Some scholars (McMahon et al., 2015; Räisänen, 2007) believe that there is no widely accepted time scale standard for GCM evaluation. Gleckler et al. (2008) demonstrated that GCM's assessment based on different time scales such as seasonal precipitation can provide important information for water resources management. McMahon et al. (2015) stated that simulations of GCM on annual timescales can produce long-term mean statistical values better than the daily scale. Srinivasa Raju et al. (2017) and Salman et al. (2018) argument provide more useful information for the GCM evaluation in the region, and GCM selection based on their performance at the grid point cannot guarantee its ability to simulate spatial patterns in the regional climate. Koch et al. (2018) and Demirel et al. (2018) argued that the Climate Model Committee is mostly focused on temporal performance of GCMs, ignoring the evaluation of direct spatial performance. They also emphasized the importance of GCM evaluation using multiple spatial metrics.
Tian et al. (2021) conducted a statistical analysis of precipitation data under different scenarios of CMIP6 in four different directions in China, and found that different models performed differently in different regions. Yang et al. (2021) showed that the results of climate models have different simulation effects for different variables in China. Li et al. (2021) demonstrated that different models in the Yangtze River Basin have different degrees of deviation in precipitation prediction. Some studies (Ahmed et al., 2019; Pour et al., 2018; Salman et al., 2018; Srinivasa Raju et al., 2017) considered performance on the study region evaluating GCM, however, they ignore the ability performance of the GCM spatial pattern.
Accordingly, this study focus on the evaluation of the performance of climate variables of GCMs in historical temporal and spatial assessments. On the premise of this research, the existing research hypothesis is that part of the GCM model set based on the ability to simulate the temporal and spatial model of precipitation can be used in China. The rest of the paper is arranged as follows, Section 2 presents a brief introduction of the study area and datasets, and Section 3 presents the methodology, including the GCM performance assessment metrics and comprehensive rating metrics. Section 4 presents the results, followed by the discussion in Section 5 and conclusion in Section 6.