DTR in Winter Wheat Growing Regions of China: CMIP6 Models Evaluation and Comparation

Winter wheat is widely planted in China. The changes of winter wheat yield and quality are related to the food security of human society. Climate change has an important impact on the yield and quality of winter wheat. Diurnal temperature range (DTR) is an important factor affecting the yield and protein content of winter wheat. Furthermore, climate model is one of the main sources of error in crop model simulations of yields. Therefore, how to improve the accuracy of climate data has become an important concern for scholars.Previous model evaluations for the entire country or region cannot answer which model is suitable for the estimation of future winter wheat yield. Therefore, we evaluated the ability of climate models to simulate DTR within the range of winter wheat growing regions in China to identify the most suitable climate models for winter wheat yield and quality projections. The results show that CMIP6 models can basically reproduce the DTR of winter wheat-growing regions in China, but there are discrepancies in the simulations between nationwide and winter wheat-growing regions. EC-Earth3-Veg has the best simulation of climate DTR for wheat-growing regions (TS=0.848) and nationwide (TS=0.842), and ACCESS-CM2 has the strongest ability to simulate the annual growing season DTR (TS=0.46). In summary, in the estimation of future winter wheat yield, attention should be given to the selection of models suitable for the actual growing regions and the growing seasons of winter wheat.


Introduction
Winter wheat is widely grown in China ( Cao et al., 2011;Song and Dong, 2006) and is one of the major food sources for humans. The changes of winter wheat yield and quality are related to the food security of human society. It is necessary to estimate the future yield of winter wheat. The protein content of wheat is known to be in uenced by the environment and other factors (Rao et al., 1993;Baenziger et al., 1985;Vaughan et al., 1990; Smika and Greb, 1973).The yield, quality and growing season of winter wheat vary with geographical location (Wu et al., 2021). Climate change has an important impact on the yield and quality of winter wheat. The diurnal temperature range (DTR) is one of the most important agrometeorological variables in agricultural production. The quality of crops as well as their yields are often largely in uenced by DTR. Nutritional quality is an important index to evaluate the quality of food crops (Mo et al., 1993), and the amount of protein provided by food crops exceeds 20% of the total consumption of protein (Jin et al., 2018). DTR is an important factor affecting the yield and protein content of winter wheat (Wang et al., 1990). Therefore, DTR is one of the meteorological variables agricultural scientists give great attention to.
To predict the future yield of winter wheat, scholars have developed a variety of crop growth models (Keating et  America, Europe, mid-latitude Asia and Australia) and pointed out that the DTR has decreased in the past decreasing DTR trends have been universally observed (Karl et al., 1993;Easterling et al., 1997;You et al., 2017). You et al. (2017) assessed the DTR simulated by 17 CMIP5 models over the Tibetan Plateau (TP)   by comparing model simulations with observations over the period 1961-2005. Most CMIP5 models  generally underestimated DTR compared with observations, and 15 CMIP5 models reproduced an overall decreasing trend in DTR on the TP, but the rate of the decreasing trend was smaller. It has also been suggested that the pattern differences in the DTR over the TP may be determined by the radiometric variables and total cloudiness in CMIP5 models. Wang and Clow (2020) evaluated the ability of CMIP6 models to simulate the global DTR and showed that CMIP6 models underestimate climatological DTR relative to observations and do not fully re ect the observed spatial and temporal evolution of DTR. The large differences between models appear to be controlled by the daily minimum temperatures. Overall, the CMIP6 models do not improve its ability to simulate temporal changes in DTR from 1901 to 2005 comparied with CMIP5 models. The CMIP6 models are generally better than the CMIP5 version in simulating the rapid decline in DTR from 1951 to 1980. Lindvall and Svensson (2015) evaluated the simulation ability of 20 CMIP5 models in simulating the terrestrial DTR of recent and future projections using HadGHCND and CRU and found that the DTR varies considerably between CMIP5 models and that the DTR is often underestimated. However, most models predict a decrease in global mean DTR but an increase over Europe and a decrease over the Sahara Desert.This creates a great deal of confusion in applying the results of climate models to assess the extent to which crops are exposed to climate change.
The retrospective analysis of systematic biases in current climate models as well as their correction is one of the scienti c issues that CMIP6 focuses on (Zhou et al., 2019). This suggests that the ability of climate models to simulate climatic variables which are critical in agriculture need to be carefully evaluated to truly provide a more credible understanding and perception of the agricultural impacts of climate change. Moreover, existing assessments of patterns (Lindvall and Svensson, 2015; Zhuang and Zhang, 2020) usually select the entire country or region instead of the real crop-growing regions. However, the previous model evaluations for the entire country or region cannot answer which model is suitable for the estimation of future winter wheat yield, and focus more on the mean temperature and less on the DTR. Therefore, the simulation ability of the DTR was evaluated and analyzed in winter wheat-growing regions in this study to accurately serve the prediction of future yield and quality of winter wheat.

CMIP6 Model outputs
CMIP6 takes into account the effects of external forcing, including natural factors and human activities, over time in the simulation of historical periods. Global near-surface maximum air temperature and minimum air temperature data simulated by twenty-six CMIP6 models from 1961 to 2014 were retrieved from the CMIP6 website (https://esgf-node.llnl.gov/search/cmip6). The model data used in this study were the simulated results of the near-surface maximum air temperature (Tasmax) and near-surface The available starting and ending times of these data are 1961-2018, with a high spatial resolution of 0.5°×0.5°. This dataset has a long time span and high spatial resolution. The generation process of this dataset only used the actual observational data of observation stations for statistical interpolation, covering the entire land area of China (Taiwan province is missing statistical data). Compared with the reanalysis data, the CN05.1 data have greater reliability.
The data of all models were interpolated uniformly to a 0.5°×0.5° grid using the bilinear interpolation method. Due to the different time spans of the model data and observational data, only China's land area was considered in this study, and the study period from 1961 to 2014 was 54 years in total.

Methods
To facilitate the analysis, a bilinear interpolation method was adopted to interpolate the model data uniformly to the same resolution, corresponding to the grid positions and resolutions of the observed datasets. According

Evaluation of climatological DTR
The climate mean from 1995-2014 was selected to evaluate the simulation ability of CMIP6 models to the spatial distribution of the DTR in winter wheat-growing regions in China.

Evaluation of DTR in the winter wheat-growing season
The research results of Wang (2020) indicated that the reviving and maturity periods of winter wheat in northern China mainly occur from March to June. In this study, March to June of the current year was selected as the winter wheat-growing season.54 years of data from 1961 to 2014 were selected for analysis in this study.

Evaluation of winter wheat-growing season DTR interannual trend
Based on 54 years of data from 1961 to 2014, the interannual variation trend of the DTR in growing season were calculated to evaluate the simulation ability of the CMIP6 models and multimodel data.

Multimodel ensemble method
It was revealed in previous studies that the multimodel ensemble mean usually shows a higher reliability to reproduce the present Chinese climate relative to an individual model (Jiang et al., 2005(Jiang et al., , 2009). Therefore, the following two multipattern ensemble methods were used in this study: (1). Multimodel arithmetic mean ensemble with the same weights (MME); (2). Multimodel median mean ensemble (Median).

Performance Metrics
In the evaluation of the CMIP6 models simulation capability to DTR in winter wheat-growing regions in China, quantitative calculations were carried out and are shown in the Taylor diagram. The Taylor diagram compares the consistency between the model results and observations according to the correlation coe cient (R), centralized root mean square error (RMSE'), standard deviation of the model simulation and observational results (Taylor, 2001).
To further evaluate the overall skills of the CMIP6 model for DTR simulation in China's winter wheatgrowing regions, Taylor skill score (TS) was used. Taylor skill score (TS) is: where are the standard deviations of the observation and simulation, respectively. R is the spatial correlation coe cient between the simulation and observation, and R 0 is the maximum correlation coe cient attainable. The score is between 0 and 1. When TS is 1, it indicates that the pattern matches the observation perfectly; otherwise, if TS is 0, the pattern does not match at all.

Evaluation of climatological DTR
The CMIP6 models can basically reproduce the spatial distribution characteristics of the climate mean DTR (Figure 2), which is consistent with CMIP5 (Lindvall and Svensson, 2015): 1). The DTR increased gradually from low latitudes to high latitudes, and the DTR in winter wheatgrowing regions ranged from 8 °C to 12°C.
2). From coastal to inland regions, the DTR gradually increases, and the DTRs of NEC and TP are higher than those of other regions. Figure 2(e)~(h) shows the simulation results of the multimodel ensembles. When comparing with observations, the multimodel ensemble data are approximately 3°C lower than the observations nationwide and 6°C lower than the observational data in NEC. In addition, the DTR in CY is 2°C higher than the observation.
EC-Earth3-Veg has the best simulation ability among 26 CMIP6 models for simulating the climatological DTR in both the national (TS=0.842) and winter wheat-growing regions (TS=0.848). Regardless of the entire country or in wheat-growing regions, the simulation effect of DTR simulated by the MME and Median is not as good as that simulated by EC-Earth3-Veg. The same conclusions can be drawn across the country. For winter wheat-growing regions, the simulations of EC-Earth3-veg in SWC are relatively small compared with the observed values, while the simulations in other regions are approximately 1°C higher. In the entire country, it also shares the same characteristics, but in NEC, the simulation is approximately 3°C lower, and the biases in some regions are more than 5°C.
The simulated results of each model have great spatial differences ( Figure 3). The mean SD within the winter wheat-growing regions is 2.33, and in the country, the standard deviation is 2.72. The consistency within the winter wheat-growing regions is higher than that in the entire country. The SDs of simulations in NEC and TP are slightly higher than those in other regions, at approximately 4°C, and the difference between models is large. This shares the same characteristics with mean temperature (Guo et al., 2013;Zhou and Yu, 2006). The SD of the simulated results in Xinjiang is slightly higher, and the simulated results of the CMIP6 multimodel in this region are greatly different. This indicates that CMIP6 models have good simulation capability in eastern China. In NEC and TP, there are great differences between the simulated results of different models. This indicates that, similar to CMIP5, CMIP6 models are still de cient in their ability to simulate the climate mean surface temperature and DTR in TP, which is also consistent with the previous assessments ( In general, the correlations between simulations and observations are concentrated in the range of 0.5-0.85 in both nationwide and the wheat-growing regions (Figure 4). The SDs of most models are smaller than observations. EC-Earth3-Veg has the highest correlation coe cient with observations in the entire country, which is 0.84. The correlation coe cient for EC-Earth3-Veg with observations is 0.856 in the winter wheat-growing regions, which is higher than that in nationwide. More than 80% of the CMIP6 models have smaller SDs than the observations. The SD of EC-Earth3-Veg-LR (2.38) is the closest to the SD of the observed results (2.46), which can better re ect the true spatial distribution of the DTR. The RMSEs of simulations in winter wheat-growing regions are smaller than the results in the entire nation. We evaluated each region and calculated the TS scores for each model separately, the evaluation results are presented in Figure 5 to show the comprehensive performance of the model more visually.
In general, there are large differences in model performance in different regions: EC-Earth3-Veg scores the highest among the 26 CMIP6 models both nationwide and in the growing regions, with TS scores of 0.84 and 0.85, respectively, for overall better simulation ability. The multimodal ensembles (MME, median) have better simulated results at the national scale than at the growing regions scale. However, the MME (TS=0.72) and Median (TS=0.65) are not as good as the EC-Earth3-Veg. For different regions, EC-Earth3-Veg has the highest TS scores of 0.75, 0.89, and 0.67 in NC, SWC and CY, respectively, which are consistent with the results of nationwide and winter wheat-growing regions assessments. However, the simulation ability of EC-Earth3-Veg is not the best in JH and SWC. This indicates that even the bestperforming model among 26 models does not always have the best performance in all regions. The multimodel ensembles (MME, Median) share the same characteristics. Calculating the mean TS scores for each region separately, the results show that the CMIP6 models have better simulation effects in JH (TS=0.57) and NC (TS=0.49); the simulation effects in NWC (TS=0.20) are relatively weak, and there are large differences, which is also related to the complex topography of NWC (Hu et al., 2014).

Evaluation of DTR in the winter wheat-growing season
According to the observations, the growing season DTR in the wheat-growing regions shows a decreasing trend at a rate of -0.080°C/10a ( Figure 6). The same trend is observed at the national scale, with a decreasing rate of -0.185°C/10a. The rate of decrease is greater at the national scale than in the growing regions. The multimodel ensembles can basically simulate these trends with slower rates. CMIP6 models can simulate the corresponding trends better ( (2) The positive correlation rate of MME is higher nationwide (positive=92.02%) than that of the singlemodel simulation but less effective than that of the single-model simulation within the growing regions (positive=94.67%). In contrast, the MME is more effective than the Median. In the entire country, 80.36% of grid points have a positive correlation coe cient, but the distribution of negative correlation grid points is more extensive. Within the winter wheat-growing regions, the spatial distribution of the Median and MME correlate more consistently with the observations.
There are large differences in the performance of CMIP6 models in different regions (Figure 9): EC-Earth3-Veg has the highest TS score (0.54) nationwide; however, within the growing regions, ACCESS-CM2 simulates better than EC-Earth3-Veg with a TS score of 0.46. The multimodal ensemble data perform better nationwide than those in growing regions. For different regions, ACCESS-CM2 has the highest TS scores of 0.46 and 0.47 in NC and JH, respectively, which are consistent with the assessment results in the growing regions. However, no model always has the best simulation ability in all or most regions: in SWC, GISS-E2-1-G has a TS score of 0.39, which is higher than other models. The multimodel ensemble data (MME, median) have the same characteristics. By calculating the average TS scores for each region separately, we nd that the average TS scores of the CMIP6 models are more concentrated in each region, with less variation between regions. Furthermore, the TS scores of the multimodel ensemble data are relatively lower in all regions, and the simulation effects are also not as good as those of the individual models.
The same evaluations were performed to the simulation of annual DTR.The results show that CanESM5 have highest TS scores both in nationwide (0.68) and growing regions (0.58) among 26 models. EC-Earth3-Veg (0.41) and ACCESS-CM2 (0.34) also perform better than most models in nationwide and winter wheat-growing regions, respectively. The CMIP6 models have different simulation effects for DTR of annual and growing seasons, and the models which have best perofrmance for mean DTR of annual and growing seasons are different.

Evaluation of winter wheat-growing season DTR interannual trend
The CMIP6 models can reproduce the spatial distribution characteristics of growing season DTR trends ( Figure 10): (1) Within the growing regions, according to the observations, the growing season DTR has an increasing trend in CY and southern JH, with a rate smaller than 0.2°C/10a. There is a clear decreasing trend in Shandong, with the rate decreasing between 0.2 and 0.4°C per decade. In SWC, the rate of decline is slow, less than 0.2°C/10a. Nationwide, the increase of DTR mainly occurs in CY, Shaanxi, the southern part of JH and the northern part of SEC. Signi cant decreasing trends exist in the NWC and TP, as well as in NEC, with some grid points exceeding 0.4°C per decade.
(2) KIOST-ESM (TS=0.46) and MPI-ESM-1-2-HAM (TS=0.37) have the best simulation effects in winter wheat-growing regions and nationwide, respectively. The simulation effects are better in the winter wheatgrowing regions overall than nationwide. In addition, the simulated results of the CMIP6 models nationwide have a tendency to increase over a large area in the northern region, which is less consistent with the observation results.
In general, the correlations (Figure 11) between the models and observations are concentrated between 0.1 and 0.55 both nationwide and in growing regions, respectively. The standard deviation varies widely, with most models having smaller standard deviations than observations. For individual models, nationwide, FGOALS-f3-L and INM-CM5-0 have the highest correlations with observations, reaching 0.55 and 0.41, respectively. In the winter wheat-growing regions, the correlation coe cients of EC-NESM3 and MRI-ESM2-0 with observations reach 0.55 and 0.48, respectively, which are higher than the nationwide correlations. The standard deviation of EC-Earth3-Veg-LR (2.38) is closest to that of the observations (2.46), which can better re ect the true spatial distribution of daily temperature differences. The RMSEs of simulations in winter wheat-growing regions are smaller than the results in the entire nation. The multimodal ensembles all perform better in the growing regions than in the entire nation, while the multimodal ensemble data (MME, Median) have better effects on improving the correlations of the simulated results (growing regions: R mme =0.67 and R median =0.47). In particular, the multimodal ensembles, although they signi cantly improve the correlation, have large differences from the standard deviation of the observed results (Wu et al. 2016).
There are signi cant differences in the performance of the models in different regions ( Figure 12): MPI-ESM-1-2-HAM has the highest TS score, 0.37, in the entire country, while KIOST-ESM has a better TS score, 0.46, in the growing regions. CMIP6 models perform better than both the MME and the Median. In terms of subregions, KIOST-ESM has the highest TS score (0.73) in NC, which is consistent with the evaluation results of the wheat growing regions. EC-Earth3 in JH has the best performance (TS=0.31). No model is capable of providing the best simulation in all or most regions. Multimodel ensembles (MME, median) share the same characteristics. The mean TS scores are calculated separately for each region, and we nd that, in general, the mean TS scores of CMIP6 models are more concentrated in each region, and the differences in TS scores between regions are smaller. future yield changes for a variety of crops, including wheat. It is worth noting that most GCMs used in the above studies may have a good simulation capability for national or regional climate as a whole, but when applying the output data of GCMs to analyze the impact of climate change on winter wheat or other crops, the simulation capability of these models in the actual wheat-growing regions of the study area was not further evaluated. In this study, we found that most CMIP6 models can simulate the DTR well at the national scale, but these models do not have the best simulation ability in the actual winter wheat-growing regions. Therefore, the results of climate change effects on winter wheat yield, phenology and quality based on these models may be inaccurate. Climate models are the second largest source of error in predicting wheat yields and the targeted evaluation would reduce this uncertainty.

Conclusions
The evaluation of the DTR simulation by CMIP6 models across China and within winter wheat-growing regions from 1961 to 2014 provide an understanding of the model's ability to simulate the spatial distribution and trends of the DTR in China. Our study provides a better de nition of the scope of model assessment for winter wheat by using actual winter wheat-growing regions and therefore allows for more targeted assessment results when quantitatively assessing the ability of climate models to simulate critical agrometeorological elements.In this study, the performance of CMIP6 models in reproducing the DTR in winter wheat-growing regions in China was analyzed by comparing model simulations with observations. CMIP6 models can basically reproduce the DTR for winter wheat-growing regions in China, but there are discrepancies in performance between regions. The main conclusions are summarized as follows: (1) When studying the impact of climate on winter wheat, data from the model with the best simulation effects in the winter wheat-growing regions should be selected. In this study, the following recommendations are made for the CMIP6 models in each region: for the simulation of climatological DTR, EC-Earth3-Veg is recommended in the winter wheat-growing regions, and it is also recommended in NC, SC, and CY. However, EC-Earth3-Veg-LR is recommended in JH and CanESM5 in NWC. For the simulation of DTR in the winter wheat-growing season, ACCESS-CM2 is recommended in winter wheatgrowing regions. In each subregion, the model that is recommended differs from each other. ACCESS-CM2 is the recommended model both in NC and JH. In SWC, GISS-E2-1-G is recommended; in CY, AWI-ESM-1-1-LR is the recommended model; and EC-Earth3-Veg is recommended in NWC.
(2) The recommended models have the highest TS scores in their respective regions, which is the main reason for their recommendation. For the simulation of climatological DTR, EC-Earth3-Veg has the highest TS scores (0.84 and 0.85) in both winter wheat-growing regions and the entire country, and the TS scores in NC, SC, and CY are also the highest among the models. EC-Earth3-Veg-LR has the highest TS score (0.84) in JH, higher than EC-Earth3-Veg (0.83). In addition, CanESM5 has a higher TS score in NWC. For the simulation of DTR in the winter wheat-growing season, ACCESS-CM2 has the highest TS score (0.46) in growing regions, EC-Earth3-Veg also has the highest TS score (0.48) in NWC. These models in different subregions also rank highest in their subregions.
(3) In this study, it was found that the assessment of nationwide DTR cannot replace the assessment of DTR in winter wheat growing regions, while the assessment of mean annual DTR cannot replace the assessment of DTR in winter wheat growing seasons. This shows that it is necessary to evaluate speci c research areas and research periods. Although the most suitable models for different regions have been recommended, it should be noted that these models represent only the most suitable models for the study of DTR in winter wheat growing regions. When the study needs the DTR data of the region, it is the most suitable choice for researchers. Similarly, when the entire country or other region is selected as the study area, the model with a relatively best simulation effect in the study area is the most suitable option relative to other models. However, there are differences in the ability of the above models to simulate the climatological DTR and annual DTR in winter wheat growing regions. For example, EC-Earth3-Veg has a good simulation effect on climatic DTR, but ACCESS-CM2 has stronger simulation ability for the simulation of DTR in the winter wheat-growing season. Therefore, attention should be given to selecting the most appropriate data in scienti c research.
The CMIP6 models have different simulation effects for DTR in China and winter wheat-growing regions, and the models with best performance in nationwide and growing regions are different.In summary, when studying the relationship between crops and climate, it is important to give attention to selecting the appropriate model for the crop-growing regions for the study, rather than simply choosing the climate model that has good overall simulation effects in the entire country or region.     Table 1).

Figures
Page 23/28 Figure 5 TS scores of climatological DTR simulated by CMIP6 models in each subregion in China. Each column represents a subregion, and each row represents a CMIP6 model. Winter wheat-growing season DTR trends simulated by CMIP6 models and observations.The red lines are the median of model simulated trends, grey dots are the results of CMIP6 models. These red, blue and green dots represent the results of CN05.1, MME and Median, respectively.  TS scores of annual growing season DTR simulated by CMIP6 models in each subregion in China. Each column represents a subregion, and each row represents a CMIP6 model.     Table 1).

Figure 12
TS scores of growing season DTR interannual trends simulated by CMIP6 models in each subregion in China. Each column represents a subregion, and each row represents a CMIP6 model.