Spatial scale transformation–based estimation model for fresh grass yield: a case study of the Xilingol Grassland, Inner Mongolia, China

Estimating the grass yield of a grassland area is of vital theoretical and practical significance for determining grazing capacity and maintaining ecological balance. Due to the spatial inconsistency between sampling and remote sensing data, improving the accuracy of fresh grass yield (FGY) estimation based on remote sensing is difficult. Using vegetation coverage at different spatial scales, this paper proposes a spatial scale transformation (SST)-based estimation model for FGY adopting normalized difference vegetation index (NDVI) as its estimation factor, using the grassland in Xilingol League, Inner Mongolia, as the study area. Results showed that the SST-based FGY estimation model was able to greatly improve estimation precision; the relative estimation error (REE) of the estimation models constructed using linear with intercept zero (linear-0) and power functions were 18.16% and 18.35%, respectively. The estimation models constructed using linear-0 and power functions were employed to estimate the grass yield of the grassland in Xilingol League, and the total FGYs estimated were 8.777 × 1010 kg and 8.583 × 1010 kg, respectively. The two models obtained roughly the same estimates, but there were significant differences between them in the spatial distributions of FGY per unit. Taking net primary productivity (NPP) as an example, the effectiveness of other remote sensing data as estimation factors was further verified, and the results showed that SST-based estimation for FGY also effectively improved the estimation accuracy of grass yield.


Introduction
Grassland ecosystems are one of the most widely distributed ecosystems globally, covering about 40% of the Earth's land surface (Suttie et al. 2005). To date, there are approximately 400 million hectares of natural grasslands in China, mainly distributed in the Inner Mongolia Autonomous Region (Inner Mongolia), the Xinjiang Uygur Autonomous Region (Xinjiang), and Sichuan Province. They account for 41.7% of China's total land area, and constitute the largest terrestrial ecosystem in China (Xie et al. 2001;Xu et al. 2013). Grasslands not only produce feed for animal husbandry, but also fulfill many important ecological functions, such as carbon fixation and oxygen release, windbreak provision and sand fixation, headwater conservation, soil and water conservation, and biodiversity maintenance (Fry et al. 2013). The grass yield of grasslands is the material basis of ecosystem maintenance, the most direct reflection of grassland status, and determines the functional strength of the ecosystem (Xu and Yang 2009). Being able to accurately and efficiently identify the spatio-temporal distribution of grassland grass yield and understand interannual dynamic variations provides an essential scientific basis for determining grazing capacity, which has vital theoretical and practical significance for keeping grassland ecological balance and planning livestock production.
Traditional methods of assessing grassland productivity mainly include ground surveys, statistical models (Gao et al. 2009;Li et al. 2003;Liu et al. 2007;Yang et al. 2008), process models (Feng and Zhao 2011;Goetz et al. 1999;Luo et al. 2012), and parameter models (Li et al. 2007;Zhang et al. 2008). The development of remote sensing technology has driven the emergence of numerous studies employing it to estimate productivity and grass yield, and a series of remote sensing estimation models have been generated (Liu et al. 2020). This method saves both time and labor, and offers efficient decision-making for grassland management from a macroscopic perspective.
Literature in this field has shown that grass yield is closely related to vegetation indices. Using multiple vegetation indices, scholars have employed linear functions, power functions, and other mathematical relationships to construct remote sensing estimation models for grass yield (Bella et al. 2004;Gao et al. 2013a;Xu et al. 2007;Yang et al. 2007). For instance, Gao et al. (2013b) investigated the spatial distribution of aboveground and underground biomass in the Xilingol Grassland, Inner Mongolia, using MODIS NDVI. Xu et al. (2008) presented a systematic model that could be used to estimate grass yields in mainland China based on MODIS NDVI and ground samples. Yang et al. (2009) estimated the aboveground biomass in Tibet using the MODIS enhanced vegetation index (EVI), and analyzed the relationship between aboveground biomass with grasslands and meteorological factors. Further discussion and experiments are still needed, however, to determine the suitability of a vegetation index to a specific region or environment. Some studies hold that NDVI has advantages in remote sensing estimation of grass yield (Ni 2004); however, the instability and supersaturation of NDVI under different vegetation coverage tends to introduce indeterminate errors. In recent years, many scholars have attempted to estimate grass yield with other remote sensing derivative products (such as gross primary productivity and MODIS PSNnet). Fu et al. (Ni 2004) combined MODIS GPP and NDVI with calculated vegetation coverage and actual survey data to estimate the grass yield of grasslands in Sichuan Province. Zhao et al. (2014) directly constructed a regression model based on MODIS PSNnet and ground survey data, and used it to estimate the grass yield of the Xilingol Grassland.
In addition, remote sensing estimation models also frequently have problems with inconsistency between ground sample data and digital remote sensing images on spatial scales. While sample collection usually adopts a quadrat size of 1 m 2 for the sake of grassland protection, remote sensing data adopted for large-scale grass yield estimation often has a spatial resolution of 30-1000 m, and the surface features within each pixel are not singular; the information contained has integrated spectral characteristics of different vegetation types within different pixels (Zribi et al. 2003). Therefore, for improving the precision of estimation models, the spatial scale transformation of remote sensing images and ground samples is necessary.
Overall, studies on the estimation of grass yield using remote sensing technology have achieved numerous significant results. Despite this, a method to solve the problem of spatial scale inconsistency between ground measured samples and remote sensing data and improve the accuracy of grass yield estimation model based on remote sensing images is still needed. This paper obtained a spatial scale conversion coefficient through the conversion between the measured vegetation coverage (MVC) and the estimated vegetation coverage, and put forward an SST-based estimation model for fresh grass yield, where after verification and evaluation based on the ground measured samples, the fresh grass yield in the research area is estimated.

Study area
The study area was the Xilingol Grassland, a typical temperate grassland located in central Inner Mongolia, northern China (41°35′ ~ 46°46′ N, and 111°09′ ~ 119°58′ E; Fig. 1). It occupies a total area of 202,580 km 2 , 192,512 km 2 (95.03%) of which is grassland. The dominant grassland type is natural, which accounts for 97.2% of the total grassland area. The climate is a typical temperate continental semi-arid climate zone, cold in the winter and hot in the summer, with mean annual temperatures of 1.3 ~ 4.8 °C and mean annual precipitation of 150 ~ 400 mm. Precipitation increases progressively from west to east, and the annual precipitation distribution is uneven (70% concentrated in the period from June to August), with significant interannual variability.

Remote sensing data
The remote sensing data was mainly derived from China's 500 m NDVI monthly synthetic products (MODND1M) from the Geospatial Data Cloud site, Computer Network Information Center, Chinese Academy of Sciences (http:// www. gsclo ud. cn). MODND1M is a 1-month synthetic product, with a spatial resolution of 500 m. Considering the collection time of ground-measured samples, NDVI data in July and August were used. To effectively eliminate the errors caused by cloud cover, solar zenith angle, and atmosphere and to improve the quality of images (Liu et al. 2016), the NDVI data was preprocessed using the maximum value composite method (Formula (1)), and marked as MNDVI.
where MNDVI is the maximum value of NDVI; NDVI 7 and NDVI 8 are the NDVI values of July and August, respectively.

Ground survey data
The ground sample data used was measured ground quadrat data collected by the local grassland authority as entrusted by the Grassland Supervision and Monitoring Center of the Chinese Ministry of Agriculture during July-August (flourishing period for grass) of 2013. Figure 1 shows the spatial distribution of quadrats (1 m × 1 m in size). The main sample variables included quadrat number (QN), longitude and latitude, grassland type (GT), altitude, vegetation type (VT), FGY, measured vegetation coverage (MVC), community height (CH), grazing intensity (GI), and sampling time (ST). The FGY for samples was determined by mowing all the plants within quadrats to ground level and weighing them. Considering that ground sample data quality can substantially affect the estimation precision of models (Matsushita and Tamura 2002), data was put through rigorous testing and standardized screening before modeling. After rejecting individual abnormal data based on grassland type and multi-year mean quadrat status, this study ultimately selected 460 sample data measures (400 randomly selected for modeling and 60 for precision verification). The statistical results of measured ground data for different grassland types are shown in Table 1. The

Meteorological data
The meteorological data were from the China Meteorological Data Service Network (http:// data. cma. cn/), including datasets of monthly average temperature, monthly cumulative precipitation, and monthly total solar radiation. Since there are few solar radiation observation stations in the study area, in order to ensure the precision of subsequent interpolation, relevant stations around the study area were selected to participate in the interpolation. The spatial distribution of different meteorological stations is shown in Fig. 1. After data was downloaded, based on the latitudinal and longitudinal information for each meteorological station, kriging spatial interpolation was used to obtain a meteorological data raster image with the same pixel size and projection as the NDVI data.

Estimation of vegetation coverage
It was assumed that each pixel of the study area consisted of two parts -vegetation coverage and bare land -and the spectral information acquired by remote sensors was a linear combination of two pure components after weighting based on the area ratio. On this basis, the current classical binary pixel method (Ge et al. 2018) was adopted to calculate vegetation coverage from Formula (2) based on MNDVI data. The result is marked as FVC. (2) where i is geographical position; FVC is estimated vegetation coverage; MNDVI i is the NDVI value of pixel i; MNDVI max is the maximum MNDVI of the study area; MNDVI min is the minimum MNDVI. This paper adopts the MNDVI value corresponding to a cumulative frequency of 95% as MNDVI max , and that corresponding to a cumulative frequency of 5% as MNDVI min .

Modification of vegetation coverage
MVC is usually performed with discrete points based on the design scheme, while subsequent grass yield estimation is conducted pixel by pixel. Assuming that the vegetation coverage of the pixel where a sample is located is uniform and consistent with the measured vegetation coverage of the sample, to obtain the measured vegetation coverage of the entire study area this paper proposes to modify FVC using MVC, as described below: ① Exact FVC value based on the position of the quadrat, and perform margin calculation between MVC and FVC value to obtain residual (Formula (3)).
② Provide kriging interpolation for the calculated residuals to obtain the interpolation image of residuals ′ , modify FVC using image operation (Formula (4)), and mark it as RFVC.

Estimation of grassland NPP
In this paper, the cumulative value of NPP from January to August was selected to further verify the applicability of the SST-based estimation model for estimating fresh grass yield; however, no NPP data for a monthly scale was found, and therefore, estimations were used. At present, the Carnegie-Ames-Stanford Approach (CASA)

SST of grass yield data
According to a comparison between FGY and MNDVI, FGY comes from measured ground samples with a spatial scale of 1 m, while MNDVI is the MODND1M data product with a spatial resolution of 500 m (Fig. 2). For this reason, directly adopting MNDVI for FGY estimation introduces unknown factors due to the inconsistency between spatial scales, lowering the fitting precision of the model. Similarly, a comparison between FVC and RFVC reveals that FVC is calculated using MNDVI according to the binary pixel method with a spatial scale of 500 m, whereas RFVC assumes that the vegetation coverage of each pixel is uniform and consistent with the measured vegetation coverage of the sample. RFVC can be regarded by default as the vegetation coverage of the sample, with a spatial scale of 1 m. Assuming the coefficient of SST between them is k, the value of k can be calculated from the ratio of FVC to RFVC.
Assuming that there is a proportional relationship between FGY and vegetation coverage, the coefficient of SST can also be regarded as the ratio of the grass yield per unit with a spatial scale of 500 m (FGY 500 ) to the grass yield per unit with a ground measured spatial scale of 1 m (FGY) (Formula (6)).
As can be seen from the above assumptions and analysis, the SST between FGY 500 and FGY can be calculated based on FVC and RFVC.

Precision verification
To verify the precision of different remote sensing estimation models for FGY, this paper selected the commonly used relative estimation error (REE) for precision evaluation. The formula is given below: where y i is the survey data; yi′ is the estimate from the regression model; N is the number of validation points.
To evaluate the total FGYs estimated by different models, this paper adopted relative estimation precision for quantitative analysis according to the following formula: where RSP ij is the estimation precision of model j relative to model i; si is the total FGY of the entire study area estimated by model i; sj is FGY estimated by model j.

Comparison of vegetation coverage before and after modification
The vegetation coverage estimated based on MNDVI is shown in Fig. 3a. The validation sample had an REE was 28%. The RFVC is shown in Fig. 3b. Compared with Fig. 3a, it was found that the overall vegetation coverage in the study area had decreased significantly, especially in the center and southwest, with the maximum reduction being more than 0.3. The verification results showed that the accuracy of RFVC was significantly improved, with an REE of 8%.

FGY estimation model based on NDVI
As pointed out in related literature, NDVI produces desirable results as an estimation factor in the remote sensing estimation of grass yield (Liu et al. 2020). Because of this, this paper selected NDVI as the main factor for FGY estimation. After using MNDVI and FGY to construct the scatter map, it was fitted based on a linear function and power function respectively, therefore constructing the FGY remote sensing estimation model (Fig. 4). It can be seen from Fig. 4 that the constructed scatter diagram was messy, and the adjusted goodness of fit coefficients (adj. R 2 ) of the linear function and power function did not differ (between 0.60 and 0.61). When the MNDVI of the linear fitting result was lower than 0.22, however, the FGY estimation result was negative. Because of this, the linear function fitting was carried out again by setting the intercept to zero (linear-0 function) (red line; Fig. 5).
Comparing the two linear fitting results, it was found that the adj. R 2 of the linear-0 function was significantly higher than that of the linear function fitting result; the residual sum of squares (RSS) reached 6.59, which was also higher than the linear function fitting result. FGY of ground measured verification samples was used to verify the accuracy of the constructed model ( Table 2). The verification result of the linear-0 function was much higher than that of the linear function or power function.

SST-based FGY estimation model
The above results indicated that there was a high correlation between FGY 500 and MNDVI, which agreed with a mathematic function (Formula (9)). In combination with Formula (6), it can be held that, theoretically, there is a high correlation between FGY × FVC and MNDVI × RFVC. To verify the above assumptions and deductions, scatter diagrams were constructed with FGY × FVC and MNDVI × RFVC, and linear and power functions were used for fitted regression (Fig. 5). Compared with Fig. 4, it was found that the scatter diagram constructed with FGY × FVC and MNDVI × RFVC had higher correlation. The fitting results of the linear function and power function showed that, regardless of the linear or power function used, adj. R 2 always significantly improved and RSS/ reduced chi-sqr decreased significantly. The intercept of the linearly fitted regression model was negative, however, and when the value of MNDVI × RFVC was lower than 0.04, it resulted in a negative estimate of FGY. The statistical results showed that the proportion of estimated samples with a MNDVI × RFVC lower than 0.04 was nearly 10%, and the proportion in the whole study area was also substantial. Because of this, the linear-0 function was also adopted for fitting (red line; Fig. 6). Compared with the RSS (1.49) of the linear function fitted model, the RSS of the linear-0 function fitted model was 1.62, which was very similar.  When measured ground sample data was adopted for precision verification (Table 3), it was found that compared with the FGY estimation model directly based on NDVI, the accuracy of the estimation model, whether using a linear fitted model or power function model, was greatly improved after SST. The REE of the linear function was 19.04%, the linear-0 function was 17.02%, and the power function was 17.45%. The fitted regression models constructed using the linear-0 function and the power function had roughly the same estimation precision.

Estimation of the grass yield of the study area
The SST-based fitted regressions constructed using the linear-0 and power functions were used to estimate the grass yield for the study area (Figs. 6 and 7). According to the statistical results, when the fitted regression models constructed using linear-0 and power functions were used to estimate the total FGY of the grassland in the study area, the respective results were 4.98 × 10 11 kg and 4.83 × 10 11 kg, indicating there was no significant gap. Relative simulation precision (RSP) was used for comparative analysis of relative accuracy. The results showed that when compared to the power function fitted model, the linear-0 function fitted model had a relative precision of 96.99%; when compared to the linear function fitted model, the power function fitted model had a relative precision of 96.89%.
Comparison between Figs. 6 and 7 revealed that, spatially, the overall change trend of FGY per unit of the two models was the same -a gradual decreasing trend from northeast to southwest -however, significant differences in spatial distribution could be observed. As indicated by the statistical results (Table 4), in the west and center of the study area, the area with a FGY per unit of less than 0.2 kg/m 2 estimated by the power function fitted model was far greater than that estimated by the linear function fitted  model. In the east and south of the study area, the area with a FGY per unit of greater than 0.6 kg/m 2 estimated by the power function fitted model was also far greater than that estimated by the linear function fitted model. Due to the lack of data on large-scale regional FGY, it was difficult to evaluate the advantages and disadvantages of the two models. In follow-up research, the coverage of ground samples will be expanded to further explore this issue.

Discussion
Through the further analysis of the spatial position of the ground measured sample data, it was found that in the process of collecting the measured samples, many samples were contained in one pixel. As an example, three points were found in one pixel in the red box of Fig. 5 after searching in Fig. 1. The FGY of the measured ground data of the three points were 0.035 kg/m 2 , 0.043 kg/m 2 , and 0.051 kg/m 2 , respectively, while the NDVI value of the corresponding pixel was 0.63. According to the above analysis, the inconsistency of spatial scale between ground measured samples and remote sensing data did lead to the decline of FGY estimation accuracy. From the perspective of practical application and the statistical results of ground measured sample data, the growth of vegetation and regional vegetation coverage could effectively express the level of grass yield. The FGY of grassland with high coverage is higher than that of area with low vegetation coverage (Han et al. 2021;Shi et al. 2022). It was also found during the collection of ground measured sample data; however, that FGY was significantly different in different areas with the same vegetation coverage due to different vegetation types and heights. To further explore the relationship between vegetation coverage and FGY, the relationship between FGY and MFC was analyzed based on the ground measured sample data (Fig. 8). There was an obvious linear relationship between FGY and MVC, and therefore, it was feasible to obtain the SST coefficient based on the vegetation coverage before and after correction and use it to convert the FGY from different spatial scales. The final experimental results also showed that the spatial scale conversion through FVC and RFVC significantly improved the estimation accuracy of FGY.
In order to explore the effectiveness of the SST-based estimation model for FGY proposed in this paper, the NPP cumulative value (SNPP) from January to August 2013 was selected as the estimation factor of FGY. The SST method based on the vegetation coverage before and after correction was used, and the FGY estimation model based on NPP was established. Scatter diagrams were constructed based on SNPP and FGY, FGY × FVC, and SNPP × RFVC, and all were fitted by linear fitting, linear-0 fitting, and power fitting ( Figs. 9 and 10). Whether using the linear function or power function, the fitting based on FGY × FVC and SNPP × RFVC was significantly improved. The verification based on the ground measured sample data showed that the REE also decreased from 40-50% to about 20%, and the estimation accuracy was significantly improved. This demonstrated that the FGY estimation model based on SST was effective with different estimation factors. Liu et al. (2020) used five methods to estimate the grass yield of grassland in Qinghai based on MODIS-NDVI. The root-mean-square error (RMSE) of the model with the highest accuracy was 1140 kg/ha. Zhang et al. (2017) estimated the aboveground biomass of Qilian County in Northeast Qinghai in 2014, and the RMSE of the estimation result was 1713.4 kg/ha.  modeled grass yields of five provinces in the northwest and southwest of China during [2005][2006]. The power function model with the highest accuracy had poor accuracy, and its RMSE was 2595.21 kg/ha. In order to compare with relevant literature, this study also calculated the RMSE of the FGY estimation model based on SST. The RMSE of the linear-0 function and power function models with NDVI as estimation factor were 830.57 kg/ha and 837 kg/ha, respectively. The comparison showed that the accuracy of the FGY estimation model proposed in this paper is higher than that in the above literature, which further confirms the utility of this model.

Conclusion
NDVI as an estimation factor of grass yield has been examined in many studies. The inconsistency in spatial scale between remote sensing data and ground samples makes it difficult to improve its estimation accuracy for grass yield. SST was performed using the vegetation coverage before and after modification, and a SST-based FGY estimation model was proposed. As demonstrated by the results of this case study based on the Xilingol Grassland, the model effectively improved estimation precision.
The SST-based fitted regression models constructed using linear-0 and power functions were employed to estimate the total FGY for the entire study area, resulting in respective estimates of 4.98 × 10 11 kg and 4.83 × 10 11 kg. The two models therefore obtained roughly the same estimate, and their spatial distribution trends were also consistent. There were significant differences in the spatial distribution of FGY per unit, however, meaning that further study is needed. Based on this comprehensive analysis, the conclusion can be drawn that when using the NDVI or NPP of MODIS as the main variable, the SST-based FGY estimation model can be used to estimate regional grass yield and effectively improve the estimation accuracy.
Author contribution Haixin Liu proposed the newly idea and prepared the manuscript with contributions from all co-authors. Anzhou Zhao, Yuling Zhao, and Dongli Wang performed the material preparation, data collection, and analysis. Anbing Zhang review and edit of this manuscript. All authors read and approved the final manuscript.

Data availability
The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.

Consent for publication Not applicable
Competing interests The authors declare no competing interests.