Analysis of spectral characteristics of Sabina vulgaris. The spectral curve of Cypress on sandy land conforms to the spectral characteristics of general green plants (Fig. 4). Green plants are mainly affected by various pigments (chlorophyll, lutein, carotenoid, etc.) contained in leaves in the visible light band, among which chlorophyll plays the most important role35. Due to the strong absorption of pigments to electromagnetic waves and other radiation in this band, the reflection and transmission of leaves are very low. In the 420nm-450nm blue waveband and the 620nm-780nm red waveband, chlorophyll strongly absorbs radiation waves and easily forms absorption valleys. The reflection between these two absorption valleys is relatively reduced and forms reflection peaks, which makes plants appear green. If the normal growth of plants is inhibited in some form in the visible band, the decrease of chlorophyll content will increase the reflection of plants in the blue-green band and reduce the absorption.
The curve obviously shows the characteristics of "five grains and four peaks" of green plants. The main characteristics of vegetation spectrum are "Red Valley" and "green peak" in a visible light band; The red-edge appears between 680-760 nm, which is a diagnostic spectral feature of vegetation, and the Red Valley forms high reflection in this band; There is a small reflection peak near the wavelength of 800 nm, namely "green peak". With the increase of chlorophyll, the spectral curve will shift to the right.
In the near-infrared band, the main influencing factor of green plants is the cell structure inside the leaves. In this band, the absorption energy of leaves is low, but the reflection and transmission are similar. High reflection is formed in the 680-1300nm spectrum.
In the infrared band, the transmission of plants is very small, and the absorption and incidence are similar. The main influencing factor is the water content in plant cells. Generally, two main water absorption bands are formed in the band centered at 1400nm and 1900nm.
Extraction of optimal spectral index wavelength combination. In the correlation analysis between the ratio of original spectral reflectance of the two bands, the vegetation index, the improved RVI, NDVI, mNDVI with chlorophyll, the RVI performed best (Fig. 5), and blue to yellow indicated a high negative correlation to high positive correlation. In the ratio vegetation index, the highest correlation is in the combined band of 610-680nm and 700-940nm, and the highest correlation is in the combined band of 350-430nm and 650-690nm in the improved red-edge ratio vegetation index. In NDVI, the highest correlation was found in the combined bands of 470 nm-500 nm, 610 nm-680 nm, and 740 nm-840 nm. The correlation between NDVI and chlorophyll was 0.95. Through comparative analysis, it was found that the band with the highest correlation between NDVI and chlorophyll was (660,790), the best ratio vegetation index was (630,720), and the best improved red-edge ratio vegetation index was (360,450). Therefore, in the follow-up monitoring, we can focus on the band with better performance to monitor the growth of Sabina vulgaris.
Selection of optimal index. In this study, according to the five vegetation indexes for establishing linear regression, three indexes and red-edge parameters are randomly selected to establish the structural joint equation model, and finally, the vegetation index most suitable for establishing the mathematical model is selected. Among the five vegetation indices selected in this study, NDVI has a close relationship with chlorophyll and has a significant characterization of chlorophyll, which is consistent with the previous research results. Similarly, the characterization of chlorophyll by RVI and mNDVI is also significant. In previous studies, the disadvantage of NDVI is that it is not suitable for areas with large vegetation coverage, while the disadvantage of RVI is that it is not suitable for measurement areas with too sparse vegetation and soil impact. The Sabina vulgaris in this study belongs to tree species with unsaturated vegetation coverage. At the same time, the selected measurement area and experimental design during measurement also minimize the error of ground soil on this measurement, Therefore, in this study, their characterization of chlorophyll is more obvious(figure. 6b, figure. 6c ), and they are the preferred indexes in the follow-up modeling.
However, the characterization of mSR is relatively unstable. When DVI and mNDVI establish equations, the characterization of the relationship between mSR and chlorophyll is significant(figure. 6a ), but it has a negative correlation with the characterization of chlorophyll when RVI and NDVI establish equations(figure. 6c ). This is because the influence ways of the structural equation model are divided into direct influence and indirect influence. In all indicators of establishing a structural equation model, The relationship between indicators and indicators is interactive. It can be seen from the figure that among the equations established by mSR, RVI, and NDVI, RVI and Dr have an obvious characterization effect, which will indirectly affect the direct characterization effect of mSR on chlorophyll. Therefore, the characterization of chlorophyll is unstable, but the above mNDVI, RVI, and NDVI are also indirectly affected by other factors, However, the performance is still obvious and stable, so the three are the best indexes for modeling(figure. 6).
Mathematical model. Univariate linear regression model. The univariate linear regression analysis was conducted on the spectral data and chlorophyll (Fig. 7). According to the fitting results, the best fitting effect is the normalized vegetation index, with R2 above 0.9. According to the conclusions of previous researchers, NDVI itself is used to monitor the vegetation growth status. At the same time, NDVI can also eliminate the error impact of atmospheric radiation, so it is more suitable for this study, The results are also satisfactory; The worst effect is the difference vegetation index, R2 is only 0.16;
However, except for the difference vegetation index, the correlation between other vegetation indexes and chlorophyll is about 0.6, in which the correlation coefficient R2 of mSR and RVI is > 0.7, because the applicable condition of RVI itself is "the ratio of scattering of green leaves in the near-infrared band to chlorophyll absorption in the red band"37. It can be seen that RVI itself is more suitable for studying the spectrum of green plants, Therefore, the effect is ideal when it is used for the correlation between green plant spectrum and chlorophyll; Similarly, mSR corrects the specular reflection efficiency of leaves. The original vegetation index RVI based on mSR is more suitable for this study, so the fitting effect of mSR in this study is also ideal. Although mNDVI is an improved value of NDVI, it is only very sensitive to small changes in leaf canopy, gap segments and senescence38. In previous studies, it is also mostly used for fine agriculture, vegetation monitoring, and vegetation stress detection. mNDVI is selected because it is suitable for vegetation monitoring. However, according to the linear fitting results, the research used for Sabina vulgaris is not ideal. The subsequent mNDVI should be more used for spectral monitoring of broad-leaved tree species or monitoring with high vegetation coverage.
Multiple stepwise regression model analysis. Advantages of multiple stepwise regression analysis: the regression equation includes all independent variables that have a significant impact on the dependent variable, and does not include the regression equation of independent variables that have no significant impact on the dependent variable. Stepwise regression analysis is a regression analysis method based on this principle. Its essence is to derive an algorithm skill for studying and establishing the optimal multiple linear regression equation based on multiple linear regression analysis. It mainly uses the principle of regression analysis, adopts the double test principle, and gradually introduces and eliminates independent variables to establish the optimal regression equation39.
In this study, vegetation index and red-edge parameters were used as independent variables and chlorophyll content as dependent variables. Two multivariate linear stepwise regression models were established by SPSS26. The regression equation is constructed as shown in Table 3. According to Table 3, the fitting degree of the model constructed by vegetation index is much higher than that constructed by red-edge parameters, and its RMSE is also relatively small, so the accuracy of its prediction model is high. But generally speaking, the accuracy of multivariate stepwise regression model is higher than that of univariate linear regression model.
Table 3
Multiple stepwise regression model equation
parameter
|
Model
|
Model accuracy
|
Test model accuracy
|
R2
|
Rmse
|
R2
|
Rmse
|
vegetation index
|
y=-0.542+5.063NDVI+7.373RVI
|
0.938
|
0.194
|
0.934
|
0.124
|
Red-edge parameter
|
y=-5.962+0.19Rp+23.114Dr
|
0.885
|
0.183
|
0.820
|
0.194
|
Partial least squares regression model. The partial least squares regression models of different leaf coverage areas were established by programming with MATLAB R2012a and taking vegetation index and red-edge characteristic parameters as inputs respectively. For the accuracy of the results, three indexes mNDVI, RVI, and NDVI with the best characterization of chlorophyll were selected from 3.3 to establish the partial least squares regression model, and the three vegetation indexes and red-edge parameters were used as inputs respectively. The results are shown in Table 4. Compared with multiple stepwise regression analysis, the partial least squares regression model has higher accuracy than the multiple stepwise regression model, the correlation coefficients of vegetation index and red-edge parameter model have increased, while Rmse has decreased and is less than 0.1. The fitting effect of the model with vegetation index as input is higher than that of the prediction model, As a model, the effect is better.
Based on the above three modeling methods, the model accuracy established by partial least square method is better than univariate linear regression model and multivariate linear regression model, and the fitting effect is the best.
Table.4 partial least squares regression equation
parameter
|
Model accuracy
|
Test model accuracy
|
R2
|
Rmse
|
R2
|
Rmse
|
vegetation index
|
0.971
|
0.094
|
0.982
|
0.037
|
Red-edge parameter
|
0.914
|
0.091
|
0.938
|
0.043
|