Statistical analysis of properties
Descriptive analysis of soil properties is presented in Table 4. According to analysis, there has been observed a wide range of studied properties. AK, sand fraction, and AP had the highest, and pH had the lowest range. This broad variation expresses the heterogeneity in soil resources and the wide range of soil condition that influences the extent of crop and land management (Balland et al., 2008; Gallardo, 2003). The coefficient of variation (CV) also supports the results. The CV changed from 2.37 for pH to 96.84 for AP, respectively. de Avila e Silva (2022) classified CV into low (< 10%), medium (10–20%), high (20–30%), and very high (> 30%). Hence, except for soil pH, all characteristics had very high CV.
The percentage of sand gradually decreased from 84% in the eastern part (Maru’ak) to 1% in the western part (Da’ei-chi), and as a result, the amount of clay increased from 10 to 68%. The electrical conductivity varied from 0.14 to 1 dSm-1 and the soils were not saline (Bashir et al., 2022). Although in general, no significant correlation was detected between EC and soil fractions (Table 5) or land use, higher values of salinity were observed in orchards with coarser soil texture. The CCE percentage had a different trend compared to EC. It ranged between 0 in the eastern part (Maru’ak) to 43.8% in the western part (Da’ei-chi). Consequently, soil pH gradually changed from neutral (pH = 7.00) in the east to moderate alkaline (pH = 8.03) in the west (Weil & Brady, 2017).
The percentage of organic carbon showed a range from 0.12 to 1.8, with a mean value of 0.9. Higher organic carbon was observed in soils with less clay content that had orchard land use. The same pattern repeated for AP and AK. The AP and AK increased from 2.2 and 60 mgKg-1 (in the west) to 81 and 780 mgKg-1 (in the east), respectively. The Pearson analysis between soil properties (Table 4) showed that AP And AK had a positive significant correlation with OC, a negative significant correlation with CCE, and no significant correlation with soil texture particles and EC.
The accepted range of skewness and kurtosis coefficients are between − 2 and 2 for normal data (Asakereh et al., 2020; Kim, 2013; Sharma & Ojha, 2020). Therefore, except AP all properties had a normal distribution (Table 5). The Non-normal structure of the AP has caused the mean (9.18 mgKg-1) and SD (± 8.89 mgKg-1) to be close together. This proximity also offers a large skewness and kurtosis (Livingston, 2004), which is also evident in the skewness and kurtosis values in Table 4. The non-Gaussian structure of the AP could be explained by numerous factors such as variable intrinsic characteristics, method of sampling, the number of samples taken, or environmental conditions such as human activities, diversity in land uses, and land management (Fathololoumi et al., 2020).
Model performance
In the present study, the GLM, RF, and Cubist models were compared for predicting the spatial variation of AP and AK. The performance of the models was interpreted by R2, RMSE, and MAE, listed in Table 6. The best performance for each model was obtained within 80:20 cross-validation.
In terms of prediction accuracy, the R2 coefficient explains the degree of well-fitting the predicted and observed variables. The higher coefficient leads to better performance of the model in predicting (Zhou et al., 2019). Results showed that the RF model had better performance for both test and train datasets, than GLM and Cubist models. The proficiency of the RF model in predicting soil properties has been mentioned before (Taghipour et al., 2022; Xie et al., 2021). Kaya and Başayiğit (2022) described that the RF model is high-accurate in predicting soil properties that are affected by soil management.
The R2 parameters for AP train data were 0.22, 0.88, and 0.86 for GLM, RF, and cubist models, respectively. The values for AP test dataset were also 0.02, 0.4, and 0.25 for GLM, RF, and cubist models, respectively. The coefficient was 0.44 (GLM), 0.92 (RF), and 0.49 (cubist) for AK train dataset. was 0.31, 0.57, and 0.27 for GLM, RF, and cubist models for AK test dataset, respectively.
Although the R2 coefficients obtained from the evaluation of the model were less than 0.6 (R2 = 0.4 for AP and 0.57 For AK), it was still within the common range and acceptable accuracy declared by the studies (Taghizadeh-Mehrjardi et al., 2021a; 2014). The parameter also revealed that the model had acceptable accuracy in predicting AK than AP (Al Masmoudi et al., 2022), and was able to explain 40% and 57% of the AP and AK variability, respectively.
The root mean square error (RMSE) values for AP train dataset were 8.26, 4.74, and 3.5 mgKg-1 for GLM, RF, and cubist models, respectively. The RMSE values for AP test dataset were 8.97, 2.81, and 6.54 mgKg-1 for GLM, RF, and cubist models, respectively. The values of the RMSE for the AK train dataset were 136.65 (GLM), 65.65 (RF), 128.67 (cubist) mgKg-1; and for the AK test dataset were 142.82, 143.77, and 162.88 mgKg-1 for GLM, RF, and cubist models, respectively. The RMSE analysis showed that the RF model had better performance compared to GLM and cubist models for both AP and AK, however, the cubist model had lower RMSE for the AP train dataset, and performed slightly better than others. Zeraatpisheh et al. (2019) reported that the high prediction correlation does not necessarily lead to the high accuracy and the least error. The contradictory trends in RMSE results for AP and AK training and testing set for the RF model may confirm the issue. The RMSE for the AP training dataset (RMSE = 4.74) was higher than the testing dataset (RMSE = 2.81), while, the AK train set showed a lower value (RMSE = 65.65) than the test set (RMSE = 143.77).
RMSE has been widely used in numerous environmental science types of research, although it is highly affected by outliers and the number of observations (Chai & Draxler, 2014). Furthermore, Willmott and Matsuura (2005) stated that as there is no clear functional relationship between RMSE and average error, its interpretation might lead to miss understanding. They suggested that, as an unambiguous parameter, MAE provides the more acceptable measurement of average error; and it is preferable to RMSE. Hence, in addition to the RMSE, the MAE was also calculated.
The MAE values for AP train dataset were 4.61, 2.40, and 2.10 mgKg-1 for GLM, RF, and cubist models, respectively. The MAE values for AP test dataset were 7.34, 2.43, and 3.90 mgKg-1 for GLM, RF, and cubist models, respectively. The values of the MAE for the AK train dataset were 109.12 (GLM), 51.81 (RF), and 88.25 (cubist) mgKg-1; and were 131.53, 116.61, and 121.58 mgKg-1 for the AK test dataset for GLM, RF, and cubist models, respectively. As the results show, the MAE values showed more uniform trends than RMSE; and had almost the identical pattern for train and test datasets. Researchers have stated that there are no optimal values for RMSE and MAE, and have only reported different ranges of these parameters (Hengl et al., 2017; Song et al., 2018; Xu et al., 2020).
Important variables
As the RF model was the best-fitted model in the present study, only the important predictors selected by this model have been explained (Fig. 4).
The three most effective covariates in predicting AP were valley depth, TVI, and watershed basin (Fig. 4a), and in predicting AK were SAVI, SI, and SMI (Fig. 4b). Limited studies have been performed to estimate P and K content using environmental covariates and remote sensing data, but the role of terrain attributes in the movement of nutrients on and inside the soil have been proven in numerous investigations. Slope-related features as part of topography features significantly affect the soil nutrient contents by controlling water and energy transportation through a landscape (Adhikari et al., 2018; Hook & Burke, 2000; Smólczyński & Orzechowski, 2010; Zhang et al., 2011). However, the processes that occur along land topography when accompanied by long-term irrigation and soil management could affect nutrient redistribution (Salmanpour et al., 2018) and create complex relationships in the spatial variation of minerals (Wang et al., 2009).
Overview of Fig. 4, remote sensing indicators seem to have a higher ability to estimate soil phosphorus and potassium. Misbah et al. (2021) stated that soil chemical properties like P and K cannot be identified through spectral reflections, however, some investigations reported the correlation between bands, indices of satellite images, and soil K and P. Chao et al. (2018) reported significant correlations between soil nutrient contents, visible (400–760 nm), and NIR spectrum (760–1100 nm). The selection of SAVI, SI, SMI, and TVI indices may also be related to the ability of NIR and SWIR bands to recognise phenomena used in the formula of these indices (Mazur et al., 2022a; 2022b; Yu et al., 2018).
Soil AP and AK maps
About 71% and 21% soils of Iran have low content of AP and AK content, respectively (Shokri Vahed et al., 2022). Malakouti and Gheibi (2000) provided criteria levels of some soil nutrients, which were updated by Shahbazi and Besharati (2013) based on the percentage of plant response to fertilizer for each province in Iran. In accordance with their reports, about 28% of soils in Lorestan province have 10–20 mgKg-1 AP, and about 73% have more than 250 mgKg-1 AK. Therefore, in the present study, for better explanation, the obtained raster maps were classified according to these standards. As mentioned before, since the RF model had lower RMSE and MAE values in predicting AP and AK properties, it is dedicated only to the description of the maps produced by the RF model (Fig. 5).
According to the maps (Fig. 5a and b), a similar distribution pattern was observed for both minerals (AP and AK) in the area. In general, most of the region had low and moderate amounts of these elements. In the west and north, with agriculture, paddy field, and abandoned land uses, AP and AK were under the threshold level and values of 5–10 (Fig. 5a) and < 250 mg (Fig. 5b), respectively, with almost uniform spatial distribution pattern.
In the eastern and southern parts (Maru’ak and Do-khaharan), the AP and AK increased up to 15 (Fig. 5a) and 450 mgKg-1 (Fig. 5b), respectively with a heterogeneous and inconsistent pattern in orchards. In a small part on the river bank, the AP increased to more than 15 mgKg-1 (Fig. 5a).
A noteworthy point is that no difference in AP and AK content was observed between agriculture, paddy farms, and abandoned areas, in the west and north (Da’ei-Chi and Yazdgerd). Likewise, higher amounts of AP and AK were observed in the orchards, nevertheless, it seemed that the spatial distribution of these nutrients in topsoil was not affected by topographic changes. This was exactly the opposite of what the researchers of this project in the region expected. These maps also explain the reason why the land use map or most DEM derivatives were not selected by the model.
Different and sometimes contradictory results can be observed in the investigation of the relationship between soil nutrients and land use in different research. For instance, Babaei and Gholami (2022) reported that land use change from uncultivated land to sugarcane cultivation has caused an increase in soil AP; and a decrease in AK. On the contrary, Pichand (2017) after analyzing four different land uses including pasture, orchards, agriculture, and fallow lands, mentioned that the soil AP and AK were not significantly different between agriculture and uncultivated lands. They also stated that orchards and pastures had high and significant amounts of AP and AK compared to the other two land uses. What is deduced from the research is that in the first stage, the spatial distribution of macronutrients in topsoil may be controlled by sedimentary and erosion processes at different geomorphic surfaces that were influenced by topographic attributes (Aama Azghadi et al., 2010; Azadi & Shakeri, 2021; Farshadirad & Dordipour, 2015; Gopp et al., 2017; Ly et al., 2022; Omonode & Vyn, 2006). In the second stage, human activities, such as land use change (Signor et al., 2018), intensive and long-term land and water management (Shabanpour et al., 2020), cropping system and fertilizer management (Huang et al., 2012) affect the equilibria of mineral fractions, and depletion or addition of nutrients in the soil (Karami & Bazgir, 2019). Specifically, in the study area, it can be assumed that the higher amount of AP and AK in orchards were mainly related to the management difference in orchards management compare to paddy fields and agricultural lands such as the lack of removal of plant residues, and pattern, amount and method of fertilizer consumption. Non-significant correlation between AP, AK, and soil texture particles (Table 5) can also be considered as an emphasis on the dominant effect of agricultural management. Based on the results of the present study, agriculture has not improved soil quality in the region and it has not been a suitable land use for sustainable management. Consequently, it is recommended that the decision-makers should play a more prominent role as a guide for farmers to train in selecting more suitable crops and fertilizer recommendations.