3.1 Survival and reproduction of F. candida
The soil samples used in this study were generally representative of the Cd concentrations found in most Chinese Cd-contaminated agricultural soils (Zhao et al., 2015), and the Cd concentrations in all soils were higher than the current national soil Cd risk screening value (GB 15618 − 2018). An average of 76.1% of adults were alive in the 28 Cd-contaminated soils (Table 1), indicating that the death rate of F. candida was relatively low in the naturally Cd-contaminated soils. There was no significant relationship between the survival number and soil total Cd concentration or any other soil properties by Pearson’s analysis (Fig. 1), and the Cd LC50 was not able to be calculated in these soils. The scatter plot of the survival rate in different Cd-contaminated soils is shown in Fig. 2. The survival rate appeared to increase with Cd concentrations in soils ˂ 3 mg Cd kg− 1 concentration (r = 0.528, P < 0.05) but showed a significant negative relationship with soil Cd concentrations in soils > 3 mg kg− 1 (r = – 0.645, P < 0.01). These trends were opposite to those of previous studies, showing a significant negative relationship between soil Cd concentrations and the survival rate, and soil pH, SOC, and soil CEC etc that were essential parameters for the survival rate in laboratory studies (Luo et al., 2014; Sahraoui et al., 2021). This may be because the previous conclusions were drawn mainly from ranging and finding tests, those tests set the Cd concentration, soil pH, SOC, and CEC, etc as a single variable in a specific soil (artificial soils, standard soil LUFA, or newly Cd spiked soils) (Lock and Janssen, 2001a, b; Liu et al., 2019). Therefore, it is easy to find the main factors and the most appropriate soil conditions. However, the naturally contaminated soils studied here were more complex, and the soil properties were closely associated with each other, and it was, therefore, difficult to obtain these relationships. Notably, the current study showed that survival number decreased with increasing soil pH (r = 0.585, P < 0.05) in low Cd concentration soils (< 3 mg kg− 1) (Fig. S1), suggesting that instead of Cd, pH was the major factor constraining survival in soils with low Cd concentrations, on the contrary, Cd was likely to have a promotive effect on survival rate in low Cd concentration soils. Studies indicate that collembolans are more appropriate living in acid and circumneutral soils, and few survive in soils with pH values > 8.0, indicating that soil pH plays a major role in collembolan survival (Fountain and Hopkin, 2005; Howcroft et al., 2009). Therefore, the standard single-species test should not be generalized for a wide range of soil pH values. Instead, the standard single-species test animal should be specified for certain ranges of pH values (Fountain and Hopkin, 2005; Howcroft et al., 2009; Liu et al., 2019). An alternative explanation is hormesis effects, demonstrating stimulatory or beneficial effects occurring at a lower dose, but inhibitory or toxic effects at a higher dose (Mark, 2008; Gospodarek et al., 2020). Hormesis should therefore be considered in the experimental design, selection of biological markers and endpoints, and risk model development et al. in assessing pollutants (Christiani and Zhou, 2016). Overall, the relationships between survival rate and soil properties are intricate and are challenging to predict using traditional statistical methods. The BPNN model was therefore used here to resolve the complex situation between survival rate and soil properties, in which soil total Cd concentration and soil pH were selected as input layers in the neural network mode.
Table 1
Survival number and rate, juvenile number and reproduction inhibitory rate, Cd concentration in body tissue, and Cd bioaccumulation factor (BAF) of F. candida in 28 naturally Cd-contaminated soils.
| Survival number | Survival rate (%) | Juvenile number | Reproduction inhibitory rate (%) | Cd concentration in tissue (mg kg− 1) | BAF |
Min | 5 | 50 | 61 | 0 | 2.44 | 0.40 |
Max | 9.3 | 93 | 352 | 82.7 | 85.7 | 36.0 |
Median | 7.3 | 73.3 | 172 | 51.1 | 10.1 | 3.16 |
Mean ± SE | 7.6 ± 0.2 | 76.1 ± 2.10 | 181 ± 16 | 48.5 ± 4.4 | 18.2 ± 3.8 | 5.00 ± 1.46 |
Coefficient of variation (%) | 20.6 | 20.6 | 46.3 | 46.3 | 112 | 146 |
Reproduction number showed significant negative relationships with soil pH and total Cd concentration based on Pearson’s correlation analysis (Fig. 1). With the total soil Cd concentration, the obtained EC50 value was 8.00 (5.92–24.1), a lower value than that in previous studies using artificial soils (Lock and Jassen, 2001a). This also indicates that Cd is not the only factor limiting F. candida reproduction in soils. The scatter plot of the reproduction number in different naturally Cd-contaminated soils is shown in Fig. S2. In the soils with lower Cd concentrations, the reproduction number was quite low, possibly due to inappropriate alkaline soil pH conditions for F. candida reproduction (Fountain and Hopkin, 2005). For example, in soil No. 2, the total Cd concentration was 0.62 mg kg–1 but the reproduction inhibition rate was 50%, resulting from the high pH value (8.47) (Table S1). Zhang et al. (2021) obtained similar results in which soil pH had more adverse effects on F. candida reproduction than metals in naturally contaminated soils, especially in calcareous soils. In addition, only when the metal detoxification level exceeds its capacity, it may cause adverse effects on individual and population numbers (Lock and Janssen, 2001b; Howcroft et al., 2009). The Cd accumulated in soil animals may be formed as non-toxic substances by forming metal-proteins complexes. This metal-protein complex was observed by Morgan et al. (2004) and Vijver et al. (2006), who found that 70% of Cd in earthworms was sequestered by metallothionein (MT). Therefore, some Cd levels may be accumulated in the animal’s body with minimum toxic effects on the survival and reproduction of the testing organisms. Toxicity endpoint survival is complex according to soil characteristics and low sensitivity; therefore, further studies are required to better understand metal accumulation in soil animals.
3.2 Cadmium accumulation in F. candida
The internal Cd concentrations in F. candida and the BAF values in the 28 naturally Cd-contaminated soils were 18.2 ± 3.83 mg kg− 1 (2.44–85.7 mg kg− 1) and 5.00 ± 1.46 (0.398–36.0) respectively, showing significant differences among different soils (P < 0.001) (Table 1). The coefficient of variation was > 100, indicating a wide variation in soil properties and affecting Cd accumulation in F. candida. The Cd BAF values in 92.9% of the soil samples were > 1.0, suggesting a high ecological risk of Cd in soils and potential food chain risks, which should be a concern in these soils.
Pearson correlation analysis (Fig. 1) shows that soil pH was the primary factor that negatively influenced the accumulated Cd concentration in F. candida and BAF-Cd values. Conversely, silt, total K, and total Cd concentrations in soils showed positive associations with the internal Cd concentration. Soil total K and available K concentrations also had positive relationships with the BAF. However, total Cd concentration in soils showed a negative association with the BAF. RDA analysis gave similar results (Fig. 3). A Monte Carlo permutation test in the RDA analysis shows that Log [Cd] and Log [BAF] had significant correlations with the selected soil properties (Pseudo-F = 10.5, P < 0.005). The first and second RDA components (RDA1 and RDA2) explained 50.4 and 39.7%, respectively, of the total variation, suggesting that the soil attributes were significantly highly diverse. Soil total Cd concentration was the dominant factor significantly restraining Cd accumulation, contributing 43.6% of the variance. Soil pH, K, and SOC made lesser contributions of 22.0, 12.6, and 6.2% of the total variation, respectively.
Nutrient elements have been given less consideration in previous studies on metal accumulation (van Gestel, 2008; Liu et al., 2019). However, here, Pearson’s correlation analysis and RDA analysis both confirm that K was essential for Cd accumulation by F. candida. These results may be explained as follows. Firstly, K may exchange with Cd on soil particle surfaces, thereby increasing exchangeable/available Cd concentration in soils with further soil animal Cd assimilation (Wu et al., 2015; Wang et al., 2019). Wang et al. (2019) found that K addition had a significantly positive relationship with available Cd content in soils, and K increased Cd accumulation in plant tissues. Secondly, K generally tends to be deficient in Chinese agricultural soils, with an area of deficiency of up to 60% of agricultural soils (Chen and Zheng, 2004; Römheld and Kirkby, 2010). In this study, the average total K levels in the selected soils were 15.1 (3.16–26.9 g kg–1), slightly lower than the average K content in Chinese soils (Chen et al., 2020). The deficiency of K may therefore be a constraining factor in the selected soils for collembolan. Thirdly, K is an essential nutrient for the growth and development of organisms and is involved in numerous processes, such as the regulation of enzyme activity, membrane potential, cellular homeostasis, and stable protein synthesis (Tester, 2001; Luersen et al., 2016). Thus, K may increase the resistance of F. candida under metal exposure, further increasing Cd accumulation in the body. Studies also confirm that K channels control neuromuscular, mating, and locomotion behaviors in Caenorhabditis elegans (Abraham et al., 2010; LeBoeuf and Garcia, 2012; Luersen et al., 2016). Potassium channels have secretory and excretory functions and thus may promote the adsorption and accumulation of Cd (Piermarini et al., 2013; Wu et al., 2015). More importantly, Cd2+ binds to K channels, and this influences the transfer and cytotoxicity of Cd (Wang et al., 2017). Thus, K is essential for Cd accumulation in F. candida. To provide direct evidence of the effects of soil K on Cd accumulation in F. candida, a K-spiked single-species test was conducted based on the OECD guidelines with some modifications (OECD, 2009, Supplementary materials). The results confirm that soil K showed a positive relationship with Cd accumulation in F. candida (R2 = 0.964, P < 0.001), with the body Cd concentration significantly increasing with increasing soil K concentration (Fig. S3, P < 0.001). The current study indicates the necessity of studying the effects of nutrient elements on metal accumulation in biota.
3.3 Comparison of Cd toxicity prediction performance between multiple linear regression and BPNN models
Multiple linear regression models are commonly used to predict and estimate the dependent variable by the optimum combination of multiple independent variables (De’ath and Fabricius, 2000). Here, the survival rate of F. candida in naturally Cd-contaminated soils cannot be predicted by multiple linear regression analysis, therefore, the BPNN model was used. The reproduction inhibitory rate was predicted and compared by the BPNN and multiple linear regression models. All soil properties were input as variables and the reproduction inhibitory rate was well interpreted by stepwise multiple linear regression equations in the tested soils, but only soil total Cd concentration and soil pH were valid parameters. The stepwise regression equation was developed as follows:
Reproduction inhibitory rate = 0.180 Log CdT + 0.073 pH – 0.097 (R2 = 0.302, P > 0.05) (8)
Based on Pearson’s correlation analysis, soil total Cd concentration and soil pH had significant relationships with survival and reproduction inhibitory rate, thus, soil total Cd concentration and soil pH were set as input variables in the BPNN model, and survival/reproduction inhibitory rate was the output target. The number of the input layer was 2, the output layer was 1, the hidden layer was 5 based on the empirical formula, and the number of network iterations was 5000. In the BPNN model, 20 sets of samples were used for training, 4 were used for validation, and 4 for testing. Specific training, validation, and testing results are shown in Table S2 and the predictions of all the samples were significant (P < 0.001). The prediction results were compared with the multiple linear regression model, as shown in Table 2 and Fig. 4. The survival and reproduction inhibitory rates were successfully predicted by the BPNN model with R2 = 0.797 and 0.930, respectively. The prediction accuracy of the reproduction inhibitory rate by the BPNN model was much higher than by multiple linear regression, with a higher R2 and lower MAE, MRE, and RMSE. This indicates that the BPNN model is superior to the stepwise regression model. In addition, the larger soil total Cd concentration level (0.54 − 25.7 mg kg –1) indicated that the BPNN model was sensitive and suitable for wide soil metal concentrations. Therefore, the newly developed BPNN model can be useful for predicting Cd toxicity to F. candida in naturally contaminated soils.
Table 2
Prediction performance parameters of the BPNN and stepwise multiple linear regression models for the estimation of Cd concentration in F. candida and Cd bioaccumulation factor (BAF).
Target | Model | R2 | MAE | MRE (%) | RMSE |
Survival rate | BPNN | 0.797 | 3.27 | 0.042 | 4.69 |
Reproduction inhibitory rate | BPNN | 0.827 | 8.24 | 0.194 | 10.3 |
Multiple linear regression | 0.302 | 16.3 | 0.318 | 20.7 |
Log [Cd] | BPNN | 0.961 | 0.003 | 0.014 | 0.077 |
Multiple linear regression | 0.638 | 0.002 | 0.012 | 0.201 |
Log [BAF] | BPNN | 0.964 | 0.002 | 0.014 | 0.075 |
Multiple linear regression | 0.591 | 0.002 | 0.011 | 0.194 |
3.4 Comparison between multiple linear regression and BPNN models in predicting Cd accumulation in collembolan
The Cd accumulation in collembolan was interpreted well by stepwise multiple linear regression equations in the tested soils, and only soil total Cd and pH were valid parameters selected through stepwise multiple linear regression analysis, and the developed equations for Log [Cd] and Log [BAF] were as follows:
Log [Cd] = 0.568 Log CdT – 0.154 pH + 1.748 (R2 = 0.638, P < 0.001) (9)
Log [BAF] = – 0.432 Log CdT – 0.154 pH + 1.748 (R2 = 0.591, P < 0.001 (10)
Based on Pearson’s correlation analysis and RDA analysis, soil total Cd concentration, soil pH, SOC, and soil K play significant roles in Cd bioaccumulation, therefore, were selected as input variables in the BPNN model. Log [Cd] or Log [BAF] was the output target. The number of input layer was 4, the number of output layer was 1, the number of hidden layers was 5 based on the empirical formula, and the number of network iterations was 5000. In the BPNN model, 20 sets of samples were used for training, 4 sets for validation, and 4 sets for testing. Specific results on training, validation, and testing results are also shown in Table S2 and the predictions of all samples were significant (P < 0.001).
To retain the consistency of selected soil properties in all the models, the multiple linear regression equations entered were as follows:
Log [Cd] = 0.623 Log CdT– 0.144 pH + 0.228 Log KT – 0.319 Log SOC + 1.815
(R2 = 0.687, P < 0.001) (11)
Log [BAF] = – 0.377 Log CdT – 0.144 pH + 0.228 Log KT– 0.319 Log SOC + 1.815
(R2 = 0.646, P < 0.001) (12)
The prediction performance of entered multiple linear regression models slightly improved to the stepwise multiple linear regression models, therefore, entered multiple linear regression equations were not compared with the BPNN model. Although the MAE and MRE of the BPNN and multiple linear regression models were similar (Table 2), the BPNN model had higher R2 and lower RMSE than the multiple linear regression model. The prediction accuracy of the BPNN model of Log [Cd] and Log [BAF] was superior to the multiple linear regression model.
Notably, the correlation between the Log [Cd] predicted by the BPNN model and the measured Log [Cd] was stronger than that by the multiple linear regression model, with the predicted and measured values very close to the 1:1 line among a wide range of soil Cd concentrations (R2 = 0.910), but the correlation by multiple linear regression model was lower (R2 = 0.638, Fig. 4C). A similar trend was found in Log [BAF] (Fig. 4D). The correlation between the Log [BAF] predicted by the BPNN model and the measured Log [BAF] was higher than that by the multiple linear regression model, which the correlation between the predicted and measured values was very close to the 1:1 line (R2 = 0.979). In addition, with increasing Log [BAF], the predicted values by the multiple linear regression model gradually deviated from the 1:1 line, especially when the measured Log [BAF] exceeded 1.00. Recently, Wang et al. (2021) obtained similar results showing that the multiple linear regression model cannot predict well the Zn BAF in rice grain when the BAF exceeds 0.40. Thus, the multiple linear regression model is not suitable for the prediction of high Cd BAF and is limited to a wide range of soil properties. The BPNN model accounted for the complex nonlinear relationships between Cd accumulation in F. candida and soil properties and showed superiority in predicting internal Cd concentrations and Cd BAF values in F. candida over a wide range of Cd concentrations.
3.5 Application of the BPNN models for predicting Cd ecotoxicity in Southern Chinese soils
With the developed BPNN models, the predicted Cd toxicity to and accumulation in soil collembolans can be assessed without conducting toxicity tests. Here, 57 samples were also collected from southern China with a wide range of soil properties (soil pH 4.14–8.36, soil total Cd 0.15–25.5 mg kg–1, SOC 5.94–47.8 g kg–1, Table S3), and the accumulated Cd concentrations in F. candida and BAF values were predicted based on the BPNN models with the input layers soil total Cd, soil pH, SOC, and K, the number of input layer was 4, the number of output layer was 1, the number of hidden layer was 5, and the number of network iterations was 5000. Multiple linear regression models were also used for comparison with the valid parameters of the total Cd and pH in soils (Fig. 5). The predicted values were visualized on sensitive maps, and different colors were assigned to indicate the level of risk. The predicted values of the internal Cd concentration in F. candida and BAF-Cd values, as observed by the BPNN models, were generally higher than those by the multiple linear regression models. Notably, the BPNN models better identified the regions of higher risk. This was consistent with our previous results in which multiple linear regression underestimated Cd accumulation in F. candida (Fig. 4).
The mathematic models are also beneficial in the quantification of influencing factors and the development of further strategies to control metal risks in soils. There is a tendency for soil pH to decrease and SOC to increase in agricultural soils (Guo et al., 2010; Sun et al., 2012). Therefore, the effects of changes in soil properties on the evaluation of Cd ecological risks in soils need to be considered. Three soils (Table S1, Soil No. 15, 25, and 6) with different pH values (4.73, 5.85, and 7.36) were used to demonstrate the effects of changes in pH (± 0.5 and 1 unit) and SOC (+ 1% and 3%) on the BAF-Cd values. The results showed that soil pH was more important than SOC in Cd accumulation in F. candida (Fig. 6). Increases in SOC of 1% and 3% had no significant effect on the BAF-Cd values, but pH was significantly negatively associated with the BAF-Cd values. This is probably because decreasing soil pH increased Cd bioavailability, further increasing the accumulated Cd in the collembolan bodies. Thus, remediation methods should be taken to increase soil pH to lower the ecological risks of Cd in soils. The input data from the BPNN model is easily derived from soil chemical analyses or obtained from databases of national soil surveys (Zhao et al., 2015), and thereby the Cd toxicity to and accumulation in F. candida in soils can be predicted by the BPNN model without conducting toxicity tests.