2.1 Study area
The study area is located in the eastern part of the Qinghai-Tibet Plateau, a typical industrial area(range: 36.715667°~36.943608°N, 101.554588°~101.805370°E) (Fig. 1) in a valley, belonging to the Qilian Mountains in eastern Qinghai Province, with an average elevation ranging from approximately 2,210 to 3,900 m. The multiyear average temperature is approximately 4.9°C, and the precipitation reaches approximately 520 mm. The main soil type in the study area is black calcium soil and chestnut calcium soil. Heavy metal pollution, including zinc, lead and cadmium pollution, may be caused by some industries in this area (Khademi et al., 2019). In addition, there are mining, farming and agro-food processing industries, which together constitute an industrial cluster.
2.2 Sampling and analysis
The sampling sites were set up according to the principle of a uniform distribution, and the density was appropriately increased around the key polluting enterprises and should further consider the need for sampling in areas with different land use types(Fig. 1). Fifty-one soil samples were collected in the study area in spring (March to May), in which each soil sample included two subsamples at different sampling depths: topsoil (layer A: 0–20 cm) and middle soil (layer B: 20–40 cm). Each sample was stored in a foam box or sample box containing ice bags and sent to the laboratory on the same day of sampling to assess the relevant parameters.
After preliminary processing, such as debridement and sieving, the collected soil samples were treated according to Chinese national standards HJ/T 166–2004 Technical Specification for Soil Environmental Monitoring, NYT 391–2021 Environmental Quality of Green Food Production Land, and Soil Environmental Quality Soil Contamination Risk Control Standards for Agricultural Land (Trial) (GB 15618 − 2018), and the sample pre-treatment technique involving nitric acid + hydrochloric acid + hydrofluoric acid supplemented by microwave digestion was used to determine the five heavy metals of Cr, Cd, Cu, Pb and Zn in the collected soil samples via inductively coupled plasma optical emission spectrometry (ICP‒OES). Soil organic matter (SOM) and organic carbon (SOC) were determined via the potassium dichromate (K2Cr2O7) volumetric method (hydration heat method). The soil pH was determined using water as a leaching agent with a water-to-soil ratio of 2.5:1, followed by composite electrode measurement. Other soil physical and chemical properties (total nitrogen (TN), total phosphorus (TP), total potassium (TK) and bulk weight) were determined in accordance with the Technical Specification for Soil Analysis.
The spatial pattern of the content and heavy metal combination type, integrated pollution risk and ecological risk, were obtained via the IDW interpolation prediction method based on the ArcGIS 10.6 platform. We used the GDM based on R language to identify the optimal parameters (intermittent methods and intermittent intervals) for continuous-type variables such as topographic factors, soil indicators, and distance factors to obtain the maximum q-statistic value and thus fully reflect the explanatory power of the influencing factors.
2.3 Types of heavy metal combinations of similar pollution sources
The APCS-MLR model calculation process includes 6 steps:
Step 1
Five heavy metal contents were converted into a standardized form.
$${Z}_{i}=({C}_{i}-{\overline{C}}_{i})/{\sigma }_{i}$$
4
where \({C}_{i}\) and \({\overline{C}}_{i}\) are the measured and mean value of heavy metal i, respectively, and \({\sigma }_{i}\) is the standard deviation of the content of heavy metal i.
Step 2
PCA was performed of the standardized soil heavy metals.
$${Z}_{i}=\sum _{j=1}^{p}{g}_{j}\bullet {h}_{ij}$$
5
where j = 1..., p denotes the sources, and \({g}_{j}\) and \({h}_{ij}\) are the factor loading and factor score, respectively. Factor extraction with eigenvalues > 1 was employed after varimax rotation.
\({Z}_{i}\) must be converted into unstandardized absolute principal component scores (APCSs) before they can be used to analyse the contribution of the PCs to each heavy metal. Next, the APCSs could be calculated.
Step 3
An artificial sample of heavy metal i accumulated to 0 was artificially introduced at the observation sites and standardized. The standardized value of 0 can be calculated as follows
$${Z}_{i0}=\frac{0-\overline{{C}_{i}}}{{\sigma }_{i}}=-\frac{\overline{{C}_{i}}}{{\sigma }_{i}}$$
6
where \({Z}_{i0}\) is the standardized value of 0, \({\overline{C}}_{i}\) and \({\sigma }_{i}\) denotes the same meanings as Eq. (4). After calculating the standardized value of 0, i.e., \({Z}_{i0}\) is attached to the standardized data of Step 1, PCA was again performed.
Step 4
The APCS value for each heavy metal at the different positions was obtained by subtracting the factor score from the initial value, and was subtracted from the factor score (artificial samples). The APCS can be calculated as follows
$${APCS}_{i}={Z}_{i}-{\left({Z}_{0}\right)}_{i}$$
7
$${\left({Z}_{0}\right)}_{j}=\sum _{i=1}^{i}{S}_{i}\bullet {Z}_{i0}$$
8
where\({Z}_{i}\) is the same as that in Eq. (4),\({\left({Z}_{0}\right)}_{i}\) is the PCS value at a value of 0,\({S}_{i}\) is the factor score coefficient, and\({Z}_{i0}\) is the zero-valued standardized pollutant concentration at the observation sites.
Step 5
The contribution of heavy metal combination type k to the accumulation of heavy metal factor i was determined. Adopting the measured heavy metal content as the dependent variable and the APCS as the independent variable, multiple linear regression (MLR) fitting was performed, and regression coefficients were obtained. Then, the linear relationship between heavy metal content and the APCS under combination type k () can be expressed as
$${C}_{i}={b}_{i0}+\sum _{k=1}^{k}\left({b}_{ki}\bullet {APCS}_{ik}\right)$$
9
where \({b}_{ki}\) is the coefficient of MLR, \({b}_{i0}\) is a constant term, and \({b}_{ki}\bullet {APCS}_{ik}\) is the contribution value of combination type k to heavy metal i.
Step 6
The contribution rate of combination type k to heavy metal factor i can be calculated as follows
$${PC}_{ki}=\frac{{b}_{pi}\bullet \overline{{APCS}_{ik}}}{{b}_{i0}+\sum _{k=1}^{k}\left({b}_{pi}\bullet {APCS}_{ik}\right)}$$
10
where \(\overline{{APCS}_{ik}}\) is the mean APCS value of all samples of heavy metal i.
2.4 Geo-detector model for quantitative analysis
We choose two models, the factor-detector and the interaction-detector in the Geo-detector model, to further analyse potential pollution sources of heavy metal combination types. The APCS was chosen as the dependent variable, and environmental influencing factors were selected as independent variables. With reference to the results of previous studies and combining the actual conditions of the study area and sample selection, thereby focusing on the exploration of the influence of factors such as industrial activities, we selected 17 potential pollution sources as indicated in Table A1, to match several combination types identified through the APCS-MLR model (considering the common characteristics of waste and wastewater treatment plants (Pan et al., 2021; Yuan et al., 2022), we combined them into one potential pollution source), of which the distance factor was selected as the shortest distance between a given sampling site and the potential pollution sources, and the soil and land use types were assessed and recorded in the field.
Factor and interaction-detector methods were selected to calculate the driving force (q statistic) of the selected driving factors and potential pollution sources, respectively, on the APCS, and the factor-detector model can be expressed as:
$${q}_{x}=1-\frac{\sum _{h=1}^{L}{N}_{h}{\sigma }_{h}^{2}}{{N\sigma }^{2}}$$
11
where \({q}_{x}\) is the explanatory power of the potential sources for the independent variable, with q∈(0,1) and h = 1, 2, 3..., L, and L is the number of levels of factor x; N and \({N}_{h}\) are the study area and the number of samples in each stratum h, respectively; and \({\sigma }^{2}\) and \({\sigma }_{h}^{2}\) are the variance in Y in the whole study area and each level h, respectively.
The interaction detector determines whether there exists an interaction between any two of the potential pollution sources. As such, we determined whether the joint effect of the variables X1 ∩ X2 on the distribution of the soil heavy metal content was enhanced or diminished or whether these variables were independent. The calculation principle entails the calculation of q-statistic values of the effect of variables X1 and X2 on the soil heavy metal content, i.e., q(X1) and q(X2), respectively, calculation of the q-statistic value of their interaction, q(X1∩X2), and comparison of q(X1), q(X2) and q(X1∩X2). The calculation results could indicate the following five modes:
Nonlinear-weakening: MIN[(q(X1), q(X2)] > q(X1∩X2)
Univariate-weakening: MAX[q(X1), q(X2)] > q(X1∩X2) > MIN[q(X1), q(X2)]
Bivariate-enhancing: q(X1∩X2) > MAX[q(X1), q(X2)]
Independent: q(X1∩X2) = q(X1) + q(X2)
Nonlinear-enhancing: q(X1∩X2) > q(X1) + q(X2)
2.5 Integrated risk index evaluation model
In this study, a new integrated ecological risk factor evaluation model, the NIRI, was established, and the corresponding evaluation results were compared to those of the PLI and NIPI.
The PLI is mainly used to identify the combined pollution due to all heavy metals which are influence the soil pollution risks.
$$PLI=\sqrt[n]{{P}_{i1}\times {P}_{i2}\times {P}_{i3}\times \dots {P}_{in}}$$
12
$${P}_{i}=\frac{{C}_{i}^{}}{{C}_{b}^{}}$$
13
where n is the quantity of heavy metals;\({C}_{i}^{}\)and \({C}_{b}^{}\)are the content and background value of heavy metal I, and Pi1... Pin denotes the pollution factors of the individual heavy metal pollutants. The PLI can be used to evaluate the quality of the soil pollution more directly and intuitively.
The NIPI can be calculated as follows:
$${P}_{n}=\sqrt{{({P}_{iave}}^{2}+{{P}_{imax}}^{2})/2}$$
14
where Pn is the Nemerow composite \({P}_{i}\) value and Pi can be calculated via the same procedure as that expressed in Eq. (13). \({P}_{iave}\) and\({P}_{imax}\) are the mean and maximum values of Pi, respectively.
NIRI not only refers to the algorithm principle of the NIPI and RI but also captures the most important feature of introducing the degree of the toxicity of the different heavy metal factors, thus preventing extreme evaluation results due to the quantity of heavy metals. The NIRI can be expressed as follows:
$${E}_{i}={T}_{i}\bullet {C}_{i}/{S}_{i}$$
15
$$NIRI=\sqrt{{(E}_{imax}^{2}+{E}_{iaverage}^{2})/2}$$
16
where \({E}_{i}\) is the potential ecological risk factor for heavy metal i, and \({T}_{i}\), in this study, were chosen as 30, 2, 5, 5, and 1, respectively (Fang et al., 2019) for the toxicity of Cd, Cr, Cu, Pb and Zn; \({C}_{i}\) is the same as Eq. (13), and \({S}_{i}\) is heavy metal reference level. \({E}_{imax}\)and \({E}_{iaverage}\)are the maximum and average values of\({E}_{i}\), respectively.
The level of PLI, Pi and NIRI are shown in Table A2.
To compare the contributions of the different heavy metals to the above two risk evaluation models, the following equation could be used for calculation.
$${W}_{{P}_{i}}={P}_{i}/n{P}_{iave}$$
17
$${W}_{{E}_{i}}={E}_{i}/n{E}_{iaverage}$$
18
where \({W}_{{P}_{i}}\) and \({W}_{{E}_{i}}\)are the contribution rates associated with the NIPI and NIRI, respectively, of heavy metal i, and n is the number of heavy metal factors.