Spatial Heterogeneity Analysis of Soil Heavy Metals in Chongqing City Based on Different Interpolation Methods

: Heavy metal pollution in urban soil is an important indicator of environmental pollution, 3 which is of great significance to the sustainable development of cities. Choosing the best 4 interpolation method can accurately reflect the distribution characteristics and pollution 5 characteristics of heavy metals in soil, which is conducive to effective management and 6 implementation of protection strategies. In this study, the grid sampling with a depth of 40cm 7 was carried out in the whole study area based on the principle of uniform sampling, and the 8 characteristics of As, Cu and Mn elements in the soil of the main urban area of Chongqing 9 were investigated. The interpolation accuracy and difference of results of four interpolation 10 methods, namely ordinary Kriging (OK), inverse distance weighting (IDW), local polynomial 11 (LPI) and radial basis function (RBF), were analyzed and compared. The results showed that 12 the average values of As (5.802 mg kg -1 ), Cu (23.992 mg kg -1 ) and Mn (573.316 mg kg -1 ) in 13 the soil of the study area were lower than the background values of heavy metals in Chongqing. 14 Coefficient of variation showed that As (55.71%), Cu (35.73%) and Mn (32.21%) all belonged 15 to moderate variation. The parameters of semi-variance function theory model show that Cu 16 element belongs to moderate spatial correlation, and As and Mn element have strong spatial 17 correlation. The spatial distribution of the three elements was further predicted by using OK 18 method, IDW method, LPI method and RBF method, which showed that LPI and OK method 19 had strong smoothing effect and could not reflect the information of local point source 20 pollution, while the interpolation results of IDW method and RBF method greatly retained the 21 maximum and minimum information of element content, which reflected the necessity of using 22 different methods when studying the spatial distribution of soil properties.


Introduction
Due to the change of natural environment such as soil formation process and spatial continuity of climate zone, the characteristics of soil are related and related in space, which are not uniform and independent (Lahlou et al.,2004), and have the characteristics of irreversibility, long-term, concealment and lag (Li & Wu,1991).
As an important part of urban environmental factors, the environmental quality of urban soil is directly related to human health and safety, and is of great significance to the sustainable development of cities. Various media (such as atmospheric dry and wet deposition, dust and soil, etc.) exist in urban environment.Urban soil is the main gathering place of urban pollutants, and heavy metals carried by these pollutants also enter urban soil in large quantities, resulting in heavy metal pollution of urban soil, resulting in the degradation or even loss of the original function of urban soil .(Chen et al.,2020).Heavy metal pollution in urban soil is an important indicator of environmental pollution, and it has become the focus of the research on urban environmental pollution.
Geostatistics is a mathematical geological method with regionalized variables as its core and theoretical basis, and spatial correlation and variogram as its basic tools.After years of development, geostatistics has developed into a mature tool for studying spatial variability, which can retain spatial variability information to the maximum extent.The application of geostatistics in mineral geology has reached a mature stage, and has been widely used in hydrology, pedology and other fields (Cao et al.,2008;Chen et al.,2005).The optimization of spatial interpolation method is the key to accurately predict the spatial distribution characteristics and pollution risk of heavy metal content in regional soil (Ma et al.,2018), and the selection of interpolation model determines the effect of soil heavy metal pollution assessment.The spatial interpolation methods widely used in soil heavy metal pollution assessment include inverse distance weighting method, Kriging interpolation method, spline function method, multiple regression method and radial basis function method (McGrath et al.,2004;Zhao et al.,2020).
Heavy metal pollution in soil has been widely concerned by the government and the public.Many experts and scholars at home and abroad have been devoted to the study of heavy metal pollution in soil for many years, but there is little research and analysis on heavy metal pollution in the main urban area of Chongqing.Among the heavy metals, As, Cu and Mn have very strong toxicity, and have an important impact on urban development and people's health.Therefore, the objectives of this study are :(1) describe the characteristics of As, Cu and Mn contents in the soil of Chongqing;(2) Four interpolation methods, namely ordinary kriging (OK), inverse distance weighted (IDW), local polynomial interpolation (LPI) and radial basis function (RBF), were used to visualize the distribution of three kinds of heavy metal pollution and reveal the spatial distribution law of heavy metal content in soil.( 3) analyze the spatial patterns generated by OK, IDW, LPI and RBF interpolation methods and their differences.The main urban area of Chongqing includes six districts of Yuzhong, Jiangbei, Dadukou, Shapingba, Nan 'an and Jiulongpo in the core area of the city and three districts of Banan, Yubei and Beibei in the peripheral metropolitan area.The study area belongs to subtropical monsoon humid climate, with warm winter and hot summer, with annual average temperature of 16~18℃, average temperature of 5~7.9℃ in January in winter and 28~34.4℃ in July in summer, with four distinct seasons, with more fog and less frost (XU et al.,2009).The Yangtze River runs through the central part of the study area, and the Jialing River flows to the north.There are many rivers with a long history and abundant water.Because of the complex lithology of the parent rock, the soil types in the study area are rich and varied, which can be divided into 8 soil types and 16 sub-types such as paddy soil, newly accumulated soil, yellow soil, yellow brown soil, purple soil, limestone soil, red soil and mountain meadow soil.

Soil sampling and data collection
Soil sampling is carried out in the main urban area of Chongqing, China.According to the characteristics of the study area, grid sampling is carried out in the whole study area with the help of GPS.The sampling depth is 40cm and the sampling density is 4 km.Three multi-points are collected and combined within 100m around the sampling center, and four sampling points are determined by taking the sampling point located by GPS as the center and radiating around for 40m (Figure 1).Special areas such as roads and ditches were avoided when sampling, and 1kg soil samples were taken according to the quaternization method.A total of 342 soil samples were collected in the 9 district of Chongqing.
Samples collected in the field are packed in plastic bottles with caps and sent to the national first-class qualification testing center laboratory for testing.After receiving the samples in the laboratory, the soil samples are dried in the air, removed of gravel, plant debris, etc., ground with agate mortar, passed through a soil sieve with a diameter of 2mm, and then ground all through a 100-mesh sieve, of which it is best to grind all through a 100-mesh sieve at one time.200g samples are taken by quartering method and stored for later use, and 4g samples are weighed and put into a mold to be lined with boric acid.
Under a pressure of 40t, the pressure holding time is 20s, and the pressure is pressed into a diameter of 32mm.The detection was carried out according to the geological survey technical standard of China Geological Survey "Technical Requirements for Analysis of Eco-geochemical Evaluation Samples (Trial)" (DD 2005-03).The detection limits of As, Cu and Mn in samples were calculated by using different determination methods, and the detection limits all reached mg kg-1, which can meet the requirements of rapid analysis of Class II soil in the national soil environmental quality standard.

Statistical and geostatistical analyses
Standard statistical analysis includes maximum value, mean value, standard deviation, coefficient of variation, etc. to explain the situation and trend of As, Cu and Mn reserves in soil.To satisfy the assumption of normality in geostatistical analysis, the original data were logarithmically transformed using GS+10.0 and inversely transformed by weighted averages.In this study, the distribution of As, Cu and Mn reserves in the soil was generated in ArcGIS 10.2 by comparing the four interpolation methods of ordinary Kiger (OK), inverse distance weighting (IDW), local polynomial (LPI) and radial basis function (RBF).

Analysis of spatial structure characteristics
Geostatistics is a mathematical method based on the theory of regionalized variables and using semivariance function as the basic tool.It is based on the concepts of regionalized variable, random function, intrinsic hypothesis and stability hypothesis.Semi-variogram function, also known as semi-variogram, is a key function for studying soil variability in geostatistics.Grid sampling is usually used to estimate the semi-variance function of a soil property, which is conditionally negative qualitative.The calculation formula is (Dai et al.,2019): (1) (ℎ): Pair number of all observation points with h as the spacing (if there are n sampling points, then (ℎ) = n-1);(ℎ): Semi-variance, usually using (ℎ) as a semi-variance function diagram in a certain characteristic direction.
Semi-variance function has three extremely important parameters, namely, Range, Nugget and Sill, which are semi-variance functions used to express the spatial variation and correlation degree of regionalized variables on a certain scale.The variable range (a) reflects the spatial variability of soil properties, which is spatially independent outside the range value and correlated within the range value.
The Nugget value (C0) represents a variation caused by the distance between non-sampling points, which belongs to random variation and reflects the spatial variation caused by random factors (such as socioeconomic factors) (TAN et al.,2009).The Sill value (C0+C), also known as the top value, refers to the semi-variance maximum value existing in different sample spacing, reflecting the spatial variation caused by natural factors (such as soil parent material, terrain, etc.) and socio-economic factors (such as fertilization, planting system, etc.), which is composed of random variation and structural variation (Guo et al.,2019).The purpose of analyzing the spatial structure characteristics of variables is to use the determined semi-variance function to best fit the model and its parameters (Nugget value C0), Sill value (C +C0), Nugget effect (C0/C +C0) and range (a).Combined with spatial correlation distance, the spatial correlation of each attribute was evaluated, and the variation law, variation degree and the reason of variation were analyzed.The size of the range (a) is used to analyze the mobility of the variable, namely the degree of spatial dependence (Dai et al.,2019).

Ordinary kriging
Ordinary Kriging (OK) is based on the theory of regionalized variables and takes variogram as the main tool.The advantage of this method is that it considers the random distribution of sample points in the spatial structure.The advantage of this method is that the sample points are randomly distributed in the spatial structure.The accuracy of the estimate depends on the selection of the weight coefficient, and the optimal weight coefficient depends on the selection of the variogram model (Xie et al.,2018), which is used to calculate the integrity of the spatial continuity in one or more directions (Schoening et al.,2006).
In order to reflect the spatial heterogeneity of heavy metal elements in soil more accurately, the theoretical models are constructed by fitting the determination coefficient R 2 and residual error of semivariogram.Linear model, Spherical model, Exponential model and Gaussian model can be selected to construct semivariogram of As, Cu and Mn content in topsoil, and the best model is selected to analyze the spatial structure and provide interpolation input parameters (ZHANG et al.,2009).

Inverse Distance Weighting
Inverse Distance Weighting (IDW) is a simple interpolation method based on Tobler theorem (Song & Wu,2010).It is assumed that each measuring point is affected locally, and this influence is inversely proportional to the distance.Its principle is to interpolate by the weighted average of the measured values of each point near the point to be measured.According to the principle of spatial autocorrelation (Bargaoui & Chebbi,2009), the point closest to the predicted position is assigned a larger weight, but the weight decreases as a function of distance.
Where Z is the estimated value of interpolation point, Zi(i=1~n) is the measured sample value, n is the measured sample number used for estimation, Di is the distance between interpolation point and ith control point, r is the power of distance, and r=2 is taken.

Local polynomial interpolation
This method is suitable for polynomial with a given order to interpolate all the sample points in the given search neighborhood, and the resulting surface mainly depends on local variation and is easily affected by the distance between adjacent regions.The data set with small variation is most suitable for this method.

Radial basis function
The radial basis function is used to approximate the predicted value of the measured function F=F(x), and its core is to construct the approximation function.Compared with other interpolation methods, the processing is more complex, and it is suitable for interpolation of a large number of data. (4) φ(di) is the radial basis function, di is the distance between interpolation point i and sampling point x, and fj(x) is the trend function.

Data verification
The prediction accuracy of As, Cu, and Mn contents was evaluated by cross validation method (Gotway et al.,1996;Rodriguez Martin et al.,2016).Cross-checking method first assumes that the content value of each sampling point is unknown, and estimates it by using the values of surrounding sampling points, then calculates the error between the estimated value and the actual measured value, and evaluates the advantages and disadvantages of interpolation method according to the error statistical results (Robinson & Metternicht,2006).Commonly used indicators include root mean square error (RMSE), mean error (ME) and inaccuracy (IP), which are used to compare the interpolation accuracy of different methods (Yang et al.,2002).These indices are calculated as follows: (5) (6) In which Pi, Mi and M represent predicted value, measured value and measured average value respectively 3 Results and discussion

Descriptive statistical analysis
Table 1 shows the descriptive statistics of As, Cu and Mn in the study area.The content of As in soil heavy metals in the main urban area of Chongqing ranges from 1.965 mg/kg to 21.180 mg/kg, and the difference between the maximum value and the minimum value is 10.8 times.The average value (5.802 mg/kg) is lower than the background value of Chongqing soil (6.62 mg/kg).Among the three elements, the coefficient of variation of As element, the variation coefficient of As element (55.71%) was the highest, indicating that the external pollution factor was larger, and the cumulative use of pesticides, herbicides and insecticides might be the important cause.The range of Mn content is 107.900-1584.000mg/kg, the difference between the maximum value and the minimum value is 14.7 times, the average value (573.316mg/kg) is lower than the background value of Chongqing soil (615.00mg/kg), and the coefficient of variation of Mn is the lowest (32.21%), indicating that it may be less affected by external factors.The range of Cu content is 6.208-76.600mg/kg, the difference between the maximum value and the minimum value is 12.3 times, the average value (23.992 mg/kg) is not much different from the background value of Chongqing soil (24.60mg/kg), and the coefficient of variation.SD = standard deviation; CV = coefficient of variation;

A theoretical model of GS+ fitting semi-variance function
After analyzing and fitting the variogram, the best variogram model is selected based on the principle of maximum coefficient (R 2 ) and minimum residual error (RSS).Table 2 shows the best theoretical model and related parameters selected by semi-variance fitting for three heavy metals.The results show that the semi-variance theoretical model of As and Mn is Gaussian, and the semi-variance theoretical model of Cu is Exponential.Generally, the ratio of Nugget value to Still value (nugget-base ratio) is used as a scale to measure the spatial correlation degree of variables, which is an important index to reflect the spatial variation degree of regionalized variables, also known as Nugget effect.If the ratio is less than 25%, it shows that the spatial variation of variables is mainly structural variation, and the system has strong spatial correlation, which is mainly controlled by natural factors and less affected by human factors If the ratio is 25%-75%, it shows that the system has moderate spatial correlation; If the ratio is greater than 75%, the spatial correlation of the system is weak, and the variables are greatly influenced by human factors (Zheng et al.,2006).Spatial correlation is the result of structural factors and

Comparison of Four Interpolation Methods
Cross-verify and check the interpolation prediction accuracy by using the leave one method (Figure 2-4).Different scatter distribution patterns show that different methods can predict different values of the same point (Liu et al.,2013).The linear model and 1: 1 line intersect with the contents of As, Cu and Mn.The linear model is higher than the contents of As, Cu and Mn, and vice versa.This method aims to realize unbiased estimation of the mean value (Zhao et al.,2015).
It can be seen from fig. 2 that the IDW method As is the largest correlation coefficient between the predicted and measured values of as elements, followed by LPI and RBF, and the OK method has the smallest value; The correlation coefficient between predicted and measured values of Cu element is the largest by IDW method, followed by LPI and RBF, and the smallest by OK method.The correlation coefficient between predicted and measured values of As element is the largest by IDW method, followed by OK and RBF method, and the lowest by LPI method.Generally speaking, the correlation coefficients between the predicted and measured values of OK method are 0.2854~0.3186,IDW method is 0.3365~0.4384,LPI method is 0.2570~0.3949,and RBF method is 0.2325 ~ 0.3949 The correlation coefficients of As, Cu and Mn in IDW method are higher than those in OK method, LPI method and RBF method.In order to compare the accuracy of the four interpolation methods more intuitively, the mean square error (RMSE), mean error (ME) and inaccuracy (IP) were calculated respectively.The smaller the RMSE value, the higher the accuracy.The closer ME is to 0, the smaller the interpolation error is.The smaller the IP, the higher the interpolation accuracy.
Table 3 shows the cross-validation results of four interpolation methods for As, Cu and Mn elements.
The comparative analysis shows that for As elements, among the four spatial interpolation methods, OK method, IDW method, LPI method and RBF method, the three elements in RMSE are significantly different, indicating that the predicted values of the three elements are overestimated; For the same element, there is little difference in overestimation degree between different methods, among which Mn element is the most overvalued.According to the value of ME, the value of OK method is closer to 0 for As element, which indicates that its predicted value is relatively unbiased; The difference between LPI of Cu element and ME and AME of RBF method is smaller and closer to 1, which indicates that LPI and RBF method have better prediction accuracy.Compared with other methods, ME in LPI method of Mn element is closer to 1, which indicates that the prediction accuracy is better.

Prediction of Spatial Distribution of Heavy Metals in Soil
Cross-validation method is used to evaluate interpolation errors of sample points, which cannot  with small area; The situation predicted by LPI method is more concise and concentrated, which smoothly reflects the wide-ranging trend and trend; The RBF method is similar to the OK method as a whole, but it misses some local information with small distribution area and fails to reflect the transition region between high value and low value.According to the variogram, Cu belongs to a moderate degree of spatial correlation, which is greatly influenced by the thought factors, which is consistent with previous studies.Guo et al.(Guo et al.,2016)studied that heavy metal pollution in the soil of Chongqing's main urban area has a certain relationship with traffic intensity, industrial pollution and the length of urban construction.The more developed the traffic, the greater the traffic flow, and the more serious the accumulation of Cu elements.Mn content is low in the southeast of the study area, but high in the north, east and west.Mn content gradually increases from southeast to north and east-west direction, with good continuity.LPI method reflects the above characteristics, but the smoothing effect is too obvious, which can not accurately reflect point source pollution and small-scale non-point source pollution, and its interpolation results are not as detailed as the other three methods.The similarity between OK method and LPI method is higher, and the change trend of OK method is more obvious.The RBF method reflects the pollution situation in the whole study area in a small scope in detail, and the performance content is more detailed.IDW method can show the characteristics of point source pollution in areas with high concentration, the content change trend is not obvious, and the pollution degree is more accurate.The pollution degree of Mn in the main urban area of Chongqing is small, but the degree of Mn pollution in the mining area belongs to moderate pollution to heavy pollution and there are many polluted areas (Luo et al.,2018), which shows that the pollution of Mn element is closely related to artificial mineral mining.shows that the accuracy of OK method is higher than the other three methods for As element.For Cu element, the accuracy difference of OK method, LPI method and RBF method is small; For Mn element, OK method has certain forecasting advantages.Generally speaking, LPI method and OK method show strong smoothing effect, but IDW method and LPI method can better reflect the extreme value information and local pollution situation, which reflects the necessity of using different methods when studying the spatial distribution of soil properties.

Fig. 1
Fig. 1 Distribution of main urban areas and soil sampling points in Chongqing Chongqing municipality is located in southwest China, the upper reaches of the Yangtze River, across 105°11'~110°11' E, 28°10'~32°13' N, and is located in the transition zone between the Qinghai-Tibet Plateau and the plain of the middle and lower reaches of the Yangtze River.Chongqing's terrain gradually decreases from north to south to the Yangtze River Valley, with hills and low mountains in the northwest and middle, Daba Mountain and Wuling Mountain in the southeast, with many slopes, which is called "mountain city".

Z
(xi)=y,i=0, 1, 2, …, n. xi is the interpolation node, n is the number of samples, Z(x0) is the predicted value of the first sample, Rn(x) is the interpolation remainder, and f(x) is the kernel function(Samadder et al.,2010)。 random factors.Structural factors such as parent material, soil type, climate and other soil-forming factors; Random factors include farming, management measures, planting system, pollution and other human activities.The parameters show that the spatial variation of heavy metals in Chongqing is: the Nugget effect [C0/(C+C0)] Cu > As >Mn.The Nugget value of Cu element block to Still value is 43.7%, which belongs to moderate spatial correlation, which shows that its spatial variation is the result of random factors and structural factors.Therefore, the Cu content in urban topsoil in Chongqing is affected by random factors (mainly artificial input) to a certain extent.H. Khademi(Khademi et al.,2020) once pointed out that there is a great correlation between Cu concentration and particle size, and the particle size dependence of metal concentration in street dust and the enrichment degree of Cu element are higher.The results show that the contribution rate of human pollution to Cu is high in the economically developed main city of Chongqing, but Cu in the study area is still controlled by structural factors (such as climate, parent material, soil type and other natural factors), and its original spatial pattern has not been destroyed.The Nugget effect of As and Mn is less than 25%, which indicates that the system has strong spatial correlation, and its spatial variation is less affected by random factors.The variation range of Cu element is large, which explains the controlling effect of structural factors on Cu content on the other hand.The range of As and Mn are roughly equal, and both are close to the distance between sampling points, which shows that they are greatly influenced by human factors, and the degree of influence of human factors is Mn>As.

Fig. 2
Fig. 2 Cross-validation of ordinary Kriging (OK), inverse distance weighting (IDW), local polynomial (LPI) and radial basis function (RBF) interpolation methods for As content in soil reflect the spatial distribution characteristics of interpolation errors.According to ordinary kriging (OK), inverse distance weight (IDW), local polynomial (LPI), radial basis function (RBF) interpolation principle and semi-variance function fitting parameters, this paper applies Geostatistical Analyst module in ArcMap software to carry out spatial variation interpolation, and draws the spatial distribution trend map of three heavy metals in soil in Chongqing main urban area.As, Cu and Mn elements are interpolated by ordinary Kriging interpolation (OK), inverse distance weighting (IDW), local polynomial (LPI) and radial basis function (RBF), respectively.The interpolation distribution results are shown in Figures 5, 6 Fig. 5 Spatial distribution of soil heavy metal As by different spatial interpolation methods Fig. 6 is the spatial distribution map of Cu in the study area under four interpolation methods.It can be seen from fig. 6 that the highest values predicted by OK method are distributed in the north and west of the study area, and there are large areas of high values in the north and west, and low values are mainly concentrated in the south; The prediction results of IDW method can better reflect the local information

Fig. 6
Fig. 6 Spatial distribution of heavy metal Cu in soil by different spatial interpolation methods

Fig. 7
Fig. 7 Spatial distribution of soil heavy metal Mn by different spatial interpolation methods

Table 1
Descriptive statistical analysis of heavy metal content in soil.

Table 2
Theoretical model and related parameters of heavy metal content in soil

Table 3
Cross-validation results of four interpolation methods for As, Cu and Mn elements