Analysis of correlation and variation of Cadmium in soil around Enterprises in Fujiang River Basin

In order to explore the correlation and spatial variation law of Cadmium (Cd) content in the soil around enterprises in the Fujiang River Basin, Global Moran's I and Anselin Local Moran's I were used to analyze the correlation, and the spatial structure was analyzed by using semivariogram. The results show that: the overall level of Cd in the soil around the enterprises in Fujiang River Basin is relatively low, which does not exceed the risk screening value of the national soil environmental quality standard (GB15618-2018), and only the content in some regions exceeds the risk screening value; The spatial correlation of soil Cd in different directions and distances is different. The distribution of high value cluster points and abnormal value points of soil Cd is less, most of them belong to low value cluster points and insigni�cant points, and they have a certain correlation with the number of enterprise distribution and geological background; The spatial variation of Cd in soil is of moderate intensity, and the spatial differences in different directions are different, according to the order from large to small: 0° > 90° > 45° >135°, the spatial differences are consistent with the distribution of local cold hot spots and outliers.


Introduction
Soil is one of the basic elements of ecosystem, which not only affects the development of the national economy and the safety of national land resource environment, but also relates directly to the safety of agricultural products and human health (Chen et al. 2018).As a heavy metal element with strong biological toxicity, the natural form of Cd transformed from bedrock and soil parent material will not affect the organisms, while the Cd introduced into the soil by human factors is easily absorbed by plants due to the valence state, and nally enriched at the top of the food chain, endangering human health (Vries et al. 2007;Cui. 2003).
In recent years, the spatial variability of soil Cd has been extensively studied by domestic and international scholars, mainly using Moran's I, coe cient of variation, semivariogram theory, scale variance analysis, geostatistics and other methods to analyze the spatial correlation, spatial structure, variation rule and cause of formation of soil heavy metals in the study area.Among them, Moran's I and coe cient of variation can be combined with the theory of semivariogram.For example, Qin et al (2018) used Moran's I and semivariogram theory to reveal the spatial correlation of soil Cd and its variation law in spatial structure in the study area, while Li et al (2018) and Chandra et al (2014) used the theory of coe cient of variation and semi variogram to explore the variation and spatial autocorrelation of heavy metals in soil in study area.The method of scale variance analysis, which is based on the theory of semivariogram, is used to study the variation of soil heavy metals at different spatial scales.For example, Liu et al (2019) in order to study the spatial heterogeneity and spatial structure characteristics of soil heavy metals, the method of semivariogram and scale variance analysis is used to quantitatively depict the level structure and characteristic scale of soil Cd spatial variation.Geostatistics is also one of the powerful tools to analyze spatial variation.For example, Wu et al. (2017), Lin et al (2002) and Ye et al (2019) used this method to predict the spatial distribution of element content in soil, and analyze the spatial distribution characteristics and variation laws of soil by analyzing the properties of soil itself and external factors.In the above studies, local clustering features are not considered and there is no report on using Anselin local Moran's I index to analyze spatial clustering features of soil heavy metal content.Only some scholars use this index to analyze spatial clustering features of animals, plants and other aspects (Feng et al. 2019;Sun et al. 2018).
Taking the Fujiang River basin as the research object.Firstly, Global Moran's I index and Anselin Local Moran's I index were used to evaluate the spatial correlation and clustering characteristics of heavy metal content in soil in global and local range.Then, the semivariogram was observed to explore the spatial variation law of soil heavy metals in different directions.Finally, the spatial variation cause of soil heavy metals were analyzed according to the distribution of different types of surrounding enterprises, so as to provide a theoretical basis for the subsequent pollution control and treatment.

Overview of the study area
The Fujiang River originates from Xuebaoding, the main peak of Minshan Mountain between Songpan county (32°35′N,103°03′E) and Pingwu County 32°24'N,104°31'E , and mainly passes through Mianyang City, Deyang City and Suining City.It is the largest tributary of the Jialing River, a tributary of the Yangtze River.The basin belongs to subtropical humid climate area, with annual average temperature of about 15° and average precipitation of 1200 mm.The basin has a large population, dense towns and developed transportation.Cities along the river have initially built their own light and heavy industrial systems.With the development of social economy, the soil near the river basin has been polluted to some extent, especially with heavy metals, so it is necessary to study the heavy metals in the river basin.

Sample collection and determination
Use arcgis10.5 to arrange the points in the study area.In principle, arrange a point for every 10000m×10000m grid.The center of the grid is the soil sampling point, and make appropriate adjustment according to the terrain and surrounding enterprises.A total of 137 surface soil points are arranged, as shown in Fig. 1.According to the longitude and latitude of the sampling points in the center of the grid, each planned sampling point is precisely located by GPS with an error of no more than 100m.When sampling personnel arrived at the target point, they must observe whether it meets the representative requirements of soil sampling.The sampling point is selected within the allowed range and the displacement distance is no more than 50m, and sampling sites are not provided near steep slopes, low-lying stagnant land, residential buildings, roads, ditches and cesspool.
The sampling area is generally 100m×100m with determined point as the center.The quincunx mixed sampling method is adopted.
When sampling, rst use a shovel to dig a 20cm deep earthwork larger than the soil volume, then use a wooden shovel to remove the soil contacting with the shovel, and put it into the sample bag.Do not dig in an oblique direction, try to make the sampling volume consistent up and down, and then put the collected soil sample in a cool place for storage.During the preparation of soil samples, rstly, take a proper amount of fresh soil samples and lay them on a clean enamel plate or glass plate to avoid direct sunlight, and the ambient temperature is not more than 40℃.Then, air dry naturally, remove stones, branches and other impurities, crush the soil blocks and pass them through a 100 mesh nylon screen, and mix them evenly for testing.Cd was determined by inductively coupled plasma mass spectrometry (ICP-MS).The precision and accuracy of the analytical method were controlled by national soil standard gss-15 and indoor parallel samples.The recovery was between 99% and 105%.

Research method Spatial autocorrelation analysis
A completely random sample point in geostatistics is meaningless, and the content of heavy metals in the soil at a certain point in space does not change randomly.It is not only related to the distance and location of the sample point, but also has a signi cant effect on the way and mode of transmission (Schmal et al.2017).When judging the spatial autocorrelation, not only the large-scale trend of the whole research area, but also the local effect should be considered.Therefore, the global Moran's I should be used to evaluate the spatial autocorrelation in the whole area, and the Anselin local Moran's I index should be used to nd out the hot spots, cold spots and outliers in the local area (Fu et al. 2019).The calculation formulas are as follows: Where n is the total number of spatial elements, w ij is the spatial weight matrix, x i and x j are the actual observation values of the spatial variable x at different positions i and j respectively, and is the average value of spatial variable x.The value range of I is [-1, 1], when I > 0, it means that the space is positively correlated, the larger the value is, the more obvious the spatial correlation is; when I = 0, it means that the space is uncorrelated; when I < 0, it means that the space is negatively correlated, the smaller the value is, the more obvious the spatial difference is.According to I i , we can distinguish high value (HH) clustering, low value (LL) clustering, low outliers

Semivariogram analysis
Semivariogram, also known as semi-variant function, is a unique function for geostatistical analysis.It can more accurately describe the spatial variation characteristics of regionalized variables, and quantitatively show the spatial properties changing with the change of distance (Styrishave et al. 2012;Rajendra et al. 2019).The calculation formula is: Where N(h) represents the number of discrete point pairs with the same step size vector h in space, Z(x i ) and Z(x i +h) represent the regionalized variable values on x i and x i +h respectively.The value of semivariogram increases with the increase of distance, and when the interval of discrete point pairs is 0 and semivariogram tends to be stable, four parameters will be produced, that is, nugget value, range of variation, part base station value and base station value.When the step length h is 0, the semivariogram should also be 0, but due to the in uence of random factors, the semivariogram r(h) is not 0. At this time, r(h) is the nugget value.With the increase of step length h, the correlation between the pairs of discrete points becomes smaller and smaller, the semivariogram r(h) tends to be stable.
At this time, r(h) is the base value, h is the variable process, and the partial base value is the difference between the base value and nugget value.

Statistics of Cd in soil
The statistics of Cd in the study area are shown in Table 1.The analysis results of 125 soil samples showed that the soil in the study area was weakly acidic as a whole, the average and median values of soil pH were 6.66 and 6.45, respectively, which did not exceed the risk screening value of the national Soil Environmental Quality Standard (GB15618-2018).The contnet of heavy metal in the soil was at a low level, while the maximum value of Cd exceeded the risk screening value, indicating that Cd contaminated the soil in some areas.In terms of skewness, the distribution of data is higher and narrower than the normal distribution, indicating that the data is more concentrated near the average value than that of normal distribution; in terms of kurtosis, the distribution of data is positive skewness (right skewness), but after log transformation, it is close to normal distribution.

Global spatial autocorrelation
To explore the spatial autocorrelation of soil Cd in Fujiang River Basin in different directions, the global Moran's I in four directions (0°, 45°, 90°and 135°(clockwise rotation) were calculated by using Moran's I module running GS + 9.0 software, as shown in Fig. 2. In the 0°d irection, global Moran's I is a positive number or 0, when the distance between sampling points is 0-50km, which is positively correlated or uncorrelated in space, and is basically negatively correlated when the distance exceeds 50km; in the 45°direction, global Moran's I is generally greater than 0, which is positively correlated in space; in the 90° direction, the global Moran's I value change is similar to the trigonometric function.When the sampling points distance is between 0-50km and 100-150km, the global Moran's I is greater than 0, which is positively correlated in space, and negatively correlated in 50-100km and 150-250km; in the 135° direction, the value change of global Moran's I is similar to "W" type.When the distance between sampling points are in 0-10km and 100-125km, it is positively correlated in space, while negatively correlated in 10-100km and 125-250km.

Local cold hot spot and outliers
Compared the global Moran's I, the local Moran's I focuses more on analyzing the clustering characteristics of the observed values in the local scope and inferring the location of the outliers.They complement and illustrate each other, and they have certain similarity and particularity.Bivand et al (2018) Compared the differences between the global correlation method and the local correlation method in measuring spatial autocorrelation, and pointed out that the whole data set should be considered when using the global correlation method, while the local correlation method is for each independent body and allows inferring the statistical results.Boots et al (2000) studied the characteristics of global Moran's I index and local Moran's I index in different spatial models, and showed that the global Moran's I index is suitable for the overall measurement of the study area, while the local Moran's I index is suitable for determining the local observational characteristics of speci c and adjacent areas.In this paper, on the premise of analyzing the spatial correlation of soil Cd in the whole range of the study area, in order to further explore the characteristics of its local spatial distribution, arcgis10.5software was used to run Anselin local Moran's I analysis module to get Fig. 3. Through the collection of relevant enterprise information and actual investigation in the study area, Fig. 4 was obtained.Overall, the Cd of most sites in the study area is not signi cantly high or low, but some areas also have clustering effect and outliers.It can be seen that the high-value clustering areas, that is, hot spots, is mainly distributed in Deyang City, mainly metal mining and processing industry, chemical raw materials and chemical products industry;The low value cluster area, that is, cold point, is mainly concentrated near the middle reaches of the Fujiang River Basin, with a small amount of distribution in the upstream and downstream.However, there are more enterprises in the middle reaches, and the soil Cd pollution is relatively light, which indicates that the geological background value of the soil Cd content in this area is low, there is only one high value abnormal point, which is distributed in the middle reaches of the Fujiang River Basin.The cause of the outliers may be caused by the geological background values and the inadequate management of some enterprises.The low value abnormal points are mainly concentrated in Mianyang City, and a few in Deyang City, but the number of enterprises in this area is small, so the outliers may be caused by the uneven distribution of the geological background values.

Spatial structure analysis
Most geographical phenomena have spatial correlation characteristics, that the closer things are, the closer their characteristics are, and the semivariogram is the quantitative expression of this similarity.With the help of GS + 9.0 software, the variogram module was run to simulate the semivariogram of soil Cd of enterprises around the Fujiang River Basin, and Table 2 was obtained.It can be seen that the nugget value of the basin is very small, so the spatial variation caused by random factors can be ignored.The base effect is between 0.25 and 0.75, so the spatial variation of Cd in soil is moderate; the variation is more than 29km, that is, the soil Cd has spatial autocorrelation within the range of the distance.But through measurement, the maximum distance of soil sampling points in the study area was 252km, which shows that the distribution of Cd in soil is greatly affected by external factors.Because the decision coe cient of semivariogram simulation is higher than 0.8, we need to explore its spatial difference from different directions.Use GS + 9.0 software to calculate semi-variance (angle tolerance is ± 22.5°) in the four directions of 0°, 45°, 90° and 135° (north direction is 0°, with the angle rotated clockwise), as shown in Fig. 5.It can be seen that there is a signi cant difference in the semi-variance of Cd in different directions in the soil around the enterprises in the Fujiang River Basin, which indicates that the spatial anisotropy is signi cant.In the direction of 0°, the spatial difference of Cd in soil is the largest, showing a certain linear relationship and increases with the increase of the distance between sampling points; In the 45° direction, the spatial difference of Cd in soil is relatively small, but the semi-variance tends to decrease with the increase of the distance between sampling points.Generally, the semi-variance should increase gradually with the increase of the distance between sampling points.So it indicates that there are local outliers in this direction, which is consistent with the distribution of low and medium outliers in Fig. 3; In the 90° direction, the spatial difference of Cd in soil is relatively low, but with the increase of the distance between sampling points, the semi-variance shows a trend of rst decreasing and then increasing.It indicates that there are also individual outliers in this direction, which is consistent with the distribution of high outliers in Fig. 3; In the 135° direction, the spatial difference of Cd in soil is the smallest, but with the increase of the distance between sampling points, the semi-variance shows the "m" type, indicating that there are also a few outliers in this direction.By observing Fig. 3, this is related to the level of outliers in the study area.

Conclusion
The overall level of Cd in soil around enterprises in the Fujiang River Basin is relatively low, which does not exceed the risk screening value of the national Soil Environmental Quality Standard (GB15618-2018), and only a few regions have relatively high Cd; From the results of global spatial autocorrelation analysis, the spatial autocorrelation of soil Cd in different directions and distances is different; From the analysis of local hot and cold spots and outliers, the distribution of high value clustering points and outliers of soil Cd is small, most of them belong to low value clustering points and insigni cant points, and they have a certain correlation with the number of enterprise distribution and geological background.From the spatial structure, the spatial variation of Cd in soil is of moderate intensity, and the spatial differences in different directions are different.According to the order from large to small: 0° >90° > 45° > 135°.The spatial differences are consistent with the distribution of local hot and cold spots and outliers.

Declarations
This article is funded by Shandong Provincial Natural Science Foundation (ZR2017MD009) and National Natural Science Foundation of China (No. 41202165, No.41102149).

References
Figures   Distribution of enterprises around the study area.Note: The designations employed and the presentation of the material on this map do not imply the expression of any opinion whatsoever on the part of Research Square concerning the legal status of any country, territory, city or area or of its authorities, or concerning the delimitation of its frontiers or boundaries.This map has been provided by the authors.

(
LH) clustering surrounded by high values and high value outliers (HL) clustering surrounded by low values with statistical signi cance (P = 0.05).

Figure 2 Global
Figure 2

Table 1
Statistical analysis of heavy metal content in soil

Table 2
Theoretical models and parameters of semi -variational function