Effect of multiple spatial scale characterization of land use on water quality

Land use in uplands is an important factor affecting water quality in its respective catchment, and its influences at the different spatial scales and configurations warrant further investigation. Here, we selected 26 catchments in the upper Han River (China) and sampled the surface water at the outlet of each catchment in four seasons during 2019. Multivariate statistics were used to identify the relationships between land use characteristics in uplands and water quality in river system. The results indicated that chemical oxygen demand (CODMn); pH; dissolved oxygen; electrical conductivity; nutrient, i.e., NH4+–N, NO3−–N; and dissolved phosphorus (DP) in rivers displayed significant seasonal variations. Stepwise regression revealed that landscape metrics such as patch density, landscape shape index, and splitting index were important factors influencing water quality in rivers regardless of their spatiality and seasonality. Urban was the most frequently chosen land-use type in the best prediction models, and forest area showed a negative correlation with water quality parameters in most cases for example, DP. Overall, the influence of land use on river water quality was slightly stronger at reach scale than at catchment and riparian scales. Also, nutrients (i.e., NH4+–N, NO3−–N, and DP) in rivers were primarily impacted by the land use characteristic at catchment and riparian scales. Our results suggested that multi-scale explorations would help to achieve a fully understanding on the impacts of land use on river water quality.


Introduction
The water quality of rivers profoundly influences public health, agricultural and industrial development, and social sustainability (Perez-Gutierrez et al. 2017;Shi et al. 2017). However, due to the increasing intensity of human activities, land originally covered by natural vegetation has been converted into cropland or cities (Chen et al. 2019;Gibbs et al. 2010;Lambin and Meyfroidt 2011). Such land conversion usually leads to increases in the concentration of nitrogen, phosphorus, and other elements in water bodies due to the excessive use of fertilizer (Perez-Gutierrez et al. 2017), which results in water pollution and eutrophication (Namugize et al. 2018;Zhang et al. 2019). In addition, the increase in impervious surfaces generates more surface runoff and transports pollutants into the river (Wilson and Weng 2010). Conversely, increased forest area can to some extent reduce the amount of nutrients and solids that enter water bodies and mitigate pollution caused by agricultural activities (de Mello et al. 2018;Sliva and Williams 2001;Turunen et al. 2019).
The effects of land use on water quality also depend upon spatial scale (Ding et al. 2016;Shi et al. 2017;Sliva and Williams 2001). Catchment scale land use can better explain water quality than land use in the riparian buffer zones (de Mello et al. 2018;Pratt and Chang 2012;Zhang et al. 2019). De Mello et al. (2018) reported that croplands and urban areas could explain more variability in total phosphorus and total suspended solids at the catchment scale than the riparian scale. However, others have found that riparian land use better accounted for water quality variability than catchment or reach scale land use (Shi et al. 2017;Tran et al. 2010). For Responsible Editor: Xianliang Yi example, Shi et al. (2017) found that forest and grassland can explain changes in water quality better at the riparian scale than at other scales, because the riparian buffer could intercept the sediment in the surface runoff (Santos et al. 2015).
Linkages between landscape metrics and water quality are also significant (Bu et al. 2014;Uuemaa et al. 2007). Previous studies mostly correlated water quality to land use composition but neglected the effects of land use configuration (Bolstad and Swank 1997;Li et al. 2008). For example, Bu et al. (2014) found that landscape metrics, such as aggregation and diversity, were closely related to stream water quality. Xu et al. (2019) also reported that Shannon's diversity index and patch cohesion index were strongly linked with riverine water quality. Moreover, landscape metrics are also easily affected by changes in spatial scale (Zhang et al. 2018).
Land use can not only directly influence water quality but also has scale effects that could impact water quality (Ding et al. 2016;Shi et al. 2017;Sliva and Williams 2001). Thus, understanding the relationships between land use, landscape metrics, and water quality, and understanding the scale effects of these parameters, is important for preventing water contamination, because the composition and configuration of land use at different scales could reflect the variation of human activities and thus have an essential influence on the types and degree of contamination. Besides, measuring the proportions and spatial arrangement of land use at different scales could enable us to predict water quality conveniently.
We hypothesize that the urban and cropland land use at reach scale could influence the water quality more because of their higher proportion at reach scale and the close proximity to streams, which could transfer more nutrients directly into the stream. Here, we explored the relationships between land use composition and landscape configurations and water quality at different spatial scales in the upper Han River basin. Our objectives were to identify (1) correlations between land use, landscape metrics, and water quality; (2) the variation of those above-mentioned correlations when scale changed; and (3) on which spatial scale the land use composition and configuration could explain water quality variation better. Our findings provide important information for decision-making and the management of land use for water conservation.

Study area
The Han River originates from Ningqiang County in Shaanxi Province and flows into the Yangtze River at Wuhan in Hubei Province. It is the largest tributary of the Yangtze River with a total length of 1577 km (Yang et al. 1997;Shen and Liu 1998). The Han River covers about 1.59 × 10 5 km 2 , and the entire basin is in the subtropical monsoon region (Fig. 1). The mean annual precipitation is approximately 873 mm, and about 80% of the precipitation falls in May to August (Yang et al. 1997). The Han River is geographically separated into upper and lower reaches by the Danjiangkou Reservoir with a drainage area of 95,200 km 2 and 63,800 km 2 , respectively. Most of the upper Han River basin is mountainous with an elevation of between 201 and 3500 m and the dominant vegetation is coniferous forest, deciduous broad-leaved forest, mixed coniferous forest, and broad-leaved forest (Shen et al. 2006). Major crops in this area include maize (Zea mays Linn), wheat (Triticum aestivum Linn), rice (Oryza sativa Linn), cassava (Manihot esculenta), vegetables including radish (Raphanus sativus L. var. radiculus Pers.), cucumber (Cucumis sativus L.), tabasco (Capsicum L.), and citrus (Citrus) (Shen et al. 2006).

Water samples and water quality analysis
We delineated the Han River Basin into hundreds of small catchments using the hydrological analysis function in Arc-Map 10.3 by processing a 30-m resolution digital elevation model ). Then we selected 26 tributaries of the upper Han River and the sampling sites were selected at the outlet of each tributary. Subsequently, we aggregated the small catchments belong to the same tributary into integrated catchment of the tributary and each sampling site was also considered the outlet point for the catchment. Therefore, the effect of land use composition and configuration on water quality could be analyzed at the catchment level.
We collected water samples at 26 selected sites in spring (April), summer (July), autumn (October), and winter (January) in 2019 (Fig. 1), and water samples were collected at the edge of the streams under base flow conditions when the weather was clear and the river flow was relatively stable. The base flow conditions were assumed if there had been at least 5 days of no significant rain (< 10 mm over 48 h), which was recorded at nearby rainfall stations (Buck et al. 2004). High-density polyethylene 500-ml bottles which were previously acid washed were used to collect water samples at a depth of approximately 10 cm. Before we collected the samples, the bottles were rinsed three times with sample water on site and three replicates were taken at each sampling site. The unfiltered samples were used for analyzing the COD Mn (chemical oxygen demand). Besides, a 100-ml subsample was taken from each replicate, filtered with a 0.45-μm pore size filter membrane, and the filtered water samples were used for testing NO 3 − -N (nitrate-nitrogen) and NH 4 + -N (ammonium-nitrogen) and DP (dissolved phosphorus). All samples were stored in a refrigerator at 4 °C before laboratory analysis. DO (dissolved oxygen), pH, EC (electrical conductivity), and turbidity were detected in situ using a YSI 6600 V2 sonde (YSI Inc., Yellow Springs, USA). The concentrations of NO 3 − -N and NH 4 + -N were analyzed using an automatic discrete analyzer (AMS westco, Smartchem200, Italy). DP was detected by the inductively coupled plasma atomic emission spectrometer (ICP-AES) method (Metrohm, 940 Professional, Switzerland). COD Mn was analyzed using the potassium permanganate index method.

Land-use and landscape metrics
The land-use map in 2019 was obtained by interpreting Landsat 8 (https:// earth explo rer. usgs. gov/) multispectral imagery in ENVI 5.2 software (Exelis Visual Information Solutions, Colorado, USA), and the classification accuracy of the land-use data was determined using the method used by Rwanga and Ndambuki (2017), which had an overall accuracy which is 95.45%. The categories of land use were forest, cropland, waterbody, urban, bare land, and grassland. The land-use map was processed in ArcMap 10.3 to calculate the area of each land-use type for the entirety of each catchment (the entire upstream catchment of the sampling sites), the riparian zones (500 m on both sides of the stream, extending the entire length upstream of the sampling sites), and the reach (500 m on both sides of the stream extending 2000-m upstream of the sampling sites) (Fig. 1). Eight landscape metrics are as follows: patch density (PD); largest patch index (LPI); edge density (ED); landscape shape index (LSI); contagion index (CON); COH (patch cohesion index); splitting index (SPLIT); and Shannon's diversity index (SHDI) ( Table 1) were calculated using the FRAG-STATS 4.2 software.

Statistical analysis
The Shapiro-Wilk test and Bartlett test were used to test normality and homogeneity of variance of the data respectively before conducting variance analysis (Jackson 1993;Shapiro and Wilk 1965). After using the Shapiro-Wilk test and the Bartlett test to check our data, we found that the data were not normally distributed and had heterogeneity of variance, so we performed non-parametric tests in the statistical analysis. The Kruskal-Wallis test was conducted to test the differences in the water quality parameters among the seasons. It can be used as a substitution of ANOVA when conditions of ANOVA are not met. When Kruskal-Wallis test calculated statistically significant, it implied that at least one of the compared groups is different from the others. Therefore, the source of significance among groups should be located by further multiple comparisons. The Nemenyi test is similar to the Tukey test for ANOVA and is used when all groups are compared to each other (Demsar 2006;Liu and Chen 2012;Nemenyi 1962).
Land-use and landscape metrics were selected using Pearson correlation analysis and tested the variance inflation factors to eliminate multicollinearity. Multiple linear regressions were used to identify the significant variables influencing water quality among the remaining parameters. To eliminate insignificant variables, we applied stepwise regression, and the model with the least Akaike information criterion and a p-value < 0.05 was chosen as the best model (Kuk and Varadhan 2013). Eventually, among the three regression models of the same water quality parameter at different scales, the model with the highest adjusted coefficients of determination (adjusted R 2 ) for the water quality parameter was considered the best model. R 2 is defined as the proportion of variation in the outcome variable explained by the model, and is a popular measure of the strength of association between the outcome  The description of complexity of landscape patch shapes P i = proportion of the landscape occupied by patch type (class) i g ik = number of adjacencies (joins) between pixels of patch types (classes) i and k based on the double-count method The degree of agglomeration or trend of different patch types in landscape

Z = total number of cells in the landscape
The degree of separation of different patches in a landscape type Reflect the heterogeneity of landscape and the predictors in linear regression. But the disadvantage is that when the number of predictors is not small compare to the number of observations, and R 2 can substantially overestimate the strength of association (Liao and McGee 2003). However, the adjusted R 2 corrects for this overestimation by taking the number of predictors in the model into account and is generally considered superior (Neter et al. 1990). Thus, the response of water quality parameters to different land use distribution and configuration were stronger at the scale with highest R 2 . The variables were standardized before multiple linear regression. All statistical analysis was performed in R 3.6.1 (R Core Team 2019).

Land-use and landscape metrics
Generally, forests were the largest proportion of land use at all three scales, but forest area decreased as the scales became smaller ( Table 2). The average percentage that forests occupied decreased from 88.62% at catchment scale to 69.37% and 41.82% at riparian and reach scale, respectively. Moreover, the maximum and minimum forests proportion dropped from 97.34 and 69.41 to 78.85 and 1.36% as the scale declined from catchment to reach, respectively. The proportion of cropland increased when the scale changed from catchment to reach (Fig. 2). Contrary to the proportion of forests, the average proportion of cropland in the 26 catchment increased from 9.71% at catchment scale to 23.11% and 37.39% of the riparian and reach scale, respectively. Water and urban area increased similarly, from on average 0.16% and 1.17% at catchment scale to 7.10% and 13.02% at reach scale, respectively. Neither the average proportion of bare land nor grassland was larger than 1% of the total area at all three scales. As for spatial differences, catchments in the Hanzhong Plain and Dan River Plain had a higher proportion of cropland than other catchments. The sum of cropland and urban percentage at riparian scale accounted for nearly 50% in these plains. Besides, most proportions of cropland at reach scales in these regions exceeded 50%, while the reaches in other zones are still dominated by forest coverage (Fig. 2). The landscape metrics also varied with three scales (  (Fig. 3).

Spatial and temporal differences of water quality parameters
Many water quality parameters, such as COD Mn , NH 4 + -N, and NO 3 − -N, had obvious spatial variability (Fig. 4). Their higher values were in regions with a higher proportion of croplands, such as in the Dan River catchment, and for COD Mn in the Ankang Plain. EC was also higher in the Dan River catchment than in the other catchments. Moreover, the Hanzhong Plain and Ankang Plain had high turbidity. Water quality parameters also varied seasonally (Fig. 5). COD Mn and pH peaked in summer and were significantly higher than those in the other three seasons. Significant differences in DO were also observed among different seasons. The highest NO 3 − -N was observed in summer, while the highest NH 4 + -N was observed in winter. Significant differences in NH 4 + -N were only observed in winter and spring, while NO 3 − -N in summer was significantly different than that in other seasons. DP was significantly different in summer, autumn, spring, and winter, and EC was only significantly different between autumn and winter.

Relationships between land use, landscape metrics, and water quality
Water quality parameters could be slightly better explained by land-use characteristics at the reach scale than those at catchment and riparian scales across the four seasons (Table 4). In winter, the best model was mostly at the catchment scale (4/7), and the model with highest R 2 at the catchment scale accounted for half in the summer while the best models in the other seasons were mostly at the reach scale (4/6 and 4/7 in spring and autumn, respectively).
Among the land-use types, the proportion of urban area was positively correlated with COD Mn , EC, and DP, but negatively correlated with pH. Cropland and bare land were also positively correlated to the nutrient parameters, especially DP, NO 3 − -N, and NH 4 + -N, and grassland was positively correlated with DP, NH 4 + -N, and EC. The Pearson correlation coefficient indicated that forests were negatively related with cropland and urban area at all three scales (Table S1, Table S2, Table S3); thus, forests had opposite effects on water quality than croplands and urban areas.
Furthermore, the strength of the effect of land use on water quality depended on scale. EC could be more influenced by parameters such as urban area and grassland at the reach scale, while DP was the best model at the catchment scale and was mostly affected by cropland and urban area. The best models for NO 3 − -N were all observed at the riparian scales and were affected by SPLIT, which represented the degree of human activity. NH 4 + -N was best modeled at the catchment scale in spring and summer and at the reach scale in autumn and winter. The best model of turbidity was only significant at the reach scale in summer. Moreover, urban area and grassland were mostly best modeled at the reach scale, while cropland and bare land had a larger influence at the catchment and riparian scales.
Landscape metrics, such as SPLIT and PD, were positively related with the nutrient parameters in most cases. In addition, LSI correlated negatively with NH 4 + -N at the catchment scales in summer. Other landscape metrics, such as SHDI, were positively related to cropland and urban area at all three scales, and ED was also positively related with PD at all scales. LPI was significantly negatively correlated with SPLIT, and SPLIT and PD showed their influence mostly at the catchment and riparian scales.

Seasonal variation of water quality parameters
Water quality parameters were variable among the four seasons in the upper Han River. The opposing seasonal dynamic patterns of COD Mn and DO could be attributed to intensive rainfall, which washed organic matter into rivers (Aschermann et al. 2016). As water temperature increases, the solubility of oxygen in the water will gradually decrease, which causes a reduction in DO concentration (Kannel et al. 2008). Also, the seasonal differences in pH may be attributed to biogeochemical processes in rivers, because as the water temperature increased, denitrification would also increase and cause an increase in pH (Zilberbrand et al. 2001). In the aerobic environment in winter, the increase of nitrification increases the acidity of the water pH decreases (Stumm and Morgan 1996).
For nutrient parameters, the significant seasonal differences in NO 3 − -N might be due to the use of nitrogenous fertilizers during crop planting and growth in spring and summer and the increased runoff of these fertilizers during and after intense precipitation events . Moreover, NH 4 + -N and DP had a significantly lower value in spring, and a higher value during the rainy season (i.e., summer and autumn), and the explanation for this change was that the significant precipitation during the rainy season, substantial nutrients, and contaminants accumulated in croplands and urban areas could be scoured into streams, leading to the deterioration of river water quality (Li et al. 2009;   Tran et al. 2010). Therefore, it is better to develop containment measurements to reduce non-point source pollution and analyze the relationship between land use and water quality during the rainy season. However, in the highly urbanized catchments, which are dominated by the point source pollution, the concentration of nutrients could be more elevated in the dry season, and this was due to the reduced dilution effect (Liu et al. 2017). Thus, keeping high hydrologic connectivity levels is necessary to prevent water from being contaminated.

Impacts of land-use compositions on water quality at different scales
In general, we found that the water quality parameters, especially COD Mn , NH 4 + -N, NO 3 − -N, and DP, had noticeable spatial differences (Fig. 4). Most of them were higher in the Hanzhong Plain, Ankang Plain, and Dan River Plain, which were the regions with higher coverage of cropland and urban, and indicated a positive correlation between urban and cropland area and them (Table 4), and that forest area positively impacted Table 4 Stepwise regression models for water quality parameters The model means the highest adjusted R 2 model among scales. The most influential parameters and their coefficients in the best models are shown in bold. Only significant coefficients are listed. CRO cropland, WAT water, URB urban, BAR bareland, GRA grassland, PD patch density, LSI landscape shape index, SPLIT splitting index ---river water quality. Thus, forests may prevent or offset the deterioration of water quality by cropland and urban area. Urban areas and croplands usually produce more nitrogen, phosphorus, and sulfur pollutants than other land-use types (Shi et al. 2017).
In urban areas, nutrients and organic matter accumulate on the impervious surface and then runoff into streams after rainfall. Also, the discharge of wastewater can deteriorate water quality (Ding et al. 2016;Xu et al. 2013). In cropland areas, pesticides and chemical fertilizers can be overused in the fields in an effort to improve production. Hence, nutrients could enter the surface water by surface runoff and lead to the deterioration of water quality . Conversely, forests are believed to mitigate pollution caused by nutrient pollution (Li et al. 2009;Xu et al. 2019;Ye et al. 2014), and in our study, forest area had a negative relationship with NO 3 − -N, which suggested that forest area could be a filter and decrease the sediment and pollutants in the surface runoff (Li et al. 2008;Ding et al. 2013). Alternatively, there is little nutrient pollution in the forest that could affect the stream water quality. Unlike previous studies that found that grassland areas could mitigate pollution of surface water (Shi et al. 2017), we found a positive relationship between grassland and NH 4 + -N, NO 3 − -N, and DP. This relationship may be caused by the mismanagement of animal waste and free livestock grazing in the grassland areas, which induced soil erosion (Ou et al. 2016).
Land use at different scales also have various patterns that could influence water quality (Ding et al. 2016;Sliva and Williams 2001). We found that the reach scale could generally better explain the water quality parameters than the catchment and riparian scales. The reach scale data could be more representative and yield a reliable correlation, while data at the larger scale might exaggerate the influence of land-use types that comprise a large proportion of the catchment and underestimate the influence of the land-use type near the riverbanks (Ye et al. 2014). And with the scale decreased, the proportion of cropland and urban area became larger, which meant that the spatial distribution of most farmlands and cities were located nearby the river and put the streams at the risk of contamination. Specifically, we found urban areas to be more influential at the reach scale than other scales even though the proportion of it was relatively low. This finding is because urban regions are usually distributed along the river and the pollutants emitted by cities can degrade water quality (de Mello et al. 2018). Hence, in order to reduce contamination, more consideration should be given to the spatial configuration of urban areas rather than the land use composition. However, croplands were more influential at the catchment and riparian scales, because the larger area and wider distribution of croplands could lead to severe non-point source pollution (Sun et al. 2014). Furthermore, the higher forest area at larger scales could improve the absorption and retention of organic matter and nutrient runoff and help prevent water quality degradation (Li et al. 2009). But in our study, the forest cover was relatively low, and the distribution was very scattered at the reach scale. Thus, the ability of forests to absorb and retain nutrients was limited (Lee et al. 2009).
The close proximity of croplands and urban areas to the streams at our sampling sites might be why all the best models with EC appeared at the reach scale. Smaller distances between cropland and rivers reduce the infiltration and retention time compared with agricultural activities that are not in close proximity, and thus more nutrients and ions could be transported into the rivers (Varanka and Hjort 2017). The best model of COD Mn in spring and winter was at the reach scale, which was contrary to a previous study (Ding et al. 2016). This inconsistency may be due to the more fragmented landscape (higher PD) at the reach scale in our study area, which could result in more organic matter inputs and soil erosion (Shi et al. 2017;Uuemaa et al. 2005).

Relationships between landscape configuration and water quality
The density, size, diversity, and aggregation of landscape metrics are important land-use features that can affect river water quality, but their relationships were diverse (Table 4). PD, which reflects landscape fragmentation, was positively correlated with COD Mn and NO 3 − -N in most instances (Table 4), which suggested that small patches of land use could lead to higher overflow and soil erosion (Shi et al. 2017). Furthermore, some landscape metrics include abundant information which could reflect the configuration and composition of land use, and those characteristic of land use are non-negligible on influencing water quality (Turner et al. 2001). For instance, CON indicates the degree of aggregation and physical connection of land-use types, and it ranges from 0 to 100. When CON is close to 100, it indicates that different land uses have aggregated completely by types. We found that the CON was negatively correlated with CRO; thus, CON was also negatively related to the water quality parameters, which indicated that the landscape with higher fragmentation and lower dispersion could lead to the water quality degradation (Uuemaa et al. 2005). However, the well physical connection between agricultural landscapes might pollute the stream in the cropland dominated watershed (Gemesi et al. 2011). Moreover, SHDI reflects the complexity of the types of patches in the landscape, and it was positively correlated with croplands, urban area. Hence, it elevated with the scale decreased and water quality parameters (Shi et al. 2017;Zhang et al. 2019).
LPI represented the percentage of the landscape that the largest patch comprises, and it showed a negative relationship with SPLIT and water quality parameters. The dominant landuse type in our study area was forest. Thus, a higher LPI means larger areas of forests, which could filter pollutants, sediments, and nutrients (Winston et al. 2011) and prevent water from degradation. LSI, which represented the complexity of landscape patch shapes, was negatively correlated with NH 4 + -N. Anthropogenic activities usually lead to regular shapes, such as with cropland and urban area, while forest was commonly characterized by irregular shapes and boundaries. With the increase of LSI, the complexity of landscapes also increased, which could help retain more nutrients (Uuemaa et al. 2005).

Conclusion
We found that most water quality parameters have significant correlations with land-use and landscape metrics, and that their correlations also depended on the different scale. The overall variation of water quality variables could be explained slightly better by land use and landscape metrics at reach scale than those at the catchment and riparian scales. Urban areas, primarily located along the river, contributed more to the water quality degradation than other land use types, even though the proportion of it was not high. Cropland showed its negative effects on water quality mainly at the larger scales. Additionally, the contamination was positively associated with landscape metrics of SHDI and PD and negatively correlated to LPI, CON, and LSI. Thus, the management of land use should not only focus on the composition (i.e., percentage of different land use type) but also the land use configuration. A larger area of forest along the riverside and the forest patches that break up the connection of cropland and urban area may intercept the nutrient better than the large proportion of forest coverage. Also, our results have important implications for improving water quality and recommended multi-scale sights on regional land use planning.