Quantifying groundwater phosphorus flux to rivers in a typical agricultural watershed in eastern China

Increasing evidence indicates that groundwater can contain high dissolved phosphorus (P) concentrations, thereby contributing as a potential pollution source for surface waters. However, limited quantitative knowledge is available concerning groundwater P fluxes to rivers. Based on monthly hydrochemical monitoring data for rivers and groundwater in 2017–2020, this study combined baseflow separation methods and a load apportionment model (LAM) to quantify contributions from point sources, surface runoff, and groundwater/subsurface runoff to riverine P pollution in a typical agricultural watershed of eastern China. In the studied Shuanggang River, most total P (TP) and dissolved P (DP) concentrations exceeded targeted water quality standards (i.e., TP ≤ 0.2 mg P L−1, DP ≤ 0.05 mg P L−1), with DP (76 ± 20%) being the major riverine P form. Observed DP concentrations in groundwater were generally higher than those of river waters. There was a strong correlation between river and groundwater P concentrations, implying that groundwater might be a considerable P pollution source to rivers. The nonlinear reservoir algorithm estimated that baseflow/groundwater contributed 66–68% of monthly riverine water discharge on average, which was consistent with results estimated by an isotope-based sine-wave fitting method. The LAM incorporating point sources, surface runoff, and groundwater effectively predicted daily riverine TP [calibration: coefficient of determination (R2) = 0.76–0.82, Nash–Sutcliffe Efficiency (NSE) = 0.61–0.77; validation: R2 = 0.88–0.98, NSE = 0.54–0.64] and DP loads (calibration: R2 = 0.73–0.84, NSE = 0.67–0.72; validation: R2 = 0.88–0.97, NSE = 0.56–0.83). The LAM estimated point source, surface runoff, and groundwater contributions to riverine loads were 15–18%, 14–35%, and 46–70% for TP loads and 7–9%, 10–32%, and 59–82% for DP loads, respectively. Groundwater was the dominant riverine P source due to long-term accumulation of P from excess fertilizer and farmyard manure applications. The developed methodology provides an alternative method for quantifying P pollution loads from point sources, surface runoff, and groundwater to rivers. This study highlights the importance of controlling groundwater P pollution from agricultural lands to address riverine water quality objectives and further implies that decreasing fertilizer P application rates and utilizing legacy soil P for crop uptake are required to reduce groundwater P loads to rivers.


Introduction
Increasing anthropogenic phosphorus (P) inputs have substantially elevated riverine P loads worldwide, resulting in a myriad of environmental issues including eutrophication, hypoxia, and harmful algal blooms in downstream lakes, estuaries, and coastal waters (Anderson et al. 2002;Gao and Zhang 2010;Meinikmann et al. 2015). Riverine P may originate from a range of sources, such as industrial, residential, and animal wastewaters (namely point source (PS)), as well as agricultural runoff and leaching (namely nonpoint source (NPS)) (Chen et al. 2016). Due to their different pollution processes and control strategies, it is necessary to identify P pollution loads derived from PS and NPS for a given river.
Many lumped watershed models (e.g., export coefficient model, load apportionment model, and SPARROW Johnes 1996;Li et al. 2015)) and complex mechanistic models (e.g., AGNPS, HSPF, and SWAT Chen et al. 2015;Moriasi et al. 2007;Shrestha et al. 2008)) have been developed to identify riverine P sources at the watershed scale. The load apportionment model (LAM) was developed based on the principle that the PS pollution load is independent of changing runoff volume, whereas the NPS pollution load is appreciably increased with runoff volume (Bowes et al. 2009;Rattan et al. 2021). Based on the hydrological differences affecting PS and NPS, the LAM is able to quantify PS and NPS pollution loads using a power-law function of flow (Bowes et al. 2008(Bowes et al. , 2009. Compared with other source apportionment models, the LAM does not require detailed watershed attribute information, but only a dataset comprised of paired P concentrations and river discharge (Greene et al. 2011;Jarvie et al. 2010). In addition, the LAM has few parameters and is relatively simple in format and application, making the calibration process straightforward. Once the LAM has been calibrated to produce an acceptable fit to the river monitoring record, its parameters can be applied to analyze high temporal-resolution river discharge data (Bowes et al. 2009;Chen et al. 2013), which is critical for determining nutrient sources and loads during the most sensitive times of the year when eutrophication is most likely to occur (Anderson et al. 2002). The LAM methodology has been successfully applied to a wide range of watersheds in England (Chen et al. , 2016Meinikmann et al. 2015), Ireland (Greene et al. 2011;Mockler et al. 2017), Canada (Rattan et al. 2021), and China (Chen et al. 2013).
For a given watershed or catchment, NPS nutrient pollution loads are primarily conveyed to rivers by surface runoff and groundwater flow (Meinikmann et al. 2015). In general, farmland soil P loss to adjacent waters is mainly through surface runoff rather than subsurface or groundwater flow due to limited P mobility through soil profiles (Jiao et al. 2011). However, increasing evidence indicates that soil P loss via subsurface or groundwater runoff can contribute substantial dissolved P loads to streams and rivers. For example, P loads from groundwater were found to account for more than 40% of the total P export in Northern Germany, south-west Ireland, and Iowa, respectively (Meinikmann et al. 2015;Mellander et al. 2016;Smith et al. 2015). Other investigations found that dissolved P concentrations in groundwater were much higher than in receiving surface waters and even exceeded the typical water quality criteria used to assess risk to surface water systems (Kannel et al. 2008). In general, groundwater could contribute a high P flux to rivers in agricultural watersheds with long-term excessive P fertilizer applications (i.e., soil P accumulation), manure application, well-drained or intensively cultivated soils, low dissolved oxygen concentrations in groundwater (Kannel et al. 2008;Wang et al. 2003;Zhou et al. 2005) and watersheds with good connectivity between surface and groundwater (McDowell et al. 2015). The buildup of legacy P in soils and groundwater due to historical excess inputs of anthropogenic P can contribute a longterm continuous source for leaching P to surface water , thereby inducing algae blooms in sensitive seasons (Anderson et al. 2002). Therefore, it is warranted to address the contributions of surface runoff and groundwater flow to riverine P pollution loads to guide the development of efficient P management strategies in agricultural watersheds (Kannel et al. 2008). However, studies concerning quantifying groundwater or baseflow P contributions to riverine P exports in Chinese watersheds are still limited (He and Lu 2021;Guan et al. 2022). In addition, these studies usually lacked the necessary validations for estimated surface runoff and groundwater components. More efforts are required to quantitatively identify contributions of surface runoff and groundwater/ baseflow to surface water P pollution.
Based on monthly hydrochemical monitoring data of rivers and groundwater in 2017-2020, this study integrated baseflow separation methods with a LAM to quantify the contribution of groundwater to riverine P pollution in a typical agricultural watershed in eastern China. The nonlinear reservoir algorithm (NRA) and water stable isotope-based sine-wave fitting (ISF) method were combined to identify the monthly contributions of surface runoff and groundwater to river discharge. Then the LAM was adopted to quantify the contributions of point sources, surface runoff, and groundwater to riverine P loads. This study improves our quantitative understanding of groundwater and surface water P pollution dynamics, providing critical information for guiding the development of efficient agricultural NPS pollution control practices.

Study area
The Shuanggang River (120.82°-121.02°E and 28.88°-29.04°N) is located in the subtropical region of eastern China (Fig. 1) and has a watershed area of 163 km 2 . It is a major tributary of the Jiaojiang River, one of eight major river systems in Zhejiang Province that finally flows into Taizhou Estuary and the East China Sea. The estuary and coastal area commonly experience hypoxia (Gao and Zhang 2010;Li et al. 2007). Climate is subtropical monsoon with an average annual temperature of 17.0 °C. Average monthly precipitation and evapotranspiration were 129.8 mm and 86.2 mm, respectively, during the study period (October 2017-December 2020, Fig. 2). Rainfall distribution is temporally heterogeneous and mainly occurs during May-June and the typhoon season in July-October. Annual average water discharge was 5.37 m 3 s −1 at the watershed outlet (Table 1).
Population density and domestic animal density of the Shuanggang River watershed in 2019 were 399 km −2 and 94 km −2 , respectively. Agricultural land averaged ~ 36% of the total watershed area in 2020, with forest and residential lands contributing ~ 55% and ~ 9%, respectively (Table 1). Among the three tributaries investigated in this study, T1 had the highest percentages of agricultural land (34%) and paddy fields (22%), whereas T3 had the lowest farmlands (Table 1). Rice, corn, vegetables, waxberry, and oranges are the major cultivated crops. Watershed soils are dominated by highly weathered and acidic soils that are predominantly classified as Oxisols and Ultisols (Chen et al. 2016). As a typical agricultural watershed, > 98% of total P input (~ 51.9 kg P ha −1 year −1 on agricultural soils) in the Shuanggang River watershed is from synthetic fertilizer and farmyard manure; the ratio of P application rates between these two fertilizer sources is 2.3:1 ).

Water quality sampling and analyses
River water samples were collected on a monthly basis from October 2017 to December 2020 from upstream to downstream (4 mainstream, 3 tributaries, and 2 groundwater sites) in the Shuanggang River watershed (Fig. 1). Surface water samples were collected at 3 points across the river channel cross section and composited to obtain a single depth-width integrated sample. Two groundwater sampling wells were established 100-200 m from the river channel near M1 (G1: ~ 3-m depth) and T2 (G2: ~ 3-m depth). Groundwater samples were collected using a peristaltic pump to purge a minimum of 3 well Fig. 1 Distribution of land-use types and water quality sampling sites in the Shuanggang River watershed volumes of water before collecting a representative sample for analysis. Rainfall water samples were collected near T2 on an event basis and composited into a volume-weighted, monthly sample that was used for water chemistry and stable isotope analysis. All water samples were immediately sealed and stored at 4 °C to prevent fractionation from evaporation. Water electrical conductivity (EC) and chloride ion (Cl − ) concentration were measured in the field using YSI electrodes (Xylem, NY, USA). River water width, depth, and flow rates were determined at the time of sampling using a 6526 Starf low Ultrasonic Doppler (Unidata, O'Connor, Western Australia).
All river water samples were analyzed within 24 h of collection for total phosphorus (TP) and dissolved phosphorus (DP, water samples filtered by 0.45-μm microfiltration membrane) concentrations using the ammonium molybdate spectrophotometry method with a detection limit of 0.01 mg P L −1 (GB 11,893-89, Chen and Arai 2020). Other forms of P (OFP) were estimated as the difference between TP and DP, which includes particulate and colloidal (> 0.45 µm) bond P (Kovar and Pierzynski 2009;Liu et al. 2014 (Brand et al. 2014). Analytical precisions were ± 0.2‰ and ± 1‰ for δ 18 O and δ 2 H, respectively.

The methodology for riverine P source apportionment
This study adopted the nonlinear reservoir algorithm (NRA) and isotopically based sine-wave fitting (ISF) Fig. 2 Monthly river water discharge (at site M4), mean temperature, precipitation, and evaporation in the Shuanggang River watershed from October 2017 to December 2020. Blue bars represent the monthly water discharge; orange dots represent the monthly precipitation; green squares represent the monthly evaporation Table 1 Land-use distribution and river water discharge for the seven catchments in the Shuanggang River watershed M mainstream, T tributary, D dry land, PF paddy field, F forest land, R residential land, PD population density, DA domestic animal density (the number of each type of domestic animal is converted into the equivalent number of pigs according to their P excretion rates) method to determine surface runoff and groundwater flow/ baseflow components. Based on the estimated monthly surface runoff and groundwater flow components, the load apportionment model (LAM) was used to quantify contributions from point sources, surface runoff, and groundwater/subsurface flow to riverine P pollution (Fig. 3).

Surface runoff and groundwater components
The NRA method was conducted for mainstream sampling sites (e.g., M2, M3, and M4) based on the continuous stream water discharge records provided by the local Hydrology Bureau. Then the ISF method was conducted for the entire Fig. 3 The methodology framework for estimating point source, surface runoff, and groundwater contributions to riverine phosphorus load watershed (e.g., M4) to validate the NRA results using monthly water isotope signals in precipitation and river water during the study period.
The nonlinear reservoir algorithm The nonlinear reservoir algorithm (NRA, Wittenberg 1999) separates river discharge into baseflow and direct runoff. Baseflow is defined as the sustaining flows in a river during low-flow periods, which includes groundwater and a portion of the delayed shallow subsurface flow (He et al. 2016). The NRA assumes a nonlinear storagedischarge relationship to describe the baseflow recession: where storage (S) and discharge (Q) are in m 3 and m 3 s −1 , respectively, the parameter a is related to basic watershed characteristics (e.g., porosity and hydraulic conductivity) and morphological factors, and parameter b controls the concavity of the recession curve, usually ranging from 0 to 1 (He et al. 2016;Wittenberg 1999). Combining Eq. (1) with the continuity equation of a reservoir without inflow, dS/dt = -Q, yields the recession curve equation (Eq. (2)) for the nonlinear reservoir starting at any initial discharge Q 0 .
The baseflow at t-Δt can be further estimated as: The coefficient of variations (CV, Eq. (4)) between observed records (Q) and the fitted values (Q calc ) with the NRA were used to select typical recession curves for recession analysis (Aksoy &Wittenberg 2011). The data for the recession curve with CV less than 0.1 would be further used for the subsequent parameter estimation.
By systematically varying parameter b, the parameters (a and b) in the nonlinear reservoir algorithm were calibrated using an iterative least-square method: Calibration of typical recessions was conducted to determine parameter b (He et al. 2016). Based on the calibration of typical recessions, the parameter b varied between 0.34 and 0.85 with a mean and median of 0.50, which was consistent with the value (b = 0.5) recommended by Wittenberg (1999). As a result, the exponent b was set as a constant of 0.5. With a fixed b = 0.5, the parameter a for 91, 91, and 79 recession events during 2010-2020 in M2, M3, and M4 was estimated according to Eq. (5). Based on the seasonal variations of parameter a, Fourier fitting was conducted to determine monthly values of parameter a for further estimation of baseflow using MATLAB software (Ver. R2020a, MathWorks, Natick, USA). Daily surface runoff was predicted by subtracting the daily baseflow from the daily river discharge. Monthly and annual surface runoff and groundwater flow/baseflow were estimated as the sum of daily surface runoff and baseflow, respectively.
The isotope-based sine-wave fitting method The isotopebased sine-wave fitting (ISF) method separates river discharge into young (F yw ) and old (F ow ) water fractions based on the damping and phase-shifted isotopic signals between precipitation and river water. F yw is defined as the proportion of the transit-time distribution younger than a threshold age (~ 0.2 year) (Kirchner 2016). The F yw includes overland flow and rapid subsurface flow paths (e.g., subsurface lateral flow through upper soil horizons), whereas the F ow includes slow subsurface flow paths (e.g., deeper soil horizons and vadose zone) and groundwater. Sine wave regressions on the δ 18 O (‰) time series (McGuire and McDonnell 2006) were performed to determine the cosine and sine coefficients for precipitation and streamflow: In Eq. (6), C p (t) and C s (t) are the tracer signal (δ 18 O) for precipitation and streamflow at time t, f is the frequency of the annual fluctuations (set to 1/365 days), k P and k S represent the vertical shift of the sine wave, and a p , b p , a s , and b s are coefficients for determining the amplitude (i.e., A P and A S ) of the seasonal cycles.
As indicated by Kirchner (2016), F yw is equal to the amplitude ratio A S /A P . F ow was calculated as the total water component minus F yw : An uncertainty analysis was further conducted to determine the uncertainty range for F yw following Song et al. Stockinger et al. (2016). The 95% confidence limits for the coefficients (i.e., a p , b p , a s , or b s ) obtained from the sine wave regressions were used as parameter boundaries for 10,000 Monte Carlo simulations to estimate the range of F yw values. The sine-wave fitting and Monte Carlo simulations were conducted in MATLAB software.

Application of the load apportionment model (LAM)
In the conventional LAM, point (PS) and nonpoint (NPS) source pollution loads are identified as a constant parameter and a power-law function of river discharge, respectively (Bowes et al. 2008(Bowes et al. , 2009. In this study, we further apportioned the NPS pollution load into surface runoff and groundwater components using a power-law function of surface runoff and groundwater flow, respectively. Therefore, riverine P load at the outlet on the i th month (L i , kg P month −1 ) can be expressed as: where Q s,i and Q g,i are monthly surface runoff and groundwater flow on the i th month (m 3 month −1 ), respectively, and a, b, c, d, and e are unknown fitting parameters. In Eq. (10), the parameter a represents the PS-contributed pollution load (kg P month −1 ) based on the assumption that the P load from PS pollution is discharged from domestic, industrial, and animal wastewater and remains constant throughout the study period. The second term represents the NPS P load from surface runoff (kg P month −1 ) and is based on the consideration that the surface runoff-contributed NPS P load is flow-dependent and increases with increasing surface runoff volume. The third term represents the NPS P load from groundwater (kg P month −1 ) and assumes that the groundwater-contributed NPS P load is flow-dependent and increases with increasing groundwater flow volume.
Using monthly river discharge and riverine TP/DP loads, the Markov chain Monte Carlo (MCMC) program was implemented to calibrate parameters (a, b, c, d, and e) for sampling sites M2, M3, and M4 using the MAT-LAB software (Vrugt 2016). To provide realistic solutions, parameters a, b, and d values were set to be ≥ 0, as P loads from all three sources must be non-negative. In addition, parameter c and e values were set to be ≥ 0, as NPS P inputs from surface runoff and groundwater flow should not decrease with increasing water flow (Rattan et al. 2021). For calibration purposes, all parameters were assumed to be normally distributed ). The MCMC simulation was performed using 20,000 runs to find the optimum parameter set that maximized the log-likelihood function. At each sampling site, 70% of the available data was used for model calibration, with the (10) remaining 30% retained for validation. Model efficiency was evaluated using correlation of determination (R 2 ) and Nash-Sutcliffe Efficiency (NSE) coefficient metrics (Nash and Sutcliffe 1970). Uncertainties associated with predictions of riverine P loads were expressed by 95% confidence intervals. Finally, using the validated LAM models for TP and DP, monthly riverine TP or DP loads (kg P month −1 ) derived from point sources, surface runoff, and groundwater were estimated as a, bQ c s, i , and dQ e g, i , respectively. Monthly riverine OFP loads derived from point sources, surface runoff, and groundwater were estimated as the difference between DP and TP loads. Annual point sources, surface runoff, and groundwater contributions to riverine TP/DP/OFP loads were calculated as the sum of monthly loads over a given time period.

Other data sources and statistical analysis methods
Daily precipitation and evaporation for the Shuanggang River watershed and daily river discharge for the outlet in 2017-2020 were obtained from the local Weather Bureau and Hydrology Bureau, respectively. We categorized the water year into three different flow regimes according to monthly average water discharge: high flow regime (upper 30% of the flow regime), low flow regime (lower 30% of the flow regime), and median flow regime (30-70% of the flow regime) . Land-use distribution and catchment characteristics were derived from a watershed DEM (30-m resolution) using ArcGIS 10.8. Analysis of variance (ANOVA), independent sample t-tests, and regression analyses were performed using SPSS (Ver. 26.0, SPSS, Chicago, USA). One-way ANOVA was conducted to reveal temporal and spatial variations in P concentrations across the Shuanggang River watershed. Univariate ANOVA was used to assess differences between P concentrations in river water and groundwater. An independent sample t-test was conducted to examine the parameter values for the LAM. Regression analyses were used to assess relationships among variables.

P dynamics in river water and groundwater
Average TP, DP, and OFP concentrations for the 7 river sampling sites in the Shuanggang River watershed were 0.12, 0.09, and 0.03 mg P L −1 in 2017-2020, respectively. A total of 89% of riverine TP concentrations exceeded targeted water quality standards (i.e., Grade III, ≤ 0.2 mg P L −1 ) in terms of Chinese National Environmental Quality Standards for Surface Water (GB 3838-2002). Similarly, 69% of DP concentrations exceeded the critical concentration of 0.05 mg P L −1 established as the maximum DP concentration to reduce the risk of excessive algal growth (Li et al. 2010). These concentrations were much higher than observed average riverine TP (0.03 mg P L −1 ) and DP (0.02 mg P L −1 ) concentrations in a nearby forest-dominated catchment (agricultural land + residential land < 4%, Hu et al. 2020a, b, c) and imply the substantial P contribution from agricultural activities. Riverine DP (76 ± 20%) was the major component of TP, which was significantly higher than OFP (24 ± 20%) across all sampling sites. This implies that DP from PS pollution or groundwater flow should be major potential cause of river P pollution (Kannel et al. 2008). Observed electrical conductivity (EC) and Cl − concentrations in river water (EC: 84.0-138.2 μS cm −1 , Cl − : 2.15-5.03 mg L −1 ) and in groundwater (EC: 131.3-232.0 μS cm −1 , Cl − : 2.36-4.56 mg L −1 ) sampling sites were also much high than those observed in a nearby forest-dominated catchment (EC: 47.6 μS cm −1 and Cl − : 1.25 mg L −1 in river water; EC: 45.1 μS cm −1 and Cl − : 0.93 mg L −1 in groundwater, Hu et al. 2020a, b, c). In addition, significant correlations were observed between DP concentrations with EC (p < 0.001, r = 0.258) and Cl − concentrations (p = 0.012, r = 0.166) in river water. These higher EC values and Cl − concentrations as well as the positive correlation between DP concentration and EC in the Shuanggang River watershed further indicated potential contributions of sewage and groundwater to river water pollution in terms of previous studies (Yu et al. 2019).
Longitudinally, average TP and DP concentrations increased by 50% from upstream (M1) to downstream (M4) along the mainstream (Table 2) due to the increasing cumulative P inputs from PS and NPS inputs. Average TP, DP, and OFP concentrations of tributaries were all significantly Table 2 Spatial and seasonal variations of riverine total phosphorus (TP), dissolved phosphorus (DP), and forms of P (OFP) concentrations in the Shuanggang River watershed P concentrations are expressed as "mean ± standard error." The lowercase letters ("a", "b," and "c") denote the differences of each P concentration (TP, DP, and OFP) among different flow regimes for each sampling site. The uppercase letters ("A", "B," and "C") denote the differences of annual averages for each P concentration across different sampling sites (p < 0.05) higher than that of the mainstream, potentially resulting from dilution effects . TP, DP, and OFP concentrations in T3 were the highest (p < 0.05) among the river sampling sites (Table 2), which may be due to the low water discharge (0.31 m 3 s −1 , Table 1) and high domestic animal density (209 km −2 , Table 1). No significant (p > 0.05) correlations were observed between average riverine TP/DP/OFP concentrations and area percentages for different land-use type across the subwatersheds. This may result, in part, from the low number of catchments, as well as interactions among multiple, complex hydrological processes within each subwatershed. In terms of seasonal variations, there were no significant (p > 0.05) correlations between riverine TP/DP/OFP concentrations and river discharge (Table 2), implying a mixture of pollution sources and complex pollution processes in the Shuanggang River watershed. Average riverine TP and DP concentrations were generally higher in the low flow regime (TP = 0.13 mg P L −1 , DP = 0.85 mg P L −1 ) than in the high flow regime (TP = 0.11 mg P L −1 , DP = 0.73 mg P L −1 ), which might reflect the higher P concentration contributed by groundwater during the low flow regime (Flores-López et al. 2011). During the high flow regime (corresponding to the summer growing season in this area), 68% of the riverine DP concentrations exceed the critical concentration (0.05 mg P L −1 ), implying a high risk for algal blooms in the river and downstream water bodies (Bowes et al. 2009).
Average groundwater TP concentrations in G1 and G2 were 0.19 mg P L −1 and 0.18 mg P L −1 , respectively, which were significantly (p < 0.05) higher than their corresponding river water samples in M1 and T2 (Table 2). Groundwater TP concentrations were much higher than those observed in a nearby forest-dominated catchment (0.03 mg P L −1 , Hu et al. 2020a, b, c) and measured in other areas of eastern China (0-0.18 mg P L −1 , Kang and Xu 2017;Qian et al. 2011). These high groundwater TP levels reflect the high agricultural soil P contents (32-90 mg Olsen P kg −1 ) resulting from long-term excessive chemical fertilizer and farmyard manure applications in the Shuanggang River watershed . The positive correlation observed between P concentrations in river water and groundwater (R 2 = 0.76-0.88, p < 0.05, Fig. 4) support the inference that groundwater is a potential source of riverine P pollution.

Surface runoff and groundwater contributions to river discharge
Using the nonlinear reservoir algorithm (NRA) method, estimated monthly groundwater or baseflow contribution to river discharge ranged within 30-96%, 30-96%, and 28-96% for M2, M3, and M4, respectively, with surface runoff contributions of 4-70%, 4-70%, and 4-72% (Fig. 5). The large estimates for groundwater contributions were supported by the low isotopic variability of river water (CV: 16.7% for δ 2 H-H 2 O; 15.0% for δ 18 O-H 2 O) compared to that of precipitation (CV: 69.3% for δ 2 H-H 2 O; 50.0% for δ 18 O-H 2 O). This damping effect is consistent with substantial contributions of groundwater to river discharge (Hu et al. 2020c). The lowest groundwater contribution occurred in August 2019 and coincided with "Typhoon Likima" that resulted in a cumulative rainfall of > 400 mm. The intensive storm exceeded the soil infiltration capacity generating high surface runoff. The highest groundwater contribution occurred in June 2020 during the low flow regime when high temperature and strong evapotranspiration limited surface runoff (Fig. 2). The positive correlation between monthly groundwater contribution and temperature (r = 0.486-0.491, p < 0.01) highlights the role of evapotranspiration in limiting surface runoff (Fig. 5). This is also consistent with the considerably lower slope of the Local River Water Line (LRWL: δ 2 H = 5.66δ 18 O-2.37‰, n = 27, R 2 = 0.90, p < 0.001) compared to that of the Local Meteoric Water Line (LMWL: δ 2 H = 7.05δ 18 O + 10.22‰, n = 27, R 2 = 0.90, p < 0.001), which further implies that high evapotranspiration reduced the surface runoff contribution and increases the relative groundwater contribution (Gonfiantini 1986). Estimated monthly groundwater contributions showed a consistent longitudinal trend along the sampling site progression of M2, M3, and M4 (Fig. 5), with similar annual groundwater contributions (M2: 68%, M3: 68%, and M4: 66%, Table 3). Such similar temporal trends were associated with similar Fig. 4 Relationships between observed total phosphorus (TP) or dissolved phosphorus (DP) concentrations in river water (at sampling sites M1 and T2) and groundwater (at sampling sites G1 and G2) in the Shuanggang watershed. Blue line represents the regression line for TP of M1 and TP of G1; red line represents the regression line for DP of M1 and G1; blue dot-dash line represents the regression line for TP of T2 and TP of G2; red dot-dash line represents the regression line for DP of T2 and G2 topographic attributes (e.g., topographic gradients and soil types) across catchments of M2, M3, and M4 (Table 1), resulting in homogeneous hydrological characteristics.
Based on the observed δ 2 H and δ 18 O values in precipitation and river water (Table 3), the isotope-based sine-wave fitting method (ISF) estimated an average young water Fig. 5 The nonlinear reservoir algorithm estimated monthly surface runoff and groundwater contributed river water discharges at sampling sties M2 (a), M3 (b), and M4 (c) in the Shuanggang River watershed. "L", "M," and "H" represent the low, median, and high flow regimes, respectively. Error bars denote the 95% confidence intervals fraction (F yw ) of 29% (95% CI: 15-60%) for the entire watershed (M4), implying that the majority (71%) of total streamflow originates from old water (F ow ). The F ow includes groundwater and slow subsurface flows (Hu et al. 2020c), which are similar in principle to the components of baseflow (groundwater and a portion of delayed shallow subsurface flows, He et al. (2016)). The estimated F ow contribution from the ISF method was fully consistent with the estimated groundwater contribution from the NRA method (Table 3). Moreover, the high baseflow contribution was comparable with estimates from a lumped-model simulation for this same watershed (82%, Hu et al. 2018), supporting the reliability of our NRA results. Our estimated baseflow contribution (66-68%) was also similar to results (58-74%) from a nearby agricultural watershed (He et al. 2016).
The variations in groundwater contribution may be primarily related to landscape characteristics. The high percentage of paddy fields (Table 1) in the Shuanggang River watershed is prone to hold a certain level of water during the rice production season, thereby enhancing evapotranspiration and water infiltration into the soil where it may become subsurface flow and/or groundwater. This would lead to delayed transport of the young water fraction to rivers (Peng et al. 2011). Additionally, the agricultural and forest lands typically have high vegetation cover, which promotes the interception of precipitation and the infiltration processes and thereby increases subsurface flows (Du and Shi 2012).

Riverine P source apportionment
Using estimated surface runoff and groundwater flow/baseflow components along with riverine P concentrations in 2017-2020, the load apportionment model (LAM) was calibrated for the Shuanggang River. Calibrated parameters (i.e., a, b, c, d, and e) of the LAM were statistically significant (p < 0.001) for all three mainstream sampling sites (M2, M3, and M4), indicating that a satisfactory fraction of the variance was explained by each model component (Table 4). The TP-LAM accounted for 75-82% (R 2 ) of the variability in TP concentrations with a NSE of 0.61-0.77 for the calibration, compared to a R 2 = 0.88-0.98 and NSE = 0.54-0.64 for the validation (Fig. 6). Similarly, the DP-LAM explained 73-84% (R 2 ) of the variability in DP concentrations with a NSE of 0.67-0.72 for the calibration and R 2 = 0.88-0.97, NSE = 0.56-0.83 for the validation (Fig. 6). The higher accuracy observed in the validation step for each LAM (TP and DP) compared to their respective calibration step demonstrates the robustness of the LAM for the Shuanggang River. The efficiency metrics for the LAM (R 2 = 0.73-0.98, NSE = 0.54-0.83) were satisfactory and comparable with other monthly model simulation results using SWAT, AGNPS, and HSPF (R 2 > 0.50 and NSE > 0.50, Moriasi et al. 2007). Notably, these results were comparable or superior to nutrient simulations for other watersheds using the LAM (R 2 = 0. 26-0.98, NSE = 0.70-0.98;Chen et al. 2013;Greene et al. 2011;Mockler et al. 2017;Stevenson et al. 2021). Collectively, these assessments validate the feasibility of using the LAM for apportioning PS, surface runoff and groundwater contributions of P to the Shuanggang River.
The LAM estimated that PS contributions to annual riverine TP (15-18%) and DP (7-9%) loads were limited (Fig. 7), with a higher contribution of OFP from PS (38-65%) in the Shuanggang River watershed. Estimated PS contributions to longitudinal riverine TP loads followed M4 > M3 > M2, which is consistent with the distribution of residential lands in the watershed (Table 1). In terms of seasonal variation among mainstream sites, estimated PS contributions to TP and DP loads during the low flow regime (TP = 36%, DP = 21%) were considerably higher than for the median and high flow regimes (TP = 11-19%, DP = 5-9%). We attribute this seasonal pattern to lower river discharge and weaker dilution effects during the dry season .
The LAM estimated that surface runoff contributed 14-35%, 10-32%, and 35-55% of annual riverine TP, DP, and OFP loads in the Shuanggang River, respectively (Fig. 7). This indicates that riverine OFP pollution loads were mainly derived from PS and surface runoff. Estimated TP and DP loads from surface runoff followed the order of M2 > M4 > M3, which is consistent with the percentage of agricultural lands associated with the three sampling sites (Table 1). This infers that agricultural lands are a primary source of riverine P via surface runoff (McDowell et al. 2001). As expected, surface runoff contributed the highest riverine OFP loads during the high flow regime (222 kg P month −1 ), being significantly (p < 0.05) higher than the low (64 kg P month −1 ) and median (112 kg P month −1 ) flow regimes. This is consistent with heavy rainfall events causing surface runoff and soil erosion that transports OFP from the landscape to stream channel (Cheng et al. 2021). The LAM estimated that groundwater contributions to annual riverine TP (46-70%) and DP (59-82%) loads were relatively high with very limited OFP loads in the Shuanggang River (Fig. 7). Similar results were found in a nearby agricultural watershed where the annual groundwater DP load contributed ~ 64% of the annual riverine DP load in 2003(He and Lu 2021. Estimated groundwater contributions to riverine TP and DP loads in M3 (TP = 62%, DP = 76%) were higher than that in M2 (TP = 42%, DP = 56%) and M4 (TP = 42%, DP = 60%), which we ascribe to the low percentage of paddy fields in M3 (Table 1). The plow pan within paddy field soil profiles has been shown to hinder P leaching into the groundwater system (Wang et al. 2012).
High groundwater contributions to riverine P loads are due to a combination of high groundwater contributions to river discharge (Fig. 5) and high DP concentrations in groundwater ( Table 2). As a typical agricultural catchment, the annual P fertilizer application rate to agricultural lands within the Shuanggang River watershed was 51.3 kg P ha −1 year −1 in 1980-2010 ). This P application rate was nearly three times higher than that in the Yangtze River basin (~ 13.47 kg P ha −1 year −1 , Hu et al. 2020b) and more than twice that of the maximum P application standards in Europe (14.2-20.7 kg P ha −1 year −1 , Amery and Schoumans 2014). However, P uptake by crops comprised only 27% of the total annual P input to the studied area . As a result, agricultural soil P contents (32-90 mg Olsen P kg −1 ) in 2009 often exceeded the critical concentration of 60 mg Olsen P kg −1 that is considered conducive to soil P leaching (Hesketh and Brookes 2000). Moreover, application of farmyard manure (average P application rate of 15.7 kg P ha −1 year −1 during 1980-2010, Chen et al. 2017) in the Shuanggang River watershed adds additional P and inhibits P adsorption by soils, thereby accelerating soil P leaching to groundwater (Spiteri et al. 2007). Thus, the accumulation of large soil P pools (i.e., legacy P) associated with chemical fertilizer and farmyard manure application over long time periods, coupled with high groundwater contributions to river discharge, result in a high groundwater contribution of P to the Shuanggang River.

Implications for watershed-scale P pollution control
Quantitative information concerning monthly point source, surface runoff, and groundwater contributions to riverine P pollution loads provides critical information for guiding development of efficient P pollution control strategies (Fig. 7). Although the PS contribution to riverine P pollution was limited, its impact can be significant, especially during the low flow regime (Withers et al. 2012), when the average DP concentration of ~ 0.1 mg P L −1 was significantly higher than the critical concentration (0.05 mg P L −1 ) for eutrophication control (Table 2). Therefore, improvements in sewage collection and treatment, which are lacking in some areas within the Shuanggang River watershed (Chen et al. 2016), are warranted to reduce domestic, industrial, and animal wastewater discharge to the river system. Surface runoff contributes a substantial OFP load to the river system during the high flow regime due to soil erosion primarily from agricultural lands (Fig. 7). Thus, promoting conservative tillage and crop residue management to reduce soil erosion are practical methods to reduce OFP loss via surface runoff (storm events) in croplands (Kleinman et al. 2011). Further, constructing wetlands, ponds, and vegetation buffer zones can slow runoff velocity and decrease flood peak flows, Fig. 6 Calibration (a, c) and validation (b, d) results of the load apportionment model for riverine total phosphorus (TP) and dissolved phosphorus (DP) loads in the Shuanggang River watershed thereby reducing surface runoff and associated P loads (Wu et al. 2017). This study highlights the importance of designing control strategies to reduce P pollution loads from groundwater in the Shuanggang River watershed (Fig. 7). Given the considerable lag time of groundwater flows between precipitation inputs and river discharge, it will likely take several years for groundwater-directed BMPs to become evident in streamflow reductions in P loads. A first step is to reduce P leaching by utilizing/recycling legacy P stored in soils through cessation or reduction of P fertilizer/manure application. Long-term field studies indicate that after cessation of P fertilization, legacy P in soils can be effectively utilized by crops with minimal reduction in crop yields for several years to decades Rowe et al. 2016;Sharpley et al. 2013). Enhancing bioavailability of legacy soil P for crop production through using P activators (i.e., bio-inoculants, bio-fertilizers, low-molecular-weight organic acids, zeolites) and deep rooted crops is warranted to attenuate groundwater P pollution (Zhu et al. 2018). Expanding paddy field area provides a potential avenue to improve P use efficiency and deplete the legacy soil P pool. For example, previous studies in China showed that cereal crops in paddy fields (43%) have a higher P use efficiency than non-cereal crops in dryland and garden fields (19%) ). Finally, intercepting P during subsurface hydrologic transport using artificial ditches and P removal materials (e.g., fly ash and steel slag) could be beneficial for reducing groundwater DP loads before discharge to rivers (Penn et al. 2014).

Conclusion
Based on monthly hydrochemical monitoring data in 2017-2020, this study combined baseflow separation methods and a LAM to assess P dynamics and riverine Fig. 7 The LAM estimated point sources, surface runoff, and groundwater contributions to monthly riverine total phosphorus (TP) and dissolved phosphorus (DP) loads at mainstream sampling sites M2, M3, and M4 in the Shuanggang River watershed. "L", "M," and "H" represents the low, median, and high flow regimes, respectively. Error bars denote the 95% confidence intervals from Monte Carlo simulations P sources in the Shuanggang River watershed. The LAM effectively predicted riverine P pollution loads with reasonable accuracy metrics for both calibration and validation steps. Point sources, surface runoff, and groundwater contributed an estimated 15-18%, 14-35%, and 46-70% of the annual riverine TP load and 7-9%, 10-32%, and 59-82% of the riverine DP pollution load, respectively. A notably high groundwater contribution to riverine P loads resulted from the combination of high groundwater contributions to river discharge (66-68%) and high groundwater DP concentrations. High groundwater P concentrations were associated with considerable cropland soil P accumulations coupled with farmyard manure application over long time periods. Therefore, to reduce P loads in the Shuanggang River watershed, pollution control strategies must address reducing the groundwater P load, such as decreasing P fertilizer application rates and utilizing/recycling soil legacy P for crop uptake. The integrated methodology developed in this study provides a relatively simple method to quantify P pollution loads from point sources, surface runoff, and groundwater to rivers. This study highlights the importance of considering groundwater pollution in regulating P loads in agriculturally dominated watersheds.