Deciphering the influence of multiple anthropogenic inputs on taxonomic and functional profiles of the microbial communities in Yitong River, Northeast China

We conducted the analysis of physicochemical parameters, 16S rRNA amplicon sequencing and real-time quantitative polymerase chain reaction to explore the impact of human inputs on the bacterioplankton communities within a tributary of the largest river flowing through a megacity in northeast China. Agriculture largely accounted for the alteration of diversity and functions of the microbial communities. Furthermore, nitrate and total phosphorus declined at the reservoir outlet. The WWTP effluent discharge caused a decrease of the relative abundance of Actinobacteria and Cyanobacteria, while the impact on the variation of alpha diversity of river microbial community was slight. Carbon fixation and nitrogen cycle varied with the change of land use type. The rare taxa contributed with a predominant role in the response to environmental variables and NH3-N as well as NO3−-N were the main environmental factors that drove the shift in the bacterial community. The occurrence of the human-specific fecal indicator was mostly derived from agriculture, and its increase in relative abundance was observed in the WWTP effluent. Thus, our study provides guidance for ecological assessment and management of rivers by revealing the response pattern of river bacterioplankton to multiple types of anthropogenic stressors.


Introduction
The river ecosystem is vital for the persistence of human civilization, and harbors numerous microbes (Clark et al., 2018;Mansfeldt et al., 2020). Microorganisms are essential components for the freshwater environment, playing an important role in material and energy flow in aquatic ecosystems (Falkowski et al., 2008;Zeglin, 2015;Yang and Zhang, 2020). Stream microbial communities drive organic chemical degradation, carbon flow, nitrification or denitrification, and phosphate uptake in rivers, and have been suggested as sensitive bioindicators for water quality assessments (Guecker et al., 2006;Valett et al., 2008;Tank et al., 2010).
As a typical lotic ecosystem, rivers are influenced not only by climate factors but also by the disturbance of anthropogenic effects, such as dam construction, land-use alteration, and wastewater discharge by wastewater treatment plants (WWTPs) (Garcia-Armisen et al., 2014;Liao et al., 2018;Price et al., 2018;Burdon et al., 2020;Yang and Zhang, 2020). On the one hand, the construction of a reservoir may significantly modify hydrological regime, Responsible Editor: Robert Duran * Yang Huo huo0814@outlook.com * Mingxin Huo huomx097@nenu.edu.cn limit physical exchange, drive the variation of the microbial structure, and change transformation and flux of nutrients by biotic and abiotic processes (Li et al., 2015;Schmadel et al., 2018). On the other hand, land use alteration is proved to induce the change of the existence form of nitrogen Eggen et al., 2014). It was reported that nitrogen cycle processes including nitrogen fixation, ammonia oxidization and nitrite oxidization were abundant in the forest area, while key microbes in the nitrogen cycle were lost in the urban area (Liao et al., 2020). Other human activities, such as discharging contaminants from both agricultural area and municipal wastewater are highly influential to rivers, which also bring potential threats to human health (Vadde et al., 2019). Rivers in arid and semi-arid regions receive a major portion of their base flow from WWTP effluents (Yu et al., 2020). China discharges more than 60 billion m 3 municipal wastewater per year, of which more than 90% is treated through WWTPs (Tong et al., 2020). Treated effluent discharge into the rivers is considered to alter the availability of nitrogen and phosphorus, and also the relative content of various forms of nitrogen in the receiving water (Ekka et al., 2006;Lewis and Grimm, 2007). The downstream decrease of ammonia nitrogen is usually accompanied by an increase of nitrate, which results from nitrification by nitrifying bacteria to a great extent (Sonthiphand et al., 2013).
Until now, many researches have focused on the spatial and temporal taxonomic profile variation of microbes exposed to inputs. However, the functional diversity of river microbial communities and their direct relevance to various anthropogenic stresses have yet to be comprehensively understood Mansfeldt et al., 2020). In this study, we selected a river (Yitong River) in the northeast China that experiences different human disturbances, including land-use partition, hydraulic changes, and WWTP effluent discharge. High-throughput amplicon sequencing and quantitative polymerase chain reaction (qPCR) were performed to depict the taxonomic and functional diversity of bacterioplankton communities in the highly disturbed watershed in different sampling times. The aim of this study is to investigate the relationship between the diversity pattern of river bacterioplankton and multiple types of anthropogenic disturbances for guidance in evaluation and management of the river ecosystem.

Study area and sample collection
Yitong River is a secondary tributary of Songhua River in Jilin Province, northeast China, with a total length of 343 km, covering 8,440 km 2 of the watershed area and the average velocity is 0.011 cm/s. The mean annual precipitation of the watershed is 622 mm, with 78.1% accounting for rainfall events from June to September. The river experiences a four-month freezing period from mid-November to late March the following year. We selected a stretch of 80.04 km of the river (43°10′ N to 43°57′ N, 125°16′ E to 125°29′ E), which successively flows through an agricultural area and a constructed reservoir (Xinlicheng Reservoir), and then enters a highly urbanized area (Changchun City) that serves a population of 7.538 million. In southern Changchun, the river receives effluent discharged from a domestic WWTP (Dongnan WWTP), with a daily wastewater treatment capacity of 150,000 m 3 and the oxidation ditch treatment process.
Sampling campaigns were conducted along the above mentioned stretch on October 11, 2019, and August 22, 2020, which represented normal and wet periods, respectively. Eight sampling sites were selected, including agriculture sites (Y0-Y3), of which Y0 contained aquaculture activity and Y3 was the reservoir outlet, and urban sites (Y4-Y7), upstream and downstream of the WWTP effluent discharge (Fig. 1). For the two periods, we collected a total of 16 samples. The latitude and longitude information of each sampling site could be found in Table S1. For each site, A total volume of 5.5 L surface water was collected into sterile containers (4.5 L in a polypropylene bucket and 1 L in a brown glass bottle) at each site. All the containers were carefully rinsed using the river water prior to collection. After collection, the samples were immediately transported to a laboratory in Northeast Normal University (Changchun, China) with a distance of 7.5 km from the WWTP and preserved at 4 ℃.

Physicochemical analysis
For each site, water temperature (T), pH, electric conductivity, oxidation-reduction potential (ORP), and total dissolved solid (TDS) were measured in the field using a multiparameter water quality analyzer (ULTRAMETER II™ 6PFC E , MYRON Co., USA). The laboratory analysis of samples for the physicochemical parameters of water were done within 24 h, following the standard methods suggested by the Ministry of Ecology and Environment of China (Chinese EPA, 2002;Yu et al., 2020). Ammonia nitrogen (NH 3 -N), total phosphorus (TP), and soluble reactive phosphorus (SRP) were measured using an ultraviolet spectrophotometer (UV-VIs, GENESYS 10S, Thermo Scientific Co., USA). Nitrite (NO 2 − -N) and nitrate (NO 3 − -N) were measured using ion chromatography (IC, 881, Metrohm Co., Switzerland). Fe, Mn, K, Mg, Ca, and Si were measured in element via inductively coupled plasma-optical emission spectroscopy (ICP-OES, Avio-200, Perkinelmer, USA), and total organic carbon (TOC) was measured by a total organic carbon analyzer (TOC, 5000A, Shimadzu Co., Japan). Significance of the variance of main physicochemical characteristics among sampling sites and seasons was calculated via analysis of variance (ANOVA) in the R software (version 4.0.2) (Ramette, 2007;Aristi et al., 2016;R Core Team, 2020).

DNA extraction
Each water sample (500 mL to 1100 mL) was vacuum pumpfiltered using a Millipore 0.45 μm filter membrane on the day of sampling, 30 min for each sample approximately. These membranes were stored at − 80 ℃ before DNA extraction, which was conducted using a DNeasy PowerWater Kit (Qiagen Co., USA) following the manufacturer's instructions. Extracted DNA concentration was measured using a NanoPhotometer (N50, IMPLEN Co., Germany). Prepared DNA samples were stored at − 20 ℃ until further analysis.

Amplicon sequencing and bioinformatic analysis
The 16S rRNA gene V4 region was amplified using the forward primer 515F (GTG YCA GCMGCC GCG GTAA) and reverse primer 806R (GGA CTA CNVGGG TWT CTAAT). The amplicon library was constructed according to the following procedure: the Illumina barcode was added to the end of the target region via PCR, and the product was collected following the detection in a 2% agarose gel. Finally, the product was denatured with sodium hydroxide to produce single chain DNA fragments. Both amplicon library construction and amplicon sequencing were performed on an Illumina MiSeq PE300 platform (Illumina Inc., USA) at Shanghai Majorbio Bio-pharm Technology Co., Ltd. Raw read files were uploaded to the National Center for Biotechnology Information Sequencing Read Archive database (SRA; https:// trace. ncbi. nlm. nih. gov/ Traces/ sra/) with the BioProject Accession ID PRJNA683092. A list of the sample names and their corresponding BioSample Accession IDs are shown in Table S2.
The quality of reads and the effect of merge were filtered for quality control to remove barcode primers and low-quality reads by using a Galaxy pipeline in Denglab (Feng et al., 2017). A total of 770,634 high-quality bacterial sequences from 16 samples were obtained after quality control, which was performed as the following procedure: Firstly, sequences were removed if the sequences contained N, and primers in the sequences were removed. Then sequences were trimmed by the minimum length of 240 bp and the maximum length of 260 bp. Finally, the average length of each read is 256 bp. The clean reads were clustered into OTUs with the UPARSE method at the similarity threshold of 97% (Edgar, 2013). The taxonomy annotation for each OTU in bacteria was assigned against the SILVA database (version 132) (Pruesse et al., 2012;Quast et al., 2013;Yilmaz et al., 2014).

Diversity analysis
Diversity analysis was performed using R software (version 4.0.2) (R Core Team, 2020). Alpha diversity indices, including Shannon, Inv_Simpson, and Richness, were calculated for each sample in vegan 2.5-7 (Oksanen et al., 2020). ANOVA was used to calculate the significance of the change of indices. For beta diversity, PCoA was performed based on Bray-Curtis dissimilarity to identify potential patterns and relationships within the bacteria community and the sites where the sample was collected and Permutational Multivariate Analysis of Variance (PERMANOVA) was used to test the significance of dissimilarity between groups (Ramette, 2007;Coll et al., 2020). The significant key taxa for each period were determined via the linear discriminant analysis effect size (LEfSe) conducted online within a Galaxy pipeline in Denglab (Segata et al., 2011).
Hellinger distance-based redundancy analysis was used to test the relationships between the bacterial communities and the water chemistry parameters by using vegan 2.5-7 (Oksanen et al., 2020;Xu et al., 2020). To eliminate the influence of dimension and acquire a normal distribution, all environmental variables were standardized through z-score transformation (Ramette, 2007). We removed the environmental variables with maximum value analyzed through VIF (variance inflation factor) to exclude the collinear factors until all of the values were below 10, and forward selection was performed to select which variables were statistically significant in shaping the community composition. The statistical significance was determined through ANOVA-like permutation tests (with 999 permutations) using the anova. cca function in the package (Blanchet et al., 2008).

Co-occurrence network and functional analysis
The top 10 phyla network was constructed by calculating the absolute value of Spearman rank correlation with Rho > 0.6 and p value < 0.05 in Hmisc 4.5-0 Wang et al., 2020;Harrell, 2021). Cytoscape software (version 3.8.0) was applied for network visualization via the algorithm of Prefuse force directed layout (Shannon et al., 2003). We tested the random network via a tool of Network Randomizer (version 1.1.3) to construct random network with the same number nodes and edges, and a tool of Centiscape (version 2.2) to calculate the parameters of the network for each period in CytoScape (version 3.8.0).
Piphillin was used to predict functional diversity aligned to the KEGG database (version Oct2018) based on microbial 16S rRNA genes at the clustering threshold of 97% similarity (Iwai et al., 2016;Narayan et al., 2020). The visualization of predicted function diversity was plotted in R package pheatmap 1.0.12 (Raivo, 2019).

Fecal indicator quantification by qPCR
qPCR assays were performed to investigate the quantitative distributions of total 16S rRNA gene and human-related fecal indicator bacteria. Bacteroides dorei plays a significant role in contributing to normal intestinal functioning, and its assay is highly specific to the gastrointestinal tract of humans. A universal Bacteroides spp. protocol was used to determine the overall fecal pollution (Bakir et al., 2006;Shanks et al., 2007;Price et al., 2018). Standard plasmids were constructed using a pClone007 Vector Kit (TsingKe Biotech Co.,Ltd., Beijing). Each qPCR was performed in a 20 μL reaction mixture containing 7.2 μL of ddH 2 O, 10 μL of fast-start essential DNA green master (Roche Ltd., Germany), 0.4 μL of each primer (10 μM), and 2 μL of DNA template. Each sample was processed in triplicate. The primer pairs and thermal programs are detailed in Table S3.

The physicochemical characteristics of surface water
The environmental factors of Yitong River are summarized in Table S4. Figure 2 shows that the contents of TOC, NH 3 -N, and TP were at a high level across the agriculture area, which may be caused by irrigation runoff and aquaculture wastewater. As a drinking water protection area, the river exhibited a decline in the concentration of N and P at the reservoir outlet (Fig. 2a, b). The maximum concentrations of NO 2 − -N and NO 3 − -N were found in the WWTP Normal Wet effluent discharge point in both periods (normal 10.18 mg/L and wet 12.67 mg/L). The variation of ammonia nitrogen in the wet period before and after the discharge of WWTP effluent fluctuated more than that in the normal period (Fig. 2a). The concentration of TP and SRP was higher in the wet period than in the normal period at most sampling points, which was possibly due to release of phosphorus from sediments. TOC concentrations showed a significant difference between wet and normal periods (Fig. 2c). Based on the ANOVA results, precipitation caused a significant fluctuation of nutrient concentrations. Precipitation might not only accelerate human activity related contaminates discharge into the rivers, but also dilute these pollutants to a degree (Yu et al., 2019). However, the concentration of nitrate was affected by the WWTP effluent discharge, land use change and reservoir.

Diversity variation and community dissimilarity of bacterioplankton
The Shannon index values varied from 2.6 to 6.0, the Inv_Simpson index values ranged from 5.5 to 101.8, and the Richness index values varied from 783 to 2,315. The richness decreased at the reservoir outlet, thereby indicating lower alpha diversity (Table S5). Although both maximum and minimum values of the alpha diversity indices appeared in the wet period, the normal period exhibited larger differences in terms of Shannon indices (Table S5). During the wet period, precipitation events could reduce environmental heterogeneity (Shang et al., 2021), which might promote the recovery of microbial diversity in the surface water and result in similar richness to the upstream sites of the WWTP effluent (Table S5). This result suggested that the WWTP effluent accounted for the increasing rate of the total river flow during the normal period, as well as contributing with an input of a large number of microorganisms into the receiving water, which was consistent with the study of Price et al. (2018). However, the increase of Shannon index from the upstream to the downstream of the WWTP was not significant (Fig. 3a).
The PCoA plot suggested a slight separation (p < 0.13) between the normal and wet period with a dissimilarity between groups (Fig. 3b). As can be seen in Fig. 3c and d, a clear separation between the agricultural and the urban area was also observed. Moreover, land use change could cause a more significant variation in the bacterial composition during normal season (p < 0.031) compared with wet season (p < 0.176). Despite different community diversities, a prominent shift of the bacterial community did not occur from the upstream to the downstream despite the discharge of WWTP effluent into the receiving water. This finding indicates that land-use variation mostly changed the bacterial community composition (Liao et al., 2018(Liao et al., , 2020.

Taxonomic compositions and co-occurrence relationship of bacterioplankton
The 5,674 clustered OTUs belonged to 47 phyla and 575 families with only 49 shared OTUs in all samples. At the phylum level, the top 4 phyla were Proteobacteria (45.7%), Bacteroidetes (16.9%), Actinobacteria (16.2%), and Cyanobacteria (9.0%), which accounted for more than 80% of the total sequences (Fig. 4c). Actinobacteria and Cyanobacteria decreased in the WWTP effluent, similar to a study by Romero et al. (2019). In terms of family level (Fig. 4d), the relative abundance of Sporichthyaceae decreased apparently from the agricultural area to the urban area. More interestingly, in the normal period we found an abrupt increase in the relative abundance of Acidobacteria and one of its family Ilumatobacteraceae at the reservoir outlet. The predominance of Proteobacteria in freshwater ecosystems was well documented in many previous studies (Xia et al., 2013;Ligi et al., 2014;Dai et al., 2016). Study has indicated that the ratio of certain taxa to α-Proteobacteria was effective for identifying pollution sources from sewage (Wu et al., 2010). Further, the relative abundance of α-Proteobacteria and γ-Proteobacteria might be related to the availability of nutrients and dissolved oxygen (Chiaramonte et al., 2014). Bacteroidetes, which widely inhabit marine and freshwater ecosystems (Zeglin, 2015), are specialized in dissipation of organic chemicals with high molecular weight (Traving et al., 2017). The high abundance of Actinobacteria is often recorded in eutrophic freshwater systems (Wu et al., 2007;Debroas et al., 2009;, but their abundance often decreases with decreasing oxygen concentrations (Allgaier and Grossart, 2006). Actinobacteria have been used as a water quality indicator on account of their sensitivity to nutrients and dissolved oxygen condition that promote algal growth (Ghai et al., 2014;Pascual-Benito et al., 2020). On account of agriculture, substances containing N or P from the upstream were stored by the reservoir with long retention time, which might have caused the increase in Actinobacteria and Ilumatobacteraceae at the outlet. Cyanobacteria play a key part in nutrient cycling in freshwater. Therefore, their variation might indicate the change in the content of different forms of nitrogen .
A co-occurrence network was constructed to better understand the interaction of microbial members in the effluentriver ecosystem (Fig. 4a, b). Both networks consisted of 11 nodes (phyla) and 54 edges (correlations) ( Table S8). The positive interactions between phyla increased from the normal period to the wet period, whereas the total absolute interactions decreased. In the normal period, the positive interactions were mainly observed in Bacteroidetes, Acidobacteria, and Chloroflexi, which were all the main phyla in the upstream (Fig. 4a, c), and the negative correlations were mainly observed in Actinobacteria, Verrucomicrobia, and Planctomycetes. However, during the wet period, the positive interactions were mainly observed among Chloroflexi, Cyanobacteria, Planctomycetes and other phyla, and the interactions enhanced in the effluent (Fig. 4b, d), while the negative correlations were mainly observed in Bacteroidetes and Actinobacteria. To connect the results of co-occurrence network and LEfSe (Fig. S2), Firmicutes (including family Lachnospiraceae) and Planctomycetes could be the critical taxa for two periods, respectively.

Predicted functions of bacterioplankton
Piphillin was applied to predict functional diversity based on the microbial community of 16S rRNA genes. The KEGG annotation results (Iwai et al., 2016;Narayan et al., 2020) show that the main 50 level-KO3 functions belonged to cellular processes, environmental information processing, genetic information processing, human diseases, metabolism, and organismal system level-KO1 categories (Fig. 5). Carbon fixation can be regularly found in Proteobacteria, which are dominant in Y0A, Y1A, Y4A, and Y5A. Notably, the decrease in Proteobacteria in Y4, Y5, Y6, and Y7 was in accordance with the decline of carbon fixation (Figs. 4c and 5). The core nitrogen cycle involves nitrogen fixation and denitrification. As mentioned above, the highest total concentration of NO 2 − -N and NO 3 − -N in Y5A and Y5 corresponded to the strong correlation between WWTP effluent discharge and the function of the nitrogen cycle. Conversely, the low N concentration in Y3A and Y3 corresponded to the decreased function of the nitrogen cycle ( Fig. 1a and  5). Studies demonstrated that variations of functional genes often result in differences in the community composition to maintain the ecosystem function. Thus, the community composition may not respond comprehensively to environmental changes (Rousk et al., 2009;Wang et al., 2020).

Identification of main environmental factors
Abundant and rare taxa exhibit unequal life histories, and rare taxa sometimes display unique biogeography, which differs fundamentally from that of abundant taxa. Thus, the rare taxa are essential for understanding community changes (Lynch and Neufeld, 2015). In this study, we defined relative abundance thresholds as 0.01% for abundant and rare OTUs (Chen et al., 2019;Xu et al., 2020). The abundant and rare OTUs are described in Table S6 and Figure S1.
To explore the relationship between bacterial community composition and environmental variables, and whether both the abundant and rare taxa show the same or different response to environmental variables, 17 environmental variables were standardized to determine the evident variables that shape the community composition, as analyzed by RDA for all OTUs, abundant OTUs, and rare OTUs (Fig. 6). Calculating the variance inflation factor determines that pH, ORP, TP, SRP, NH 3 -N, NO 3 − -N, NO 2 − -N and Si were strongly related to OTUs. These physicochemical characters explained a comparatively low proportion of the variation of the abundant OTUs among samples (Fig. 6b), and similarity in the significance of key environmental parameters that determined the composition of bacterial community for all and rare taxa was found (Fig. 6a, c). Besides, of the six OTUs significantly impacted in Fig. 6a, four were also found in the rare taxa but only one OTU was included in the abundant taxa, i.e., the OTUs playing a crucial role in rare taxa were also present as the dominant groups of all taxa in response to variables, which was consistent with the view that environmental variables mainly shape the community of rare OTUs rather than the abundant OTUs (Xu et al., 2020). In addition, NH 3 -N was a dominant factor that leads to the obvious partitioning of community composition between the two periods, and NO 3 − -N was the main cause for the differences from upstream to downstream of the WWTP effluent, generally belonging to agricultural and urban areas, respectively. From Fig. 6a, it can be observed that Cytophagales were positively correlated with NH 3 -N and positive relation could also be found between Chloroplast and NO 3 − -N.

Identification and quantification of anthropogenic microbial indicators
Two Bacteroides taxa indicating anthropogenic fecal pollution (Bacteroides spp. and Bacteroides dorei) were quantified by using real-time PCR. An increasing trend of 16S rRNA and Bacteroides spp. from upstream to downstream could be observed, except for the high concentration in Y0 site, presumably due to agricultural runoff. As for B. dorei, the human specialized indicator, the qPCR results indicated few distinctions in the concentrations (Fig. 7a). Significantly, the relative abundance of Bacteroides spp. demonstrated that B. dorei appeared at Y5A and Y5 sites and higher relative abundance was observed during the wet period (Fig. 7b).
Although human-specific fecal pollution was observed in the WWTP effluent, the effluent did not significantly increase the downstream abundance of fecal indicators such as B.
dorei. On the one hand, the discrepancy observed between the qPCR and NGS results may be due to primer bias either during qPCR or amplicon library preparation or limitations of the taxonomic database used for assignment (Price et al., 2018). On the other hand, the actual absolute concentrations of some sites were lower than the limit of detection and limit of quantification, thereby leading to imprecise real-time PCR assay results in freshwater (Seurinck et al., 2005).

Conclusion
In this study, we investigated the response of the bacterioplankton community composition and function to multiple types of anthropogenic inputs, including agriculture, reservoir construction, and WWTP effluent discharge. Our results indicate decreased concentration of nutrients concomitant with an increase in the actors of carbon and nitrogen metabolism at the reservoir outlet. Moreover, temporal and spatial changes were observed in bacterial community composition  and function, and these changes were shaped by land-use change and WWTP effluent. The change in precipitation was the main factor that facilitated the changes. All these changes were more significant during the normal period. The RDA analysis results suggest that NH 3 -N and NO 3 − -N were the main environmental variables that drove the change of bacterial community composition. The absolute concentration acquired from qPCR showed evidence of pollution with Fig. 6 Relationships between bacterial communities and environmental variables. Redundancy analysis of (a) all, (b) abundant, and (c) rare bacterial community taxa at the OTU level from the eight surface water sites with respect to the 17 measurable variables. The OTUs displayed in the plots were generated by a cut-off with the absolute value of the explanation of RDA 1 > 0.1. Arrows indicate the direction and magnitude of variables. * p < 0.05; ** p < 0.01; *** p < 0.001  agriculture as the primary source. The relative abundance of fecal indicators was significantly higher in the effluent site.
Our study on Yitong River provides a reference for evaluating anthropogenic stressors on river ecology by revealing the response pattern of river bacterioplankton.