Microbial community compositions
Microbiome data tend to be very noisy, and the total number of counts per sample is highly variable because of the experimental process and quality control filtering. For consistency in statistical analyses, all samples in this study were normalized by the minimum reads of 33,827. The taxonomy analysis has identified 55 phyla, 1502 genus, 3161 species, and 8042 OTUs from all samples. Out of the 55 phyla identified, 40 phyla are common in April, May, and August, 40 are common between April and May, 48 between April and August, 42 between May and August. April and August had two and three unique phyla, respectively, while May had none.
Major phyla with the relative percentage of abundance greater than 1% included Proteobacteria, Bacteroidetes, Firmicutes, Actinobacteria, Cyanobacteria, Deinococcus-Thermus, Patescibacteria, Epsilonbacteraeota, Chloroflexi, and Acidobacteria (Fig. 2a). Additionally, Nanoarchaeaeota, WS4, and Calditrichaeota were identified in a sample affected by the sewage discharge to the Chan River. Zixibacteria, WS4, and Calditrichaeota were also detected in samples from the Zao River. Diapherotrites, which was first identified by Youssef (2014) in the groundwater seepage, was found in the August samples from the Zao River.
Some pathogenic bacteria such as Lentisphaerae, Fusobacteria, Spirochaetes, Dependentiae, and Elusimicrobia were also found at various locations of the rivers with different abundance in different sampling periods. It indicated industrial and pharmaceutical wastes contaminated the urban water environments, and there existed a serious threat to the healthiness of eco-environments in the region.
Microbial community differences
For the statistical analysis, samples were grouped by river segments, and each river segment was sub-grouped by sampling periods. Shown in Fig. 2c is the PCoA analysis of all 12 sample groups. Samples from the same river segment were marked with the same symbol but differentiated by the red, green, and blue colors for samples collected in April, May, and August. The PCoA plot showed that the differences in microbial communities from different sampling periods were far greater than the differences between river segments, with the variance explained on PC1 and PC2 as 26.65% and 10.24%. The separation of PC1 for the Chan, Ba, and Feng Rivers between the April/May and August samples were distinctive except for the Zao River.
The distance bar plot (Fig. 2b) showed that the dissimilarity of microbial communities between sample groups was higher than within each sample group. In a way, this test would also provide information for the similarity of microbial communities between sample groups. The ANOSIM (Analysis of Similarity) analysis had an R statistic of 0.65 with a p-value of 0.001, which indicates that the similarity between groups is greater than or equal to the similarity within the groups. ADONIS test results with R2 of 0.51 and a P-value of 0.001 implies that the classification of sampling groups by considering the geographical characteristics of rivers and the temporal variation of microbial communities is reasonable.
Seasonal changes in microbial communities
As indicated by the Shannon indices (Table S1), the overall microbial diversity in August was higher than in April/ May. Samples from the Qin Mountains, the Chan, Ba, Chan_Ba joined, and the Feng River segments all had higher diversity in August than in April or May. Shannon indices between river segments had larger variation in April and May than in August, which could be due to the dominance of recharge from the Qin Mountains areas in August but various sources of local inflows like the sewer outflows in April and May. The Zao River as a drainage channel for the municipal sewer discharge without the direct impact of flows from the Qin Mountains, its microbial communities in May and August did appear to have a significant change. Shannon indices varied between 1.33 and 1.54 in August but between 0.48 and 1.25 in April/May. Shannon evenness ranged from 0.37 to 0.44 in August but from 0.15 to 0.38 in April/May.
To further understand whether there were statistically significant differences between two or more sample groups, the Kruskal-Wallis H test was conducted on all sample groups. For the first ten major phyla, the difference among all groups for each phylum was significant, with p-values less than 0.05. It was clear that the proportions of Proteobacteria and Bacteroidetes were higher in August than in April/May for the Chan, Ba, Chan_Ba joined, Feng, and Zao River segments. However, Firmicutes, Actinobacteria, Cyanobacteria, and Deinococcus-Thermus were higher in August than in April/May for all sample groups. While in the Zao River, Patescibacteria and Epsilonbacteria were notably higher than in other rivers, reflecting the microbial communities affected by stormwater and sewage discharges in an urban area.
Wilcoxon rank-sum test between two groups and the Fisher’s Exact test between two samples was also conducted to understand microbial community diversities between locations and periods. A comparison of sample groups was to look at how microbial communities would be different on temporal scales, particularly their diversities at different sampling periods for the same river segment (Fig. 3). Shown in the figure are the first 15 major phyla in compared groups. The Proteobacteria was the dominant phylum in the Chan, Ba, Feng, and Zao rivers in April/May and August. It was also apparent that the microbial communities shifted from April/May to August. The Chan, Ba, and Zao Rivers had higher proportions of Proteobacteria and Bacteroidetes in April/May than in August. On the contrary, proportions of Firmicutes, Actinobacteria, Cyanobacteria, and Deinococcus-Thermus turned to be higher in August than in April/May. Shifts of proportions for the majority of the phyla in the Ba and Feng Rivers are statistically significant, with the p-values for the Wilcoxon rank-sum test less than 0.05, only Actinobacteria and Deinnococcus-Thermus were statistically higher in August. For the Zao River, even though no statistically significant shift of proportion for any phylum, it was apparent that Firmicutes, Bacteroidetes, Deinnococuus-Thermus, Cyanobacteria, and also the Epsilonbacteraota had more proportions in August.
Differentially abundant biomarkers between sampling periods
The Wilcoxon rank-sum test was conducted for the first 15 major phyla. Results showed three major phyla had statistically significant differences between April and August in the Feng River. The majority had significant differences in the Ba and Chan Rivers, but none had a significant difference from August to May in the Zao River. While the Wilcoxon rank-sum test can check the statistical significance of the relative abundance difference of a selected class, the LEfSe method is advantageous in identifying clades with statistically and biologically differences between communities for a range of classes. The LEfSe analysis was from Phylum to Class in this study (Fig. 4 and Table S3), which has identified 12 abundantly differential clades in the Chan River, 44 in the Ba River, and 53 in the Feng River, but none in the Zao River. It worth noting no abundantly differential phylum from May to August in the Zao River further proved that its water environment did not have significant change between May and August. For the Chan River, which is also impacted by sewage effluent, Proteobacteria was more abundant in April, but Firmicutes, Actinobacteria, and Deinococcus_Thermus were more abundant in August. The Ba River and Feng River, both on the outer side of the city, showed microbial communities that are more abundant in soil and freshwater environments. Both rivers had Proteobacteria and Bacteroidetes abundantly differential in April/May, but Firmicutes,Actinobacteria༌ Cyanobacteria༌ Chlamydiae༌ Planctomycetes༌ Chloroflexi, and Gemmatimonadetes in August. The existence of Gemmatimonadetes could imply the impact of soil erosion from the Chan River watershed. Chlamydiae, which is the most common bacterial STD in the United States with 2.86 million reported Chlamydiae infections annually, was found in the Ba River and Zao River in August. The differentially abundant microbial identified by the LEfSe analysis in the Feng River in August was also surprising. Proteobacteria and Bacteroidetes were more abundant in May, but the 18 other phyla more abundant in August included typical freshwater bacteria of Cyanobacteria, Armatimonadetes, Planctomycetes, Chloroflexi and Fibrobacteres, typical soil bacteria of Gemmatimonadetes, Kiritimatiellaeota, Acidobacteria and Patescibacteria, and Deinococcus_Thermus and Dependentiaes often found in sewage polluted water. Other bacterias were Margulisbacteria, which was first found in the marine sample, Omnitrophicaeota, Latescibacteria, which is often related to algae, and Nitrospirae that can be found in a wide range of environment, including soils. The combination of bacteria indicated that in addition to the recharge from the Qin Mountains, the Feng River was also affected by soil erosion, non-point source pollution from horticulture and agriculture, and some degree of sewage pollution. It was quite alarming that Fusobacteria, Spirochaetes, and Chlamydiae, known for pathogenic diseases, were found in the Feng River.
Microbial community differences along rivers
The composition of microbial communities in the study area was significantly impacted by environmental factors such as land-use changes and human activities such as sewage discharges. Shown in Figs. 5a and 5b are the composition of major phyla at each location, which shows the spatial distribution of microbial communities. Comparing samples from the same sampling locations in April/May and August can show the temporal dynamics of microbial communities. Figure 5a shows that Proteobacteria is a predominant phylum in all samples from April and May, with Proteobacteria percentage greater than 80% in samples from the Qin Mountains area. The percentages of Bacteroidetes, Firmicutes, and Actinobacteria were relatively high in the May samples from the Feng River, in samples from the upper portion of the Ba river. The percentage of Patescibacteria appeared to be relatively high in samples affected by sewage discharge, such as sample A18 and all samples from the Zao River. While in Fig. 5b, the composition of microbial communities in samples from August is different from April and May. In addition to Proteobacteria, Bacteroidetes, Firmicutes, and Actinobacteria remaining to be dominant in most of the samples in August, the relative composition of Cyanobacteria were higher in August than in April/May samples in the Chan, Ba, and Feng Rivers. The percentage of Firmicutes was higher in August than in April in most of the samples from the Chan and Ba Rivers. Patescibacteria percentage was also high in sample B32, which was also due to the sewage discharge in August.
Microbial community differences between rivers
Comparison analysis of microbial diversities through the Wilcoxon rank-sum test (Figure S1) showed there were no statistically significant shifts of proportions for the majority of the phyla in the Chan River and Ba River for the same sampling periods. It was only seen that Epsilonbacteraeota, Cyanobacteria, and Acidobacteria to be higher in the Chan River than in the Ba River in April and Spirochaetes to be higher in the Chan River in August. Differences between the Feng River and Zao River in May and August were statistically significant for a large portion of phyla because both rivers are not hydrologically well-connected. The Zao River had higher proportions of Patescibacteria, Epsilonbactereaota, Chloroflexi, Fusobacteria, Chlamydiae, and Synergistetes in May and higher Patescibacteria, Epsilonbactereaota, and Fusobacteria in August. In contrast, the Feng River had higher proportions of Verrucomicrobia in May and higher Cyanobacteria, Actinobacteria, Verrucomicrobia, Gemmatimonadetes, and Amatimonadetes in August.
LEfSe analysis was also conducted to compare the abundance of microbial communities between two river segments for the same sampling period (Figure S2). In April, differentially abundant phyla in the Chan River included Epsilonbacteraeota, Cyanobacteria, Acidobacteria, Planctomycetes, Gemmatimonadetes, Chlamydiae, Kiritimatiellaeota, and BRC1, but none in the Ba River. In August, microbial communities in both the Chan and Ba Rivers were very similar due to the dominant inflows from the Qin Mountains areas. Spirochaetes were identified as abundant in the Chan River, which could be due to the effluent from the sewage discharge. Comparision of the Feng and Zao Rivers showed that the Zao River had Patescibacteria, Epsilonbacteraeota, Diapherotrites, Kiritimatiellaeota, Omnitrophicaeota, Elusimicrobia, Fusobacteria, Chlamydiae, WPS_2, WS2, Synergistetes, and Dependentiae differentially abundant in both May and August.
Differences of major phyla between all four rivers were examined with the Kruskal-Wallis H test. The test results showed that Patescibacteria, Deinococcus_Thermus, Epsilonbacterarota, Verrucomicrobia, and Chloroflexi were the phyla in the top 10 major phyla with statistically significant differences among the four rivers in April/May (Fig. 8a). Figure 8b-f showed Patescibacteria and Deinococcus_Thermus were most abundant in the Zao River, Epsilonbacterarota in both the Chan and Zao Rivers. Verrucomicrobia was most abundant in the Feng River and the second most abundant in the Chan River. Chloroflexi was most abundant in the Zao River and the second most abundant in the Chan River.
The Kruskal-Wallis H test showed an increase in the typical freshwater bacteria of Cyanobacteria and Actinobacteria in the Feng, Chan, and Ba Rivers due to the recharge from the Qin Mountains streams. Patescibacteria and Epsilonbacterarota remained to be the most abundant species only in the Zao River because of its primary sources of recharge from the domestic sewage effluents.
Differentially abundant biomarkers between rivers
The LEfSe test was also conducted on the differential abundances at the phylum and class levels between the Chan, Ba, Feng, and Zao Rivers (Fig. 7, Table S4 and S5). For the April/May samples, Chan, Ba, Feng, and Zao Rivers had 13, 0, 5, and 51 differentially abundant clades, with differentially abundant phyla of Epsilonbacteraeota, Kiritimatiellaeota, Synergistetes, and Gemmatimonadetes in the Chan River, none in the Ba River, Verrucomicrobia and Planctomycetes in the Feng River, and Patescibacteria, Deinococcus_Thermus, Chloroflexi, Omnitrophicaeota, Dependentiae, Acidobacteria, Chlamydiae, WPS_2, Elusimicrobia and Fibrobacteres in the Zao River. For the August samples, there were 1, 2, 14, and 51 differentially abundant clades of phylum and class in the Chan, Ba, Feng, and Zao Rivers, with differentially abundant phyla of Actinobacteria in the Chan River, none in the Ba River, Cyanobacteria, Verrucomicrobia, Gemmatimonadetes, Acidobacteria, Armatimonadetes, and Margulisbacteria in the Feng River, and Patescibacteria, Epsilonbacteraeota, Omnitrophicaeota, Lentisphaerae, Fusobacteria, Synergistetes, Kiritimatiellaeota, WPS_2, Spirochaetes, Dependentiae, Latescibacteria and Elusimicrobia in the Zao River. The existence of Epsilonbacteraeota and Synergistetes in the Chan and Zao Rivers indicated some degrees of pollution from domestic wastes, pathogenic bacteria such as Lentisphaerae, Fusobacteria, Spirochaetes, Dependentiae, and Elusimicrobia was not only an indication of contamination from industrial and pharmaceutical wastes but also serious threat to the healthiness of eco-environment in the urban area.
Although both the Kruskal-Wallis H test and the LEfSe analysis are efficient in detecting species that are different among sample groups, it cannot show the spatial distribution of microbes. This study adopted the geographical information system (GIS) to map out the spatial distribution of selected bacteria with absolute abundance, as shown in Figure S4, which would help to identify the location of microbial such as the pathogenic bacteria that are harmful to human and the eco-environment health.
Microbial diversity driven by environmental factors
Single water quality indices from the pH, Do, nitrogen, phosphorous, and COD were computed and combined into a composite water quality index (CWQI) for each river segment, with a smaller value indicating better water quality. Based on the index values (Fig. 8a), it was clear that water from the Qin Mountains areas has better quality. The composite water quality indices for the Chan River were high due to high nitrogen and phosphorous by pollution from the non-point source of stormflow and sewage discharge. The drainage area of the Ba River was predominantly agricultural land. Its water quality indices for April and August were both lower. It was understandable that the composite index for the joined river segment of the Chan and Ba Rivers fell between the Chan and Ba Rivers. Indices for the Feng River and the Zao River were lower than those of the Ba and Chan Rivers due to lower indices for nitrogen and phosphorous. For a seasonal change, the composite index for all river segments was higher in August than in April and May due mostly to the increase of phosphorous and nitrogen. Influences of the water environment on microbial communities are analyzed in this section.
Correlation of microbial communities with environmental variables
Pearson's correlation and Spearman's Rank Correlation are common algorithms for linear correlation analysis, but Pearson's correlation is sensitive to outliers; therefore, this study adopted the Spearman's Rank Correlation. The Spearman's rank correlation coefficients between environmental variables and microbial communities on the phylum level were presented in the heatmap shown in Fig. 8c. The strength of the correlation was indicated by color bands with a darker blue for negative correlation and red for positive correlation. The p-value is marked by an asterisk * when 0.01 < P ≤ 0.05,** for 0.001 < P ≤ 0.01༌and *** for P ≤ 0.001. The average distance was used for clustering environmental variables and microbial communities.
There were three major clusters for environmental variables, i.e., cluster of pH, DO and Resistance, cluster of Turbidity, TOC, Nitrate and Total Nitrogen, and the cluster of Nitrite, EC, Soluble Phosphorus, Total Phosphorus, Hardness and CODMn. It can be seen in the heatmap that most of the major phyla were negatively correlated to the pH of the water sample; some were negatively related to the DO (dissolved oxygen) and resistance. Proteobacteria and Verrucombicrobia were positively correlated to all three variables. Proteobacteria had a stronger negative correlation with the second cluster of environmental variables. Other than Verrucomicrobia, Bacteroidetez, Dependentiae, and WPS-2, other phyla had a positive correlation to the second cluster of environmental variables. Chlamydiae, Firmicutes, Deinococcus-Thermus, Epsilonbacteraeota, Kiritimatiellaeota, Synergistetes, Omnitrophicaeota, Fusobacteria, Elusimicrobia, Margulisbacteria, Tenericutes, Chloroflexi, and Spirochaetes, were positively correlated to the second and third clusters of environmental variables.
The impact of environmental variables was also tested using the redundancy analysis (RDA). Figure 8b showed that pH, DO, EC, Resistance, Nitrate, Total Nitrogen, Hardness, and TOC had more influence on microbial communities, all with a statistical p-value less than 0.05 (Table S2). To further examine the effect of environmental factors on microbial communities, the linear regression analysis was conducted for each environmental variable to April/May samples, August samples, and all April/May/August samples. Their respective coefficients of determination and p-values are summarized in Table S2. It can be seen that DO, Resistance, Nitrate, Total Nitrogen, Hardness, and TOC had a relatively better correlation with phyla, pH, Ammonia, Total Phosphorus, Soluble Phosphorus, and CODMn didn't show good linear correlation. The linear regression analysis showed that the environmental variable that had a major impact on microbial communities was consistent with the conclusions from the Spearman's Rank Correlation, the RDA analysis except for EC.