Differences in composition and diversity of bacteria communities, focusing on potential pathogens and indicator bacteria
A total of 1,586,824 high-quality reads were obtained after quality filtering, and the average number of sequences per sample was 44,078, ranging from 24,690 to 108,652. A total of 29603 feature values were obtained through QIIME2 processing, with an average of 822, and sample coverage from 98.72% to 99.97% (Table S1). The rarefaction curve (Fig. S2) based on species richness index indicates that the sequencing profundity was depth. Compared with other sampling sites, group D showed a significant decrease in chao1, shannon, simposn diversity index and observed otus (Kruskal-Walls, p < 0.01) (Fig. S3). Furthermore, the unifrac distance was evaluated by Principal co-ordinates Analysis (PCoA) to reveal the differences in bacterial community composition among all samples (Fig. 1a), a significant segregation between two groups was observed. Differences in bacterial community were characterized using LefSe method (Fig. S4). All samples were divided into river group (XR, HR and FR) and wastewater group (D, E and A). The results showed that 10 phyla, 17 classes, 33 orders, 46 families, and 62 genera with significant spatial variation when a LDA threshold of 3 was used, at the phylum level, Actinobacteria, Cyanobacteria, Chloroflexi, Planctomycetes and Gemmatimonadetes are significantly enriched in river water, while Firmicutes, Epsilonbacteraeota, Synergistetes, Proteobacteria, Patescibacteria and Bacteroidales (from order to species) dominated in wastewater (Fig. 1b).
After the alignment of self-built potential pathogen database, a total of 60 potentially pathogenic species (Table S3) were found in all samples, which were clustered into 34 genera containing potential pathogenic species (Fig. 2). Group D harbored the highest relative abundance of potential pathogens (0.0624% on average), followed by FR (0.0347%). Pseudomonas, Mycobacterium, Comamonas and Aeromonas were the most widely distributed potential pathogens genera in this study. Among them, the relative abundance of Pseudomonas and Mycobacterium in group A was significantly higher than other groups (p < 0.05 or p < 0.001, Wilcoxon), the relative abundance of Aeromonas in group E was significantly higher than that in group A (p < 0.01), D (p < 0.01) and HR (p < 0.01), simultaneously, Comamona showed the highest relative abundance in group E. (p < 0.05 or p < 0.001), Klebsiella was detected in 4 sites of FR and one site of HR, but not in wastewater group. It is worth noting that Bacillus was detected in 11 samples from A and E, but not at D, indicating that the potential pathogens from this genus originated from the sewage treatment plant rather than other pollutants. Comamonas terrigena and Pseudomonas alcaligenes were the most frequently detected potential pathogen species (Fig. 2, Table S3). The highest density of Pseudomonas alcaligenes was observed in A3 and A5, Comamonas terrigena was detected in all sites except D5 and D6, and it’s relative abundance increased after sewage treatment.
According to the Guidelines for Drinking-water Quality (WHO), Escherichia, Citrobacter, Klebsiella and Enterobacter, Serratia and Hafnia were choosed as indicator microorganisms, which contained most of the members of the total coliforms (TC), aerobic and facultatively anaerobic, Gram-negative, non-spore-forming bacilli capable, can survive in water environment. Moreover, the genus Bacteroides, Prevotella, Faecalibacterium, Helicobacter and the species Clostridium perfringens were classified as FIB, both of which was the member of the human and animal gut microbial community, can hardly reproduce in the environment, widely used as microbial source tracking (MST) (Shin et al. 2019; Duan et al. 2016; Ahmed et al. 2016). In addition, Enterococcus was classified as FIB group, which was recommended as indicators of fecal pathogens in drinking water standards. Specifically, here we classified Escherichia coli as a member of TC rather than FIB, because the source of Escherichia coli cannot be well identified through amplicon next-generation sequencing (Devane et al. 2020). Significant differences in the distribution of fecal indicator bacteria among all groups were observed (Fig. 2). Group D harbored the highest proportions of indicator bacteria (1.31% on average, p < 0.001), dominated by Bacteroides (0.78% on average) and Escherichia Shigella (0.34% on average), followed by FR (1.04% on average, p < 0.001), predominated by Klebsiella (0.86% on average) and Enterobacter (0.13% on average). Compared with group D, the relative abundance of indicator bacteria decreased significantly in group E and group A (p < 0.001), the relative abundance of TC in the XR group was higher than that in FR and XR (p < 0.05 or p < 0.01). The results of metagenomic sequencing showed a trend similar to that of 16S rRNA sequencing (Fig. S4).
Diversity, occurrence and potential hosts of ARGs
14 major types and 461 subtypes of ARGs were detected from the CARD database, and the number of ARGs in each sample ranged from 274 to 420, 221 shared ARGs were detected in all samples. The distribution pattern of ARGs was significantly different among all samples (Kruskal-Wallis, p < 0.001), D was significantly higher than that of other samples (p < 0.001), while A had the lowest ARGs relative abundance (p < 0.01 or p < 0.001, Wilcoxon). The most prominent ARG types were multidrug resistance (MDR) efflux pump (53.6%), aminoglycoside (6.0%), fluoroquinolone (5.8%), polymyxin (5.46%) and isoniazid (4.5%) (Fig. 3) there was no significant difference in ARGs abundance between XR, FR and HR. Furthermore, the abundance of trimethoprim (p < 0.001), polymyxin (p < 0.01), lipopeptide (p < 0.01) resistance gene in the wastewater group was significantly higher than that in the river water group, while the abundance of ethambutol resistance gene (p < 0.01) in the river water group was significantly higher than that in the wastewater group, but its relative abundance was relatively low (0.03% of the total ARGs abundance).
It is worth noting that a higher relative abundance of polymyxin resistance gene such as PmrA, PmrB, PmrC, PmrE, PmrF was found at site D, but a significant decrease was observed at E and A. MCR-1 was detected in all samples except E, but the relative abundance was relatively low. The relative abundance of some subtypes of glycopeptide resistance gene, such as vanTG, vanL, did not decrease after sewage treatment, while the relative abundance of vanO, vanB, vanKI, vanC, vanRO, vanF, vanSG, vanRG, vanA increased significantly after sewage treatment. In addition, except for some low-abundance ARG such as streptogramin (0.07%), elfamycin (0.04%), most of the ARG levels at WWTP effluent were significantly lower than those at the influent (p < 0.001). Similarly, a significant decrease in the total abundance of antibiotic resistance genes was observed in the ecological interception ditch downstream of the sewage outlet. Furthermore, we aligned the AACs against the NR database (e-value ≤ 1e-5) using BLAST analyse, whereas 71.37% of ARG-Carrying contigs can be annotated at genus level. Escherichia coli, Bacteroides graminisolvens, Enterobacter cloacae were the main ARGs carrier, which was annotated to 54, 39 and 10 ARGs subtypes respectively, and these species are mainly distributed in site D (WWTP influent).
It is worth noting that 20 species were simultaneously annotated as potential pathogen, Arcobacter butzleri and Stenotrophomonas maltophilia carried 36 ARG subtypes, Pseudomonas alcaligenes and Klebsiella pneumoniae carried 17 and 4 drug resistance gene subtypes respectively. The most abundant type of ARGs carried by potential pathogens was efflux pump, followed by fluoroquinolone resistance gene (Table S5), the highest relative abundance of ARGs carried by potential pathogens was registered in site D, followed by FR. Furthermore, the main hosts of polymyxin are Achromobacter and Burkholderia, which account for 28.5% and 25.5% of the total, respectively.
Horizontal gene transfer potential of ARGs
The predicted ORFs was translated into amino acid sequence, and then alignment with MGE database by BLASTp analysis to evaluate the level of mobile genetic elements (MGEs). Alignment results showed that there were a total of 9294 Contigs with movable elements in all samples (WithE-value ≤ 10% similarity ≥ 80% focus coverage ≥ 70%), of which the most abundant type of MGEs was plasmid. The highest relative abundance of MGEs was observed at site D, followed by FR. Meanwhile, a significant decrease of MGEs abundance was observed at point E compared with point D, the prominent MGE type found in site D was plasmid, while the dominant type in FR was IS (Fig. 4).Consistent with the change trend of ARGs, a significant decrease in MGEs level was observed after sewage treatment.
In order to identify the mobile ARGs, we matched the Contigs annotated as ARGs with the annotated Contigs carrying MGEs (the ARG carryings Contigs contains at least one MGE), 80, 166, 120, 90, 94 and 116 Contigs carrying both ARG and MGE were detected in A, D, E, XR, FR and XR, respectively, with the dominate type of plasmid (Table S5). Subsequently, a total of 15 ARGs were found to coexist with MGEs, and the main types were elfamycin, efflux pump, aminoglycoside, iSoniazid and sulfonamide resistance gene. We further performed taxonomic annotations using BLAST analysis on these ARG and MGE carrying contigs to determining the taxonomic origin. 84.94% of the Contigs that could be annotated to the genus level were Thauera (23.75%), Pseudomonas (10.62%), Escherichia (7.22%), and Acinetobacter (6.48%) (Fig. 5).
There were differences in dominant types of drug resistance among mobile ARGs carriers. For instance, Thauera mainly carried elfamycin resistance gene (97.96%), with Escherichia coli EF-tu mutants conferring resistance to Enacyloxin IIA as the dominant subtype. Pseudomonas mainly carried sulfonamide resistance gene (52.23%), followed by Efflux pump (29.20%), Escherichia mainly carried aminoglycoside resistance gene (51.78%), followed by elfamycin resistance gene (22.87%) and Acinetobacter mainly carried sulfonamide resistance gene (73.43%) (Fig. 5).
Correlation analysis of indicator bacteria and potential pathogens and ARGs subtypes
Pearson correlation analysis showed that the relative abundance of potential pathogens was significantly correlated with FIB, while TC was not significant, indicating that FIB was forceful to predict potential pathogens (Fig. 6). Network analysis approach was used to discern the co-occurrence patterns among TC and FIB communities and ARGs. The nodes are colored according to different bacterial taxa and ARG subtypes. The co-occurrence network consists of 233 nodes and 1714 edges. In this network, the average weighting degree is 14.804, the network diameter is 14, the graph density is 0.063, the modularization index is 0.666, the connection components are 13, and the average network distance between all paired nodes is 13. That is, the average path length is 5.832 and the average clustering coefficient is 0.701 (Fig. 7). In general, 20 ARG subtypes showed significant correlation with TC, while 10 ARGs showed significant correlation with FIB. TC and FIB members exhibited close correlations, dfrE, sul2, mexD, PmrE, oprA and FIB and TC community were observed as inter-type co-occurrence correlations. Specifically, dfrE, sul2, PmrE, which encodes trimethoprim, sulfonamide and polymyxin resistance, exhibits a significant correlation with 3 of FIB and 3 of TC members, while oprA and four kinds of FIB and one kind of TC members showed closely related.In particular, Klebsiella was the indicator bacteria most closely related to ARGs, and it was positively associated with 14 ARG subtypes of which 10 drug resistance gene subtypes have no significant correlation between other indicator bacteria.
It is worth noting that potential hosts of oqxB, mexD and sul2, including Thauera, Pseudomonas and Acinetobacter, were identified as major contributors to mobile ARGs in the samples above. Additionally, we analyzed the association between the overall indicator bacteria and AACs, and the results showed that sul2 was the most strongly associated mobile drug resistance gene with the indicator bacteria (Fig. S5).