Monitoring and evaluation of antibiotic resistance genes in three rivers in northeast China

Antibiotic resistance genes (ARGs) have become an important public health problem. In this study, we used metagenomic sequencing to analyze the composition of ARGs in selected original habitats of northeast China, comprising three different rivers and riverbank soils of the Heilongjiang River, Tumen River, and Yalu River. Twenty types of ARG were detected in the water samples. The major ARGs were multidrug resistance genes, at approximately 0.5 copies/16S rRNA, accounting for 57.5% of the total ARG abundance. The abundance of multidrug, bacitracin, beta-lactam, macrolide-lincosamide-streptogramin, sulfonamide, fosmidomycin, and polymyxin resistance genes covered 96.9% of the total ARG abundance. No significant ecological boundary of ARG diversity was observed. The compositions of the resistance genes in the three rivers were very similar to each other, and 92.1% of ARG subtypes were shared by all water samples. Except for vancomycin resistance genes, almost all ARGs in riverbank soils were detected in the river water. About 31.05% ARGs were carried by Pseudomonas. Opportunistic pathogenic bacteria carrying resistance genes were mainly related to diarrhea and respiratory infections. Multidrug and beta-lactam resistance genes correlated positively with mobile genetic elements (MGEs), indicating a potential risk of diffusion. The composition of ARGs in three different rivers was similar, indicating that climate plays an important role in ARG occurrence. ARG subtypes in river water were almost completely the same as those in riverbank soil. ARGs had no significant geographical distribution characteristics. Many ARGs were carried by human pathogenic bacteria related to diarrhea and respiratory infections, such as Pseudomonas aeruginosa and Aeromonas caviae. In general, our results provide a valuable dataset of river water ARG distribution in northeast China. The related ecological and geographical distribution characteristics should be further explored.


Introduction
Antibiotics to treat and prevent bacterial infections are widely used in animal husbandry and aquaculture (Marti et al. 2014). Studies have shown that with the increased use of antibiotics, the number and type of antibiotics in the environment have gradually increased (Zhou et al. 2011). Under the pressure of antibiotic selection, a large number of antibiotic-resistant bacteria (ARB) have appeared, and their antibiotic resistance genes (ARGs) have been rapidly transmitted via horizontal transfer (Aminov 2011). There are also many pollutants in the environment that further promote the horizontal transfer of resistance genes, such as heavy metals ) and nanomaterials (Qiu et al. 2012). Ultimately, ARGs are imported into the water  , hospital water (Dias et al. 2020), and aquaculture ) result in the discharge of a large number of ARGs into natural water bodies, making rivers and lakes an important location for ARG exchange and transmission. The close relationship between water and human activities means that ARGs and ARBs in water are a potential threat to human health. Therefore, determining the distribution of ARGs in natural water is important to evaluate this health risk. Surface water, especially rivers, is an important medium for many material exchanges, including ARGs, in ecological niches, such as urban sewage , feces , soil , and sediment (Cheng et al. 2020). Rivers flow through many cities, and ARGs in river water will be widely spread in their watersheds. It has been reported that the river environment is easily affected by human activities and has become the most important repository of ARGs Marathe et al. 2017;Peng et al. 2020). As water sources and important aquaculture environments, rivers have strong influence on public health. Understanding the occurrence, diffusion, and fate of ARGs in rivers is key to controlling ARG pollution and diffusion. Bacterial antibiotic resistance is highly prevalent in the natural environment, and antibiotic secretion and resistance are an important competition mechanism in the microbial community (Baltz 2008;Newman et al. 2003). At present, a large amount of ARGs have been discharged into rivers by human activities; therefore, it may be difficult to determine the ARG baseline profiles. Nevertheless, an alternative approach would be to study ARGs in a pristine environment. In northeast China, few people live around the Heilongjiang River, Tumen River, and Yalu River. The area lacks large cities (most cities have less than 250 thousand people), sewage treatment plants, and hospitals. Characterizing mobile genetic elements (MGEs) and ARGs could be helpful to understand the diffusion potential of ARGs in the initial stage.
In China, there have been numerous investigations into ARGs in the Yangtze River Yang et al. 2017;Zhang et al. 2020b), Yellow River (Lu et al. 2018;Shi et al. 2019), and Pearl River (Chen et al. 2015;Li et al. 2018); however, there has been none concerning ARGs in the Heilongjiang River. Compared with other rivers, the Heilongjiang River is located in a colder climate with high humus levels and thus might have a unique microbial community structure and ARG subtypes. There have been few reports on the distribution characteristics of river ARGs under a low temperature climate. The Heilongjiang River, Tumen River, and Yalu River are located in three different provinces in northeast China, separated by large geographical distances. It is difficult to associate the differences in ARGs with ecosystem functions based on an analysis of the composition of ARGs (Louca et al. 2016).
Biogeographical methods are a better way to describe the relationship between microbial communities and geographical characteristics (Berendonk et al. 2015). Some studies found no significant correlation between the distribution of microbial communities and geographical space, indicating that the spread of these microbial communities might not be limited (Angermeyer et al. 2016). However, other studies reported the characteristics of the geographical distribution of microbial communities (Liu et al. 2018). Currently, there is no consensus on the geographical distribution characteristics of ARGs.
In the present study, ARGs in three different rivers and riverbank soils in selected original habitats of northeast China were investigated using metagenomic analysis, with the following aims: (1) to understand the composition and geographical distribution characteristics of river ARGs in this area; (2) to explore the potential interaction of ARGs in river water and riverbank soil; and (3) to evaluate the potential health risks and horizontal transfer possibilities caused by ARGs in northeast China. This large sequencing dataset for river water and riverbank soil ARGs will increase our understanding of the distribution characteristics of ARGs in northeast China and will clarify the original state of antibiotic resistance in rivers that have been little affected by human activities.

Sampling and DNA extraction
Water samples were collected in the dry season (September 2019) at 12 points in the Heilongjiang River, Tumen River, and Yalu River (Fig. 1). Three subsamples were collected at every point, resulting in 36 samples in total. Five liters of water was collected per sample in glass bottles for DNA extraction. Water samples (1 L) were filtered through a 0.45-μm membrane (39-mm diameter). The membranes were washed with phosphate-buffered saline (PBS)-EDTA (1% EDTA) and centrifuged to collect the precipitate for DNA extraction. Eleven soil samples were collected from the riverbank of the water sampling site, with one sample for each point, except for Fuyuan (FY), because the FY sample site was sandy with very low microbial abundance. According to the manufacturer's protocol, total genomic DNA extraction was performed using a TIANamp DNA Kit for soil (Tiangen Biotech, Beijing, China). DNA quality was evaluated using gel electrophoresis (1% agarose). The DNA concentration was measured using a Qubit dsDNA Assay Kit in Qubit 4.0 Fluorometer (ThermoFisher Scientific, Waltham, MA, USA). DNA samples were stored at − 80 °C until sequencing.

Metagenomic sequencing and analysis
A total of 1 μg DNA per sample was used to generate sequencing libraries whose inserts size was about 350 bp (NEBNext® Ultra™ DNA Library Prep Kit, NEB, Ipswich, MA, USA). The libraries were subjected to paired-end sequencing on the Illumina HiSeq platform (Illumina, San Diego, CA, USA). Library construction and sequencing were completed by Novogene (Beijing, China). The raw data obtained from the Illumina HiSeq sequencing platform was processed using Readfq (V8, https:// github. com/ cjfie lds/ readfq) to acquire the clean data for subsequent analysis. The specific processing steps were as follows: (a) reads containing low-quality bases (default quality threshold value ≤ 38) above a certain portion (default length of 40 bp) were removed; (b) reads with a certain percentage of N bases (default length of 10 bp) were removed; and (c) reads which shared an overlap above a certain portion with the adapter (default length of 15 bp) were also removed. Raw sequencing data is available at NCBI under SRA accession number PRJNA793053 and PRJNA793168 for river water and riverbank soil, respectively.

ARGs-OAP operation
ARG gene copy numbers were quantified using the antibiotic resistance genes-online analysis pipeline (ARGs-OAP) method (Yin et al. 2018). ARGs-OAP can be accessed through http:// smile. hku. hk/ SARGs. This pipeline is based on the SARG v2.0 database, which contains sequences from the CARD and ARDB databases, as well as the latest protein dataset from the NCBI-NR database. ARGs-OAP normalizes ARG gene copy numbers with 16S rRNA gene copy numbers and prokaryotic where Ni ARG-like sequence is the quantity of the ARGlike reads annotated to one specific ARG reference sequence; Li read is read length; Li ARGs reference sequence is the nucleotide sequence length of the corresponding ARG reference sequence; N 16S sequence is the total amount the 16S rRNA sequence refer ring to the Greengenes database; L 16S sequence is 1432 bp in this study stands for the average length of 16S rRNA gene in Greengenes; n is the number of mapped ARG reference sequences belonging to that ARG type or subtype; and N 16S copy number is the average copy number of 16S rRNA genes per cell number.

HPB and MGE annotation of ARG-carrying genes and ORFs
The predicted open reading frame (ORF) sequences were searched against ARGs-OAP v2.2 database for ARG-like ORF identification using DIAMOND blastp, with a cut-off e value of 1e −5 and an identity cut-off value of 45%. An ORF sequence was considered to be an ARG-like ORF if its best hit alignment to ARG sequences was under a cutoff of ≥ 45% query coverage. The ARG-like ORF sequences were then classified according to the ARGs-OAP v2.2 database.
Mobile genetic elements (MGEs) were identified using BLASTP to compare the ARG-carrying genes to a classification of mobile genetic elements (ACLAME) amino acid database for plasmids, using an E value ≤ 10 −5 and a cutoff of ≥ 50% query coverage. The ISfinder database (Siguier 2006) was used to find insertion sequences (ISs) with an E value ≤ 10 −5 and a cutoff of ≥ 45% query coverage. In addition, searching against the IntegrALL nucleotide database (Moura et al. 2009) was performed to find integrons, with an E value ≤ 10 −5 and a cutoff of ≥ 45% query coverage.
Meanwhile, ARG-carrying genes were compared to the human bacterial pathogens (HBPs) genome database using BLASTP, with an E value ≤ 10 −10 and a cutoff of ≥ 45% query coverage. Each sequence might have multiple aligned results; therefore, for the final alignment results of each sequence, we chose the result with the best hit score as the accurate annotation information.

Ecological boundary analysis
To study how ARGs change according to latitude and longitude, we used the split moving-window analysis of ecological differentiation, which was achieved using the R package EcolUtils (https:// github. com/ Guill emSal azar/ EcolU tils). The analysis was conducted with ARG type abundance or ARG subtype abundance, and the latitude or longitude of each site. The obtained analysis result was used for plotting using the R (version 3.5.1) package ggplot2. In the split moving-window analysis, we selected Bray-Curtis dissimilarity and set the windows size as 8, and the probabilities for confidence interval was from 0.025 to 0.975. Thus, we could determine the ecological boundary center based on the ARG composition.

Network analysis
ARG subtypes and the genera network were created using the following method. A correlation matrix was constructed with ARGs and metagenomics data to explore the potential correlations of ARG-ARG, ARG-genus, and genus-genus by calculating all pairwise Spearman's correlation coefficients (ñ) among ARG subtypes and taxonomy (at the genus level) that occurred in at least 70% of samples, using the R (v3.4.0) package, psych. A correlation between two nodes was regarded as statistically significant if r ≥ 0.85 and P was ≤ 0.01, which was adjusted using the Benjamini-Hochberg method. Then, the correlation between two nodes was termed the edge file, and we added a description to the node file comprising the phylum taxonomy at the genus level and the ARG type or ARG subtype. The network analysis was visualized using the interactive platform Gephi (v 0.9.2) (Bastian et al. 2009).
The ARG subtypes and MGE network were created using a similar method. A correlation matrix was constructed with ARGs and MGEs to explore the potential correlations of ARG-ARG, ARG-MGE, and MGE-MGE by calculating all pairwise Spearman's correlation coefficients (ñ) among ARG subtypes and MGEs that occurred in at least 70% of samples using the R (v3.4.0) package, psych. A correlation between two nodes was regarded as statistically significant if r ≥ 0.9 and P was ≤ 0.01, which was adjusted using the Benjamini-Hochberg method. Then, the correlation between two nodes was termed the edge file, and we added a description to the node file comprising the MGEs and the ARG type or ARG subtype. The network analysis was visualized using the interactive platform Gephi (v 0.9.2).

Diversity and abundance of ARGs and the bacterial community in different rivers
In total, 21 types of ARG were detected in water samples from Heilongjiang River (HLJ), Yalu River (YLJ), and Tumen River (TMJ), as shown in Table S1. Twenty types of ARG were present in all water samples, and bleomycin resistance genes were only detected in a few samples of YLJ and HLJ. The top 10 types of ARG (Fig. 2) were multidrug resistance, bacitracin, beta-lactam, unclassified, aminoglycoside, tetracycline, macrolide-lincosamidestreptogramin (MLS), sulfonamide, fosmidomycin, and polymyxin, covering 96.9% of the total ARG abundance. Multidrug resistance genes accounted for 57.5% of the ARG abundance. We detected approximately 790 ARG subtypes. The top 10 subtypes were bacA, macB, acrB, mdtB, mexF, mexT, mexW, multidrug transporter genes, ompR, and cAMP regulatory protein genes. The subtype with the highest abundance was bacA, at 0.09-0.14 copies/16S rRNA among all ARGs, resulting in bacitracin resistance genes accounting for a large proportion of the total ARG abundance. Multidrug resistance genes had many subtypes with high abundance, and 7 of the top 10 subtypes were multidrug resistance genes, meaning that these genes had the highest proportion.
The heatmap in Fig. 3 shows the top 20 genera in river water samples from each sample site. Among all samples, Proteobacteria was the major bacterial phylum, accounting for more than 70% of the total abundance. Pseudomonas was highly abundant in each sample, while its abundance in Ji'an (JA) and Dandong (DD) was low. Myroides, a representative genus of Bacteroidetes, showed high abundance in Changbai (CB) and JA. The abundance of other genera in different samples was quite different, suggesting different microbial community structures in each sample site.
To understand the variation in ARG subtypes with geographical distribution, the diversity of latitude and longitude was analyzed using ecological boundaries analysis. The abundance of ARGs changed with latitude and longitude (Fig. S1). However, no obvious ecological boundary center was observed.

Comparison of ARG profiles from rivers and riverbanks
To explore the relationship between river water and nearby soil, soil samples from riverbanks were also analyzed. From 11 soil samples, 22 types and 228 subtypes of ARGs were detected. Fewer ARG subtypes were detected in soil than in river water. The top 10 subtypes in soil were vanR, bacA, mexF, ADP-ribosylating transferase-arr genes, macB, mdtB, multidrug ABC transporter genes, multidrug transporter genes, mdtC, cpxR, and aac(2')-I. Compared with the ARGs in river water, the distribution of highly abundant ARG types in soil was simpler.
The ARG subtype composition in different sample sites was similar, whether in river water or riverbank soil. As shown in Fig. 4, 95 subtypes of ARGs were shared in all river water samples, accounting for 12.0% of the number of subtypes, but representing approximately 92.1% of the total abundance. Furthermore, 407 ARGs subtypes, accounting for 51.5% of all subtypes, were present only in one or two samples with very low abundance, representing just 0.2% of the total abundance. For the soil samples, 22.4% of ARG subtypes were shared by all samples, covering 93.0% of the total abundance. Moreover, 84 ARG subtypes contributed Fig. 2 Abundance of the top 10 ARG types in samples from the three rivers. Zero abundances were considered in the plot about 0.5% of the total abundance. As for the total abundance of ARG subtypes in soil, few subtypes had high abundance, and many subtypes contributed only a small part of the total abundance. More attention should be focused on ARG subtypes with high abundance to prevent potential human health risks from ARGs.
To explore the potential correlation among ARGs in river water and soil, ARG abundance and types in different groups were compared. As shown in Fig. 5, the abundance of ARGs in soil was lower than that in river water. In soil, vancomycin resistance genes were a characteristic type, with high abundance (up to 25%). Other ARG compositions in river water  Shared ARG subtypes of river water samples or soil samples. A shows shared ARG subtypes in river water samples and B shows those in soil. The table shows the number of shared ARG subtypes and percentage of these shared ARG subtypes in terms of total abundance and soil were similar, with multidrug resistance genes and bacitracin resistance genes representing large proportions in all samples. Spectinomycin and bleomycin resistance genes were only detected in river water samples with low abundance. Except for vancomycin, other members of the top 20 ARGs types were common in river water and soil.
To learn more about the similarities and differences of ARGs in river water and soil samples, ARG subtypes were further analyzed. First, 206 ARG subtypes were detected in both river water and soil, covering 90.4% of the ARGs in soil. Second, 32 ARG subtypes were detected in all samples, whether from river water or soil, including bacA, class A beta-lactamase genes, macA, macB, arnA, vanR, and mexB. BacA, mexB, and other multidrug resistance genes were widely present in sample sites with high abundance. Third, vanR was mainly present in soil, accounting a small amount in river water.

The hosts of ARGs in river water and their potential pathogenicity
After annotation, about 17,960 ARGs were identified as being carried by 440 genera. In all river water samples, Pseudomonas, Aeromonas, Comamonas, Acidovorax, Stenotrophomonas, Herbaspirillum, Acinetobacter, Achromobacter, Delftia, and Variovorax accounted 82.66% of bacteria carrying ARGs. For each sample site, Helong (HL) had the least bacteria carrying ARGs with 53 genera, and Dandong (DD) had the most, with 163 genera (Table S2).
Pseudomonas frequently carried ARGs, and 60.00% of ARGs carried by Pseudomonas were multidrug resistance genes. Aeromonas carried many kinds of ARGs, such as multidrug, beta-lactam, MLS, tetracycline, and other unclassified ARGs. The result for the three different rivers was similar to each other. For HLJ, the top 10 bacterial genera carrying ARGs were almost the same as those in all samples, including genus and ARGs types. The only difference was that Shewanella carried more ARGs and Achromobacter carried fewer. For TMJ, Hydrogenophage and Cupriavidus carried more ARGs than in HLJ and YLJ. In addition, Enterobacter carried more ARGs in YLJ, contributing 2.08% of ARG hosts.
Antibiotic resistant pathogenic bacteria are a direct threat to human health. Therefore, we further analyzed the ARGs present in pathogenic bacteria. The top 10 pathogenic bacteria carrying ARGs are shown in Fig. 6B. Escherichia coli carried the most ARGs, representing 26.09% of all ARGs. Notably, Pseudomonas aeruginosa, Bacteroides fragilis, and other common clinical pathogenic strains carried many ARGs. Pseudomonas aeruginosa is associated with serious infection, and often carried ARGs, such as chloramphenicol and streptomycin resistance genes. In this study, Pseudomonas aeruginosa carried mainly multidrug resistance genes with active efflux mechanisms, and beta-lactam, fosmidomycin, and unclassified ARGs. The results for the three rivers were similar. In YLJ, Brucella abortus carried eight types of ARGs, representing a serious challenge for antibiotic treatment. For TMJ, Staphylococcus aureus and Bordetella bronchiseptica MO275 carried more ARGs, especially multidrug resistance genes. These human pathogenic bacteria carrying ARGs are a direct threat to human health. Our data provide some suggestions for treatment after bacterial infection.

Transfer risk and potential hosts of ARGs in river water
Horizontal transfer is the main method by which ARGs diffuse in the environment. To evaluate the risk of ARG transfer, network analysis was used to describe the correlations between ARGs and bacteria, as well as the relationship between ARGs and mobile genetic elements (MGEs), to explain the risk of transfer and the potential hosts of ARGs. As shown in Fig. 7A, the modularization index was 0.661, indicating that the ARG-bacteria network had a modular structure. Some multidrug resistance genes, like omp36 and ompF, correlated positively with many species of Proteobacteria, indicating that these resistance genes had a high risk of transfer in Proteobacteria. Some macrolideslincosamides-streptogramins (MLS) ARGs, such as macA and macB, correlated negatively with certain Proteobacteria and Actinobacteria, indicating that the transfer risk of macA and macB in these strains was low. The relationships of ARGs with MGEs were analyzed using the same method (Fig. 7B). Some ARGs, such as Oxa-12, correlated positively with many kinds of MGEs, indicating a high risk of horizontal transfer.

ARG abundance by type and subtype in the three rivers
To protect human health, it is necessary to understand the distribution and transfer risk of ARGs. To date, research has concentrated on the geographical distribution of ARGs (Liu et al. 2018, Yang et al. 2019. In this study, ARGs in three important rivers in northeast China (the Heilongjiang River, Tumen River, and Yalu River) were assessed. River water samples and riverbank soil samples were collected and analyzed for their ARG composition using metagenetic methods.
The abundance of ARGs in river water was between 7.5 × 10 -1 and 1.25 × 10 0 copies per 16S rRNA gene, and that in the riverbank soil was between 1.75 × 10 -1 and 2.25 × 10 -1 copies per 16S rRNA gene. The abundance range of ARGs in rivers and soil was similar to the range reported previously . In terms of composition, multidrug resistance genes represented more than half of the total abundance, which was related to the high expression of efflux pumps (Hiroshi and Jean-Marie 2012). Other ARG types, such as bacitracin, beta-lactam, and tetracycline resistance genes, also showed high abundance. As expected, these dominant ARG types corresponded to the antibiotics that have been used widely for human treatment and livestock breeding. BacA is a ubiquitous ancient resistance gene, which has been obtained by many bacteria during the process of evolution, resulting in the high abundance of bacitracin resistance genes (He et al. 2020;Zhang et al. 2020a). Among the multidrug resistance genes, multidrug efflux pumps are ubiquitous in bacteria (Du et al. 2018;Zgurskaya et al. 2018), which not only effectively reduce the concentration of antibiotics, but also participate in other processes, such as detoxification of metabolic intermediates, virulence, and signal trafficking (Bao et al. 2016). The presence of a variety of ARGs with efflux function, such as adeF and mexF, would lead to a high proportion of multidrug resistance genes. The increase in the abundance of polymyxin resistance genes is a concern. Polymyxin is termed the last line of defense of human antibiotics treatment, and an increase in polymyxin resistance genes represents a potential threat to human health.
ARG data in different niches were reported in previous studies. In four sewage treatment plants in Harbin, tetracycline and sulfonamide resistance genes were detected (Wen et al. 2016). These ARG subtypes were also detected in river water samples with high abundance in this study. In a study on the effect of soil fertilization on ARGs in northeast China , the reported highly abundant ARGs types were also detected as highly abundant in the present study. In particular, the ARG data in Heihe (HH) were similar to our data. In a large-scale survey of ARGs in drinking water (Ma et al. 2017), bacitracin, multidrug, and sulfonamide resistance genes were the top three ARG types in Heilongjiang Province.
Compared with ARGs in the Yangtze River, ARGs' abundance in the three rivers was lower, especially super ARGs, such as mcr-1 and NDM-1. According to a previous study, mcr-1 and NDM-1 were present at about 1.2 × 10 9 copies/L and 4.1 × 10 6 copies/L in the downstream region of the Yangtze River . In contrast, the highest abundance of mcr-1 was only 4.0 × 10 -6 copies/16S rRNA and NDM-1 was not detected in the three rivers in the present study. For common ARGs, the abundance of 16S rRNA gene, suls (sul1 and sul2), tets (tetA, tetC, tetE, tetO, Fig. 6 Composition of ARG-carrying bacteria (genus level) and the percentage of ARG types. A shows all bacteria with ARGs and B shows pathogenic bacteria with ARGs. Pie charts show the percentage of taxonomy and percentage of ARG-carrying bacteria. Bar charts show the percentages of ARG types that were determined using data from the annotated ARG-carrying bacteria ◂ Fig. 7 Correlation between ARGs and bacteria and ARGs and MGEs based on network analysis. A shows the relationship of ARGs and bacteria; and B shows that of ARGs and MGEs. The red line indicates a positive correlation between two nodes. The blue line represents a negative correlation. The size of each node is proportional to its number of connections and tetW), erms (ermB and ermF), and qnrs (qnrB and qnrS) ranged from 2.9 × 10 8 to 2.4 × 10 10 copies/L; 1.1 × 10 7 to 3.6 × 10 8 copies/L; 1.5 × 10 6 to 4.9 × 10 7 copies/L; 8.2 × 10 5 to 1.8 × 10 7 copies/L; and 7.7 × 10 3 to 1.5 × 10 6 copies/L, respectively, with mean concentrations of 1.9 × 10 8 , 2.0 × 10 7 , 6.8 × 10 6 , and 4.2 × 10 5 copies/L, respectively Zhang et al. 2020c). In this study, the abundances of these ARGs are from 2.1 × 10 -3 to 3.6 × 10 -2 copies/16S rRNA; 7.6 × 10 -4 to 4.4 × 10 -2 copies/16S rRNA; 0 to 4.9 × 10 -3 copies/16S rRNA; and 1.8 × 10 -4 to 7.5 × 10 -2 copies/16S rRNA, respectively. It seems that there were fewer ARGs in the three rivers than in the Yangtze River. Actually, there were more ARG subtypes detected in river water samples, such as sul3 and sul4. In other words, the quantitative results of some ARG subtypes assessed using quantitative real-time reverse transcription PCR (qRT-PCR) cannot include all ARG subtypes.

The similarity of ARGs in the three rivers and soils
Notably, the composition of ARGs in the three rivers was very similar, regardless of types or subtypes. The Heilongjiang River, Tumen River, and Yalu River flow in different directions and are located in three different provinces. Among the samples from the three rivers, the abundance of ARG was highest in YLJ river water and soil and was lowest in TMJ. In previous studies, the abundance and composition of ARGs in water was seriously affected by human activities , which might be one of the main driving forces for the formation of ARG structures in water. The population living along the coast of YLJ is significantly larger than that of TMJ and HLJ, and the urbanization process has been relatively faster, which might be one of the reasons for the high abundance of resistance genes in YLJ. Although TMJ and YLJ come from the same origin, there are great differences in population density and economic development in the regions they flow through. The results of ecological boundary analysis showed that the abundance and diversity of ARGs did not have a clear demarcation according to the change of longitude and latitude. These three rivers are all located in northeast China with a northern temperate climate, relatively low temperature, and similar precipitation and other conditions. Therefore, we hypothesized that climate plays a decisive role in the composition of ARGs. Currently, there are no large-scale sequencing data on the distribution of ARGs in the natural environment of northeast China, whether in water, soil, or other niches. Therefore, this conjecture needs to be further verified in future studies.
In terms of their composition, ARGs in soil had certain characteristics. Vancomycin resistance genes accounted for a large proportion in soil, but not in river water. Vancomycin is used to treat humans and was originally produced by Amycolatopsis orientalis (Xu et al. 2014), which was isolated from soil. There are more vancomycin-producing bacteria in soil; thus, the abundance of vancomycin resistance genes was very high. Except for vancomycin resistance genes, the composition of the ARGs in soil was very similar to that in river water, and almost all the resistance gene types and subtypes detected in soil were found in river water. This might be related to the characteristics of the river itself. When the river is in flood, the water level rises, covering part of the riverbank. In this process, the bacteria and resistance genes in the soil and river water are mixed and exchanged. In the dry season, the water level of the river drops and the riverbank is exposed, which reduces the mixing and exchange of bacteria and resistance genes. The samples in this study were collected in September, which is in the dry season of these rivers. The composition and structure of the bacteria and resistance genes in river water and soil are relatively stable in this period, which might explain the similar composition of ARGs detected in this study. This similarity suggested that rivers could obtain ARGs from the riverbank soil, and the soil could act as a reservoir of ARGs, resulting in potential secondary ARG pollution.

ARGs hosts and pathogenic bacteria hosts
To further evaluate the potential threat caused by ARGs, the hosts carrying ARGs were analyzed, especially human pathogenic bacteria. Pseudomonas was the most abundant ARG host. Pseudomonas is widely distributed in nature and is closely related to human infection, for example, Pseudomonas aeruginosa, Pseudomonas fluorescens, and Pseudomonas putida (Mazurier et al. 2015). Among Pseudomonas species, Pseudomonas aeruginosa was the most common and carried more ARGs, among which multidrug resistance genes accounted for 60% of the total. In a previous study, 82% of Pseudomonas aeruginosa isolated from hospital sewage treatment plants showed resistance to multiple antibiotics (Miranda et al. 2015). Pseudomonas aeruginosa can cause serious infections and is very common in nosocomial infections. The presence of different ARGs, especially multidrug resistance genes, makes treatment challenging. Among all the samples in this study, Escherichia coli was the pathogen that carried the most ARGs. Worldwide, antibiotic-resistant E. coli have been detected in all kinds of niches. Third-generation cephalosporin-resistant E. coli was detected in coastal surface waters in England and Wales (Leonard et al. 2015). In Dutch rivers, 27.6% of E. coli were found to be resistant to at least one antibiotic (Blaak et al. 2014) and 42% of E. coli in the river Seine in France were found to have at least one antibiotic resistance gene (Servais and Passerat 2009). Other antibiotic-resistant pathogenic bacteria are related to human diarrhea, such as Aeromonas caviae and Bacteroides fragilis, and to respiratory infection, such as Acinetobacter baumannii BIDMC 57. A previous study has shown that humans can acquire antibiotic-resistant bacteria from water while performing water sports or other activities (Leonard et al. 2015). The three rivers in this study were closely related to the daily lives of residents; therefore, we should be vigilant concerning the related risk of infection.

Potential risk of ARG spread
To further understand risk of ARG spread, it is important to evaluate the potential horizontal transfer of ARGs. MGEs participate in this process directly. The ARG-bacteria network showed the potential hosts of ARGs. In the network, Aeromonas was predicted to correlate positively with many types ARGs, including beta-lactam, multidrug, tetracycline, and MLS type genes, and experimentally, these ARGs were detected in Aeromonas. This showed that the network was an effective way to find potential hosts of ARGs. Positive correlations between different types of ARGs, such as that between the multidrug resistance gene ompF and the betalactam resistance gene PBP-1B, indicated their co-occurrence. Different ARG subtypes seem to be preferred by different hosts. The multidrug resistance gene ompF correlated negatively with the Proteobacteria genus Rhodoferax, but positively with the genus Trabulsiella. In the ARG-MGEs network, many MGEs correlated positively with ARGs, which was consistent with previous studies Zheng et al. 2018). Some ARG subtypes correlated positively with many MGEs, such as OXA-12 and mdtH, indicating a high risk of horizontal transfer.

Study limitations
This study had some limitations. In the process of collecting soil samples, only one mixed sample was collected from each sample site. To improve the representativeness of the samples, soil samples were collected every 1 m and mixed for analysis. For groups with fewer sample sites, there might be a great difference between a single sample data and the average; thus, a more intensive sampling strategy might be considered in future research. Moreover, the impact of season should be considered. The sampling time of this study was autumn, and the high temperature in summer might promote the growth and reproduction of many kinds of eukaryotes and strongly affect the distribution of bacteria and ARGs. Therefore, our results are only strictly applicable to the time-frame in which the samples were collected, and different results might be obtained if the study was repeated in other seasons.

Conclusions
In this study, we investigated ARGs in the original habitats of three rivers in northeast China. Twenty-one ARG types were detected in total. Multidrug resistance genes were the major ARG type detected. The composition of ARGs in the three different rivers was similar, and ecological boundaries analysis showed no differences. ARG subtypes in riverbank soil were almost the same as those in the river water, except for vancomycin resistance genes. The bacterial communities in three rivers were different, but ARG hosts were similar. Among the 440 identified ARG host genera, Pseudomonas, Aeromonas, Comamonas, Acidovorax, Stenotrophomonas, Herbaspirillum, Acinetobacter, Achromobacter, Delftia, and Variovorax were the most abundant genera. Besides, many human pathogenic bacteria were identified as ARG hosts related to human respiratory infections and diarrhea, such as Pseudomonas aeruginosa and Aeromonas caviae. Furthermore, many multidrug and beta-lactam ARGs correlated significantly and positively with MGEs, indicating the potential risks of ARG horizontal transfer, especially among human pathogenic bacteria. Overall, these results revealed the potential threats to public health caused by ARGs in this area; however, their specific impact requires further evaluation.