Sample collection was performed on farms located in the three Chinese provinces of Shandong, Henan and Liaoning (Fig. 1 and Supplementary Tables 1 and 2, further information in the Methods section). Sample collection resulted in a total of 461 viable biological samples, covering two time points in the bird life cycle within the farm (t1 and t2) and one time point in the slaughterhouse (t3). Biological samples consisted of bird faeces (n = 223; 116 at t1, 107 at t2), feathers (n = 36; 17 at t1, 19 at t2), barn floors (n = 23; 10 at t1, 13 at t2), carcasses (n = 94 at t3), abattoir wastewater (n = 21 at t3), abattoir processing lines (n = 12 at t3), and outdoor soil (n = 52; 25 at t1, 27 at t2).
Microbial Communities And Resistomes Are Differentiated Across Farm Sources And Between Farm And Abattoir
Taxonomic profiling of the metagenomic samples revealed 19 bacterial phyla, 2 archaea phyla and 3 eukaryote phyla (Supplementary Fig. 1A and Supplementary Table 3). The microbial communities formed two clusters (Supplementary Fig. 1B) (PERMANOVA, p < 0.001), clearly separating farm sources (faeces, feathers, barn floor and outdoor soil) from abattoir sources (carcasses, processing line and wastewater). Pairwise testing indicated that abattoir sources were not significantly separated from each other. However, farm sources were consistently separated from abattoir sources and from each other (adjusted p values < 0.05). Comparison of individual phyla abundances (Wilcoxon Rank-sum test) highlighted nine phyla with differences across sources (Supplementary Fig. 2 and Supplementary Table 4). As expected, typical soil phyla, Actinobacteria and Planctomycetes were more abundant in outdoor soil samples, and commensal gut species (Firmicutes, Protobacteria, and Bacteroidetes) were found in high abundances in most sample sources indicative of potentially contaminated environments. Chlamydiae, a phylum endemic in birds, was found in high abundance in abattoir samples.
In the resistome of all samples, after rarefying to the minimum read depth, 336 ARGs were found, representative of 14 antibiotic classes. The ARG presence/absence patterns differed in all farm and abattoir sources (PERMANOVA, p < 0.05) except for wastewater and abattoir processing line, Supplementary Fig. 3. The analysis of pattern differentiation for ARGs belonging to each specific class (Wilcoxon Rank-sum test, Supplementary Fig. 4 and Supplementary Table 5) revealed that 13 of the 14 classes had significant differentiation at least in one pairwise comparison across sources, consistent across farms. Specifically, barn floor (t1 and t2) samples carried a greater number of aminoglycosides, amphenicol, MLSB and tetracycline genes compared to processing line, wastewater, outdoor soil and (for t1 only) carcasses (adjusted p values < 0.05). Chicken faeces carried a greater number of aminoglycoside, MLSB and amphenicol genes compared to outdoor soil, wastewater, processing line and carcasses (adjusted p values < 0.05), and beta lactam genes compared to outdoor soil, wastewater and carcasses (adjusted p values < 0.05). Additionally, chicken faeces collected at t1 carried a greater number of MDR and fosfomycin genes compared to outdoor soil, wastewater and carcasses (adjusted p values < 0.05).
Analysis Of Mobile Args Reveals Numerous Clinically Relevant Ones, Shared Between Birds And Environments
As gene mobility may influence ARG presence across sources, and because of the potential importance of mobile genetic elements (MGEs) in the development of effective surveillance systems11, we considered ARGs in proximity to MGEs. In total, 695 different MGE-ARG combinations (mobile ARGs) were found, Supplementary Table 6, featuring 202 unique ARGs. Eighty ARGs (40%) were found in only one MGE-ARG combination, whilst the remaining 122 (60%) were found in multiple combinations (2 to 22), Fig. 2A. Over half (57%) of the 695 mobile ARGs were present in more than one source, Fig. 2B, with three MGE-ARG combinations (IS1216-poxtA, IS15-APH(3’)-Ia and ISCfr1-ACC(3)-IId) present in every source. Chicken faeces had the highest number of mobile ARGs, but also the greatest variance, Fig. 2C. Feathers and barn floor also carried many mobile ARGs, the mean number not being statistically different from faeces (Dunns test adjusted p values > 0.05). Outdoor soil, carcasses, processing line and wastewater generally had lower numbers of mobile ARG patterns per sample, with these numbers differing significantly (Dunns test adjusted p values < 0.01) from faeces and feather, but not with each other. In total, 157 different MGE-ARG combinations were found in bird and environmental sources on the same farm, with some of these appearing on multiple farms. Of these, 47 contained clinically relevant ARGs24, Fig. 2D. Notably, we found blaNDM−5 known to be present in the IncX3 plasmid which can be disseminated among humans, animals, food and environment25, and qnrS1 a plasmid-mediated quinolone resistance gene known to be present in the chicken supply chain that is capable of being transferred over different bacteria26.
The antimicrobial resistance profile of E. coli residing in the gut is correlated to the resistome and microbiome of the gut itself
We further investigated if there was a correlation between the chicken gut resistome and microbiome and the antimicrobial resistance profiles of E. coli taken from the same samples as the metagenome data.
We cultured E. coli isolates from 170 chicken faeces samples (a subset of the samples that had been used for metagenomics) and characterized their AMR profiles against a panel of 26 antibiotics. The proportion of isolates resistant to each antibiotic ranged from 1–98% (Supplementary Table 7). All isolates were resistant to at least one antibiotic, with 169 resistant to at least three.
To investigate correlations between resistance in E. coli and gut microbiome/resistome, we developed a bespoke data mining method based on machine learning, Fig. 3. The method consists of building a ML-powered “predictive function”, whose output is the resistance of E. coli to a specific antibiotic (true or false) from antimicrobial susceptibility testing (AST), and whose input is the aggregation of information from the gut microbiome (relative abundances of microbial species) and gut resistome (presence/absence of ARGs). The predictive function is trained using experimental data (supervised learning) and swapping different underlying ML technologies, until optimal prediction performance is achieved. A set of the most informative features, also referred to as “predictors”, is extracted from the ML models. The set is then refined by analysis of correlation with temperature and humidity (see later section).
Out of the 26 antibiotics, only 17 had sufficient data (resistance and susceptibility cases) to allow proper ML training: amikacin, amoxicillin/clavulanic acid, aztreonam, cefepime, cefoxitin, cefotaxime/clavulanic acid, ceftazidime, ceftazidime/clavulanic acid, chloramphenicol, cefotaxime, gentamycin, kanamycin, minocycline, nalidixic acid, streptomycin, sulphafurazole, and trimethoprim/sulfamethoxazole. For all, the extra-tree classifier (ML technology) resulted in best prediction performance (Nemenyi test, Supplementary Table 8 and Supplementary Fig. 5). Prediction performance indicators using extra-tree are reported in Fig. 4A and Supplementary Fig. 6. A total of 11 predictive models (amikacin, aztreonam, cefoxitin, chloramphenicol, cefotaxime, kanamycin, minocycline, nalidixic acid, streptomycin, and sulphafurazole) achieved the best performance according to AUC (> 0.90).
Data mining showed the existence of a core subset of the chicken gut resistome and microbiome that exhibited a strong predictive power on the resistance of E. coli. This core consisted of 495 features (191 microbial species and 304 ARGs), acting as strong predictors of E. coli resistance/susceptibility to 11 antibiotics (Fig. 4B, Fig. 4C and Supplementary Table 9). The 304 ARGs from the top 11 antibiotic models belonged to beta lactams (26% of the ARGs), aminoglycosides (16%), and MLSB (14%), with other antibiotic classes accounting for less than 10% each. Of these 304 ARGs, based on the correlation of ARG read depth with species abundance (see Methods)27, 70 were found to be present in contigs identified as originating from E. coli with 69 of these also present in other bacteria. A further 100 ARGs (of the 304) were present only in contigs identified as other bacterial species (i.e. did not originate from E. coli). To further explore the relationships between core gut features and antibiotic resistances, the 495 features and the 11 antibiotic resistances were visualised as nodes of a graph, with edges only connecting predictors to predicted resistances (Fig. 4C). This analysis highlighted a core of 117 ARGs (68 clinically relevant, including blaCTX−M−64, blaCTX−M−123, fosB3 and dfra5) acting as predictors of more than three antibiotic resistances. Six ARGs (aadA16, aadA22, PER-1, AAC(6')-Ia, tet(39) and SHV-110) were found to be predictors of eight antibiotic resistances. The same analysis revealed 29 microbial species in the gut, acting as predictors of five antibiotic resistances (aztreonam, chloramphenicol, cefotaxime, kanamycin and nalidixic acid). These 29 species included the bacterial genera Arcobacter, Acinetobacter and Sphingobacterium in addition to other commensal bacteria.
Gut resistome and microbiome features recognised as predictors of E. coli resistance are also correlated with temperature and humidity
For the top 11 antibiotic models, we developed bespoke regression models using individual gut features as independent variables (one model per variable), and temperature or humidity as dependent variables, to see if model fitting would highlight a correlation, Fig. 3 – phase III (see also Materials and Methods for details). Amongst the original 495 features, 183 ARGs and 48 microbial species correlated well with humidity, whilst 60 ARGs and 23 microbial species correlated with temperature (Supplementary Fig. 7 and Supplementary Table 10). Correlation with humidity was on average stronger (higher R2 values in the regression analysis, Supplementary Fig. 7). Of the 183 ARGs correlated with humidity, 20% were beta lactams, 16% MLSB, 15% aminoglycosides and 11% tetracyclines. Of the 60 ARGs correlated to temperature, 22% were MLSB, 20% beta lactams, 18% aminoglycosides and 12% glycopeptides. Forty-eight ARGs correlated with both temperature and humidity, four of them clinically relevant (AAC(6')-Ib4, ANT(2'')-Ia, NDM-1 and VEB-1). Four microbial species from the phyla Proteobacteria (Helicobacter pullorum and Alcaligenes faecalis), Firmicutes (Bacillus cereus group), Bacteroidetes (Bacteroides stercoris) correlated with both temperature and humidity. One species from Tenericutes (Mycoplasma yeatsii) correlated with temperature only; while other species from Proteobacteria, Firmicutes, Bacteroidetes and Actinobacteria correlated either with temperature or humidity (Supplementary Table 10).
We tested for the possibility that some ARGs found correlated with temperature or humidity might belong to microbial species also correlated to temperature and humidity. The analysis highlighted four distinct subgraphs correlated with humidity (Fig. 5A) and one correlated with temperature (Fig. 5B). Notably, one subgraph correlated with humidity contained Klebsiella pneumoniae and 5 related ARGs (KpnE, KpnF, KpnG, OmpK37 and acrA). The subgraph containing A. faecalis and ARGs ErmY, FosD, dfrA16, vgaC and vgaE, was found in both analyses (i.e., correlated with both temperature and humidity).
We then investigated if the gut ARG features identified as predictors of resistance in E. coli, and further identified as correlated to humidity or temperature, were in close proximity to MGEs. Nine ARGs were found located in close proximity to MGEs (MLSB: lsaB, mphF, cfrC and ermX; beta-lactams: NDM-1, PER-1, DHA-1; amphenicol: catB2 and trimethoprim: dfrA16). Four of the nine ARGs were found associated to only one MGE (lsaB with ISSau9; cfrC with ISEc9; DHA-1 with IS15, and drfA16 with IS6100), whilst the other five were found associated from 2–6 different MGEs. All the MGE-ARG pairs were investigated for conserved structure across farms or sources. For example, the clinically important NDM-1 was found in close proximity of IS15 in four samples (three chicken faeces from LN1 and one barn floor sample from LN3 (Supplementary Fig. 8)). In 19 samples from chicken faeces and feather sample from LN1, LN3, SD2 and SD4, NDM-1 was found in proximity of MGE ISAba125, and located next to another ARG ble, which is a known association for plasmid-borne NDM-1 in Enterobacteriae species from Asian regions28. Despite the having found the same NDM-1-ISAba125 pattern in several farms (LN1, LN3, SD2 and SD4) there was no evidence of transmission between farms, Fig. 6. Instead, evolutionary analysis of the contigs suggested recent branching of isolates within individual farms (most common recent ancestor (MCRA) less than two years on most branches) and much earlier MRCAs between different farms (greater than 20 years), indicating the likelihood of this mobile ARG widely circulating in livestock throughout China.
Gut microbiome and resistome features, recognised as predictors of E. coli resistance, are correlated with drug use
We investigated if the core chicken gut microbiome and resistome, previously identified as predictors of resistance in E. coli, may in turn be influenced by antibiotics usage (Supplementary Table 11). We found statistically significant differences in the relative abundance of ARGs per antibiotic class (Supplementary Fig. 9), presence/absence of ARGs (Supplementary Fig. 10) and relative abundances of microbial species (Supplementary Fig. 11), between farms using and not using antibiotics (Supplementary Table 12). On the farms that received tetracycline antibiotics, the tetracycline class and tetracycline ARGs (tet(39), tet(B), tet(G), tet(Y), tetS and tetX) were found be significantly increased (adjusted p values < 0.05). In addition, in these farms there was also a significantly increased presence of genes from the classes: aminoglycoside, beta lactam, MLSB, MDR, phenicol, sulfonamide, trimethoprim, fosfomycin, glycopeptide and nucleoside (adjusted p values < 0.05). All these, except fosfomycin, glycopeptide and nucleoside, had a greater than expected co-presence on contigs with tetracycline genes (Chi-square test, Holm correction adjusted p values < 0.0001). Similarly, on farms that received lincosamide antibiotics there was an increased presence of MLSB genes. There was also an increased presence of genes in aminoglycoside, beta lactam, tetracycline rifamycin, fosfomycin, phenicol and glycopeptide classes. All except glycopeptide had a greater than expected co-presence on contigs with MLSB genes (Chi square test, Holm correction adjusted p values < 0.0001).
All the ARGs for which presence/absence was found significantly correlated to antibiotic usage (indicated in Supplementary Fig. 10) were also found correlated with humidity. Additionally, seven of these genes (ANT(2’’)-Ia, catB2, dfrA16, DHA-1, ErmX, FosC2 and spd) were also correlated with temperature as well as humidity. Of the 16 microbial species that were found to have a statistically significant difference in relative abundances in relation to antibiotic usage, seven were correlated to changes in humidity (Baciillus cereus group, Jeotgalibaca sp PTS2502, Klebsiella pneumoniae, Klebsiella variicola, Lysinibacillus sp BF 4, Proteus hauseri and Proteus mirabilis) whilst three were correlated to changes in temperature (Alistipes sp An66, Baciillus cereus group and Enorma massiliensis).