Genomic analysis reveals fty years of antimicrobial resistance evolution in husbandry Escherichia coli from China

Antimicrobial agents have been used in meat production for decades and its consumption is considered an key driver for the emergence and dissemination of antimicrobial resistance (AMR). However, large-scale studies on AMR changes in animal isolates since the introduction of antimicrobial usage remain scarce. We applied whole genome sequencing analysis to 982 animal-derived Escherichia coli collected in China from 1970s to 2019 and found increasing trends for the presence of numerous antimicrobial resistance genes (ARGs), including those conferring resistance to critically important agents for veterinary (orfenicol and noroxacin) and human medicine (colistin, cephalosporins, and meropenem). Extensive diversity and increasing complexity of ARGs and their associated mobile genetic elements (MGEs) such as plasmids were also observed. The plasmids, IncC, IncHI2, IncK, IncI, IncX and IncF played a key role as highly effective vehicles for disseminating ARGs. Correlation analysis also revealed an association between antimicrobial production and emergence of ARGs at a spatial and temporal level. Prohibiting or strictly curtailing antimicrobial use in animals will potentially negate the current trends of AMR as the bacterial genome is highly changeable and using different drugs of the same class, or even unrelated classes, may co-select for MGEs carrying a plethora of co-existing ARGs. Therefore, limiting or ceasing antimicrobial use in animals to control AMR requires careful consideration.


Introduction
Antimicrobial resistance (AMR) has become a global health and development threat. Currently, at least 700,000 people die annually of infections caused by antimicrobial-resistant organisms, and this number is projected to rise to at least 10 million with accumlated costs of approx. $100 trillion by 2050 1 . This relentless increase in AMR is exacerbated by global antibiotic consumption (AMC), which increased by nearly 70% during the 2000s with much of the AMC occuring in animalproduction 2 . In the USA, for example, over 70% (by weight) of antimicrobial agents assigned to classes that are medically important in human medicine, such as β-lactams, tetracyclines, macrolides and colistin, are sold for use in animals 3 . Such extensive AMC has driven the rapid emergence of multidrug-resistant (MDR) pathogens in both humans 4 and animals 5 .
Whilst horizontal gene transfer via plasmids and transposons may account for inter-bacterial dissemination 6 , the interconnections between animals and humans either through the food chain, direct contact or environmental routes, also contribute to the widespread of AMR 7 .
Monitoring longitudinal AMR data is not only essential for analysing trends, but also for the early detection of emerging resistance traits i.e. globally coordinated surveillance studies have been conducted since the 1990s 8 , of which Escherichia coli, a commensal bacterium, is considered as an excellent AMR indicator 9 . Moreover, E. coli is the most frequently reported clinical pathogen worldwide 10 . In 2015, WHO launched the Global Antimicrobial Resistance Surveillance System (GLASS) to facilitate AMR surveillance in bacteria causing critical infections and to assess AMR trends across WHO regions 11 . However, no comparable framework exists for AMR surveillance in animals, and few country-speci c monitoring or surveillance studies have been conducted in Europe 12 and the United States 9 . In low-middle countries (LMIC)s, the OIE are liasing with the Fleming Fund to leverage such data but is restricted to 24 countries 13,14 . These programs show large AMR variations across countries, which may be due to a myriad of reasons including technical e.g. antimicrobial susceptibility testing (AST) methods. One of the earliest (2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015)(2016)(2017)(2018), large LMIC AMR studies on bacteria including E. coli from animals revealed an AMR increase among pig and chicken isolates 15 . AMR surveillance of E. coli from in rm chickens in China between 1993 and 2013 showed an increasing trend in MDR 16 . Most AMR surveillance studies on bacteria from animals are limited in their spatial and temporal sampling, and mostly focus on phenotypic data only. Moreover, an association between AMR and animal AMC has seldom been investigated.
In China, antimicrobial agents have been used as feed additives in the animal farming industry since the 1970s 17 . The annual per capita consumption of meat has increased from 10.3 kg in 1978 to 57.6 kg in 2019 18 , and this dramatic increase has resulted in a correspondingly rapid increase in animal AMC; thereby, en aming the AMR problem. However, despite these mitigating factors, changes in animal AMR levels remain poorly understood. Herein, we investigated the AMR trends of animal-derived E. coli in China from 1970s to 2010s. The evolution and transmission of ARGs alongside the plasticity of ARG-carrying plasmids were also evaluated, focusing on speci c AMR markers to clinically important antimicrobial agents used in animals 19 and humans 20 .

MLST diversity and patterns
MLST of 982 isolates demonstrated signi cant diversity (Fig. S2), with isolates assigned to 226 distinct sequence types (STs), except for 67 isolates with novel STs. More than half of the STs (54.0%, n=122) were represented by a single isolate.

Pan-genome analysis
Pangenomic analysis of the 982 E. coli isolates revealed an open pangenome of 66,767 genes and showed no evidence of reaching a plateau (Fig. S5a), indicating that genes were expanded as the sample size increased. Contigs were classi ed as most likely originating from chromosomes or plasmids according to the prediction results from mlplasmids 24 . The chromosomal core genome was predicted to contain 1,724 genes using a 99% threshold, and 57,434 accessory genes were identi ed with mean 2,837 genes per genome (Table S1). The isolates from the rst three chronological groups (1970s-2000s inclusive) were found to be more conserved based for their chromosomal pangenome accumulation curves (Fig.   S5b), while group 4 (2010s) appeared to have a more open pangenome, which may be account for the larger spatial and host representation. The accessory gene counts per genome varied over time, ranging from 2,311 to 3731. Isolates collected in the 1970s-1980s and 1990s had more chromosomal accessory genes (mean=2223), compared to those from 2000s and 2010s (mean=2152, p < 0.001), whilst the plasmid accessory genes increased from mean 193 in isolates from 1970s-1980s to mean 292 in isolates from 2010s (p < 0.001, Table S1).
Plasmid replicon marker analysis also indicates a slight increase in the number of plasmids over time with a mean count of 10.0 in isolates collected from the 1970s-1980s to 12.5 in isolates from 2010 (p < 0.001, Fig. S6a). The prevalence of several ARG-associated plasmids, such as IncX3, IncX4, IncI2 and IncHI2, were signi cantly higher in 2010s than in the 1970s-1980s (Fig. S8). The IncX3, IncI2 and IncB/O/K/Z plasmids were rarely detected between 1970s and 1990s (0%-6.06%), but increasingly presented in isolates from the 2010s (14.54% -31.78%, Fig. S8 Temporal changes of AMR, ARG and virulence gene (VG) pro les Antimicrobial susceptibility testing of 450 E. coli isolates and ARG screening of 982 E. coli genomes revealed an increasing trend in AMR across the near-50 years time span (Fig. 2 a-c). We observed an increase in resistance to more recently approved and/or less commonly used veterinary drugs (p < 0.01), such as orfenicol, apramycin, nor oxacin, and ceftiofur, and the drugs exclusively used in human medicine, such as aztreonam and meropenem (Fig. 2a). Similar trends were observed in their corresponding ARGs in the full genome set ( Fig. 2a and c). In contrast, resistance to antimicrobial agents commonly employed since the 1970's, e.g. tetracycline, trimethoprim/sulfamethoxazole and chloramphenicol, remained relatively high throughout the 1970s-2010s (48.67%-79.78%, Fig. 2a, Table S2). Correspondingly, the presence of ARGs also remained high in the full geneome set (Fig. 2c). Porcine isolates had a relatively higher prevalence of ARGs conferring resistance to colistin and uoroquinolones (65.5% vs. 44.9% in other hosts, p < 0.001), whilst sheep isolates had the lowest prevalence of ARGs (mean count of 5.97 vs 15.57 in others, p < 0.001, Fig. 2c).
We speci cally examined the resistance to β-lactams and phenicols which are key antimicrobial agents for both human and/or veterinary medicine. The rates of resistance to penicillins (ampicillin) and penicillins + β-lactamase inhibitors (amoxicillin/clavulanic acid), remained low in the 1970s-1980s (24.39% and 13.41%, respectively), but rose rapidly since 1990s (67.86% and 71.43%, respectively) and remain at a high-level (72.86% and 53.77%, respectively, in 2010s). Resistance to cephalosporins and aztreonam have increased over time (from 1.22% and 1.22% to 52.24% and 34.17%) and resistance to meropenem was most noticeable in the 2010s (1.01%, Fig. 2a, Table S2). The average counts of ESBL genes per genome for each chronological group in the full genome set were 0.24, 0.73, 0.83 and 1.16, respectively (Fig. 3a); 70.37% of the isolates (n=691) were found to contain at least one ESBL gene and the percentage in each time group increased from 22.12% in 1970s-1980s to 69.70%-77.66% in 1990s-2010s. Among the ESBL-positive isolates, 28.08% were also positive for bla NDM (n=194). Although the use of chloramphenicol in food-producing animals was banned in China in 2002 25 , orfenicol, a uorinated thiamphenicol derivative, is still in use. The prevalence of chloramphenicol resistance has remained relatively high with a slight decline in the 2000s (60.71%-43.26%, Fig. 2a) and the most prevalent chloramphenicol resistance genes have changed from catA1 to catB3, cmlA and oR (which mediates bacterial resistance to both chloramphenicol and orfenicol) (Fig. 2c, Table S3). Genetic context analysis of phenicol resistance genes indicate that catA is linked with only a few ORFs encoding hypothetical proteins and likely associated with an IncP plasmid, while catB, cmlA and oR genes were found to be associated with IncHI2 plasmids along with other ARGs (Figs. 5 and 6).
Four uoroquinolones (nor oxacin, pe oxacin, lome oxacin, and o oxacin) were banned from usage in food-producing animals by 2016 26 , and colistin was prohibited for use as an animal growth promoter in April 2017 27 . We therefore used these two time points to further group isolates in order to investigate how these restrictions may have affected AMR and ARG pro les. Resistance to nor oxacin increased rapidly from the 1990s to the 2000s before declining in the 2010s (Fig.   S11a). Correspondingly, the average counts of PMQR genes for each chronoloigical group were 0.02, 0.18, 0.45 and 0.73, respectively (Table S3). The prevalence of oqxAB and qnrS increased; qnrB and qepA continue to remain low, with a slight decrease in aac(6')-Ib-cr (19.39%, 102/526 before 2016 vs. 17.02%, 24/141, after 2016). In contrast, the prevalence of colistin resistance and the gene, mcr-1, increased rapidly during the 2000s (Fig. S11b), but decreased from 44.27% (n=278/628) in 2010-2016 to 23.08% (n=9/39) in 2017-2019, after the ban of colistin as growth promoter was enforced.
For virulence genes pro les, most isolates were detected to be hypovirulent, with 769 VGs (Table S3) functionally clustering into 16 classes and 100 groups (Fig. S9). No obvious trends or changes in VGs were observed. Pig isolates had a lower prevalence of virulence genes associated with adherence, autotransporters, invasion and iron uptake compared to isolates from chickens. Sheep isolates horbored highest virulence genes but lowest number of ARGs (Fig. 3a).

Correlation analysis of the gross output value of veterinary medicine for ARGs and VGs
To understand the association between AMC and AMR, we used the antimicrobial production data as an alternative measure to veterinary drug usage, which was unavailable in China prior to 2018. In general, the gross output value of the veterinary drugs produced in China during the different chronological periods (Table S5)

The transfer of ARGs
We applied the DISCOVER method commonly used for testing the somatic alterations in cancer 28 as a statistical approach for analyzing the co-occurrence of ARGs and plasmid replicons to infer the transmission of resistance genes. Several plasmid replicons were associated with ARGs (Fig. 5), such as IncC with the ESBL-related gene bla CMY , IncI2 with mcr, IncX3 with the carbapenem resistance gene bla NDM , IncHI2 with seven resistance gene classes, among others. The cooccurrence of ARGs and plasmid replicons largely remained stable when the analysis was restricted to different chronological-groups, with the IncC plasmid replicon being the one exception which was associated with the trimethoprim resistance gene dfrA23 in the 1970s-1990s, but then associated with the ESBL gene bla CMY after 2000 (Fig. S13). ARG genetic context analysis was used to investigated clustering of ARGs in assembled contigs, and showed that isolates from the 1970s-1990s possessed smaller clusters with fewer links between ARGs, but those from the 2000 and 2010 group gradually possessed more densely populated ARG regions (e.g. the β-lactamase genes or aminoglycoside resistance genes, Fig. 6 and S14). For example, we found that catB and cmlA co-existed with the bla OXA-1 , bla OXA-10 , bla NDM , aph(3')-XV, ant(2'')-Ia, aadA, ant(3'')-Ia, the PMQR gene aac(6')-Ib-cr, and the rifampicin resistance gene arr (Fig. 6a-c). Additionally, oqxAB gene was the rst to be detected in category of PMQR during the 1970s-1990s, while the appearance of acc(6')-Ib-cr in the 2000s was the only PMQR that co-existed with other ARGs such as bla OXA-1 and arr. The mcr-1 gene was always associated with the IS30 family transposase, ISApl1, but the anking regions have diversi ed over time.

Discussion
Previous studies investigating AMR in animals in China have been limited at both a spatial and/or temporal level 29-33 , while our study presents a nation-wide surveillance report encompassing isolates from different food-producing animals from the 1970s, a time when antimicrobial agents seldom used in veterinary medicine and have since largely replaced traditional Chinese herbs 34 , to 2019. Worldwide, China is now one of the largest users of veterinary antimicrobial agents 21 .
In this study, we observed a signi cant association between veterinary antimicrobial production and the enrichment of ARGs, further demonstrating that the long-term use of antimicrobials in animals is one of the key drivers of AMR. As AMC in animals increased over the ve decades, so the evolution of E. coli from animal origins with expanded AMR pro les continue to expand via antimicrobial selection.
To address this critical issue, the Chinese government announced a series of directives aimed at reducing AMR in animals.
Concurrently, several programs were launched for AMR monitoring 35,36 with several antimicrobial classes banned (chloramphenicol in 2002, four uoroquinolones in 2016) and colistin prohibited as growth promoter in April 2017 [25][26][27] . The overall prevalence of AMR pro les and corresponding ARGs; however, did not decline in the 2010s, but exhibited divergent trends (Fig. 3). The observation, that the dissemination of colistin resistance in E. coli dramatically reduced after the withdrawal of colistin, was consistent with our previous study 37 . This reduction may have been mediated by the following: 1. the mcr-mediated target modi cation on lipid A moiety of LPS is the sole transferable mechanism of colistin resistance in E. coli 38 . 2. mcr-1 expression can impose a considerable tness cost for E. coli carrying mcr-bearing plasmids 39-42 . 3. The ban of colistin as a growth promoter signi cantly reduced the selection pressure for mcr-1 which would have previously occurred. 4. The observation that mcr genes are seldom found together with other ARGs on the same mobile genetic elements (Fig. 6) reduces the likelihood for co-selection by other antimicrobial agents.
In contrast, the ban of chloramphenicol and the four uoroquinolones from use in animals did not result in a phenotypic AMR decrease or a reduction in their corresponding ARGs. This might be attributable to the following: 1. Other antimicrobial agents within the same class, including orfenicol (phenicols) as well as cipro oxacin, dano oxacin, enro oxacin, and sara oxacin ( uoroquinolones) were still used in animals and may have provided selection pressure for particular genotypes. 2. The co-selection by other antimicrobial agents might have contributed to the persistence and continual spread of resistance to phenicols and uoroquinolones e.g. the PMQR gene acc(6')-Ib-cr is frequently found to be located in a class 1 integron with catB, bla OXA and arr 43,44 . Co-occurrence of these genes was observed in 64.1% (91/142) of our isolates that were positive for acc(6')-Ib-cr. 3. The immediate genetic environment have become increasingly complex due to the accumulation of additional ARGs via mobile genetic elements (MGEs) (Fig. 6). These genes are likely to be mobilised together and the exposure to any one of these correspsonding antimicrobial agents could result in the coselection of the entire gene cluster or plasmid. 4. Multiple mobile resistance genes mediating diverse resistance mechanisms relating to chloramphenicol resistance ( oR-related e ux pump and cat-associated enzymatic inactivation, cfr-mediated target site modi cation) and uoroquinolone resistance (oqxAB and qepA-related e ux pump, acc(6')-Ib-crrelated enzymatic inactivation and qnr-associated target protection) render the loss of these resistance traits more di cult than a single resistance mechanism e.g. mcr-1. 5. Fluoroquinolone resistance is more frequently mediated by mutations in speci c chromosomal target genes, such as gyrA, gyrB, parC or parE. Such chromosomal mutations are very stable, especially under continuous selection pressure imposed by the use of uoroquinolones in animals. Finally, uoroquinolones can further induce mutations causing resistance to other classes of antimicrobials 45 , which continue to strengthen their competitive positions.
Another di culty hindering AMR control is the prosperity of horizontal gene transfer by MGEs. Acquisition of ARGs is greatly aided by a variety of MGEs, among which plasmids play a crticial role 46 . To date, 27 major plasmid incompatibility groups have been associated with ARGs in Enterobacteriaceae 47,48 . The co-occurrence of ARGs and several plasmid replicon types was commonly detected within our isolates (Fig. 5). The IncI2, IncX3, IncHI2, IncC and IncF types were linked with ARGs, with some of these associations highlighted in previous studies 49 . The IncF, IncC and IncX plasmid types were recently identi ed to be the most prevalent types in carbapenem-resistant E. coli 50 , while IncI2, IncHI2 and IncX4 types are commonly linked with mcr-1 transmission 51 . As the increasing prevalence of ARGs was consistent with increasing detection of these plasmid replicons (Fig. 2c, S2b, S8), any AMR containment necessitates the mitigation of these multiresistance plasmids and therefore strategies to inhibit their transmission should be further explored.
We acknowledge several limitations in this study. A few isolates (n=60) were collected several decades ago and their collection details (e.g., exact location) were no longer available. Apart from those downloaded from NCBI, all remaining isolates from the 1990s (n=28, 84.85%) were collected from a chicken farm in Jiangsu Province, and may represent a sampling bias. Additional sampling bias may also result from the available data from online database given the skew towards sequencing isolates with speci c ARGs. However, given the extensive sources involved in this study, and the obvious huge diversity of these genomes, the bias is likely to be minor.
Stricter use of antimicrobial agents in animals is a strating to become a global trend; at least in theroy. As such, the stop growth-promoting medicines as feed additives after 1 st July 2020 52 . It should be noted that antimicrobial agents are still irreplaceable for the treatment and prevention of infectious diseases caused by bacteria. It should also be noted that only healthy livestock animals provide high quality meat products for human consumption which is an increasing global trend. Thus, sick animals need an adequate treatment, which in the case of bacterial infections also involves the appopriate application of antimicrobial agents. However, forti ed with the understanding that antimicrobial agents will select speci c resistant genotypes, it is crticial to design AMR reduction stratgeies that considers the co-existence and cotransfer of ARGs i.e., interventions exploring separate classes of antimicrobials for humans and animals.

Materials And Methods
Sample collection and whole genome sequencing A total of 450 E. coli isolates from animals in China collected since the 1970s were collated from our previous studies. HiSeq X Ten platform (Annoroad, Beijing, China). Raw sequence data of clinical strains were assembled using SPAdes 57 v3.13.1 via the Unicycler 58 v0.4.7 assembly pipeline. Assemblies were deposited in NCBI (BioProject Accession number: PRJNA718052). Three isolates were selected for long-read sequencing, because they were collected from the early temporal group or were assigned to the clades in which we were particularly interested. MinION library preparation was performed using the Rapid Barcoding Sequencing kit (SQK-RBK004) according to the the manufacturer's (Oxford Nanopore, Oxford, UK) standard protocol. The constructed library was loaded into the Flow Cell R9.4 (FLO-MIN106) on a MinION device and run with the SQK-RBK004_plus_Basecaller script in MinKNOW1.5.12 software. De-multiplexing and basecalling were conducted using guppy 3.2.4. The long-read sequencing data was combined with their corresponding short-read Illumina data and used to generate de novo hybrid assemblies with Unicycler 58 v0.4.7. Additional genomes were downloaded from the National Center for Biotechnology Information (NCBI) Pathogen detection database (https://www.ncbi.nlm.nih.gov/pathogens/isolates#/search/). There were n=1728 isolates retrieved with the search criteria "species_taxid:562" and "geo_loc_name: China*" (E. coli from China) on December 9 th 2019. Among which, 606 isolates collected from food-producing animals (chicken, gallus gallus, pig, swine, duck, ovis aries, bos taurus, et al.) were secected. In addition to 74 genomes originating from our lab, 532 E. coli genomes collected between 1976-2019 were downloaded. The details of these projects involved in this study were listed in Table S6. About half of these isolates (n=243) were collected with no speci c selection criteria, 5 isolates were submitted as virulent isolates, 5 isolates were related to speci c phage, and 279 isolates were involved in AMR research. However, some of these resistant isolates were sequenced by our previous study and the other non-resistant isolates in the same epidemiological investigations have also been added. The large sample size of this collective dataset may account for a potential sampling bias to a certain extent.

Antimicrobial susceptibility testing
The minimum inhibitory concentrations (MICs) of the 450 E. coli isolates were determined using the agar diffusion method according to Clinical and Laboratory Standards Institute (CLSI) document M100, 30 th ed. 59 and The European Committee on Antimicrobial Susceptibility Testing (EUCAST) standard v10.0 60 . Altogether, 16 antimicrobial agents from nine classes were selected based on their usage in the animal industry and/or in human medicine, and included gentamicin, apramycin, tetracycline, tigecycline, nor oxacin, orfenicol, chloramphenicol, ampicillin, amoxicillin/clavulanic acid, meropenem, cefazolin, cefoxitin, ceftiofur, trimethoprim/sulfamethoxazole, aztreonam and colistin.

Phylogenetic analysis
The full set of 982 genomes were used to generate a core-genome SNP alignment and construct a phylogenetic tree, using Parsnp v1.1.2 in the Harvest package 61 . The mid-point rooted phylogenetic tree was annotated in ITOL (https://itol.embl.de/). The Bayesian model-based population structures were identi ed with HierBAPS 62 v6.0. Following initial phylogenetic analysis, two clades, C1 and C2, were selected for further analysis. A representative isolate from each clade was selected for long-read sequencing as detailed above, yielding high-quality complete chromosome sequences for use as reference genomes. Illumina data sets were aligned against the chromosomes of high-quality complete assemblies using Bowtie v2 63 . Single nucleotide polymorphisms (SNPs) were identi ed, it for all of the sequenced genomes with SAMtools 64 v1.3.1 using RedDog v1beta.11 (https://github.com/katholt/RedDog) pipeline with default settings. For each of the two clades, core SNPs (conserved in >95% genomes) were used for subsequent phylogenetic analysis. Gubbins 65 v2.3.4 was used to identify and lter out putative chromosomal recombination imports within each lineage, before generating recombination-free maximum likelihood phylogenetic trees using FastTree 66 v2.1.10. The minimum spanning tree of MLST was generated in BioNumerics v7.6.

The gene content analysis
Illumina read sets were screened against the E. coli 7-locus multi-locus sequence type (MLST) database 67 , ARGannot 68 database, PlasmidFinder 69 database and virulence factor database (VFDB, http://www.mgc.ac.cn/VFs/) using SRST2 70 v0.2.0 to assign sequence types (STs) and detect acquired AMR genes, plasmid replicons and VGs, respectively, in each genome. All the VGs were grouped according to the information from VFDB. The results were transformed into a binary table in R 71 v3.6.0 to analyse the presence/absence of acquired AMR gene alleles and prevalence of each gene in different isolates grouped by temporal or host information. Contigs in each genome assembly were classi ed as chromosomal or plasmid sequences using mlplasmids 24 v1.0.0 the posterior probability cut-off at 0.5. The resulting contig sets (chromosomal vs. plasmid) were each annotated with Prokka 72 v1.11 and subjected to a pan-genome analysis with Roary 71 v3.6.0, separately. The resulting gene content matrix comprised 982 genomes and was used as input for plotting in R v3.6.3. Gene networks showing the linkage of genes within contigs were generated in Gephi 73 .

Statistical analyses
Statistical analyses were conducted with R v3.6.3. The Pearson correlation analysis was used to determine the correlation between the gross output value of veterinary medicine and gene content, and the pairwise correlation between ARGs, VGs, and plasmid replicons. Differences between the numbers of ARGs/VGs/plasmid replicons per isolate in different decadespeci c dataset (1970s-1980s, 1990s, 2000s, 2010s) were assessed by the Wilcoxon test. Gene prevalence differences between each decade-speci c dataset were assessed with the χ2 test (n≥50) or Fisher's exact test (n<50).
We used the R package Discover (Discrete Independence Statistic Controlling for Observations with Varying Event Rates) v0.9.3, a novel statistical independence test, to identify the pariwise co-occurrence and mutual exclusivity of ARGs and plasmid replicons in the genomes 28 . This package assesses pairwise co-occurrence and mutual exclusivity by counting how many genomes have both of the two genes and comparing this to the number of genmoes expected to have such an overlap by chance if these genes were appeared independent. In addition, mutual exclusion can be detected, and the detection of mutual exclusion would be improved. In short, Discover identi es more signi cant co-occurrences and mutual exclusivities without increasing the false positive rate. Figure 1 The origins of 982 E. coli isolates from 26 Chinese provinces. All isolates were grouped by sampling time and host. The two sets of pie charts show the temporal and host distribution of isolates overall and for each province. Note: The designations employed and the presentation of the material on this map do not imply the expression of any opinion whatsoever on the part of Research Square concerning the legal status of any country, territory, city or area or of its authorities, or concerning the delimitation of its frontiers or boundaries. This map has been provided by the authors. Temporal changes in AMR. a. Temporal changes in AMR phenotypes (n=450). Different colors represent the different antimicrobial agents. The values were presented in Table S2. b. The distribution of numbers of ARGs per genome in different sampling time groups (n=982). The mean ARG count for each group is noted. The "*" on the right represent pvalues of pairwise groups using Wilcox test. *: p < 0.05; **: p < 0.01; ***: p < 0.001. c. ARG prevalence grouped by sampling time and host (n=982). ARGs that were detected at >5% prevalence in any one of the sampling groups are shown. They were clustered by their resistance of different groups of antimicrobial agents and colored by antimicrobial classes on the left.

Figure 3
Temporal changes in ESBL and PMQR gene pro les amongst the full genome set (n=982). a. Number of ESBL genes per isolate over sampling time. b. Prevalence of ESBL genes over sampling time. c. Number of PMQR genes per isolate over sampling time. d. Prevalence of ESBL(b)/PMQR(d) genes over sampling time. The size of each circle in a and c corresponds to the proportion of isolates (percentage). blaTEM*: specify only ESBL producing blaTEM genes.

Figure 4
The correlation of antimicrobial production and resistance gene content. a. The correlation of gross output value of veterinary medicine and average count of ARGs detected from genomes. b. The correlation of gross output value of veterinary medicine and average count of plasmid replicons from genomes. c. The correlation of gross output value of veterinary medicine and average account of VGs detected from genomes. Co-occurrence of ARGs and plasmid replicons. We de ne the false discovery rate (FDR) threshold as 0.05 in this algorithm; signi cant pairwise values from 1 (blue) to -1 (red) (i.e. co-occurrence and mutual exclusivity) are shown. The blue box labelled "ARG-ARG" shows the pairwise co-occurrence and mutual exclusivity of ARGs, and the red box labelled "plasmidplasmid" shows the pairwise co-occurrence and mutual exclusivity of plasmid replicons. The gold boxes labelled with numbers indicate the co-occurrence and mutual exclusivity between ARGs and plasmid replicons, and these are annotated on the bottom left ARG co-occurrence networks of phenicol, uoroquinolone and polymyxin resistance genes among isolates grouped by sampling time: 1970s-1990s (n=146), 2000s (n=169) and 2010s (n=667). Nodes represent genes and are colored to represent the antimicrobial class for ARGs; aminoglycoside: red; β-lactam: blue; trimethoprim: cyan; phenicol: green; rifampin: pink; uroquinolones: yellow; poymyxins: orange; grey: non-ARG. Unlabelled nodes were annotated as hypothetical protein. Edges are coloured according to the nodes the connect. Edge widths represent connection weight calculated by Roary.

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download. TableS1S3.xlsx