Functional selection associated with microbial genome size under polycyclic aromatic hydrocarbons (PAHs) pollution stress in soils

Bacteria can adapt to complex environmental conditions by expanding or reducing their genomes. Microbes with large genomes carry a more diverse gene pool and thus have more metabolic function than those with small genomes. Various environmental disturbances have been investigated and proved to be important in determining the community genome sizes. However the impacts of environmental pollution remain poorly understood, which is a selective factor. Polycyclic aromatic hydrocarbons (PAHs) are ubiquitous pollutants that cause great damage to the natural environment and health. Here, we conducted incubation experiments under different concentrations to evaluate the impacts of organic pollution stress on the community genome size. Besides, changes of community structure, functions and potential key genes in different concentrations were illustrated by 16S rRNA amplicon gene sequencing and quantitative-PCR.


Results
We found the distinct communities of signi cantly larger genome size with the increase of PAHs concentration gradients in soils. Highly contaminated soil notably enhanced functions relative to xenobiotics biodegradation and metabolism, environmental information processing and signaling molecules and interaction. Abundance of Proteobacteria and Deinococcus-Thermus with relatively larger genomes increased along with PAHs stress and well adapted to polluted environments, while abundance of Patescibacteria with a highly streamlined and smaller genome decreased. Co-ccurrence network analysis indicated a decreased average clustering coe cient but an increased connection number per node positively related to nidA gene along the concentration gradients. Moreover, nidA gene, appearing as an indicator, showed positive correlation with community genome size and PAHs concentrations.

Conclusions
Our ndings demonstrate the signi cant impact of pollution stress on genome size and clarify that the increase in genome size is due to the directional selection for microorganisms with larger genomes or carrying speci c functional genes and the increased capacity for horizontal transfer of degrading genes between communities.
Background Soils harbor diverse microbiomes on Earth and soil microorganisms are sensitive to environment disturbances [1,2]. The indigenous microbiomes in soils have a capacity for self-adjusting to adapt to changing environmental stresses [3,4]. Polycyclic aromatic hydrocarbons (PAHs) are ubiquitous pollutants that cause great damage to the natural environment and health [5]. Soil is the reservoir and transfer station of PAHs in the environment, bearing the major environmental load of PAHs [6]. PAHs stress makes some microorganisms enhance their ability to mutate and evolve in order to adapt, which led to changes of microorganisms at the genome level [7,8].
In previous studies, some microorganisms have been isolated from PAHs contaminated soil [5,9], and studies of isolated bacteria in single cell level have shown that some bacteria have high GC content and large genomes size [11,12]. In addition, some researchers have demonstrated that PAHs stress altered soil community structure and composition and enriched speci c functional genes (i.e. polycyclic aromatic hydrocarbon ring-hydroxylating dioxygenases (PAH-RHD) encoding genes) [5,10]. However, at the community level, the detailed mechanisms of how the genome responds to environmental stress are not well clear. Especially, microbial communities are better able to survive in diverse environments than individual species because of their diverse metabolic patterns [13,14]. Therefore, it is necessary to elucidate the reply of genomes to pollution stress at the community level.
Mutation is generally random, resulting in an increase or loss of bases [20,21], while horizontal gene transfer brings new genes into the bacterial genome [22,23]. In general, free-living bacteria have a larger genome than most parasitic bacteria, which discard some biosynthetic pathways and genes that could bene t from the host [24,25]. Hence, genome size is important for microbial adaptation to environmental disturbances because it directly affects microbial functions.
Genome size of free-living bacteria is in uenced by many factors, such as niche [7,26] and nutrient availability [27,28], but is not usually determined by replication e ciency [15]. Bacterial niche changes with the change of environmental conditions and the intrinsic ability of bacteria to adapt to the changing environment is related to genome size [27,29]. In recent years, some studies have shown that environmental factors, including poor nutritional conditions [27,30], depth [31], environmental stability [32], high temperature [33,34], alkaline milieu [35] and symbiotic lifestyle [24], affect the microbial genome size to some extent. When exposed to a new environmental stress, the bacterial genome size may be increased by acquiring adaptive gene modules [22,23], or reduced by discarding related genes for additional metabolic burdens [24,34]. That is, the variation in the size of bacterial genomes is largely due to the acquisition and loss of functional helper genes. Generally, bacteria with large genomes size have greater metabolic diversity and are better able to cope with a wide range of environmental conditions [25,32]. In addition, bacteria can also improve their ability to survive by expanding their gene pool [29].
In this study, we harness an experimental evolution approach in order to decipher how the community genome size are affected by the pollution disturbances, and how the community was selected and changed under pollution stress. We tested the potential change mechanisms of genome size based on niche and function. We speculated that the community genome size may increase when the community stable state was broken by environmental pollution and hypothesized that (1) the pollution stress would select for bacteria with large genomes with degrading genes or a wide metabolic range; (2) a potential higher occurrence of horizontal gene transfer between resident species account for the changes of genome size in community level.

Inhibition of abundance and diversity under pyrene stress
To analyze the effect of pyrene in in-situ soils on the total number of microbial community, we measured the residual pyrene concentration and quanti ed the abundance of microbial community. A negative correlation was found between degradation rate and spiked pyrene over the 35-day period. The abundance of total bacterial, archaeal and fungal communities were signi cantly inhibited by pyrene. The bacterial gene copy numbers were lowest in PYR500, indicating some inhibiting effect of high concentration on the bacterial growth (Fig. 1a). There was a statistically signi cant difference in bacterial gene copy numbers between the ve treatments (ANOVA, P < 0.001). The abundance of bacteria in PYR500 was signi cantly lower than in non-contaminated and other spiked-pyrene soils ( Fig. 1a; Tukey HSD post hoc test, P < 0.05). In addition, Archaeal and fungal rRNA gene copy numbers were about 1-2 orders of magnitude lower compared with bacteria (Additional le: Fig. S2). The abundance of archaea and fungi in PYR100 decreased dramatically, with no signi cant difference compared with PYR500 (P > 0.05), suggesting that archaea and fungi were more sensitive to high concentrations of pyrene.
To gure out the in uence of different pyrene stress on species richness and evenness, we calculated different alpha diversity indices. Alpha diversity described the diversity of the community members within a single sample. Alpha diversity values decreased with the increasing concentrations of pyrene, regardless of diversity measure used (Fig. 1b, Additional le: Table S3). The Shannon index of species diversity ranged between 5.67 and 7.05 and differed signi cantly depending on pyrene concentration (P < 0.001). Furthermore, Tukey HSD post hoc comparison identi ed high concentration (PYR100 and PYR500) to have signi cantly lower bacterial diversity compared with control and low concentration treatment (P < 0.05).
Resistance index, an ecologically relevant parameter, was calculated based on α-diversity (Shannon index) to estimate the stability of microbial communities in response to different pyrene stress. Across all experiment treatments, resistance index was altered signi cantly (P < 0.05, Fig. 1c) and decreased with increasing pyrene concentrations. The microbial communities in PYR500 had signi cantly lower resistance compared with other treatments (P < 0.05), except for PYR100 (P > 0.05), which further suggested that 100 mg/kg may be the turning point for the bacterial community to reach another steady state.

Changes of microbial structure and composition
We investigated the community structure and composition of all treatments to determine whether there were differentially representative taxa, which might be conducive to account for the differences in diversity and resistance. Non-metric multidimensional scaling (NMDS) of OTU-based Bray-Curtis dissimilarity distances was measured to visualize the similarities and differences in community composition between different samples. The samples spiked with and without pyrene formed clearly delimited groups and also clustered separately (Fig. 2a). Treatment CK, PYR1 and PYR10 were separately clustered at a certain distance, whereas samples of PYR100 overlapped with PYR500 to a large extent, indicating that there was no/low effect on microbial community structure when the pyrene concentration in soil was more than 100 mg/kg dry soil. The bacterial community structures were signi cantly changed by pollution levels of pyrene based on Bray-Curtis dissimilarity (ANOSIM, R = 0.895, P = 0.001), suggesting pollution levels were a potential predictor of community composition (Additional le: Fig. S3).
A total of 25,260 distinct OTUs were annotated to 42 different phyla, and the samples of experimental treatments were divided into two concentration-level groups according to different community compositions: (1) the low and medium concentration group (LM group; PYR1 and PYR10); (2) the high concentration group (H group; PYR100 and PYR500). Across all samples, Proteobacteria was the most abundant phylum, accounting for 35.8-45.7% of the total effective sequences (Fig. 2a). The other predominant phyla were Firmicutes (9.6-12.9%), Actinobacteria (7.0-11.1%), Acidobacteria (6.9-10.1%), with smaller proportion of Chloro exi (6.7-9.5%) and Bacteroidetes (5.3-8.8%). Among the dominant phyla, only Actinobacteria signi cantly decreased with increasing pyrene concentrations (P < 0.05). Two dominant phyla, Proteobacteria and Deinococcus-Thermus, were identi ed as signi cantly differentially abundant taxa between the LM and H groups, while Patescibacteria showed a drastically decrease. Compared with CK, the overall trend of Firmicutes and Patescibacteria was an increase and Proteobacteria had been slightly inhibited in LM group. In contrast, Proteobacteria was signi cantly enriched in H group compared with CK.
To explain the great differences in microbial composition caused by pyrene stress in soils among all samples, Random Forests analysis with 10-fold cross-validation of 5 repeats was performed at the phylum level. Finally, according to the cross-validation error curve, 12 phyla were de ned as biomarkers to distinguish samples of different concentrations, among which Fibrobacteres (4.25%) and Nitrospirae (2.75%) might contribute most to the community differences under pyrene stress (Fig. 2b). Intriguingly, almost all the biomarkers picked by Random Forest accounted for only 1.5-4.2% of all taxa, most of which had a relative abundance of less than 1%, except Proteobacteria (35.8-45.7%).
In order to investigate whether these signi cant community differences at different pyrene stresses affected soil genomes, an average genome size for each sample was calculated based on the relative abundance of the community at phylum level. A strong positive correlation was found between the logarithm of the concentrations and average genome size ( Fig. 2c; Pearson's r = 0.5164, P = 0.0082). Compared with CK, the soil average genome size of PYR500 increased by 0.43 Mbp.
Linking functional gene and genome size to potential community functions To explore how these differences in biodiversity and community composition affect the functions to change, we analyzed PICRUSt based on KEGG Orthology database to predict the functional composition of soil bacterial communities. A total of 298 level-3 pathways a liating with 7 components in KEGG were predicted, and metabolism accounted for about 53%, followed by genetic information processing (16%). Functional pathways at KEGG level 2 revealed that differences in pyrene stress led to a signi cant shift differences across all samples in 39 functional categories, except "biosynthesis of other secondary metabolites", "Carbohydrate metabolism" and "Membrane transport" (ANOSIM, R = 0.392, P = 0.001). According to the functional hotspot, these changing KEGG modules were divided into two functional groups: internal functions within microorganism (de ned as Self-functional group) and external functions between microorganisms (de ned as Community functional group) (Fig. 3a).
In Self-functional group, the relative abundances of two main categories relating to Metabolism (i.e. "Xenobiotics biodegradation and metabolism" and "Metabolism of other amino acids") and Cellular Processes (i.e. "Cell motility" and "Cell growth and death") were more abundant in samples spiked with pyrene. However, not all categories relating to Metabolism had a consistent trend of increasing. For instance, abundance of "Glycan biosynthesis and metabolism", "Nucleotide metabolism" and "Energy metabolism" were signi cantly lower in Experiment Treatments compared with CK ( Fig. 3a, P < 0.05). The abundance of Environmental information processing KEGG Modules including "Replication and repair" and "Translation" gradually eliminated as the pyrene stress increased. For functional categories of Community functional group, "Signal transduction" and "Signaling molecules and interaction" showed an increasing trend along with increasing pyrene concentrations, suggesting pyrene stress enhanced the ability of microorganisms to communicate with each other.
To further test how the speci c xenobiotics (Pyrene) biodegradation and metabolism function changed at different pyrene stresses, we chose the nidA gene, which was a key dioxygenase gene for pyrene degradation, to evaluate the abundance of pyrene-degrading bacteria. No nidA gene was detected over the course of 35 days in CK. After 35 days of incubation, the nidA gene copy numbers across spikedpyrene samples signi cant changed (P < 0.05) from 6.85 × 10 3 to 8.74 × 10 4 . The number of nidA gene copies increased quickly with pyrene concentrations, reaching the maximum value at PYR100. Thereafter, there was no signi cant decrease in PYR500 compared with PYR100 (P > 0.05). Correlation analysis between unique nidA gene and metabolic functional categories at KEGG level 2 showed that the indicator gene for pyrene metabolism were positively correlated with, metabolism of other amino acids and xenobiotics biodegradation and metabolism ( Fig. 3b; P < 0.05). A signi cant negative correlation was revealed between nidA gene and Nucleotide metabolism ("Sulfur metabolism", "Nitrogen metabolism", "Methane metabolism").
Previous studies have shown that the increase in genome size was proportional to the number of metabolic genes [28]. To test whether this consequence could also be observed in our experiments, we calculated the correlation between genome size and metabolic functions. The average genome size was only positively correlated with "Xenobiotics biodegradation and metabolism" (Fig. 3b; P < 0.05). Focusing on the speci c degradation gene (nidA), we discovered that there was also a signi cant positive correlation with genome size (P < 0.05).

Co-occurrence network under pyrene stress
In order to determine the effect of pyrene on soil microbial ecological links between different species, 5 networks were constructed at the OTU level based on spearman correlation. The numbers of nodes (979-1987) decreased from 1987 to 979 with the increasement of pyrene concentration, and edges between microorganisms reduced from 3720 to 1413. Compared with CK (maximum value, 0.539), the average clustering coe cient in experimental treatments was decreased by 0.011, 0.086, 0.018 and 0.097 (Fig. 4), respectively, which indicate pyrene decreased the soil microbial associations.
To obtain a better understanding of the cooperative relationships of pyrene-degrading consortiums, we performed Spearman correlation analysis to link the unique nidA gene and bacterial taxa pro les. A total of 151 OTUs belongs to 14 phylum in spiked-pyrene samples were signi cantly positively correlated with nidA gene copy numbers (r > 0.8, P < 0.05), which was supposed as the "potential degraders". Proteobacteria and Chloro exi were the most abundant potential degraders, accounting for about 60%. There were 43, 56, 79 and 61 "potential degrader" in experimental treatments. In addition, 142,132,180 and 134 nodes in PYR1, PYR10, PYR100 and PYR500 were associated with the "potential degrader", respectively, which might become the "potential horizontal gene transfer receptors". An average of 3.33, 2.43, 2.61 and 2.36 links in each network was correlated with each potential degrader, with an average of 1.24, 1.18, 0.99 and 1.05 positive correlation links. In general, the addition of pyrene resulted in an increase in the number of OTUs positively correlated with the nidA gene copies, and the same trend was observed for the edges positively correlated with these OTUs. In experimental treatments, OTUs positively correlated with the potential degraders mainly belong to Proteobacteria, Chloro exi, Actinobacteria and Acidobacteria (Additional le: Fig. S4).
Veri cation of genome size and nidA gene in soils of different latitudes In order to further verify the universality of our conclusion that genome size increased under pyrene stress, we selected soils at two other latitudes to conduct microcosmic incubation experiments. The average genome sizes of soils with added contaminants in HZ-PYR500 and GZ-PYR500 were 3.69 ± 0.033 Mb and 3.94 ± 0.038 Mb, increasing by 0.08 and 0.18 Mbp compared with HZ-CK and GZ-CK ( Fig. 4), respectively, which showed a consistent trend with the soil in Beijing. In addition, the nidA gene was also not detected in uncontaminated soil and the average nidA gene copy numbers of HZ-PYR500 and GZ-PYR500 were 2 × 10 5 and 8.2 × 10 5 , respectively (Additional le: Fig. S5).

Discussion
In this study, we found for the rst time that PAHs increased the average genome size of soils. Pyrene, as a selective pressure and a special carbon source, on the one hand, speci cally enriched larger genome microorganisms with stronger adaptability, and on the other hand, enhanced the ability of microbial communities to communicate and potential horizontal gene transfer between species.
Based on previous studies, genome size was affected by many different factors. Firstly, species in stable community tend to have small genomes [7,25]; secondly, shift in community composition and structure can change the community genome size and microorganisms with larger genome size have better adaptation to environmental stress [25,32]; thirdly, strong ability of community communication and horizontal transfer can improve the viability of bacteria in a changing environment [22,29].
For the rst factor, in our work, we discovered signi cant changes in community stable sate under different pyrene stresses. The Shannon and resistance index decreased with the increasing concentration, indicating that different concentrations of pyrene may be used as environmental ltering factors in different degrees to screen microorganisms with speci c niche. Moreover, in the β-diversity analysis, it was found that other treatments except the high-concentration treatment accumulated separately, suggesting that the low-concentration pyrene broke the stable state of the original microbial community, while the high-concentration pyrene made the community enter a new stable state and reshaped the community niche. As reported in previous studies, species living in unchanging niches tend to have small genomes [7,26]. In our study, it was found that pyrene not only led to the change of niche but also caused the increase of average genome size, and this result was con rmed in three soils of different latitudes, indicating that the positive correlation between the genome size and pollutant concentration had a certain universality. Overall, signi cant changes in species diversity and niche led to an increase in community average genome size.
Besides, it is generally believed that microorganism with larger genomes are better able to cope with a wide range of environmental conditions [15,24]. We then focused on the change of community composition under the selection of pollutant. Results showed that among the abundant phyla (> 1%), Proteobacteria and Deinococcus-Thermus were signi cantly enriched but the relative abundance of Actinobacteria and Patescibacteria decreased. We checked the genome size of these species and found that Proteobacteria and Deinococcus-Thermus had genome sizes of 0.  [5,37]. These suggested that Proteobacteria might have their own unique mechanisms for PAHs-tolerance, possibly by carrying genes with speci c functions such as nidA gene. The result that OTU with positive correlation with nidA gene copy numbers mainly assigned to Proteobacteria proved that Proteobacteria had the potential to utilize pyrene as carbon and energy source. The enhancement of degradation capacity further promoted xenobiotics biodegradation and metabolism. The increase in genetic information processing compared with the control treatment may be due to Deinococcus-Thermus, which carried unique carotenoid genes or enzymes. Carotenoids were helpful to protect genetic material from stress and thus made bacteria better adapt to harsh environments [38]. Interestingly, Patescibacteria, with a decreasing relative abundance, had a highly reduced genome size of 1.1 ± 0.2 Mb and signi cantly streamlined cellular activities (cell motility; signal transduction) and metabolic potentials (carbohydrate, protein, and lipid metabolisms; sulfur and nitrogen compound metabolisms) [39]. To some extent, the increasing phyla including Proteobacteria and Deinococcus-Thermus had bigger genome sizes and did well in adaptation to PAHscontaminated soils, while the decreasing phyla containing Patescibacteria owned simpli ed genomes and reduced function related to metabolisms and cellular activities. In a word, the increase of largegenome microbes and the decrease of small-genome microbes eventually resulted in the increase of community genome size.
Last, the increase of Proteobacteria provided more genes in the community to handle with the changes of circumstances. Some microbes can choose other strategies to improve their own adaptation to PAHs selective pressure, in addition to carrying degradation or anti-stress genes [37]. Many previous studies have found that microorganisms in a community have ability to degrade and tolerate xenobiotic compounds through horizontal transfer of mobile genetic elements [40,41]. Our results have shown that the addition of pyrene in uncontaminated soils activates pyrene degradation genes. Furthermore, in the network, it could be discovered that with the increasing concentration, the proportion and connectivity of potential degraders and OTUs connected to them improved. The results suggested that there was a strong link between the microbes. What was interesting was that these OTU belonged mainly to Proteobacteria, Chloro exi and Actinobacteria. Moreover, currently, the microorganisms carrying catabolic mobile components are mainly Proteobacteria and Actinomyces, such as Pseudomonas sp., Sphingomonas sp., and Mycobacterium sp. [37,40,42]. These degradation genes, whether on plasmids or chromosomes, are usually transferred in the form of an insertion machines-elements. In addition, the increase of environmental adaptability, signal transduction and signaling molecules and interaction indirectly proved that there was a strong cooperative function between microbial communities.

Conclusions
Overall, this is the rst time to discover an increase in genome size under PAHs stress and associated with speci c functions. In our study, different PAHs concentrations were found to inhibit microbial growth, alter microbial community structure, and signi cantly increase the copy numbers of nidA gene. Based on the positive correlation among genome size, functional genes and metabolic function, we propose a possible hypothesis for the increasing genome size. As a powerful selective factor, PAHs not only enriched the large genome microorganisms carrying degrading or anti-stress genes with stronger adaptability, but also promoted the horizontal transfer of mobile genetic elements with degrading genes, thus improving the tolerance of the whole community. However, much more follow-up work need to be done, not only more direct evidence is needed to con rm the role of horizontal gene transfer, but also make sure whether there are other functional genes associated with genome size. Moreover, we should investigate whether genome size decreases with the disappearance of pollution stress.

Soil sampling and experimental design
Surface soil sample (from 0 to 20 cm) with unknown history of PAHs contamination was collected from an area (40°23' E and 116°40' N) in Changping District, Beijing, North China. One part of the sample was air-dried at room temperature, homogenized with a 2 mm sieve and stored in the dark at room temperature until the microcosm incubation, and the remaining part was stored in a self-sealing bag at -20°C to measure soil physicochemical properties. The original soil was one type of uvo-aquic soil according to the FAO/UNESCO soil classi cation system and had a pH of 7.9. Additional details on soil characteristics are provided in the supplementary materials (Additional le: Table S1).
In order to better reveal the variation of genome size in soils with different pollution levels, microcosms under laboratory were constructed. All the microcosms were carried out by placing 10 g soils into 120 ml serum bottles which were sealed with rubber stoppers and aluminum caps, and sterile distilled water was added every 2 days to keep the soil moisture content at about 60%. The headspace in these microcosms was replaced daily with sterile ltered synthetic air (20% O 2 , 80% N 2 ), then incubated at 25℃ in the dark.
Five treatments were established with 12 replicates, including one Control Check: non-contaminated soil without Pyrene (denoted as CK) and four Experimental Treatments: contaminated soils with 1, 10, 100 and 500 mg Pyrene (denoted as PYR1, PYR10, PYR100 and PYR500, respectively). The contaminated soils in Experimental Treatments were conducted by following the procedure described by Brinch et al [43]. Brie y, the pyrene solution was mixed well with 25% air-dried soil in a glass container and then the container was placed in a fume hood overnight to let the acetone evaporate. Thereafter, the remainder of the soil (75%) was added to the spiked soil and blended thoroughly. The non-contaminated soil, in the meanwhile, was prepared using the same methods with pure acetone solution. During the whole incubation period, the soil CO 2 emission ux in all treatments were measured every day (Additional le: Fig. S1). When the CO 2 emission rate began to decline, suggesting inadequate carbon sources in soils, all the microcosms were harvests immediately after 35-days incubation and stored at -80℃ for the following analyses. The soil samples were equally divided equally into two parts: one for phenanthrene quanti cation, and one for DNA extraction.

Construction of another two latitudes soil microcosms
To further con rm the results concluded in microcosms of Beijing, uncontaminated soils were collected from two other cities according to different latitudes to construct microcosms. One is Hangzhou (HZ, 30°26' N, 120°19' E) in the middle latitude of China, and the other is Guangzhou (GZ, 23°16' N, 113°23' E) in the low latitude. The HZ and GZ soils had a pH value of 6.25 and 5.46, respectively. The HZ and GZ soils was classi ed as Haplic Ferralsol soil based on the FAO/UNESCO soil classi cation system. Two treatments were performed for each soil, including Control Check (denoted as HZ-CK, GZ-CK) and contaminated soils with 500 mg Pyrene (denoted as HZ-PYR500, GZ-PYR500). The whole incubation process and sampling were consistent with the microcosmos mentioned above.

Pyrene extraction and quanti cation
Residual pyrene concentrations in soils were detected to gure out the different rates of pyrene degradation in each treatment, which was an parameters to determine the reply of resident species to the exposure. Pyrene extractions were performed using the ASE 350 system (Dionex, Thermo Scienti c, USA) by carefully loading freeze-dried soils (8 g) into sample cells containing pelletized diatomaceous earth (2 g). Sample cells were vertically hung in the tray slots in order and extracted twice with dichloromethane/acetone (1:1, v/v). Extract were concentrated to approximately 3 mL, puri ed via solidphase extraction and then quanti ed by gas chromatography coupled to mass spectrometry (GC-MS, Shimadzu, Kyoto, Japan) with a HP-5MS capillary column (5% diphenyl-95% dimethyl polysiloxane, 30 m × 0.25 mm i.d., 0.25 μm lm thickness, Agilent). The analysis procedure of GC-MS was slightly modi ed from the methods of Rombolà et al. [44]. High purity helium was used as a carrier gas with a ow rate of 1 ml/min. The GC oven procedure proceeded as follows: 50℃ hold for 2 min, rise to 180℃ at a rate of 20℃/min, increase at 10℃/min to 290℃, then hold for 10 min. High-throughput sequencing data analysis was performed through the Galaxy pipeline [50] at the Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences (http://159.226.240.74:8080/). In short, each sequence was split by detected barcodes, following by combing forward sequences and reverse sequences. Quality ltered sequences were clustered using UPARSE in Operational Taxonomic Units (OTUs) at 97% similarity level to pick represent sequences and OTUs. A total of 1,374,419 quality-ltered paired-end sequences from samples, ranging from 39,134 to 51,214 sequences, were clustered into 20,641 bacterial OTUs at 97% identity. Then OTU representative sequences were annotated using the RDP classi er with the SILVA database. A resample OTU table at 26,971 reads per sample was created for further statistical analysis based on the minimum value of reads numbers for all samples.

Bioinformatic analysis
The Shannon diversity index (α-diversity) and Non-metric multidimensional scaling (NMDS) analysis based on Bray-Curtis dissimilarity were performed using the Galaxy pipeline based on the resample OTU table. The resistance index [51] was calculated by comparing the α-diversity between spiked-samples and controls in order to evaluate the response stability of the microbial community to pollutant changes. An average genome size was calculated using genomes at each phylum level deposited in IMG and 16S rRNA gene composition following the method of Sorensen et al. [34]. Spearman correlation was used to evaluate correlations between nidA gene copies and the OTU relative abundance using SPSS 25.0 (IBM, USA), following by a network analysis to explore the co-occurrence patterns at http://ieg2.ou.edu/MENA. The metagenomic function of the microbiome was predicted based on 16S rRNA amplicon data through PICRUSt [52]. The output of PICRUSt showed the amount of each functional-gene in each sample by a functional-gene-count matrix.

Statistics analysis
All statistical analyses were carried out in R platform (3.6.2, http://www.r-project.org). The Shapiro-Wilk test and Bartlett test were used to check whether the data conformed to normality and homoscedasticity respectively. One-way ANOVA was used to determine whether there were any statistically signi cant differences (P < 0.05) among treatments in Shannon index and copies. The post-hoc Tukey HSD test were used to check if the relationship between two sets of data was statistically signi cant.

Declarations
Ethics approval and consent to participate Not applicable Consent for publication Not applicable Availability of data and materials Raw sequences data of 16S rRNA high-throughput sequencing are available in NCBI Sequence Read Archive repository database (www.ncbi.nlm.nih.gov/sra) under accession number SRP266388 : PRJNA638003.

Competing interests
The authors declare that they have no competing interests   Network of the bacterial communities at OUT level in soils. The network of (a) Treatment PYR1 (1 mg pyrene/kg dry soil), (b) Treatment PYR10 (10 mg pyrene/kg dry soil), (c) Treatment PYR100 (100 mg pyrene/kg dry soil) and (d) Treatment PYR500 (500 mg pyrene/kg dry soil) were visualized by Gephi. The red nodes represent OTUs signi cantly positively correlated with nidA gene and the green nodes represent OTUs which are positively connected to the red nodes. The size of each node is proportional to its number of connections.