Characteristics of Bacillus sp. S3 with different heavy metal(loid)s
In the previous study, Bacillus sp. S3 showed high tolerance to multiple heavy metal(loid)s [18]. As shown in Fig.1, the SEM images of Bacillus sp. S3 intuitively showed that the cell walls were enveloped by filaments, possibly as a result of the presence of extracellular polymeric substances. Intriguingly, after cultivation of Bacillus sp. S3 for exponential phase under the Sb(III) stress, the cell surface became smoother than control. As shown in Fig. S1, no physical Sb(III) adsorption was detected on Bacillus sp. S3 cell surfaces by EDS analysis, the result is kind of other studies [9]. When the initial concentration was 1mM Pb(II), the peak value of EDS was significant from that of the control group and other treatment groups. The SEM results showed that smaller cell size, lesser wrinkled cell wall and the occurrence of intracellular dissolution in present of heavy metal(loid), implying that the elevated levels of heavy metal(loid) ions might suppress the secretion of extracellular polymers substance and normal metabolism. The smallness of bacterial cells provided a large contact interface, which would facilitate the interaction between heavy metal(loid)s and biosorption process of Bacillus sp. S3. The EDS spectra results revealed that Bacillus sp. S3 might further absorb Pb(II) to a certain extent compared with other heavy metal(loid)s, such as extracellular adsorption and surface complexation [9].
General genome feature of Bacillus sp. S3
The general genomic features of Bacillus sp. S3 were summarized in Table 1 and Fig. 2. The whole genome of Bacillus sp. S3 contained one single circular chromosome of 5,579,638 bp with 40.30% GC content and one plasmid of 241,339 bp with 36.74% GC content (Fig. 2A, B). The whole genome harbored 5,131 protein-coding sequences covering 85.35% of the genome with the average length of 904 bp, as well as 36 rRNAs and 104 tRNAs (Table 1). In addition, the genomic features of Bacillus sp. S3 and the other 44 closely related strains were summarized in Table 2. The average chromosome length of the 45 Bacillus genomes is 4.99 Mb with ranging from 2.2 to 7.08 Mb and the average GC content was 40.31% with ranging from 35.4% to 47.8%, indicating substantial species-to-species or strain-to-strain variations and similar genomic characteristics. Among these investigated strains, Bacillus sp. OxB-1 showed the highest GC content (47.8%), and B. thuringiensis and B. cereus SJ1 showed lower GC contents (35.4%).
Based on BLASTP searches (e value<1e-5), there were 4371 and 2564 CDSs involved in COG database (Fig. S2, Table S4) and GO database, respectively (Fig. S3, Table S5). A high proportion of genes in COG database were assigned to the general function prediction only (R, 10.3%), amino acid transport and metabolism (E, 8.05%), carbohydrate transport and metabolism (G, 6.15%), energy production and conversion (C, 5.33%), transcription (K, 5.06%), and replication, recombination, and repair (L, 4.6%) categories. In addition, the proteins were distributed to GO database in three functional classification, including “molecular function” (2715), “cellular component” (1001) and “biological process” (3615). Compared to other bacteria, enrichment profiles of Bacillus sp. S3 genes assigned to COG functional categories showed an overabundance of genes involved in amino transport and metabolism, carbohydrate transport and metabolism, energy production and conversion. These resistance genes involved in defense and repair mechanisms for dealing with heavy metals propelled by harsh eco-environments, such as nitrate and heavy metals [10]. These resistance genes include ABC antiporters and Cd2+/Zn2+/Co2+ efflux components (CzcABC, CzcD) were probably important for Bacillus sp. S3 in the adaption of specific niche.
In addition, the KEGG pathway database was used to map their corresponding terms on the Bacillus sp. S3 genome. A total of 3691 CDSs were assigned to 180 KEGG pathways (Fig. S4, Table S6), which could enriched in “Metabolism” (637), “Biosynthesis of secondary metabolites” (285), “Microbial metabolism in diverse environments” (215), “Carbon metabolism” (135), “Biosynthesis of amino acids” (134) and “ABC transporters” (122). The arsenite oxidase (AioB) was assigned to metabolic pathway (ko01100), the phosphate-binding protein (PstS), sn-glycerol-3-phosphate-binding periplasmic protein (UgpB) and phosphate import ATP-binding protein (PstB) were assigned to ABC transporters pathway (ko02010), the copper-ion-binding protein CopZ was assigned to mineral absorption pathways (ko04978), these pathways may play an important role in response to the metal(loid) toxicity.
As shown in Fig. S5, we identified 237 carbohydrate-active enzymes (CAZymes) in the genome of Bacillus sp. S3. Predicted CAZymes were classified into 6 classes, encompassing auxiliary activities (AAs, 4), carbohydrate-binding modules (CBM, 63) carbohydrate esterases (CEs, 39), glycoside hydrolases (GHs, 90), glycosyltransferases (GTs, 39), polysaccharide lyases(PLs, 2).To induce inhibition, the heavy metal(loid) ions might non-specifically bind to regions of CAZymes [42].This interactions between heavy metal(loid)s and CAZymes need a great deal of energy (e.g. for pumping out intracellular metal ions by ATPases, or for the strengthened expression of metal(loid) resistance proteins, etc.) through increased conversion of carbohydrates [43]. It implied that these enzymes might play an important role in coping with the metal(loid) ions [42].
Notably, large numbers of heavy metal(loid) resistance genes were found to locate on the chromosome rather than plasmid in Bacillus sp. S3 (Table 3), which could be used for further analysis of genetic diversity and evolutionary. Unlike chromosome, the plasmid was found to contain a large number of hypothetical proteins. The aioAB genes encoding As(III) oxidase were responsible for Sb(III)/As(III) oxidation, which could convert the more toxic Sb(III) to the less toxic Sb(V) in the periplasm [12]. In our study, the aioB gene was only detected in the chromosome of Bacillus sp. S3 but not the other Bacillus genomes, giving Bacillus sp. S3 the capacity to oxidize Sb(III). Moreover, our data showed that cytoplasmic Sb(III) extrusion was the main As/Sb resistance strategy in Bacillus genus. As resistance genes (arsB and arsC) were detected in all comparative Bacillus strains. Three arsB genes encoding Sb(III)/As(III) efflux pump membrane proteins and arsC gene encoding As(V) reductase were located on the chromosome in Bacillus sp. S3. The phosphate (Pi) related genes, such as phoB, pstS, and phoR, were also identified in Bacillus sp. S3, which have been proved to be co-regulated with As(III) oxidation and could be induced by Sb(III) in previous report [44].
Phylogenetic analyses
To evaluate the phylogenetic relationship, we downloaded 44 Bacillus genome sequences (including 16 complete genomes) and their annotations from the NCBI GenBank database (Table 2). The phylogenetic tree based on 16S rRNA gene sequences revealed that Bacillus sp. S3 belonged to Bacillus genus and grouped with other Bacillus sp. L75, Bacillus soil strain G8 and Bacillus drentensis G18 (Fig. 3A). Although 16S rRNA gene has been conventionally been used for assessing bacterial taxonomy and phylogeny, there was still controversy and uncertainty based on only one gene [45]. The phylogenetic trees based on 554 core genes and whole-genome composition vectors (CVs) were also constructed (Fig. 3B, C). In the two phylogenetic trees, Bacillus sp. S3 was most closely related to the Bacillus bataviensis. However, the topologies of the two phylogenetic trees exhibited some differences, suggesting that the flexible genes could be crucial in altering the genome content and shaping the topology of the trees.
ANI is the most widely accepted bioinformatics tool that evaluate the genomic distance and delineate species in evolutionary progress, overcoming the difficulty of conventional deviations caused by evolutionary mutation rate and HGT events [32]. As show in Table 3, the closest ANI values 81.51% between Bacillus sp. S3 and other reference strains considerably lower than threshold value of 95-96% of the boundary for species circumscription [46]. Besides, the value of 70% dDDH was the recommended standard for species delineation of bacteria, corresponding tightly to 95% ANI [47]. The results showed that the dDDH% values of Bacillus sp. S3 against all reference genomes ranging from 12.7% to 34% (Table S7). Thus, the results of ANI and dDDH values indicated that Bacillus sp. S3 represented a novel species.
Core and pan genomes analysis
To clarify the genomic features specific to each Bacillus strain, all genes from tested Bacillus strains were described by MP method in the pan-genome analysis pipeline with a 50% cut-off for protein sequence identity. There were total 39,933 orthologs protein coding sequences in the pan genome for the Bacillus genus (Fig. 4A). Of these orthologs genes, 554 (1.38% of total pan genome) were identified as core conserved genes, and 16,234 were identified as strain-specific genes. Accessory gene number varied from 1,189 to 4,431 (mean 3,693) and the isolated Bacillus sp. S3 had 3,893 accessory gene. Accessory gene is known as indispensable orthologs, the variability of accessory gene indicating the flexibility of genome structure [48]. After comparing strain-specific genes, the variability in the total number of strain-specific genes ranges from 0 to 1560 genes (mean 360). Bacillus sp. OxB-1 had the highest amount of these (n = 1560), reflecting the greatest difference with other tested genomes.
Previous reports stated that a mathematical extrapolation of the pangenome was highly remarkably reliable when sufficient genomes (>5) were involved [49]. The deduced power law regression function [Ps(n) = 5313.92n0.527477] revealed that the pan-genome of Bacillus had a parameter (γ) of 0.527477 (0 < γ < 1), implying a stabilized core structure and an open pan-genome of Bacillus strains (Fig. 4B). Thus, the new orthologs were intuitively observed after addition of more genomes to the group. Furthermore, the extrapolated curve of the core genome presented a steep slope according to the exponential regression [Fc(n) = 1779.45e-0.0387412n] (Fig. 4B). The addition of an extra genome would not significantly alter the size of the core genome due to the numbers of core genes were relatively constant [43].
These core genes shared with all Bacillus genomes also could be classified into different COG categories (Fig. 4C), which was agreed with previous reports that larger prokaryotic genomes tended to pile up genes directly or indirectly involved in different metabolism [50]. The KEGG annotation of 385 specific genes of Bacillus sp. S3 showed that 10 genes involved in the environmental information processing, including one cobalt transport system protein-encoding gene corA and one As(III) efflux pump membrane proteins-encoding gene arsB. Further, small number of strain-specific genes (<40%) were assigned to the COG categories for the Bacillus sp. S3, which were mainly found to be enriched in phosphotransferase ABC-type metal ion transport systems. The result revealed specific adaptive strategies for Bacillus sp. S3 in response to harsh eco-environment.
Mobile gene elements in Bacillus genomes
It is well recognized that bacteria genomes have notable genome plasticity by several elements of HGT events, known as mobile gene elements (GEIs, IS, Prophages) and CRISPRs [51]. The presence of the majority of GEIs in Bacillus sp. S3 and other comparative strains renders clues about the genomic plasticity of these isolates (Table S8). We identified 20/5/15 GEIs in Bacillus sp. S3 through three methods. These GEIs were possibly conducive to the integrated pool of transposase. Analysis of transposable elements showed that numerous insertion sequence (IS) elements were distributed over the genomes and plasmids of Bacillus strains, harboring IS1, IS2, IS3, IS4, IS5, IS21 and IS256. The numbers of insertion sequence (IS) elements could magnify the size of genome, and result in frequently genomic exchange with other community members [51].
As shown in Table S8, all Bacillus genomes could be served as feasible targets of phage infections. A total of 5 intact (100% score) prophage regions were predicted in the genome of Bacillus sp. S3, their information containing size: 11485 bp, 5841 bp, 8630 bp, 8468 bp, 7959 bp; coding sequence: 11, 7, 6, 10, 8; GC content: 42.84%, 38.02%, 35.25%, 36.25%, 33.84% (Figure S6). It is interesting noted that the number of CRISPRs varied between 0 and 15 per strain and CRISPR loci absolutely scattered the chromosome. 4 confirmed CRISPR with 39, 20, 60, 4 spacers were detected in Bacillus sp. S3. Our findings suggested that the Bacillus genus could trigger various defense mechanisms against the invasion of exogenous DNA for maintaining the stability of their genetic architecture during the evolution.
Signatures of HGT events generally could be facilitated by MGEs, such as integration sites, anomalous GC contents, or varied codon usage, suggesting that the potential contributor of the observed genomic rearrangements between the genomes of genus Bacillus [51]. MGEs were key driving forces of genome evolution [11], indicating that high genomic plasticity in Bacillus genus was extended to potential strategies to cope with high metal(loid) ion concentrations of their natural habitats. Genomic island (GEIs), which have been committed to provide antibiotic resistance to the host bacteria, were generally divided into 4 categories based on their function, including resistance island (RIs), virulence genes, metabolic islands, and symbiotic island(SIs) [52]. These islands promoted symbiotic integration of the host with other microorganisms [53]. CRISPR-Cas system is a type of adaptive immunity in bacteria and archaea, which protect them against invading genetic elements [54]. Bacillus sp. S3 might acquire specific functional gene clusters related metal(loid)s from other genera via HGT and genomic reshuffling, which was integrated into its genome and enhanced the adaptability of coping with heavy metal(loid)s.
Comparative genomic analysis of Bacillus genus
Mauve has been used for constructing multiple genome alignments in large-scale evolutionary events such as genome rearrangement, inversion, and other recombination [55]. To prove the extent of genomic shuffling, the whole genome sequence of Bacillus sp. S3 was compared individually with the other 10 complete genomes using mauve with default settings. As shown in Fig. 5, synteny maps of the 10 complete genomes were inspected, and represented large-scale blocks of inversions and several crisscrossing locally collinear blocks (LCBs). 615 LCBs with a minimum weight of 45 were exhibited between Bacillus sp. S3 and B. bataviensis LMG21833, other comparative information was also observed. Mauve analyses showed Bacillus sp. S3 had the highest synteny with B. bataviensis LMG21833 genome, other strains exhibited more number of rearrangement, insertion, and deletions. These results revealed that conserved structural synteny and lack of inversions and rearrangements were observed among members of Bacillus group, suggesting that the large-scale evolutionary events were occurred at the genus level [24, 56]. This was in agreement with genome distance in phylogenetic tree.
The origin and evolution of heavy Sb(III)-related genes
To examine the origin and evolution of Sb(III)-related genes in Bacillus sp. S3, including aioB, arsB and arsC, these genes that contribute to Sb(III) resistance of Bacillus genus were selected. Genome evolution could be driven by the acquisition and loss of genes, which was conducted by HGT, genomic reshuffling and natural selection [57]. Comprehensive analysis of HGT events was performed, since it was difficult to identify via either phylogenetic analysis or deviant G+C content. The deviant G+C content between gene and the genome could be used as a detection method of HGT [46]. Herein, we detected the G+C contents of Sb(III)-related genes and their corresponding genomes in Bacillus strains. It is worth noting that the aioB gene was specific gene of Bacillus sp. S3, whose GC content was significantly higher than that of the genome (44.64 vs. 40.3). As shown in Fig. S7, the average GC contents of the Sb(III)-related genes were different from that of their corresponding genomes in Bacillus strains.
The overall origin and evolution of Sb(III)-related genes in Bacillus genus were inferred by phylogenetic trees with NJ, ML and UPGMA methods. We speculated that the aioB gene in Bacillus sp. S3 derived from the evolution of 2Fe-2S ferredoxin (Fig. S8-10). To further elucidate the evolution of arsB cluster/arsC, phylogenetic trees based on ArsB and ArsC proteins were constructed. The results showed that arsB clusters of Bacillus sp. S3 formed separate 3 groups and a monophyletic clade, suggesting that the arsB cluster/arsC of Bacillus sp. S3 may originate from a common ancestor with B. bataviensis LMG21833 (Fig. S11-16). Bacillus sp. S3 likely obtained arsB cluster from B. bataviensis LMG21833, B. vireti and B. drentensis, since prosperus branch of arsB cluster of Bacillus sp. S3 near the base of the clade (Fig. S11-13). As shown in Fig. S14-16, the arsC gene of Bacillus drentensis may be acquired via HGT events from Rhodococcus qingshengii. The results showed that Bacillus sp. S3 acquired As/Sb resistance genes via HGT. These data suggested that the As/Sb resistance genes were horizontally transferred between the Bacillus sp. S3 and other species. The results were found to be consistent with previous reports, such as bacterial As resistance and detoxification acquired via HGT [58].
Assessment of functionality of heavy metal(loid)s-related genes
Previous reports showed that the CAI was a numerical estimator of gene expression level, due to highly expressed genes in bacteria were prone to magnify stronger codon bias [59]. The CAI value varies from 0 to 1.0, with higher CAI value indicating a higher expression level [60]. In order to indirectly assess the functionality of metal(loid) resistance genes, the appraisal of the strength of natural selection was performed, along with the CAIs of these genes. Putative highly expressed (PHX) genes associated to metal(loid) resistance in the Bacillus genus were inferred, where gerd gene encoding spore germination protein was used as a reference. As shown in Fig. 6A, the CAI values of these above-mentioned metal(loid)s resistance genes were calculated. The cutoff values were indicated with average CAI values of Gerd genes in each species. Our result showed that only about 8% of the metal(loid) resistance genes were predicted to be PHX genes that greater than 0.75, while lots of genes had CAI values range from 0.4 to 0.8. The PHX genes lead to stronger codon bias than those of low expressed level genes, resulting from codon translational selection.
A gene in the node or tip of a given tree was considered under diversifying selection (dN/dS > 1), evolving neutrally (dN/dS ≈ 1) or purifying selection (dN/dS < 1) using the likelihood ratio test after adjusting for multiple testing (P value < 0.1). As shown in Fig. 6B, 96.3% of 10 selected genes associated metal(loid) had a ratio of nonsynonymous substitutions (dN/dS<1), implying that these genes were indispensable factor for the above-mentioned Bacillus species under the pressure of purifying selection. Only an arsC gene from B. bataviensis LMG21833 (dN/dS=1.63), double chrA genes from B. niacin DSM 2923 and B. liceniformis ATCC 14580 (dN/dS=2.08, dN/dS=1.28), a copZ gene from B. firmus NCTC 10335 (dN/dS=1.28), a corA gene from Bacillus sp. LF1 (dN/dS=2.80), double znuB genes from B. soli 15604 and B. bataviensis LMG21833 (dN/dS=2.02, dN/dS=1.1) showed that dN/dS>1, suggesting that they might be under diversifying selection. Furthermore, the lowest dN/dS ratio was remarked for copA gene (average dN/dS=0.08) and the zur gene (average dN/dS=0.09), demonstrating strong purifying selection. Additionally, these genes including arsB from Bacillus mesonae H20-5, chrR from Bacillus mesonae H20-5 and Bacillus sp. S3, copZ from B. thuringiensis 97-27, Bacillus glycinifermentans BGLY and B. oceanisediminis strain Bhandara28, corA from B. firmus strain 14_TX, showed dN/dS ratio=1.0, indicating that selection force had little effect on them. These results showed that metal(loid) resistance genes had universally low dN/dS ratios and high CAI values, indicating that their functions play a role in supporting the growth of Bacillus species in response to harsh environments [60].
Transcriptional expression analysis in Bacillus sp. S3 with or without Sb(III)
To gain the insights into the role of Bacillus sp. S3-specific genes annotated as “arsenate oxidase (aio)” and “protein of resistance function (ars)” in presence of Sb(III), the expression levels of aioB, arsB_123, arsC and psts_1 was investigated by qRT-PCR. Primers used in the study are listed in Table S9, where 16S rRNA gene was used as an internal reference. As shown in Fig. 7, the results showed that the transcriptional expression levels of most of genes were up-regulated induced by Sb(III) except for arsC. Although the exposure of 0.5h Sb(III) decreased the transcriptional expression levels of arsB_123 and arsB cluster, the expression levels of aioB and arsB_123 were up-regulated after 1h, 2h and 4h. The expression levels of aioB gene was 15.8, 4.4, 2.6-fold change higher with 100μM Sb(III) from 1 to 4h compared to uninduced culture. Nevertheless, when the Bacillus sp. S3 was exposed to high Sb(III) concentration (200 and 300 μM), the expression level of aioB gene was up-regulated 2.6, 2.1, 1.3 and 2.5, 2.8, 2.1 fold changes from 1 to 4h. Thus, the expression level of aioB had nothing to do with the concentration and stress time of Sb(III). In addition, the expression levels of arsB_123 (2h and 4h) were remarkably up-regulated. The expression levels of arsB_123 and psts_1 were up-regulated along with increasing the Sb(III) concentration. The increased expression levels of aioB and arsB_123 in the presence of Sb(III) suggested that Sb(III) could stimulate the expression levels of As(III)/Sb(III) resistance genes, which might act synergistically to release the toxicity of Sb(III) in Bacillus sp. S3.
It has been reported that aioA expression was induced by Sb(III) [11]. Our results showed that expression level of aioB was up-regulated from 1 to 4h compared to uninduced cell, indicating that the aioB gene was induced by Sb(III) and played a putative part in oxidizing Sb(III). Nevertheless, 0.5h Sb(III) exposure suggested that the expression level of aioB gene was not up-regulated during the earlier time points. It is noteworthy that the expression level of aioB in higher Sb(III) concentration (200 and 300 μM) notably lower than 100μM Sb(III), indicating that the higher Sb(III) concentration could inhabit aioB expression. These results were basically consistent with the previous reports that the excessive As(III) treatment could inhibit the aioB expression [12]. In addition, the expression level of psts_1 gene involved in phosphate metabolism and co-regulated the aioBA genes was induced by Sb(III), suggesting that Bacillus sp. S3 required the DNA repair and amino acids synthesis processes in response to Sb(III) by enhancing production of Pi and phosphoribosyl pyrophosphate [22].