Whole-genome sequencing reveals putative underlying mechanisms of biocontrol capability of IBFCBF-5

As the world’s food safety and environmental safety problems become increasingly severe, the agricultural sectors of various countries are also paying closer attention to the use of biofertilizers and biocontrol agents. Rhizosphere bacteria are a significant source of commonly used biofertilizers and biocontrol agents. This study aims to describe the genome and genomic traits of a biocontrol agent in the genus Bacillus. In this paper, a strain of Bacillus amyloliquefaciens IBFCBF-5 was isolated and identified to have an inhibitory effect on several common oomycete and fungal pathogens Phytophthora capsica, Sclerotinia sclerotiorum, Colletotrichum gloeosporioides, and Fusarium oxysporum f.sp. cucumerinum. The genome of strain IBFCB-5 was sequenced, and the assembled genome was 4,338,658 bp, with a G + C content of 46.05%. The IBFCBF-5 genome contains abundant GH, GT, CE, PL, AA, and CBM gene families, potentially degrading cellulose and hemicellulose, chitin, starch, xylan, peptidoglycan, etc. In addition, 14 lipopeptide and polypeptide antibiotic gene clusters were found in IBFCBF-5, including those coding for the synthesis of several known antifungal and antibacterial compounds Fengycin, Bacilysin, Bacillibactin, and Plantazolicin. Our results show that Bacillus amyloliquefaciens IBFCBF-5 has a broad-spectrum antifungal ability and that its genome contains many genes coding for enzymes involved in the synthesis of antimicrobial metabolites.


Background
With international food safety and environmental safety issues becoming increasingly severe, the agricultural sectors in various countries are also paying more attention to the use of biofertilizers and biocontrol agents (Ongena and Jacques 2008a). Consequently, plant rhizosphere growth-promoting bacteria (PGPR) and their biologically active compounds promoting plant growth and/or antagonizing plant pathogens have received increasing recognition. These potential biofertilizers and biocontrol agents are generally considered environmentally friendly, superior to chemical fertilizers and pesticides (Bhardwaj et al. 2014;Ali et al. 2014).
Bacillus amyloliquefaciens is closely related to Bacillus methylotrophicus and Bacillus subtilis. It has a short generation time and is resistant to various stresses. Its spores are highly tolerant to high temperature, drying, ultraviolet and ionizing radiation, and many kinds of toxic chemicals. It can also produce several types of antibacterial substances such as lipopeptides, bacteriocins, and antibacterial proteins Communicated by J. Zhao.
Chen Luo and Ai-rong Shen contributed equally to this work.
* Ji-lie Li lijilie12@163.com * Liang-bin Zeng zengliangbin@caas.cn (Yang et al. 2015). These antibacterial substances belong to different groups and have shown wide antibacterial activities and no known detrimental effects on the environment. These bacteria and the compounds that they produce have been the focus of research in biological control of plant diseases, animal feed processing, and medical research and development in recent years (Jin and Xiao 2018;Planchot and Colonna 2018). Theantibacterial substances secreted by Bacillus amyloliquefaciens can be broadly grouped into two types: low molecular weight antibiotics and high molecular weight antibacterial proteins. According to their structural differences, low molecular weight antibiotics can be divided into three categories: Surfactin, Fengycin, and Iturin. Surfactin mainly inhibits the growth of bacteria, viruses, and mycoplasma. Fengycin has a strong inhibitory effect on filamentous fungi. Iturin mainly inhibits the growth of fungi. The inhibitory effect of high molecular weight antibacterial proteins on fungi is primarily manifested in inhibiting mycelial growth and destroying fungal cells, including chitinases and glucanases. Chitinase and β-1,3-glucanase can degrade the cell wall of pathogenic fungi (Finn et al. 2008;Zhang et al. 2021). Bacteriocins and lipopeptides have been commonly used in medicine, agriculture, and other fields because of their stable physical and chemical properties, broad antibacterial spectrum, and low frequencies of drug resistance, and they have become a focus of bio-pesticide research (Huang et al. 2010).
Here we screened and identified a strain of Bacillus amyloliquefaciens IBFCBF-5 obtained from the rhizosphere soil of a healthy pepper plant. This strain showed vigorous biocontrol activity against a variety of fungal pathogens. To help understand its potential mechanisms of antifungal activities, we sequenced the whole genome of strain IBFCBF-5 and analyzed its potential to produce secondary metabolites related to antimicrobial compounds. Our study provides a basis for the development and utilization of metabolites of B. amyloliquefaciens strain IBFCBF-5 in the future.
For morphological observations, strain IBFCBF-5 was grown on LB (Luria-Bertani) solid medium. The colonies are irregular and sub-radial, off-white, wet, and sticky on the surface without wrinkles. Microscopically, the cells are rodshaped, and the flagella are visible under scanning electron microscopy ( Fig. 2).
Based on BLAST search of the 16S rRNA gene, strain IBFCBF showed a sequence identity of > 90% with Bacillus  (Zhao et al. 2018;Song et al. 2008). The 16S rRNA gene sequences of 14 type strains of different species of Bacillus were extracted from GenBank to construct a phylogenetic tree. The result is shown in Fig. 3. Strain IBFCBF-5 had the highest similarity with the 16S rRNA gene of Bacillus amyloliquefaciens MPA (accession number: 117946.1). The GenBank accession number for strain IBF-CBF-5 was SUB9291579. Combined with morphological, physiological, and biochemical identification results, it was preliminarily determined that strain IBFCBF-5 belongs to Bacillus amyloliquefaciens.

Genome sequencing and sequence assembly of IBFCBF-5
A total of 2,879,651,700 bp clean data was generated by the third-generation PacBio sequencer with a sequencing depth of 663X. For genome assembly, we used a filter read length of at least 500 bp and a total 317,690 reads were retained, with an average length of 9064 bp and an N50 length of 10.5 kb. The genome was assembled into one circular contig, with a size of 4,338,658 bp (Fig. 4). The genomic G + C content was 46.05%, with a predicted 4546 genes in total,

Taxonomic status and genetic evolution of IBFCBF-5 strains
Based on the results of ANI (Average Nucleotide Identity) online analysis, the ANI values of IBFCBF-5 strain with Bacillus amyloliquefaciens and Bacillus velez were between 97.65 and 99.00%. In contrast, the ANI values of strain IBFCBF-5 with Bacillus licheniformis and Bacillus subtilis were less than 78.00% (Table 3). At present, the threshold of identification at above species level is lower than 95% ANI. Combined with results in Fig. 5

Functional annotation of the IBFCBF-5 genome
GO (gene ontology) is a database established by the Gene Ontology Consortium to annotate genes in three ways: Cellular Component, Molecular Function, and Biological Process. As shown in Fig. 6, a total of 3061 genes are annotated, of which 31.7% were in "Cellular Component", followed by Note: The outermost circle refers to genome size and gene position, with the distance between two bars representing 5 kb; the second circle and the third circle are genes on the positive and negative strands of the genome, and different colors represent different COG functional classifications; the fourth circle is repetitive sequences; the fifth circle is tRNA and rRNA, with blue representing tRNA, and purple representing rRNA; the sixth circle is the GC content, the light yellow part indicates that the GC content of the region is higher than the average GC content of the genome, (the higher the peak value, the greater the difference), the blue part indicates that the GC content of the region is lower than the average GC content of the genome; the innermost circle is GC-skew, the dark gray represents the area with G content greater than C, and the red represents the area with C content greater than G. Different colors represent different COG function classifications. The colors on the right represent their annotated functions of other groups of genes (the numerals in parenthesis represent the number of genes in each category): A: RNA processing and modi-fication; B: Chromatin structure and dynamics(1); C: Energy production and conversion (179) membrane and membrane parts, at 23.8 and 17.5% respectively. The most significant proportion of "Molecular Function" is in the "Catalytic Activity" category, accounting for 67.2%. In the "Biological Process", those in the "Metabolic Processes" and "Cellular Processes" accounted for 57 and 46.9%, respectively, with significant overlap between these two groups. There are 25 detoxification genes related to antibiotic antagonism, accounting for 0.8% of the total annotated protein-coding genes.
The sequence data were also analyzed based on the KEGG (Kyoto Encyclopedia of Genes and Genomes) database. As shown in Fig. 7, "Metabolism" accounts for the most significant proportion, followed by "Environmental Information Processing". However, biosynthesis of amino acids accounted for a large proportion of the "Metabolism" category, followed by "Carbon Metabolism". Genes for several predicted bacteriostatic substances secreted by Bacillus amyloliquefaciens overlapped with those involved in the synthesis of amino acids.
The results of COG (Cluster of Orthologous Groups of proteins) database analysis showed that a total of 3512 genes were functionally annotated. Among them, those involved in amino acid transport and metabolism, transcription, carbohydrate transport and metabolism, inorganic ion transport and metabolism, cell wall/membrane/envelope biogenesis, and ribosomal structure and biogenesis were estimated at 346,303,261,214,180,162 (Fig. 8).

Gene clusters for secondary metabolites
We used the online software anti-SMASH to analyze the secondary metabolites of strain IBFCBF-5. A total of 14 secondary metabolite gene clusters were predicted (Table 5)    The amino acid sequence similarity between Cluster 2 in strain IBFCBF-5 and Plantazolicin synthesis gene cluster in BGC0000569 is 91%. The similarity between the Difficidin synthesis gene cluster in strain IBFCBF-5 and the Difficidin synthesis gene cluster derived from BGC00000176 is 86%. The similarity between Cluster 6 and Fengycin synthesis gene cluster derived from BGC0001095 is 100%. The similarity between the Bacilllaene synthesis gene cluster and that in BGC000108 is 100%. The similarity between the Macrolactin H synthesis gene cluster and that in BGC0000181 is 100%. The similarity between Cluster 10 and BGC0000693-derived Butirosin synthesis gene cluster is 7%. The similarity between Cluster 11 and BGC0000309-derived Bacillibactin synthesis gene cluster is 100%. The similarity between Cluster 13 and BGC0001184-derived Bacilysin synthesis gene cluster is 100%. Finally, the similarity between Cluster 14 and BGC0000433 -derived Surfactin synthesis gene cluster is 100%.
Five functional unknown gene clusters (Clusters 1, 4, 5, 9, 12) were found, including one phosphonate (Phosphonate), one T3PKS (Type III PKS), two terpenes (Terpene), and one non-ribosomal peptide synthase (NRPS), indicating that there are additional gene clusters for the synthesis of potential new bacteriostatic substances in the strain, and with a great application potential in agriculture and pharmaceutical industry.

Discussion
In 1943, Japanese scientist Fukumoto discovered Bacillus amyloliquefaciens for the first time in soil. Because this group of bacteria could produce liquefied amylase and decompose starch, it was named Bacillus amyloliquefaciens. Many subsequent studies have shown that Bacillus amyloliquefaciens has biological control characteristics, including the abilities to colonize plants, inhibit pathogens, promote plant growth, and induce systemic resistance in plants (Hunter 2009). For example, B. amyloliquefaciens strain LX-11 could limit rice bacterial stripe disease by secreting lipopeptides and surfactants (Hunter 2009); strain LX1 isolated from Hainan saline soil could prevent and control banana wilt by secreting antibacterial proteins ; and strain HN06 showed an excellent inhibitory effect on Aspergillus niger and Magnaporthe grisea (Qiu 2015).   Bacillus can produce various substances with broad-spectrum antimicrobial activities, including lipopeptide antibiotics, bacteriocins, and antibacterial proteins (Chu et al. 2014).
Here we explored the potential genomic factors affecting the biocontrol ability of strain IBFCBF-5 from the rhizosphere soil of a pepper plant. Results of the experiment in vitro showed that strain IBFCBF-5 had an apparent inhibitory effect on oomycete and fungal pathogens causing pepper blight, pepper white silk disease, Camellia oleifera anthracnose, and cucumber fusarium wilt, indicating that the strain had great potential as a biocontrol agent. To help identify the potential biocontrol mechanism of strain IBFCBF-5, we obtained and analyzed its whole genome sequence.
The genome of strain IBFCBF-5 was compared with the GO database, KEGG database, and COG database. Comparison with the GO database showed that 67.2% of GO terms belonged to catalytic activity, the most significant proportion. Comparison with the KEGG database showed that biosynthesis of amino acids accounted for a large proportion of "Metabolism". Finally, comparison with the COG database showed a high abundance of genes related to transcription, translation, and core metabolisms. Together, these features of the IBFCBF-5 genome are similar to those of most other Bacillus strains.
We also compared the IBFCBF-5 genome with the CAZyme database and identified the gene clusters involved in the production of bioactive secondary metabolites. A total of 174 CAZyme family genes were obtained from strain IBFCBF-5, including glycoside hydrolase (23), glycoside transferase (10), carbohydrate esterase (8), carbohydratebinding module (9), auxiliary activities (6), and polysaccharide lyase (2), that can degrade cellulose and hemicellulose, chitin, starch, xylan, peptidoglycan, etc. The cell wall of most pathogenic fungi is mainly composed of cellulose, glucan, and chitin. The results of the CAZyme analyses of strain IBFCBF-5 show that this strain can degrade most components of the fungal cell wall, thus inhibiting fungal spore germination and mycelial growth. Furthermore, we found 14 lipopeptide and polyketide antibiotic gene clusters in the genome of strain IBFCBF-5, including six antibiotic synthesis gene clusters coding for proteins with amino acid similarities > 90% to six known classes of antibiotics (Fengycin, Bacillaene, macrolactin H, Bacillibactin, Bacilysin, plantazolicin). Among these known antibiotics, Fengycin has strong antifungal activity and weak bacteriostatic activity (Ongena and Jacques 2008b;ZhiY. 2017); Bacilysin is a dipeptide with extensive inhibitory activity against fungi and bacteria (Tao et al. 2018;Chen and Jia 1992); Bacillibactin and Plantazolicin both have strong inhibitory activity against bacteria (Chen 2015); and Bacillibactin with strong antifungal inhibitory activity (Zhang et al. 2020). Moreover, the strain IBFCBF-5 contains bacteriostatic gene clusters similar to that in Macrolactin and Bacillaene, indicating that this strain has a broad potential of inhibiting pathogenic fungi as well as pathogenic bacteria.
We did not do expression analyses. However, we believe that the genes coding for the enzymes to synthesize the compounds as well as those involved in the exports of these Fig. 9 Gene clusters in strain IBFCBF-5 involved in the biosynthesis of nine secondary metabolites related to antimicrobial substances compounds all need to be expressed in order for the antimicrobial metabolites to be synthesize and secreted. That's our next research focus.

Conclusion
In this paper, a strain of B. amyloliquefaciens IBFCBF-5 was isolated and shown to have an inhibitory effect on Phytophthora capsici and Sclerotinia sclerotiorum, Colletotrichum gloeosporioides, and Fusarium oxysporum f. sp. cucumerinum. To explore the biocontrol mechanism of the strain, we obtained its whole genome sequence. The whole genome sequence data confirmed its taxonomic identity and showed that the strain has the potential to degrade most components of the fungal cell wall and to secrete antimicrobial metabolites. The genome sequence results identified the putative candidate genes of strain IBFCBF-5 from which further explorations on its biocontrol mechanisms could be targeted.

Strain screening and identification
The rhizosphere bacteria were isolated using protocols described previously (Anith et al. 2003). All strains were inoculated on LB plate and cultured at 30 ℃ for 18 h. A biological microscope (model: BX53, Olympus (China) Co., LTD.) was used to observe colony morphology (at 20 × amplification). All strains were tested for their ability to inhibit the growth of four plant pathogens. Specifically, using the standard culture method (Ren et al. 2012), we grew cultures of four plant fungal and oomycete pathogens Phytophthora capsica, Sclerotinia sclerotiorum, Colletotrichum gloeosporioides, and Fusarium oxysporum f. sp. cucumerinum. To test for their susceptibility to rhizosphere bacteria, a mycelial block of 5 mm diameter was inoculated in the center of a 9 cm diameter PDA plate, and then strain IBF-CBF-5 was spotted at a distance of 2.5 cm from the center of the plate. Each treatment was repeated three times. The control plates had no culture of strain IBFCBF-5. The plates were placed upside down in an incubator at 28 ℃ for seven days. The diameters of the zone of inhibition were measured, and the bacteria with the largest inhibition zone was selected as the candidate strain for follow-up research (Khedher et al. 2015;Anith et al. 2003;Du et al. 2017).

Extraction of high-quality DNA from strain IBFCBF-5 for genome sequencing
The strain was inoculated in LB liquid medium and cultured aerobically at 30℃ for 15 h, until the optical density value reached 2.697. The cells were collected by centrifugation at a speed of 6500 r/min for 15 min, and the collected bacteria weighed about 5 g. The genomic DNA was extracted using the bacterial DNA kit (E.Z.N.A. Bacterial DNA Kit, Omega Bio-Tek), following the manufacturer's instructions.
The extracted genomic DNA was sent to Beijing Biomarker Biotechnology Co. Ltd. for sequencing. Sequencing was conducted using both the Illumina Hiseq platform and the PacBio Sequel third-generation single-molecule realtime sequencing system. Low-quality reads were filtered through SMRT 2.3.0 (BerlinK et al. 2015).

Genome assembly and sequence analysis of strain IBFCBF-5
The genome of strain IBFCBF-5 was assembled using Canu v1.5 software (Koren et al. 2017). The Pilon software (Walker et al. 2014) was used to correct any mistakes in the assembled genome using results of Illumina Hiseq sequencing.
For gene prediction, the Repeat Masker software was used to predict and mask the repetitive sequences of the assembled genome (Tarailo-Graovac and Chen 2009). We used the Prodigal software to predict the protein-coding genes (Hyatt et al. 2010), tRNAscan-SE to predict the transfer RNA (tRNA) genes, and Infernal 1.1 to predict the ribosomal RNA (rRNA) genes and other ncRNAs except for tRNA and rRNA (Lowe and Eddy 1997;Nawrocki and Eddy 2013). Using the predicted protein sequences and the protein sequences in the Swiss-Prot database, we compared the homologous gene sequences in the IBFCBF-5 genome by software GenBlastA (She et al. 2009), and then used the software GeneWise[41]to find premature stop codons and frameshift mutations in gene sequences to obtain pseudogenes. Using the predicted genome information, we obtained estimates of repeat sequences, GC content, etc. We used the program Circos to draw the circular genome map (Krzywinski et al. 2009).

Analysis of genome evolution of strain IBFCBF-5
The genome-wide average nucleotide identity (ANI) refers to the overall similarity of homologous genes between two genomes. It is generally believed that the ANI value between strains of the same species needs to reach more than 95% (Hui and Zhang 2016). To further determine the taxonomic status of Bacillus strain IBFCBF-5, the online ANI (https:// www. ezbio cloud. net/ tools/ ani)) was used to analyze the taxonomic status of IBFCBF-5 strains according to its genome sequence. DNA-DNA hybridization (DDH) refers to DNA molecules with complementary base sequences, forming stable double-stranded regions such as hydrogen bonds between base pairs. The DDH value of the IBFCBF-5 strain was analyzed by online DDH (http:// ggdc. dsmz. de/ ggdc. php) to further determine its taxonomic status.
The relationships among strains were analyzed using their 16S rDNA sequences and by comparing with various reference strains to construct a phylogenetic tree. In addition, the whole-genome sequences of strains CC09, XJ5, 2J01, WF02, YP6, S4, ATCC 14,580, CW14, SP1, BS49, and ZJU1 published previously were compared with that of our strain IBFCBF-5 to derive the concatenated SNP profiles. The concatenated SNPs were analyzed using the PhyML (version 3.0) software (Gascuel 2010), and the phylogenetic tree was constructed by the ML method (maximum likelihood method).

Functional annotation of the genome of strain IBFCBF-5
The predicted gene sequences were compared with available databases including COG (Tatusov et al. 2000), KEGG (Kanehisa et al. 2004), Swiss-Prot (Boeckmann et al. 2003), TrEMBL (Deng et al. 2006), Nr (Altschul et al. 1997) and other functional databases. The results of functional gene annotations were obtained. Based on the comparison results of the Nr database through the program Blast2GO (Altschul et al. 1997;Conesa et al. 2005), the functions of the annotated genes were linked to the GO database. Furthermore, the software hmmer (Eddy 1998) was used to identify and annotate the Pfam domains based on the Pfama (Finn et al. 2016) database. In addition, we conducted COG and KEGG metabolic pathway enrichment analysis, GO functional enrichment analysis, and other gene function annotation analyses.

Comparative analysis of CAZyme database of IBFCBF-5 strains
Based on their functional properties, carbohydrate-active enzymes (carbohydrate-active enzymes, CAZyme) are separated into different categories, including glycoside hydrolases, glycosyltransferases, polysaccharide lyases, carbohydrate esterases, co-oxidoreductases, and carbohydrate-binding modules without catalytic activity. Based on the carbohydrate-active enzyme database CAZyme, the functional annotation and analysis of carbohydrate enzyme genes were carried out using hummer software (Finn et al. 2014(Finn et al. , 2016.

Analysis of gene clusters related to secondary metabolites of strain IBFCBF-5
The anti-SMASH software (version5.1.2) was used to analyze and predict the gene clusters related to the synthesis of antagonistic substances in the genome of the IBFCBF-5 strain. According to the preliminary online prediction results, the corresponding genes related to antibacterial substance synthesis were downloaded and compared one by one to determine whether the antagonistic genes in the IBFCBF-5 genome were deleted or mutated.

Statistical analysis
Statistical analysis was performed using Microsoft Excel 2010 (Microsoft Corporation, Redmond, WA, USA) and DPS 7.05 software (Zhejiang University, Hangzhou, China). Mean values were compared using Duncan's multiple range test with P < 0.05 as the level of significance.
Author contribution statement Zeng L-B, Li J-L and Wei L, Chen Y developed the idea for the study and guided it all the way. Luo C and Ren Y-S、Shen A-R designed and completed this study. All authors analyzed the data and were involved in writing the manuscript. Xu J-P and Luo C finished revising the English manuscript.
Data availability The data set generated and/or analyzed during the current study can be found in the NCBI repository [https:// www. ncbi. nlm. nih. gov/], including a total of 1 sequence, with accession numbers: SUB9291413 (IBFCBF-5),. The data will be released in October 2021.

Conflict of interest
The authors declare that they have no competing interests.
Ethical approval Not applicable.
Consent for publication Not applicable.