Genome sequencing, assembly and annotation
B. velezensis YYC strain was propagated in Luria-Bertani broth with shaking at 180 r/min overnight at 30°C. By alignments of the 16S ri-bosome RNA and housekeeping genes, it was identified as Bacillus velezensis. Bacterial genomic DNA extraction kit (Majorbio Bio-pharm Technology Co., Ltd., Shanghai, China) was used to extract genomic DNA. A TBS-380 fluorometer (Turner Bio Systems Inc., Sunnyvale, CA) was used to quantify the purified genomic DNA. PacBio RS II Single Molecule Real Time (SMRT) and Illumina sequencing platforms were used to sequence the genomic DNA. The sequencing yielded 170,436 reads, including 1,341,760,841 bp, with 337.7× sequence depth. A statistic of quality information was applied for quality trimming, by which the low-quality data could be removed to result in clean data. Using Unicycler (Version 0.4.7) (Wick et al. 2017), the reads were assembled into contigs. A complete genome was generated by inspecting and completing the last circular step. Finally, using the Illumina reads, error correction of the PacBio assembly results was performed.
The number of protein coding sequences (CDSs) in the B. velezensis YYC genome was predicted by Glimmer (version 3.02) (http://ccb.jhu.edu/software/glimmer/index.shtml) (Delcher et al. 2007) and GeneMarkS software (version 4.3) (Besemer et al. 2005). The transfer RNA (tRNA) gene was analyzed by tRNAscan-SE v2.0 software (Version 2.0) (http://trna.ucsc.edu/software) (Chan et al. 2019). Barrnap software (Version 0.8) (https://github.com/tseemann/barrnap) was utilized to predict ri-bosome RNA genes. By aligning reads with the Nonredundant (NR), Swiss-Prot, Kyoto Encyclopedia of Genes and Genomes (KEGG) (Kanehisa et al. 2016), Gene Ontology (GO) (Ashburner et al. 2000), Cluster of Orthologous Groups of proteins (COG) (Galperin et al. 2015) and protein families (Pfam) (Finn et al. 2014) databases, all genes were annotated. The bioactive secondary metabolites were predicted by antiSMASH software (Version 4.0.2) (Weber et al. 2015).
General genome features of B. velezensis YYC
Whole-genome sequencing showed that the B. velezensis YYC strain contained a genome size of approximately 3,973,236 bp, with an average G + C content of 46.52%. The Glimmer program predicted that the number of protein coding sequences (CDSs) was 4,034, and the average gene length was 877.29 bp. Furthermore, a total of 86 tRNA and 27 ri-bosome RNA genes were identified and analyzed in the genome. By aligning the genome to sequences from diverse databases, including the NR, Swiss-Prot, Pfam, COG, GO and KEGG databases, the numbers of identified genes were 4,034, 3,533, 3,337, 3,013, 2,668, and 2,163, respectively.
The KEGG database analysis showed a great number of two-component systems (113 genes) and ABC transporters (117 genes). Meanwhile, 69 genes were related to quorum sensing, which were important for cross-kingdom communication (Schikora et al. 2016).