Paenibacillus arenilitoris sp. nov., isolated from seashore sand and genome mining revealed the biosynthesis potential as antibiotic producer

Strain IB182493T, a marine, aerobic, Gram-stain-negative and motile bacterium, was isolated from seashore sand of South China Sea. Cells grew optimally at 25–30 °C, pH 7.0–8.0 and with 2–4% NaCl (w/v). Phylogenetic analysis based on 16S rRNA gene sequence comparison revealed that the strain formed a distinct lineage within the genus Paenibacillus, and was most closely related to Paenibacillus harenae DSM 16969 T (similarity 96.6%) and Paenibacillus alkaliterrae DSM 17040 T (similarity 96.1%). The chemotaxonomic characteristics of strain IB182493T included MK-7 as the predominant isoprenoid quinone, anteiso-C15:0 and iso-C16:0 as the major cellular fatty acids and meso-diaminopimelic acid as the diagnostic diaminoacid in cell wall peptidoglycan. The polar lipids consisted of phosphatidylethanolamine, phosphatidylglycerol, diphosphatidylglycerol and two unidentified phospholipids. The DNA G + C content of strain IB182493T was 56.2 %. The values of whole genome average nucleotide identity (ANI) and digital DNA-DNA hybridization (dDDH) between the isolate and the closely related type strains were less than 84.7% and 23.6%, respectively. On the basis of phenotypic and chemotaxonomic properties, phylogenetic distinctiveness and genomic data, we named the strain as Paenibacillus arenilitoris sp. nov. and proposed that strain IB182493T (= MCCC 1K04626T = JCM 34215 T) in the genus Paenibacillus represents a novel species.


Introduction
The genus Paenibacillus, a member of the family Paenibacillaceae (De Vos et al. 2009), was created for rRNA group 3 bacilli on the basis of 16S rRNA gene sequence analysis (Ash et al. 1993). Most members of the genus of Paenibacillus are non-pigmented, motile by means of peritrichous flagella, contain meso-diaminopimelic acid as the major diamino acid in the cell wall peptidoglycan and menaquinone 7 (MK-7) as the major menaquinone (Priest 2009). At the time of writing this manuscript, there were more than 270 species of this genus with validly published (https:// lpsn. dsmz. de/ genus/ paeni bacil lus). Species of the genus Paenibacillus are widely distributed in various ecological niches, with many of the species being relevant to humans, animals, plants, and the environment (Grady et al. 2016). Recently, many new species of this genus have been isolated from various ecological habitats, including soil (Kim et al. 2021;Kämpfer et al. 2021;Klm et al. 2021;Yang et al. 2021a), Arabidopsis thaliana (Qi et al. 2021), nodules of soybean (Wang et al. 2021), seawater (Chen et al. 2021), corridor air (Liu et al. 2021) and salt lake (Yang et al. 2021, b).The Paenibacillus species have played an important role in industrial, agriculture and medical applications, such as degrading starch granules (Vander Maarel et al. 2000), enhancing plant growth through phosphate solubilization and nitrogen fixation (Lee et al. 2011;Jin et al. 2011) and producing antibiotics (Chung et al. 2000;Romanenko et al. 2013).
Natural products and their derivatives have occupied 50% of approved drugs in the world (Newman and Cragg 2020). For the discovery of pharmaceutical leads, tremendous studies have been found about the secondary metabolites from terrestrial plants and microorganisms. Therefore, the discovering number for novel natural products has reached a steady state, and the rate of finding compounds with unique novel skeletal structures has become extremely difficult over recent decades (Pye et al. 2017). Marine environments cover more than 70% of the surface of the earth, and are habitat of diverse microorganisms. Marine microorganisms are rich sources for a lot of bioactive natural products. Marine natural products are relatively efficient in the discovery of drug leads (Pereira 2019;Khalifa 2019), including the anti-cancer drugs trabectedin (discovered from a marine tunicate Ecteinascidia turbinata), eribulin mesylate (synthetic mimic to halichondrin B, which was isolated from a marine sponge Halichondria okadai) (Pereira 2019). It has become clear that the identification of new antimicrobial compounds is vigorously related with the discovery of novel species (Thumar et al. 2010). Thus, mining of microorganisms from various habitats is considered an advantageous approach to discover novel antibiotics (Baumann et al. 2014). In this paper, we describe the strain IB182493 T which has the potential to produce various biological activities such as receptor antagonist, enzyme inhibitor, anti-tumor metastases and antibacterial agents (Wilson et al. 2003;Knappe et al. 2010;Iwatsuki et al.2006). The purpose of the present study was to establish the taxonomic position of a novel Paenibacillus like strain IB182493 T based on polyphasic taxonomy.

Collection and microbial isolation
A seashore sand sample was collected from Zhaoshu Island (16°58′53.3′′ N, 112°16′33.6′′ E), Hainan province, China, in March 2018. Sterilized PBS was used for suspending the sand sample and for serial dilutions. 100 μL suspension was spread plated on marine agar 2216 (MA, Hopebio). After 5 days of culturing at 28 °C, colonies with different morphologies were picked up and purified. Among the bacteria, strain IB182493 T was isolated and identified. The strains used in this study were sub-cultured on MA at 30 °C and stored at − 80 °C in marine broth 2216 (MB, Hopebio) containing 20% glycerol (v/v).

Genome features and phylogenomic analysis
The draft genome sequencing of strain IB182493 T and P. alkaliterrae DSM 17040 T were conducted using an Illumina HiSeq 2500 platform by Biomarker Technologies Co., Ltd. (Beijing, China). The de novo genome assembly was performed using SPAdes 3.5.0 (Bankevich et al. 2012). The G + C content was analyzed with the RAST server using the draft genome sequence (Brettin et al. 2015). The genome of P. harenae DSM 16969 T (NCBI accession number AULV01000000) was retrieved from the NCBI database.
The phylogenomics tree of strain IB182493 T and related species based on whole genome was constructed using a bioinformatics platform: Type (Strain) Genome Server (http:// tygs. dsmz. de/) (Meier-Kolthoff et al. 2019). The obtained draft genome of IB182493 T was annotated using the KEGG and COG analysis for gene function prediction (Kanehisa et al. 2016;Tatusov et al 2003). Both genomes of the strains IB182493 T and P. alkaliterrae DSM 17040 T were deposited at GenBank /EMBL/ DDBJ with the accession numbers JACXIY000000000 and JAK-GAP000000000, respectively.

16S rRNA phylogeny
The genomic DNA extraction and PCR amplification of the 16S rRNA gene sequence were performed as described previously (Chen et al. 2021). The determined 16S rRNA gene sequence was compared with those of other type strains using the EzBioCloud database (https:// www. ezbio cloud. net/ ident ify). For comparison, phylogenetic reconstructions based on 16S rRNA genes were performed by using MEGA 11 (Tamura et al. 2021). Neighbour-joining (NJ) (Saitou et al. 1987), maximum-parsimony (MP) (Fitch et al. 1971) and maximum-likelihood (ML) (Felsenstein et al. 1981) methods with 1000 bootstrap replicates were used to reconstruct phylogenetic trees. Distances were obtained using options according to the Kimura's two-parameter model (Kimura et al. 1980). Bacillus subtilis NCIB 3610 T was added to the phylogenetic trees to serve as an outgroup.

Phenotypic characterization
Gram-staining of strain IB182493 T was performed using a Gram-staining kit (Solarbio), and the endospores were stained according to the Schaeffer-Fulton method (Smibert and Krieg 1994). The cell morphology was observed by light microscope (Leica DM6000B, × 1000 magnification). The images of cells grown on MA from the exponential phase were obtained by means of transmission electron microscope (HT7700, Hitachi, Japan) at an accelerating voltage of 100 kV. The motility of the strain was determined by observing the growth spread of cells in MB as described previously (Chen et al. 2021). Growth under anaerobic conditions was determined on MA for 7 days at 28 °C using the anaerobic jars containing AnaeroPack-CO 2 bags (Mitsubishi). Cultural Characteristics of strain IB182493 T were investigated on MA, yeast-malt extract agar (ISP2), oatmeal agar (ISP3), Reasoner's 2A agar (R2A), nutrient agar (NA), tryptic soy agar (TSA), Gause's agar and potato dextrose agar (PDA), all of the media were adjusted with NaCl to 2% (w/v).
Growth at different salinities was tested in the presence of 0-20% (w/v) NaCl with modified nutrition broth (peptone 5 g L -1 , beef extract 3 g L -1 ) at 28 °C for 7 days.
Catalase activity was evaluated with a 3% (v/v) hydrogen peroxide solution. The acid production from carbon sources, the enzyme activity and sole carbon source substrate utilization were determined using the API 50CH, API 20NE and API ZYM test strips (Bio-Mériux, France) according to the manufacturer's recommendations except that the AUX medium was adjusted to 2% (w/v) NaCl. All the strips were incubated at 30 °C and recorded after 24 and 48 h.

Chemotaxonomy
For fatty acid analysis, cell mass of strain IB182493 T , P. harenae DSM 16969 T and P. alkaliterrae DSM 17040 T were harvested from TSA (Hopebio, China) plates after incubation for 48 h at 28 °C. The wholecell fatty acids were then extracted, methylated and analyzed using the standard protocol of the Microbial Identification System (Sherlock software version 6.3; MIDI library: RTSBA6) as described by Sasser et al. (2001). The respiratory quinones were extracted and analyzed using reversed-phase HPLC as described by Komagata and Suzuki (1988). Polar lipids were extracted and analyzed by two-dimensional TLC method according to the protocols of Minnikin et al. (1984). The amino acid composition in the peptidoglycan was determined by using the method described by Schumann (2011).

Results and discussion
The draft genome of strain IB182493 T contained 86 contigs and with a size of 7.06 Mbp. 10 rRNAs (2, 5, 3 for 5S, 16S, 23S rRNA, respectively) and 70 tRNAs were detected. The general features of the genome of strain IB182493 T were listed in table S1. The genomic DNA G + C content of the isolate was 56.2% and within the range of 40-59% reported for the genus Paenibacillus (Priest 2015) but higher than those of P. harenae DSM 16969 T (49.9%) and P. alkaliterrae DSM 17040 T (49.4%). The distribution of the genes into clusters of orthologous groups (COGs) functional categories is presented in Fig. S1. In the phylogenomics tree (Fig. S2), strain IB182493 T clustered with P. harenae DSM 16969 T . The ANIb, ANIm and rthoANIu values of strain IB182493 T and the closely related type strains ranged from 69.1-78.6%, 82.9-84.7% and 71.8-80.1%, respectively, while dDDH values ranged from 23.6-18.5% (Table S2). All of these values meet the criteria for bacterial species demarcation (Richter and Rosselló-Móra 2009; Chun et al. 2018) and support the hypothesis that IB182493 T represents a novel species within the genus Paenibacillus.
The obtained 16S rRNA gene sequence of strain IB182493 T was 1470 bp long and the GenBank / EMBL/ DDBJ accession number was MK249696. According to the EzBioCloud database, strain IB182493 T represented a member of the genus Paenibacillus and showed the highest 16S rRNA gene sequence similarity to P. harenae DSM 16969 T (96.6%), P. alkaliterrae DSM 17040 T (96.1%), P. agarexedens DSM 1327 T (96.1%) and P. agaridevorans DSM 1355 T (96.1%). Phylogenetic analyses showed that the isolate formed a discrete cluster with P. harenae DSM16969 T and P. alkaliterrae DSM the similar results (Figs. S3 and S4, available in the online version of this article). Based on 16S rRNA gene sequence similarities and phylogenetic analyses, the most closely related species, P. harenae DSM 16969 T and P. alkaliterrae DSM 17040 T were used as reference strains and examined for their genomic and phenotypic characteristics in comparison with those of the new isolate. Both of the two reference strains were obtained from the Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH (DSMZ).
The novel isolate had similar proportions of the fatty acids to the reference strains (Table S3).
Based on the result of the gene cluster prediction, strain IB182493 T contained 9 secondary metabolite gene clusters in the genome (Table S4), while 8 clusters in P. harenae DSM16969 T and P. alkaliterrae DSM 17040 T . The genome mining revealed that the novel strain has the potential to produce many secondary metabolites including lasso peptide paeninodin, ectoine, basiliskamide A/B and staphylobactin, etc (Table S4). The lasso peptide belongs to a new class of natural product with highly compact and stable structure, which has various biological activities such as anti-tumor metastases, receptor antagonist, enzyme  5.0-10.0 6.0-10.0 6.0-9.5 Optimum 7.0-8.0 7.0-7.5 7.5-8.0 NaCl range (%, w/v) 0-9 0-3 0-2 Optimum 2-4 0-2 0-1 Assimilation of (API 20NE) inhibitor and antibacterial agents (Wilson et al. 2003;Knappe et al. 2010;Iwatsuki et al.2006). As known that lasso peptides are non-pathogenic, and have great resistibility to high temperature, acidic condition and most proteases, therefore lasso peptides may be used as multifunctional backbones for further medical use (Knappe et al. 2011;Hegemann et al. 2014;Meyer et al. 2006). In the genome of strain IB182493 T , lasso peptide paeninodin biosynthetic gene clusters with 100% similarity to that of strain P. dendritiformis C454 (Sirota-Madi et al. 2012) were found. The lasso peptide paeninodin cluster has a gene encoding a kinase, which was represented as member of a new class of lasso peptide tailoring kinases. By employing a wide variety of peptide substrates, it was shown that the novel type of kinase specifically phosphorylates the C-terminal serine residue while ignoring those located elsewhere (Zhu et al 2016). In genomic data analysis, 75% similarity of ectoine gene cluster was found in the genome of strain IB182493 T compared to Streptomyces anulatus (Beijerinck 1912). The isolate and the closely related species (P. harenae DSM16969 T and P. alkaliterrae DSM 17040 T ) contain ectoine gene cluster and were isolated from the similar extreme habitat such as sand and soil which were dry, salinity or alkaline-like extreme. This evidence suggests that the ectoine is primarily associated with extreme environments, as has been reported by Brown (1976) that ectoine is essential for extremophiles to survive in extreme environments. In addition, basiliskamide A/B biosynthetic gene cluster with 9% similarity to that of strain Brevibacillus laterosporus PE36 (Theodore et al. 2014) were found in the genome of strain IB182493 T . Despite the relatively closeness, no basiliskamide A/B biosynthetic gene clusters were detected in the genome sequence of strain P. harenae DSM16969 T and P. alkaliterrae DSM 17040 T . In conclusion, the complete genome of strain IB182493 T will help further studies regarding the biosynthesis of diverse secondary metabolites and their regulatory mechanisms. Based on phenotypic, phylogenetic and genomics analyses, strain IB182493 T is considered as a type strain of a novel species with the proposed name, Paenibacillus arenilitoris.