Salt tolerance and ectoine biosynthesis potential
P. halophilum was selected due to its high salt resistance potential during our primary assay of the Sfax saltern strains library. This strain grows well on Bennett’s and ISP2 agar, generating white spores. To determine its resistance potential against environmental salinity, it was grown in modified SW-media supplemented with different concentrations of NaCl and the results are presented in Fig. 1a. No growth was detectable up to a salinity of 5% NaCl, a property expected for a halophile. An increase in the salinity up to 20 % NaCl strongly stimulated growth, but further increases to 25% impaired it. In fact, P. halophilum not only depends on a considerable salt concentration for its growth but it can also cope with a broad spectrum of salinities (from 5 to 20 % NaCl). To correlate the osmotolerance capacity of the strain with the solute compatible production, we assessed the ability of P. halophilum for ectoines synthesis. According to the results of the HPLC analysis, we found that at an optimum salinity growth of 15% NaCl, the most abundant osmoprotector was ectoine and hydroxyectoine that shared the same retention times of 22.609 min and 20.027 min, respectively with standard ectoines (Fig. 1b).
Genome characteristics and phylogeny
Upon sequencing, the chromosome of P. halophilum was assembled from 13 million reads resulting in a total length of 3,789,770 bp and including 213 contigs with N50 size of 304,586 bp. The GC content of DNA for the strain is 51.5 mol% as calculated from the whole-genome sequence. The properties and statistics of the genome are summarized in Table 1. The genome annotation report by NCBI revealed that the genome constituted of a total of 3,775 genes, among which 3616 protein-coding genes and 73 RNA genes (57 tRNAs, 12 rRNAs and 4 ncRNAs). A total of 8 retroelements with a size of 2314 bp, presented by LTR elements contained seven Ty1/Copia (2002) and one Gypsy/DIRS1, two DNA transposons (158pb) were also detected (Table 1). To further reveal the evolutionary relationship of P. halophilum with other thermoactinomycete strains, the whole genome sequence of the strain was aligned and compared with those of the different strains selected from among the genomes submitted to the NCBI genome sequence database using Mauve software (http://asap.ahabs.wisc.edu/mauve/). It was found that P. halophilum clustered under a separate node with the three Thermoactinomycete, Kroppenstedtia eburnean, Planifilum fulgidum and Melghirimyces thermohalophilus as shown by the phylogenomic tree construction (Fig. 2). These results confirm a previous study based on the comparison of rRNA16S gene sequences of the strain with similar species deposed in Gene Bank (Frikha-Dammak et al. 2016). The genome-scale phylogeny constructed in this work clearly places P. halophilum among taxons in Firmicute Phylum under the family of Thermoactinomycetaceae and confirms that P. halophilum presents a new genus and a new species.
Table 1
Feature and mobile elements of P. halophilum
Features
|
P. halophilum
|
Genome size
|
3,78 Mb
|
GC %
|
51,5
|
Number of contig
|
213
|
Contig N50
|
304,586 pb
|
Contig N90
|
5 Mb
|
Secreted proteins
|
3527
|
Gene cluster predicted
|
9
|
retroelements
|
8
|
Size of retroelements
|
2314 bp
|
Ty1/Copia(2002)
|
7
|
Gypsy/DIRS1
|
1
|
DNA transposons
|
2
|
Size of DNA transposons
|
158pb
|
The comparison with the three closely related species shows that the genome size of P. halophilum is similar to that of P. fulgidum (3,36 Mb), K. eburnea (3,53 Mb) and slightly higher than M. thermohalophilus (3,19 Mb) (Table S1). Besides, the GC% content of P. halophilum is comparable to the GC% of M. thermohalophilus (52,9%), while it is slightly lower compared with GC % of P. fulgidum (58,5%) and K. eburnean (54,1%). In this database search, 3527 proteins were deduced from 3,702 CDSs (CoDing Sequence), which are more important than K. eburnean (3360) M. thermohalophilus (3074) and P. fulgidum (3223). Also, the tRNAScan-SE predicted that a total of 57 tRNAs in the genome of strain SMBg3, that are similar to K. eburnea (57 tRNAs), M. thermohalophilus (56 tRNAs) but more important than P. fulgidum (54 tRNAs). The rRNAs operons present in SMBg3 (5S, 16S, 23S) are also detected in the other comparative strains M. thermohalophilus and P. fulgidum whereas the K. eburnea strain revealed only (5S, 16S) (Table S1).
Distribution of gene using Gene Ontology (GO)
Among the 3,775 annotated genes in the P. halophilum genome, 3,702 genes with specific functions were assigned to GOCs (Gene Ontology Consortium) classified in 13 functional classes (Fig. 3). Among them, protein synthesis (23%), energy metabolism (11.8%), transcription genes (5.1%), and amino acid biosynthesis (9.4%) were abundant categories. The cell envelope (3.1%), the cofactors and prosthetic groups (14.7%) synthesis, DNA metabolism (7.4%), and purine, pyrimidines, nucleoside and nucleotides synthesis (7.8%) are supplementary functions assigned to GOCs. The unclassified and unknown function were estimated at 4.9 % and 3.6% respectively. Based on GO, genes having InterProScan hits were studied for distribution within functional categories. Allowance by GO domains, “Biological Process” and Molecular Function”, according to generic terms at level 2 in Blast 2GO is illustrated in Fig. 4. The most “Biological process” groups were single organism process (23.23%), metabolic process (30.44%), cellular process (27.22%), biological regulation (5,35%), localization (1.5%), response to stimulus (2.78%), and cellular component organization or biogenesis (2.32%) (Fig. 4a). Distribution according to the “Molecular function” revealed the presence of those genes involved in compound binding (36.46%), catalytic activity (50.17%), transporter activity (5.69%), nucleic acid binding and transcription factor activity (2.66%), structural molecular activity (1.79%), molecular transcription factor activity (0.866%), electron carrier activity (0.835%), antioxidant activity (0.68%) and molecular function regulation (0.06%) (Fig. 4b).
To further distinguish P. halophilum from the three closely related thermoctinomycete species, we ran EDGAR analysis and results are shown in Fig. 5. The strain SMBg3 was shown to harbor 43 distinct genes which were not found in the other three closely associated species. Moreover, several distinctive genes were also identified in the other three species, shown in parenthesis: K. eburnean (21), P. fulgidum (26) and M. thermohalophilus (15), respectively (Fig. 5a). Homology searching of genes encoding known proteins involved in secondary metabolism reveals the existence of at least 1638 common gene clusters associated with the biosynthesis of secondary metabolites between the four strains. P. halophilum yielded 2621 metabolites, K. eburnean showed 2614 genes coding for secondary metabolite, M. thermohalophilus provided 2516 as global genes coding for secondary metabolite, and P. fulgidum 2264 genes coding for a global metabolic secondary production (Fig. 5b).
Genes involved in ectoines biosynthesis and degradation
The second objective of the study was to identify, based on the draft genome of P. halophilum, biosynthetic and catabolic pathways of ectoines under salt stress conditions. While the bacterial genome lacked all the genes for ectoines degradation, it contains the whole canonical ectABCD ectoine/hydroxyectoine biosynthetic gene cluster, a diaminobutyrate- 2-oxoglutarate transaminase (ectB), a L-2,4- diaminobutyric acid acetyltransferase (ectA), an ectoine synthase (ectC), and an hydroxyectoine synthase (ectD) (Fig. 6). The nucleotide blast results extracted from NCBI database showed that genes ectA, ectB, ectC and ectD of P. halophilum shared 100% identity with those of Streptomyces chrysomallus. However, ectoine/hydroxyectoine biosynthetic gene clusters often contained other genes involved in either the transcriptional regulation (ectR) of the ect operon, the provision of the precursor L-ASA (asD) from aspartyl-P and the aspartyl-P from aspartate (asK_ect), or sometimes, even a gene for a mechanosensitive channel (mscS) (Reshetnikov et al. 2011; Widderich et al. 2016; Czech et al. 2018b). We found that P. halophilum genome lacks the mentioned ectR, ask_ect, or mscS genes. However, we found four genes encoding a binding-protein dependent ABC transporter located upstream of the ectABCD genes (Fig. 6). A closer analysis of these genes revealed that the encoded proteins are related to those of the functionally characterized ectoine/hydroxyectoine ABC-type uptake system Ehu ABCD from S. meliloti (Jebbar et al. 2005; Hanekop et al. 2007). To compare the organization of the genomic region encoding the ectoine/hydroxyectoine synthase of P. halophilum with the industrial ectoine producer H. elongata a mauve analysis was also used (Fig. 6). Results showed that the four genes in P. halophilum were organized in a canonical ectABCD cluster, while in H. elongata, the ectD gene encoding the hydroxylase for hydroxyectoine synthesis is located apart from the ectABC cluster. In addition, Ehu genes coding for the transport of ectoine/hydroxyectoine were found only in P. halophilum.
Central carbon metabolism related to the synthesis of precursors of ectoines
As a next step, we mapped in P. halophilum and with the aid of KEGG analysis, the central carbon metabolism to investigate its potency to provide building precursors for ectoines biosynthesis (Czech et al. 2018b; Ma et al. 2020). The bacterial genome carries genes for Glycolysis, Pentose Phosphate (PPP), and Tricarboxylic Acid (TCA) pathways, but missing genes for Entner Doudoroff (ED) pathway (Fig S1). We also assessed a number of key enzymes such as, Pyruvate carboxylase, Phosphoénolpyruvae carboxylase and Oxaloacetate decarboxylase. These enzymes interconvert pyruvate, phosphoenolpyruvate, and OAA and could have a role in supporting high ectoine biosynthetic fluxes by anaplerotic pathways replenishing OAA needed for the TCA cycle. The genes encoding these enzymes, except oxaloacetate decarboxylase were identified from the genome of P. halophilum (Table 2).
Table 2
Genomic analysis of P. halophilum genes involved in anaplerotic pathways for replenishing OAA and Glutamate
Gene Protein
|
ORF number
|
oad oxaloacetate decarboxylase
|
-
|
pcpyruvate carboxylase
|
2
|
Ppcphosphoenolpyruvate carboxylase
|
2
|
Alaatalanine aminotransferase
|
-
|
Aldalanine dehydrogenase
|
-
|
Gltglutamate synthase
|
13
|
Gdhglutamate dehydrogenase
|
3
|
aspCaspartate aminotransferase
|
8
|
Askaspartokinase
|
1
|
Regarding nitogen metabolism, analysis of the ectoine biosynthesis pathways revealed the importance of glutamate and alanine in directing fluxes through ectoine synthesis pathway (Ono et al. 1999). A number of 13 copies of genes for glutamate synthase and 3 copies for glutamate dehydrogenase in the P. halophilum genome were identified, but alanine aminotransferase and L-alanine dehydrogenase were not detected (Table 2). The enzymes specified by these genes are responsible for reductive transfer of ammonium to 2-ketoglutarate to generate glutamate, which acts as the major ammonium donor in the cell (Magasanik 1982). There were also 8 putative aspartate aminotransferases, which catalyze the reversible transfer of the amino group from glutamate to oxaloacetate, rendering aspartate and 2-ketoglutarate. This is a key enzyme as it links the TCA cycle with the first enzyme of the ectoines synthesis pathway (aspartokinase). P. halophilum has only one aspartokinase catalyzing the formation of aspartyl phosphate, which is a common metabolic intermediate in the biosynthesis of ectoines and aspartate family of amino acids. Together, these results support that the genome of P. halophilum horbors the genes for high flux of ectoines through the metabolic model shown in Fig. 7.
Genes involved in diverse secondary metabolites biosynthesis
Another interesting genomic trait of strain P. halophilum, is the presence of several new gene clusters that have low similarity with known clusters. A total of 18 gene clusters involved in secondary metabolism were predicted by antiSMASH, including 1 NRPS (non-ribosomal peptide synthetase) type, 1 PKS (polyketide synthase) type 3 and 2 hybrid clusters, namely Type 1 PKS-NRPS and NRPS- fatty acid type biosynthetic clusters (Table 3). Out of the 18 potential biosynthetic clusters, 8 exhibited some level of similarities with known BGC whereas 10 clusters represented orphan BGCs for which no known homologous gene clusters could be identified. Notably, 6 of the known clusters shared similarity with those for antibacterial compounds including colabomycine-E, Fusaricidin-A, Zwittermycin-A, Streptomycin, Meilingmycin, and Mycosubtilin, whereas the 2 others shared similarity with those S-layer-glycan or ectoine compounds. However, the levels of similarity were fairly low in most cases, which suggests the novelty of the possible metabolites from those predicted gene clusters. Several other secondary metabolites could be potentially produced by P. halophilum. Among them, one siderophore molecule encoded by cluster 2, one bacteriocin (cluster 16), and two other gene clusters, 8 and 9, are predicted to be responsible for terpene biosynthesis.
Table 3
List of putative secondary metabolite producing biosynthetic clusters from P. halophilum genome as predicted by antiSMASH
Cluster Type From (bp) To (bp)
|
Most similar known MIBiG BGC-ID
biosynthetic cluster*
|
1 Cf-fatty-acid 30039 54529
|
Colabomycine-E (6%) BGC0000213_c1
|
2 Sidophore 6532480929
|
- -
|
3 Cf- saccharide 122667165299
|
S-layer-glycan (9%)BGC0000796-c1
|
4 Cf-fatty acid-Nrps 187235255808
|
Fusaricidin-A (50%) BGC00001152-c1
|
5NRPS 279782 325448
|
- -
|
6Ectoine 91720 102106
|
Ectoine (100%) BGC0000853-c1
|
7NRPS-T1PKS35902744171
|
Zwittermycin-A (25%) BGC0001059-c1
|
8Terpene 8549 30450
|
--
|
9Terpene 275106 295939
|
--
|
10Cf -putative 95801 108118
|
--
|
11Cf- saccharide152878 183803
|
Streptomycin (3%) BGC0000717-c1
|
12Cf -putative 290995 295475
|
Mycosubtilin (20%) BGC0001103-c1
|
13Cf -putative 107 5001
14Cf -putative 81210 92644
15T3PKS 221283 262347
16Cf -putative 15440 27365
17Cf -putative 60137 67649
18Cf -putative 127588 142084
|
- -
- -
Meilingmycin (2%) BGC0000093-c1
- -
- -
--
|
* The percentage in parentheses indicate the number of genes showing similarity to the corresponding known biosynthetic cluster |