Genome-wide screening of sigma54-regulated sRNAs
Prediction of non-coding sRNAs from the various nitrogen-fixing Rhizobium strains were performed by genome-wide computational analysis, based on the PWM matrices of conditional sigma factor 54 (nitrogen limitation sigma factor) by using improved sRNA scanner program (Raja et al., 2018). sRNA scanner demarks the transcription units (TUs) using consensus sequences of sigma factor binding sites (-35 and − 10 (SI 2)) and rho-independent transcription terminator sequences. The total number of sRNAs predicted from each strain of Rhizobium is graphically represented in Fig. 1a and the details of the predicted sRNAs are given in Table 1. Among the six Rhizobium strains, higher numbers of sRNAs are predicted from the genome of S. fredii followed by R. etli, R. trifoli, R. leguminosarum, and B. japonicum. The genome size and the number of replicons are found to vary among the Rhizobium strains. sRNAs of all Rhizobium strains were equally transcribed from both positive and negative strands, whereas in the R. leguminosarum sRNAs are mostly transcribed from the negative strand (Fig. 1b). Maximum numbers of sRNAs are predicted in chromosomes for all the Rhizobium strains. To find out the novel putative sRNAs, predicted sRNAs were searched against Rfam database and BSRD database to eliminate the conserved homologs.
Table 1
sRNAs identified from the genome of Rhizobium strains.
Name of the strain
|
No. of sRNAs predicted
|
Total number of sRNAs predicted
|
Homologous identified in Rfam
|
Homologous identified in BSRD
|
Negative strand
|
Positive strand
|
Bradyrhizobium japonicum USDA 110
|
39
|
45
|
84
|
4
|
-
|
Rhizobium etli CFN 42
|
120
|
120
|
240
|
46
|
1
|
Rhizobium leguminosarum 3841
|
144
|
109
|
253
|
48
|
1
|
Rhizobium tropici CIAT 899
|
112
|
108
|
220
|
53
|
-
|
Sinorhizobiium fredii NGR234
|
130
|
132
|
262
|
33
|
-
|
Length And Gc% Content Distribution Of Srnas
Generally, sRNAs are found to have length of 50 − 500bp. sRNAs that are < 50 and > 500bp were removed and rest of them were used for further analysis. Most of the sRNAs predicted from Rhizobium genomes are found to have length of 50–150 bp (Fig. 1c). GC content of sRNAs ranges from 30–80%. Majority of them were found to have 50–60 % content (Fig. 1d).
Srna Conservation And Comparison Analysis
sRNAs are highly conserved in nature. The predicted sRNA candidates were compared to identify the conservation between the Rhizobium strains. It is inferred that 8 sRNAs are found to be conserved between R. etli and R. leguminosarum, 1 sRNA between R. leguminosarum and R.tropoci and 1 sRNA between R. tropoci and S. fredii and interestingly 8 sRNA candidates of S. fredii conserved with the sRNAs of S. meliloti (unpublished data). In order to find out and identify the novel putative sRNAs, the sRNAs predicted from the genome of the Rhizobium strains were compared with previous reported sRNAs from the literature. A total of 6 sRNA candidates of R. etli are conserved with the sRNAs reported by Vercruysse et al. 2010 and López-Leal et al. 2015 (SI 3a). Since the bacterial cells harbour different types of regulatory RNAs, the predicted sRNAs were validated to find out whether they are truely sRNAs or any other regulatory RNAs. In order to validate and functionally characterize the non-coding RNAs, the predicted sRNAs were searched against the most comprehensive database like Rfam and BSRD repositories (Table 1). The identified conserved homologs were eliminated and other novel sRNAs were taken for further analysis.
Transcriptome-based Srnas Prediction
The high quality RNA sequence reads of the bacterium under free-living and symbiotic conditions specific of B. japonicum (SI 4) were aligned to respective genome using Rockhopper. After alignment, transcripts from the intergenic regions and antisense regions from the complementary strand of the protein-coding genes were identified. sRNAs that are < 50 and > 500bp were removed and the intergenic sRNAs having a length of 50-500nt were taken for further analysis. A total of 1375 trans-encoded sRNAs are predicted. The lengths of the predicted sRNAs are found to vary in length between 50 and 400 nt and most of the sRNA candidates have the length of 50 to 100 nt in length. GC content of the sRNAs ranged from 31–80% and most of the sRNAs ranged between 51 and 60%. Predicted sRNAs were searched against Rfam database and BSRD databases to eliminate the conserved homologs. Only one sRNA of B. japonicum was found to be conserved against predicted sRNAs in the Rfam database (chrB RNA). In order to screen the novel sRNAs, the predicted sRNAs were searched against the previously reported sRNAs. And the results indicate that 8 sRNAs candidates of B. japonicum are conserved and are compable with sRNAs reported by Hahn et al. 2016 (SI 3b). sRNAs predicted from the transcriptome were searched against the sRNAs generated based on the PWM matrics of sigma factor 54-based from the genome of B. japonicum to relate their role in the nitrogen regulation. A total of 21 sRNAs of B. japonicum were found to be conserved with the sigma factor 54-based predicted sRNAs. In order to study the regulatory role of the sRNAs in the regulation of nitrogen fixation and symbiotic association, sRNAs which are highly regulated (based on the expression level) were selected for further analysis. Since a single sRNA can regulate multiple mRNA targets, in order to screen the sRNAs and its role in nitrogen fixation post-transcriptionally, their mRNA targets were predicted using the TargetRNA2 tool and the flanking genes for these sRNAs were predicted using IGV (Table 2).
Table 2 Targets and secondary structures of sRNAs identified from the transcriptome data.
Target Prediction
Generally, sRNAs act by short perfect or imperfect base-pairing with complementary sequence stretches of the multiple target mRNAs to mediate post-transcriptional gene regulation. Therefore, it is necessary to identify the relevant mRNA targets for the sRNA. To screen the sRNAs, which directly influence the Rhizobium nitrogen fixation/regulation post-transcriptionally; putative targets of the sRNAs were predicted in all the Rhizobium strains with TargetRNA2 tool. Target mRNAs were predicted for both genome- and transcriptome-derived sRNAs. Target prediction revealed that 30 sRNAs of Sinorhizobium fredii, 18 sRNAs of R. leguminosarum, 15 sRNAs of R. etli and 5 sRNAs of R. tropici, were found to target nitrogen fixation/regulation genes. sRNA targets of all the Rhizobium strains employed in the present study were mostly complemented with the binding site of various nodulation genes like nodI, nodD2, nodC, nodF, nodQ1 nodS, nopB, nopC, nopT, nopX, nopM, nopP, nolO, noeI, fdxB, and fdxN and nitrogen fixation genes like nifA, nifB, nifE, nifD1, nifD2, nifH, nifK, nifN, nifS, nifT, nifQ, nifZ, fixA, fixB, fixC, fixF, fixG, fixX, fixS, and fixU and several cytochrome proteins. All these mRNA targets exhibit significant complementarity with the sRNAs and show significant P-value (< 0.05). In B. japonicum, nearly 100 sRNAs from the transcriptome were selected for the target prediction. Among 100 sRNAs, 43 sRNAs from the transcriptome and 6 sRNAs of genome (SI 5) were found to target nitrogen regulation/fixation related genes. Based on the target interaction energy and high scores, sRNAs which targets more nitrogen fixation genes of B. japonicum (Table 2) were taken for further analysis and were experimentally validated.
Promoter, Terminator, Secondary Structure Prediction
Promoter and rho-independent terminator sequences were predicted for the identified putative novel sRNAs (Table 3). Secondary structure was predicted for the selected sRNAs using RNAfold server. The predicted minimum free energy for the majority of the sRNAs ranges from − 20 to -70 kcal/mol. The color of the structure represents the base-pairing probabilities (Table 2, SI 5).
Table 3
Promoter, terminator and flanking genes of sRNAs identified from the transcriptome data.
sRNA
|
sRNA
cordinates
|
-10 sequence
|
-35 sequence
|
Terminators
|
Upstream gene
|
Downstream gene
|
BJ1
|
1927274–1927558
|
CTTTATAGT
|
TTGCGG
|
TGACAGAGACCTTGCGCGGCTTCTCGCGCGACTTTTGGAATGGA
|
outer membrane protein
|
hypothetical protein
|
BJ2
|
2174239–2174361
|
GGTCATTCT
|
TCGATG
|
-
|
hypothetical protein
|
hypothetical protein
|
BJ3
|
2226897–2226997
|
GTGTATACT
|
GTGCCA
|
-
|
phenolhydroxylase-like protein
|
hypothetical protein
|
BJ4
|
2066092–2066221
|
GGTTAGCAT
CCCCATAA
|
CTCAGT
CCCATAAC
|
-
|
hypothetical protein
|
transposase
|
BJ5
|
4506199–4506295
|
CCGGATTCT
|
GTGAAG
|
-
|
alanine racemase
|
replicative DNA helicase
|
BJ6
|
1741298–1741451
|
GATTAGAGT
|
TTGCCC
|
-
|
hypothetical protein
|
site-specific integrase/recombinase
|
BJ7
|
1214552–1214648
|
ATCCAGAGT
|
TTGCCA
|
-
|
Hsp33-like chaperonin
|
hypothetical protein
|
BJ8
|
5185970–5186073
|
TGTTAGACT
|
TTCGCA
|
-
|
cysteine synthase
|
queuine tRNA-ribosyltransferase
|
BJ9
|
7340538–7340630
|
-
|
-
|
-
|
hypothetical protein
|
hypothetical protein
|
BJ10
|
157905–158001
|
CCTTAAGCT
|
TTGCGA
|
-
|
ATP-dependent helicase
|
hypothetical protein
|
Experimental Validation
From the combined genome and transcriptome data, one sRNA from the positive strand of sigma 54-regulated sRNAs, nine sRNAs from the transcriptome (7 and 2 sRNAs from negative and positive strand, respectively) of B. japonicum were selected. Among the 10 sRNAs, 8 sRNAs (BJ1-BJ8) have shown amplifications in semi-quantitative PCR (Fig. 2).
Semi-quantitative Pcr Analysis
Semi-quantitative PCR was performed for 8 sRNA and the primers used to amplify the sRNAs are listed in the SI 6. Densitometric analysis of semi-quantitative PCR revealed that the 16S rRNA expression was constant in both symbiotic and free-living conditions; differential expression was observed between the 8 sRNAs in symbiotic and free-living conditions (Fig. 2).
Functional Categorization Of Srna Target Genes
In order to study the role of target mRNAs of sRNAs, select set of target mRNAs were functionally annotated by COG and GO analysis. The mRNA targets of both free-living and symbiotic association were enriched in COG categories of energy production and conversion, amino acid transport and metabolism, coenzyme biosynthesis and metabolism and post translation modification, protein turn over and the bioavailability of chaperons (Fig. 3). In GO analysis, the target mRNAs were enriched into 3 categories, viz., biological process, molecular functions and cellular components. The targets are categorized under biological process including the genes involved in cellular and metabolic process and localization, molecular functions, binding, catalytic activity and transporter activity (Fig. 4).
Go Regulatory Network
GO regulatory network analysis were constructed for the target mRNAs of free-living and symbiotic association conditions. The target mRNAs of sRNAs identified under the free-living condition is shown in Fig. 5. Signal transduction is known to be the central node in the GRN, governed by nodW and ctpA (GO ID: 7165) and showed interaction with other GO terms, such as, regulation of transcription, positive regulation of sporulation, nodulation, carbohydrate metabolism, protophyrinogen IX biosynthetic process, bacterial-type flagellum-dependent swarming motility, protein folding and aerobic respiration. In the case of the target mRNAs of symbiotic association, regulation of transcription is found to be the central node in the network (Fig. 6) governed by ccmc (GO ID: 17004) which showed interaction with nitrogen fixation, aerobic respiration, transmembrane transport and nodulation.