Bacterial growth conditions
Two hundred microliters of L. biflexa serovar Patoc (1X109 lepto/ml) inoculum was inoculated into 10 ml of complete EMJH (1:50 dilution) in a 100 ml conical flask. This starter culture was grown at 30°C with agitation at 60 rpm until reaching the mid-log phase (OD420 of 0.4). Next, 200 microliters of the L. biflexa serovar Patoc starter culture prepared were inoculated into 200 ml of complete EMJH in a 1 L conical flask. The reading of OD420 was taken using NanoPhotometer (Implen GmbH, Germany) at intervals of 6 hours for a total of 7 days to get a complete growth cycle. The growth curves were plotted using the triplicate OD420 readings acquired for a period of 7 days.
RNA isolation
The total RNA extraction was performed using the TRI reagent (Molecular Research Center, USA). In brief, about 0.1 g of cell pellet was re-suspended in 1 mL of TRI reagent, and the cell lysate was thoroughly mixed with 0.2 mL of chloroform. The mixture was then incubated at room temperature for 10 minutes and centrifuged at 12,000 g and 4°C for 15 minutes. The upper aqueous layer was transferred to a new tube and mixed with 0.5 mL of isopropanol, followed by incubation at room temperature for 5 minutes and centrifugation at 12,000 g and 4°C for 10 minutes. The resulting supernatant was discarded and the pellet was washed with 1 mL of pre-chilled 80% ethanol via centrifugation at 12,000 g and 4°C for 8 minutes. The supernatant was then discarded and the pellet was air-dried before re-suspension in RNase-free water. The concentration and quality of the total RNA were determined with Bio-Photometer (Eppendorf, Germany). To access the RNA integrity, the total RNA was separated via denaturing urea-polyacrylamide gel electrophoresis (urea-PAGE: 7M urea, 8% polyacrylamide gel) and visualized under the Gel Doc XR + System (Bio-Rad Laboratories, Inc., USA).
Before constructing RNA-seq libraries, the total RNA samples were treated with DNase I to remove all traces of DNA. In brief, ~ 20 µg of the total RNA was added with 1X DNase I buffer, 10 U of DNase I, and 80 U of RiboLock RNase Inhibitor in a final volume of 50 µL, followed by incubation at 37°C for 30 minutes. Subsequently, the reaction mixture was added with 10 µL of 3 M sodium acetate, 40 µL of RNase-free water, and 10 µg of glycogen. The mixture was vortexed briefly and mixed thoroughly with 100 µL of phenol, followed by centrifugation at 12,000 g and room temperature for 5 minutes. The upper aqueous layer was transferred into a new tube and mixed thoroughly with 100 µL of chloroform, followed by centrifugation at 12,000 g and room temperature for 5 minutes. After that, the upper phase was transferred into a new tube and mixed thoroughly with 400 µL of pre-chilled absolute ethanol. After being incubated at -80°C for 30 minutes, the mixture was centrifuged at 12,000 g and 4°C for 20 minutes. The resulting supernatant was discarded and the pellet was washed with 1 mL of pre-chilled 80% ethanol via centrifugation at 12,000 g and 4°C for 8 minutes. Finally, the supernatant was then discarded and the air-dried pellet was re-suspended in 20 µL of RNase-free water.
Construction of cDNA libraries for the TSS mapping
For the TSS mapping, each total RNA sample was divided into two portions, the first portion was directly subjected to cDNA library (TEX-) construction, while the second portion was treated with Terminator™ 5’-phosphate-dependent exonuclease (TEX) to degrade 5’-monophosphorylated transcripts, leaving only the 5’-triphosphorylated RNAs (i.e., the primary transcripts) in the sample. In brief, 10 µL of the DNase I-treated RNA was added with 1X Terminator Reaction Buffer A, 2 U of TEX, and 60 U of RiboLock RNase Inhibitor to a final volume of 50 µL. Incubation was carried out at 30°C for 60 minutes and followed by phenol/chloroform purification as per the protocols described above. The resulting air-died pellet was re-suspended in 20 µL of RNase-free water. This together with the first, untreated portion, was subjected to the following treatments sequentially.
Tobacco acid pyrophosphatase (TAP) treatment
The RNA was incubated respectively with 1X TAP Reaction Buffer, 2.5 U of TAP, and 60 U of RiboLock RNase Inhibitor in a final volume of 50 µL. After being incubated at 37°C for 90 minutes, the mixture was purified with phenol/chloroform and the air-dried pellet was re-suspended in 20 µL of RNase-free water.
Ligation of 5’-adapter
The TAP-treated RNA was denatured at 90°C for 2 minutes, snap-cooled on ice for 2 minutes, and mixed with 100 picomoles of 5’-adapters (PE_DNAAdapt and PE_RNAAdapt), 1X T4 RNA ligase reaction buffer, 15 U of T4 RNA ligase, and 60 U of RiboLock RNase Inhibitor to a final volume of 30 µL, followed by incubation at room temperature for 60 minutes, 30°C for 30 minutes, and 37°C for 60 minutes. After that, the reaction mixture was added with 1X T4 RNA ligase reaction buffer, 10 U of T4 RNA ligase, and 40 U of RiboLock RNase Inhibitor to a final volume of 40 µL. The reaction was then incubated overnight at 4°C and purified with phenol/chloroform on the next day. Finally, the air-dried RNA pellet was re-suspended in 17.5 µL of RNase-free water.
Synthesis of cDNA: The 5’-adapter-ligated RNA was added with 100 picomoles of TRUmRN6 (with random hexamer), denatured at 90°C for 2 minutes, and snap-cooled on ice for 2 minutes, followed by the addition of 1X Transcriptor RT buffer, 20 U of Transcriptor Reverse Transcriptase, 60 U of RiboLock RNase Inhibitor, and 10 mM of dNTP to a final volume of 30 µL. The reaction mixture was sequentially incubated at 25°C for 10 minutes and 55°C for 40 minutes, followed by phenol/chloroform purification as per the above-described protocol. The air-dried pellet (first-strand cDNA) was re-suspended in 20 µL of RNase-free water and added with 1X FastStart PCR buffer containing 1.5 mM of MgCl2, 2.5 U FastStart Taq DNA polymerase, 10 mM of dNTP, and 50 picomoles of primers complementary to the 5’- and 3’-adapters (1Mlx3 and 1Mlx5) to a final volume of 50 µL. The reaction mixture was subject to the following PCR amplification steps in a thermocycler (Bio-Rad Laboratories Inc., USA): denaturation at 94°C for 2 minutes, followed by 6 cycles of 94°C for 20 seconds, 62°C for 20 seconds, and 72°C for 25 seconds. The PCR products were subsequently purified using the High Pure PCR Cleanup Micro Kit (Roche, Germany) according to the manufacturer’s instructions and eluted in 15 µL of RNase-free water.
Barcoding: Prior to the barcoding of cDNA libraries, the double-stranded cDNA was purified to eliminate primer dimers (78 bp). Briefly, 15 µL of the purified cDNA samples were separated on a 2% low-melting agarose gel (1X TAE buffer) via electrophoresis at 100 V for 30 minutes. The desired gel fraction (100–200 bp) was excised for subsequent gel extraction using the E.Z.N.A. Gel Extraction Kit (Omega Bio-tek Inc., USA). After that, the eluted cDNA sample was added with 1X FastStart PCR buffer containing 1.5 mM of MgCl2, 2.5 U FastStart Taq DNA polymerase, 10 mM of dNTP, and 50 picomoles of primers complementary to the 5’- and 3’-ends of cDNA library (TruseqUniv and IDX1-4, see Online Resource 1) to a final volume of 50 µL. The reaction mixture was subject to the following PCR amplification steps: denaturation at 94°C for 2 minutes, followed by 18 cycles at 94°C for 20 seconds, 63°C for 20 seconds, and 72°C for 30 seconds. The PCR products were then purified using the High Pure PCR Cleanup Micro Kit (Roche, Germany) and eluted in 15 µL of RNase-free water. Size selection and gel extraction were performed to eliminate adapter dimers (128 bp). A total of 15 µL of barcoded cDNA library were separated on a 2% low-melting agarose gel (1X TAE buffer) at 100 V for 30 minutes, and the desired fraction (150–250 bp) was excised out for gel extraction by using the E.Z.N.A. Gel Extraction Kit (Omega Bio-tek Inc., USA) and eluted in 30 µL of RNase-free water.
Amplification of cDNA libraries: To prepare a sufficient amount of starting materials for RNA-seq, each of the barcoded cDNA libraries was subjected to PCR amplification. This was achieved by the addition of 1X FastStart PCR buffer containing 1.5 mM of MgCl2, 2.5 U FastStart Taq DNA polymerase, 10 mM of dNTP, and 50 picomoles of primers (IDXPCR and TruSeqUnivPCR) to a final volume of 50 µL. The PCR parameters were as follows: denaturation at 94°C for 2 minutes, followed by 15 cycles of 94°C for 20 seconds, 60°C for 20 seconds, and 72°C for 30 seconds with a final extension step at 72°C for 2 minutes. After that, the PCR products were purified using the QIAquick PCR Purification Kit (Qiagen, Germany) and eluted in 30 µL of RNase-free water.
Construction of cDNA libraries for sRNA-seq
For the sRNA-seq, the total RNA (100 µg) was subjected to electrophoresis on 8% denaturing urea-PAGE at 120 V for 60 minutes. Gel fractionation was carried out via excision of the bands that constitute RNA of less than 120 nts and overnight elution with 0.3 M sodium acetate (pH 5.2), followed by purification with an equal volume of chloroform and ethanol precipitation. The purified size-selected RNA samples were then subject to the TAP treatment as described above and the following treatments sequentially.
C-tailing and Ligation of 5’-adapter
The TAP-treated RNA was denatured for 2 minutes at 90°C, snap-cooled on ice for 2 minutes, and added with 20 mM of CTP, 1X tailing reaction buffer, 8 U of Poly(A) Polymerase, and 80 U of RiboLoRNase Inhibitor to a final volume of 40 µL. The reaction mixture was incubated at 37°C for 2 hours and purified with phenol/chloroform as described above on the next day. Finally, the air-dried RNA pellet was re-suspended in 22 µL of RNase-free water. After that, the C-tailed RNA was ligated with a 5’-adapter (PE_DNAAdapt and PE_RNAAdapt).
Synthesis of cDNA
200 picomoles of UningTruSeq8g were added to the C-tailed, 5’-adapter-ligated RNA, and the resulting mixture was denatured at 90°C for 2 minutes, snap-cooled on ice for 2 minutes, and added with 1X Transcriptor RT buffer, 20 U of Transcriptor Reverse Transcriptase, 60 U of RiboLock RNase Inhibitor, and 10 mM of dNTP to a final volume of 40 µL. The reaction mixture was sequentially incubated at 33°C for 10 minutes and 55°C for 45 minutes, followed by purification using Sephadex G-50 DNA purification Quick Spin Columns. The first-strand cDNA was eluted in 40 µL of RNase-free water and subject to the synthesis of the second strand of cDNA as per the above-described protocols.
Barcoding
Prior to the barcoding of cDNA libraries, the double-stranded cDNA was purified as described above to eliminate primer dimers (81 bp). After that, each cDNA library was barcoded with a unique identifier sequence (Online Resource 1) as described above and purified again to eliminate primer dimers (131 bp). To prepare a sufficient amount of starting materials for RNA-seq, the size-selected cDNA libraries were subject to PCR according to the above-described protocols.
RNA-seq
In summary, 4 cDNA libraries for TSS mapping and 2 size-selected cDNA libraries for sRNA-seq were outsourced for paired-end sequencing on the Illumina HiSeq 2000 platform (Beijing Genomics Institute, China) (Fig. 1). Raw sequence reads generated in this study have been submitted to the Sequence Read Archive (SRA) database and can be accessed using the BioProject accession: PRJNA884181.
Bioinformatics Analyses
The bioinformatics analysis workflow in this study is depicted in Online Resource 2. All bioinformatics tools and webservers are listed in Table S4 (Online Resource 1). All Shell scripts for bioinformatics analyses are provided in Online Resource 3.
Pre-processing of RNA-seq reads
All raw sequencing reads (FASTQ files) were interleaved with the SeqFu tool (Telatin, Fariselli, and Birolo 2021) using the script interleave.sh to merge paired-end read files into single, interleaved read file, which was then pre-processed to filter low-quality reads (quality score < 28) and remove adapter sequences with Cutadapt (Martin 2011) using the script trim.sh. The qualities of both pre- and post-processed reads were assessed with FastQC (Andrews 2010) and compiled with MultiQC (Ewels et al. 2016) using the scripts quality_report.sh and compile_report.sh, respectively.
Alignment to the reference genome: All trimmed paired-end reads were aligned to the reference genome of L. biflexa serovar Patoc strain (NCBI Accession: NC_010602, NC_010843, and NC_010844) with Bowtie2 (Langmead and Salzberg 2012) using the scripts index_genome.sh and mapping.sh. The resulting alignment (SAM) files were compressed into BAM files and indexed using the scripts convert2bam.sh and index_bam.sh, respectively. To generate coverage files from the alignment files for subsequent visualization, the BAM files were first converted to the bedgraph format, which was then converted to the bigwig format using the scripts generate_bedgraph.sh and bedgraph2bigwig.sh, respectively. After that, BAM files generated from TEX + and TEX- reads were subject to TSS mapping, while BAM files generated from sRNA-seq reads were subject to sRNA prediction as follows.
TSS mapping: TSS maps were extracted from BAM files generated from TEX + and TEX- reads using the TSSAR Client, which submitted them to the TSSAR Web Server (http://nibiru.tbi.univie.ac.at/TSSAR/server) for automated TSS annotation with a p-value threshold of 0.01, noise threshold of 10, and maximum merging range of 10 (Amman et al. 2014). Since the TSSAR Client can process BAM files with a single chromosome, the above BAM files were split by chromosome using the scripts splitbam_chr1.sh, splitbam_chr2.sh, and splitbam_p74.sh, respectively. The TSSAR Web Server generated a list of annotated TSSs, each of which is classified with respect to its genomic position relative to annotated genes as follows: primary TSS is located up to 250 nt upstream of an annotated gene, internal TSS is located within the open reading frame of an annotated gene, antisense TSS is located opposite to an annotated gene, and orphan TSS is located beyond the above-mentioned regions.
Predictions of other genomic features: The TSSs annotated in this study can be used to predict putative promoter regions and 5’UTRs. For the prediction of promoter regions, the genomic region within 50 nts upstream to each TSS was scanned for potential promoters (consensus sequence motif: “TANNNT”, “TTGACA”, “AAAAAARNR”, and “AWWWWWTTTTT”). Briefly, the entire genome sequences were scanned for promoter motifs with MEME FIMO (Bailey et al. 2009) using the scripts iupac2meme.sh and promoter_prediction.sh. The resulting list of potential promoter regions was screened for potential promoters located within 50 nts upstream to TSSs using the script promoter_TSS.sh. Meanwhile, 5’UTRs, which have not yet been made available in existing annotations, were predicted based on TSSs identified in this study using the script 5UTR.sh, which also simultaneously summarizes their length distribution and scans their sequences for any potential riboswitch and other highly conserved regulatory elements using the Infernal cmscan tool (Nawrocki and Eddy 2013) based on covariance models downloaded from the Rfam database (Nawrocki et al. 2015).
Identification and classification of novel sRNA candidates
The BAM files generated from sRNA-seq reads were subject to transcript assembly using the Rockhopper software, which is a graphical user interface (GUI) program for bacterial RNA-seq analysis, including the identification of novel transcripts, using the Bayesian approach (McClure et al. 2013). The predicted novel transcripts were extracted from the output file of Rockhopper software and converted to BED files for subsequent manipulations using the script sRNA_characterization.sh, which extracted the sequence of sRNA candidates in FASTA format and further categorized them according to their lengths and genomic locations relative to annotated genes into promoter-associated sRNAs, 5’UTR-derived sRNAs, 5’UTR-antisense sRNAs, ORF-antisense sRNAs, and possible individual intergenic sRNAs.