2.1. Deep Sequencing of Guinea Pig Lung Small RNAs
To identify the miRNAs involved in immune regulation of Mtb infection in guinea pig lungs, small RNA libraries from the non-infected and BCG-infected lungs were sequenced side-by-side using the Illumina HiSeq™ 2000 sequencing system. The statistics of the small RNA sequences from the two libraries were summarized in Table 1 and Table 2. A total of 9,289,133 and 9,600,000 raw reads were obtained from non-infected lungs and BCG-infected lungs, respectively. After removing the low-quality sequences, adapter sequences, and sequences smaller than 18 nt, the remaining clean reads from the two libraries were aligned to the guinea pig genome. A total of 9,237,270 and 9,551,301 clean reads were perfectly matched to the genome (Table 3). The two libraries generated 302,562 unique small RNA reads, indicating a highly complex small RNA population. For the total small RNAs, 98.43% were found in both libraries. For the unique small RNAs, only 16.09% were shared in both libraries, 42.86% were non-infected-lung-specific and 41.05% were BCG-infected-lung-specific, indicating a dynamically similar small RNA population expression in the non-infected and BCG-infected lungs of guinea pigs (Fig. 1). The size of small RNAs was not evenly distributed in both libraries; however, the overall size distribution of the sequenced reads from the two sequencing efforts were very similar. The majority (97%) of small RNAs from both libraries were 20–24 nt, with 22 nt being the most abundant, followed by 23 and 21 nt classes (Fig. 2). This result was consistent with a recent report in the mouse model [11, 12] and with two results in rat [13], human [14] and porcine [12, 15] models.
2.2. Conserved miRNAs
Conserved miRNA families have been found in many mammalianspecies and have unique expression profiles in the cells of the innate and adaptive immune systems, in which they play pivotal roles in the regulation of both cell development and function [16]. To identify the conserved miRNAs in our dataset, all small RNA sequences were BLASTN searched against the known mature miRNAs and their precursors in the miRNA sequence database, miRBase. There were currently 255 families containing 1187 and 1202 known, unique miRNAs in non-infected and BCG-infected guinea pig lungs, respectively, which match the mouse genome in miRBase 20 (Additional File 1 Table S1). The statistics of conserved miRNA families in non-infected and BCG-infected guinea pig lungs were listed in Additional File 2 Table S2[17].
In the non-infected and BCG-infected guinea pig lung groups of the miRNA expression profile, we observed that the majority of miRNAs with abundant counts are represented by a few miRNAs. As shown in Fig. 3, the miRNA expression profile consisted of unevenly distributed counts of sequences in which the top 13 unique miRNAs with the highest expression levels account for 82.47% and 82.98% (by count) of the total counts of all 13,144,294 and 13,606,027 unique miRNAs of the mappable sequences in non-infected and BCG-infected guinea pig lung libraries, respectively.
Interestingly, the histogram showed similar results for both the non-infected and BCG-infected miRNA libraries, with same unique miRNAs making up the top 13 positions. The miR-140 family (miR-140, miR-140-3p) takes the top ranking in both non-infected (257,733) and BCG-infected (277,646) guinea pig lungs, and these account for 1.96% and 2.04% (by count) of mappable sequences, respectively. Furthermore, 7 of top 13 miRNAs are from the let-7 family (Fig. 3). This finding is consistent with previous reports indicating that the let-7 family miRNAs are widely expressed in various cell and tissue types at high expression levels and are involved in the basic process of cell life activities as major regulators of cell proliferation pathways [18, 19].
However, the relative abundance of these miRNAs in non-infected and BCG-infected lungs of guinea pig may also suggest their basic cellular roles and that they may be the most important regulatory miRNA groups during the resistance to foreign pathogenic microorganism invasion in lung[20]. In addition, recent evidences also support our hypothesis. Indeed, the anomalous expression of these top 13 ranked miRNAs may be related to the body's immune response [21, 22]. These findings may support the prediction that the 13 miRNAs are related to guinea pig lung physiology, indicating that these may be the most important regulatory miRNA groups and a signature of infectious lung diseases.
It is well known that miRNAs often have implications in signal transduction regulating the binding of the 3' untranslated region (UTR) of its target mRNA, which would be translated into the key signal protein molecules at a later time in immune-related cells. The miRNAs that are differentially expressed between non-infected and BCG-infected guinea pig lungs may represent a major regulationory mechanism for immune responses that could contribute to BCG infection in the lung. To determine the significance of differences in known miRNA expression between non-infected and BCG-infected lungs, we showed the expression of miRNA in two samples by plotting the Log2-ratio figure and Scatter Plot (Fig. 4A and Additional file 3 Table S3). As shown in Fig. 4B, of 1487 unique miRNAs, 902 (60.66%) unique miRNAs were co-expressed in the non-infected and BCG-infected guinea lung libraries, while 285 (19.17%) and 300 (20.30%) miRNAs appear to be preferentially expressed in the non-infected and BCG-infected libraries, respectively. The analysis of library sequencing data resulted in the identification of 381 unique miRNAs (out of 1187; 32.10%) with statistically significant differential expression between the non-infected and BCG-infected guinea pig lung libraries. Out these 381 differential expression unique miRNAs, 248 (Non-infected-specific: 42, co-expressed: 206) and 133 (BCG-infected-specific: 55, co-expressed: 78) unique miRNAs are down- and upregulated in non-infected versus BCG-infected lungs, respectively (Fig. 4B).
In the miRNA differential expression libraries of the non-infected and BCG-infected guinea pig lung, the 1487 unique miRNAs belong to 55 of the 255 identified conserved miRNA families. In addition, of the 15 most abundant conserved miRNAs in miRNA differential expression libraries of non-infected guinea pig lungs, 13 were significantly downregulated and 2 were significantly upregulated, compared to those of the BCG-infected lungs (Fig. 5). The 13 downregulated miRNAs belong to 6 miRNA families: miR23, miR29, miR145, miR320, miR378, and miR451. The 2 upregulated miRNAs belonged to the miR423 family (Fig. 6). These data was in line with previous findings that miR-29 was expressed during active pulmonary tuberculosis[23], while miR-23, miR-29, miR-145, and miR -452 were either up- or downregulated in chronic obstructive pulmonary disease (COPD), non-small cell lung cancer (NSCLC) and inflammatory responses [24-27].
2.3. Novel miRNAs
To identify more potential miRNAs in the BCG-infected guinea pig lungs, the unclassified reads were further processed by Mireap software (http://sourceforge.net/projects/mireap). Only those tags meeting the default parameters and with read counts greater than 56 and 63 in non-infected and BCG-infected guinea pig lungs, respectively, were defined as candidate novel mature miRNAs. The target genes (433 of non-infected and 434 of BCG-infected guinea pig lung) of novel miRNAs were predicted from Mireap, and the number of miRNAs (only miRNAs with predicted target genes will be included here) and the number of corresponding targets is shown in Table 4 (for detailed information on the miRNA sequence and target gene sequence, see Additional File 4 Table S4 for novel miRNAs and Additional File 5 for target prediction of novel miRNA). The rules used for target prediction are based on those suggested by Allen [28] and Schwab et al. [29].
On the basis of this analysis, 79 novel miRNAs were identified in the guinea pig lung, and these showed different expression levels between the non-infected and BCG-infected lungs. The expression of miRNAs in these two groups was presented by plotting the Log2-ratio figure and a scatter plot (Fig. 7A). Of the 79 novel unique miRNAs, 21 (26.58%) novel miRNAs were coexpressed in non-infected and BCG-infected guinea lung libraries, while 28 (35.44%) and 30 (37.97%) novel miRNAs appear to be preferentially expressed in non-infected and BCG-infected libraries, respectively. The analysis of library sequencing data resulted in the identification of 29 novel miRNAs (out of 79, 36.71%) with statistically significant differential expression between the non-infected and infection guinea pig lung libraries. Out of these 29 differentially expressed novel miRNAs, 14 (non-infected lung-specific) and 15 (BCG-infected-specific: 13, coexpressed: 2) novel miRNAs were found to be down- and upregulated in non-infected versus BCG-infected lung samples, respectively (Fig. 7B and Additional File 6 Table S5 show the novel miRNA differential expression).
The Venn diagram displays the distribution of 49 novel miRNAs between non-infected (left, blue circle) and BCG-infected guinea pig lung (right, red circle) libraries. The dashed circles indicate the miRNAs that were significantly differentially expressed (p < 0.0001) in the two lung samples.
2.4. Nucleotide bias of miRNAs in guinea pig lungs
We analyzed the nucleotide bias of the conserved and novel miRNAs in non-infected and BCG-infected guinea pig lungs. U was the most common first nucleotide at the 5’ end of the conserved and novel miRNAs. U was the dominant first nucleotide of the conserved 20–24 nt miRNAs in accounting for 97% of 49977 in non-infected guinea pig lungs and for 81% of 43124 in BCG-infected lungs. Like the conserved miRNA, the novel 21–24 nt miRNA accounted for 59% of 1871 in non-infected group and 61% of 1697 in the BCG-infected group (Fig. 8). Some novel miRNAs predicted in our miRNA expression profile may be just small RNAs rather than miRNAs, which could be why the percentage of novel miRNAs was lower than the percentage of conserved miRNAs. Like the other studies, these results also confirmed U to be the most common nucleotide at the 5' end of miRNAs[30].
2.5. Prediction of target genes and signaling pathways
In organisms, genes usually interact with each other to play different roles in certain biological functions. An analysis based on pathways could allow us to better understand the biological functions of genes. KEGG is the major public pathway-related database.
The target genes were predicted on the guinea pig (Cavia porcellus) genome sequence (http://asia.ensembl.org/Cavia_porcellus/Info/Index?db=core) to seek out miRNAs involved in the lungs of guinea pigs as a model against the M. tuberculosis infection. A total of 19606 potential target genes were predicted (Additional file 7 Table S6). We identified 308 signaling pathways on the Kyoto Encyclopedia of Genes and Genomes (KEGG), including at least 12 pathways (Table 6) involved in the innate immune response of guinea pigs against M. tuberculosis infection, such as BCG, Toll-like receptor (TLR) and NF-kappa B signaling pathways, which could regulate host immune cell recognition and phagocytosis to pathogenic bacteria [31]; NF-kappa B[32], Wnt, MAPK, and TGF-beta signaling pathways are involved in the host inflammatory response and innate immune response[33]. The mTOR and p53 signaling pathways regulate the autophagy and apoptosis processes in immune cells [34]. A number of targets genes were detected. 223 belonged to signaling pathways involving tuberculosis, 119 belonged to the TLR signaling pathway, 158 belonged to the NF-kappa B signaling pathway, and 114 belonged to the TGF-beta signaling pathway (Table 5).
In the innate immunity of hosts infected with M. tuberculosis, TLRs are important pattern-recognition receptors for recognizing microbial pathogen-associated molecule patterns (PAMPs) and fighting aM. tuberculosis attack by initiating an immune response via tactivating TLR, NF-kappa B and many other signaling pathways that regulate the immune and inflammatory gene expression[35]. To better understand the biological functions of the conserved miRNAs in differential expression profiles that target to TLRs signaling pathways, putative target genes of miRNAs were predicted using TargetScan (http://www.targetscan.org/vert_72/) (Fig. 9). The results showed that 52 miRNAs were downregulated and 6 were upregulated on activated TLR and NF-kappa B signaling pathways in BCG-infected guinea pig lung models. This reminds us that these miRNAs may be involved in the host innate immunity response by regulating target genes in many signaling pathways.
2.6. Validation of miRNAs by quantitative polymerase chain reaction
To confirm the expression of the conserved miRNAs, quantitative RT-PCR (qRT-PCR) was performed on 6 randomly selected miRNAs, namely, miR-1421 g-5p, miR-3689a-3p, miR-2227, miR-4958-5p, miR-29d-3p and miR-1591-3p in non-infected and BCG-infected guinea pig lungs. Aliquots of the RNA samples used for sequencing were subjected to the qRT-PCR assay, with similar expression levels detected for all the 6 conserved miRNAs candidates by deep sequencing (Fig. 10).