Identification and characteristics of lncRNAs
At first, the raw data was filtered and trimmed to eliminate the adaptor sequences and low-quality reads using Trimmomatic. This process yielded an average of 121,690,584 clean reads for each sample (Table S1). Then the clean reads were utilized to re-construct the new transcriptome for human for the present study. Firstly, all the clean data were aligned to human genome sequences using STAR (Table S2). After that, the aligned bam files were subjected to StringTie to assemble into 215,331transcripts (Table S3). We simply compared these assembled transcripts against the known transcripts of human derived from Ensembl database via BLASTn, the result showed that 16,220 (~7.5%) novel transcripts were identified in the present study.
To identify high-confidence lncRNAs for this study, we initially extracted the known lncRNAs based on the annotation information of gtf file of Homo Sapiens (GRCh38.p10) from Ensembl database (details see Methods). As a result, a total of 52,791 known lncRNAs were detected. On the other hand, we applied a stringent stepwise filtering pipeline (Figure 1) to identify novel lncRNAs. This analysis generated 5,310 novel lncRNAs (Table S4). Furthermore, aimed at assessing the quality of identified lncRNAs, we performed a comparative study between lncRNAs and mRNAs before and after treatment in the transcript length, exon counts, open reading frame (ORF) length and expression level (Figure 2). The results indicated that there was no obvious difference of expression level among these patients no matter in mRNAs or lncRNAs (Figure 2-A). The similar situation exists in the comparison of expression before and after treatment (Figure 2-B). In consistent patterns of with previous studies of eukaryotic lncRNAs, the length of the majority of identified lncRNAs and the corresponding ORF was much shorter than that of mRNAs (Figure 2-C, D), and the exon count of the lncRNAs was also less than that of mRNAs (Figure 2-E).
Moreover, we found that a small proportion of known and novel lncRNAs specifically expressed in distinct status (Figure 3-A, B). In addition, the length of majority of known lncRNAs ranges from 500bp to 700bp, whereas the length of lncRNAs were found mainly shorter than 300bp (Figure 3-C). We also detected five main alternative splicing (AS) modes for these transcripts using AStalavista [21]. The results showed that there was no significant discrepancy in those five AS modes before and after treatment (Figure 3-D).
Differential expression analysis reveals desensitization-treatment-related module
To investigate the lncRNAs as well as their possible roles implicated in asthma before and after treatment, we initially performed paired differential-expression analysis using DESeq2 package for R [22]. The results indicated that only 54 transcripts (15 lncRNAs and 39 mRNAs) were significantly differentially expressed in the patients after treatment (Table 1, Figure 4-A). After that, all the differentially expressed transcripts were subjected to metascape to search for possible functional biological processes. As a result, two biological processes were significantly enriched, including humoral immune response and cytoplasmic vesicle membrane (Figure 4-B).
Co-expression network analysis reveals desensitization-treatment-related module
Alternatively, we utilized a more sensitive and robust approach to explore the lncRNAs implicated in asthma, i.e., co-expression network analysis, which was achieved via WGCNA in R. Firstly, the expression profile of all transcripts was assessed using featurecounts and normalized by DESeq2 package for R. Then the cluster analysis was conducted using flash tools package of WGCNA to detect outliers of samples. The results indicated that all samples were well clustered and passed the cuts (Figure S1). A network-topology analysis based on different soft-thresholding powers was performed to get relative balance scale independence and mean connectivity of WGCNA. The optimal power for the scale-free topology fit index was calculated out for construction of hierarchical clustering tree (Figure S2). This analysis clustered all the correlated transcripts into 80 modules with strong correlation with trait (Figure S3). Each module contains independent datasets of transcripts (includes mRNAs and lncRNAs). The interactions among those modules were visualized in Figure S4 with randomly selected transcripts. After that, the modules significantly associated with specific trait (desensitization treatment) was detected using the function plotEigengeneNetworks of WGCNA (Table 2). The results showed that a module (khaki1) were significantly correlated to the trait (desensitization treatment) (Figure 5), which contains 27 mRNAs and 21 lncRNAs. The relationship of transcript significance with module membership was visualized in Figure S5.
Functional analysis of module reveals key genes related to asthma
In order to investigate the potential relationship of transcripts which co-localizes in desensitization-treatment-related module with asthma treatment, the annotated transcripts (mRNAs) of the module were subjected to a gene annotation server Metascape for functional enrichment analysis. The result showed that one biological process (GO:0043065: positive regulation of apoptotic process) was significantly enriched (Figure 5) by four genes: TRIO, PEA15, KLRK1, NDUFA13.
In addition to these four apoptosis-related genes (TRIO, PEA15, KLRK1, NDUFA13), we also found that the other genes in desensitization-treatment-related module exhibiting distinct degree of correlations to asthma via literature review [23-37]. Concretely, four genes (ATF7IP, SLC43A3, AKAP7, HMGN1) were found to be correlated to pathogenesis or treatment of asthma in immune. For instance, ATF7IP can form a protein-complex by binding to MBD1 (methyl CpG binding protein 1) and Aire (autoimmune regulator) to maintain immune tolerance [38]. Immunological tolerance to self-antigens is critical to the prevention of autoimmune disease, including asthma. In addition, two genes (ELMO2, WDFY1) were demonstrated to play roles in inflammation response of asthma. ELMO2 was found as an ILK-binding protein to play key role in integration of adhesion and migration pathways. Abnormal regulation of migration is associated to chronic autoinflammatory disorder, such as asthma [33]. Furthermore, two genes (G2E3, ZNF19) were found to be differentially methylated between asthmatics and non-asthmatics [34, 35]. Combined with four apoptosis-related genes, all those asthma-related genes could be summarized in Table 3.
Network analysis based on PCC reveals key lncRNAs implicated in asthma
Our aforementioned analysis has successfully assisted us to identify several asthma-related genes in desensitization-treatment-related module. However, the correlation of lncRNAs with asthma in the module were still unknown. For that, we performed a network analysis to establish RNA-RNA interaction network for all the 48 transcripts (including mRNAs and lncRNAs) in desensitization-treatment-related module based on the remaining 102,041 assembled transcripts using pearson correlation coefficient (PCC) method. The RNA-RNA pairs which had absolute value of pearson correlation ≥0.7 and P-value<0.05 were retained. This analysis detected a total of 3,502 RNA-RNA interaction relationships including mRNA-mRNA, mRNA-lncRNA, which involved 1,654 mRNAs and 864 lncRNAs (Figure 6).
We further concerned which lncRNAs target to the 17 asthma-related genes as well as their possible roles in asthma. Since that, we extracted a sub-network of these 17 asthma-related genes from the above network (Figure 7-A). This step indicated that 323 lncRNAs exhibited potential correlations to these asthma-related genes, implying they may play roles in pathogenesis and treatment of asthma. To further investigate their possible functions implicated in asthma, the target mRNAs for each asthma-related gene were extracted to be subjected to Metascape server for functional analysis. The results showed that majority of target mRNAs of asthma-related gene could significantly enriched by diverse biological processes (Figure 7-A).
Among them, many RNA-RNA interactions aroused our attentions since both sides of interaction co-localizes in the desensitization-treatment-related module (Figure 7-B), which suggested their strong correlation with asthma. Concretely, three known lncRNAs (LINC01771, LINC02145, AL031289.1) and three novel lncRNAs (MSTRG.24219.1, MSTRG.28178.1, MSTRG.18506.1) were found to interact with many asthma-related genes. Notably, LINC02145 exhibited to be correlated to six asthma-related genes, including ZFN19, G2E3, ATF7IP, PEA15, SPDYE6, VWA5A. Despites there is no clinical significance was reported in this known lncRNA, our analysis indicated that its strong association with asthma before and after desensitization treatment. A similar situation was found in a novel lncRNA MSTRG.18506.1 which interacts with three asthma-related genes, i.e., SPDYE6, ATF7IP, VPS37A. Interestingly, in addition to lncRNAs, we found some other types of noncoding RNAs exhibited correlations with asthma-related genes, including antisense, snoRNA, snRNA and pseudogene. Notably, our analysis indicated that a pseudogene GUSBP2 (also defined as lncRNA in GeneCards Suite [39]) was found to interact with the immune/inflammatory-related genes, including KLRK1, HMGN1 and WDFY1. KLRK1 encodes NKG2D, and aberrant expression of NKG2D ligands has been reported in sites of inflammation and in tissues undergoing autoimmune pathologies, including asthma [40]. Klrk1-/- mice exhibited a profoundly impaired inflammatory response to allergen challenge compared with klrk1+/+ mice [36]. HMGN1 encodes the nucleosome-binding protein, which was reported as a potent alarmin that binds TLR4 and induces antigen-specific Th1 immune responses [27]. WDFY1 could mediate Toll-like receptor3/4 (TLR3/4) signaling by recruiting TRIF. TLRs were found to be expressed in the lung tissue and some cells of innate and adaptive immune system. Abnormal expression of TLRs could lead to aberrant expression of many inflammatory and anti-inflammatory mediators [36]. These 3 genes thereby have been well demonstrated to be closely associated to immune and inflammatory response in asthmatic status. Regarding GUSBP2, the associations with asthma actually has been reported in a microarray study for asthma, which showed that GUSBP2 was differentially expressed in comparison of asthmatic patients with health controls [41]. However, its detailed mechanisms in asthma was not illuminated. In the present study, our network analysis disclosed GUSBP2 exhibiting correlations to these immune/inflammatory-related genes, implying it might target to these genes to participate the regulation of immune and inflammatory process in asthma. Indeed, GUSBP2 was found as the highest up-regulated fold change of lncRNA in CD (Crohn’s disease) patient plasma via microarray screening [42]. CD is known to us as one of inflammatory bowel diseases, which suggests the existence of strong association between GUSBP2 and inflammatory reaction