Construction and investigation of lncRNA-miRNA-mRNA ceRNA network in thymic epithelial neoplasms

Background Competing endogenous RNA (ceRNA) networks may be used to relate the functions of protein-coding mRNAs with those of the non-coding RNAs, such as microRNAs (miRNAs) and the long non-coding RNAs (lncRNAs). ceRNAs enable the post-transcriptional regulation of gene expression by competing for the shared miRNAs. However, the role and function of the lncRNA-miRNA-mRNA ceRNA network in thymic epithelial neoplasms (TEN) remains unknown. Methods The miRNA, mRNA, and lncRNA expression profiles of 124 patients with TEN were downloaded from The Cancer Genome Atlas. We identified the differentially expressed (DE) miRNAs, mRNAs, and lncRNAs using the limma package in R software. The GDCRNATools package was used for the Gene Ontology (GO) and the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway annotations. Cytoscape software was used to construct the lncRNA-miRNA-mRNA ceRNA network. The Gene Expression Profiling Interactive Analysis platform was used to estimate the overall survival (OS) rates of the patients. Survival curves were analyzed using the log-rank test. Finally, the mRNAs in the ceRNA network were analyzed using the GOplot package in R. Results A total of 1513, 188, and 579 TEN-specific mRNAs, lncRNAs, and miRNAs, respectively, were identified. The lncRNA-miRNA-mRNA ceRNA network was constructed, and included 53 mRNAs, 4 lncRNAs, and 27 miRNAs. A total of 10 DEmRNAs (DLX2, C8orf88, CD38, GATA3, MAL, FOXQ1, FOLH1, NLRP12, HJURP, and ACSM1) and 1 lncRNA (SNHG3) were found to be significantly associated with OS ( P <0.05). Conclusion In this study, we constructed a lncRNA-miRNA-mRNA ceRNA gene regulatory network for TEN, and identified potential prognostic and diagnostic biomarkers, as well as therapeutic targets, for the disease.


Background
Thymic epithelial neoplasms (TEN) are the most common primary tumors of the anterior mediastinum and are associated with invasion, recurrence, and metastasis involving the heart, the mediastinum, and the pleura [1] [2]. As the thymus has both immune and endocrine functions, TEN may result in the development of systemic symptoms and paraneoplastic syndromes, including myasthenia gravis, 3 hypohemoglobinemia, systemic lupus erythematosus, and myocarditis. Surgical treatment is the first line treatment for TEN; however, chemotherapy and radiotherapy are often used as a comprehensive treatment strategy. The treatment of TEN is challenging, and the complete removal of the tumor, the histological classification of thymoma, and paraneoplastic syndromes affect patient prognosis [3]. The morphological heterogeneity of TEN complicates the identification of molecular abnormalities specific to the thymic tumors, and there is a lack of immunohistochemical diagnostic techniques and targeted drugs. Therefore, the main goal of the present study was to investigate the molecular genetics of TEN and to identify promising therapeutic targets for the treatment of this disease. Competitive endogenous RNAs (ceRNAs), including mRNAs, pseudogenes, long non-coding (lncRNAs), and circular RNAs (circRNAs), can compete with a common set of microRNAs (miRNAs) through miRNA response elements to regulate their abundance [4]. The role of ceRNAs in gene regulation has attracted increasing attention, and may lead to important insights into tumor progression. For example, shortening of the 3'UTR region of the mRNA through an alternative polyadenylation has been shown to play a major role in suppressing the expression of the tumor suppressor genes by disrupting the ceRNA crosstalk [5].
lncRNAs and circRNAs usually regulate their expression levels through a competitive binding with miRNAs. Previous studies have revealed that lncRNAs play an important role in TEN. In the estrogenmediated TEN mouse thymocytes, 36 differentially expressed (DE) lncRNAs affecting the cell proliferation, the cell cycle, and apoptosis were identified [6]. The FBXW7-associated lncRNAs regulate genes involved in the cell cycle and DNA repair, lymphocyte activation, and cell differentiation. Furthermore, they regulate genes controlling the response to interferon-beta in the radiation-induced thymic lymphoma in mice [7]. However, to the best of our knowledge, a lncRNA-miRNA-mRNA ceRNA network for TEN remains to be reported. Therefore, the role of the ceRNA network in TEN prognosis requires further investigation, particularly through the mining of clinical data.
The lncRNA-miRNA-mRNA ceRNA network plays an important role in the regulation of gene expression. The present study established a ceRNA network for TEN by analyzing raw data downloaded from The Cancer Genome Atlas (TCGA). Furthermore, potential prognostic biomarkers were identified. The results of this study may shed light into the molecular mechanisms underlying the development of TEN, and may aid the identification of novel therapeutic and diagnostic biomarkers for the disease.

Patients and samples
RNAseq, miRNA, and clinical data from the TCGA-THYM study were downloaded from the TCGA database. Incomplete data (including sex, age at initial diagnosis, and survival time) were excluded. A total of 124 patients with TEN, including 94 patients with thymic cancer and 30 patients with cardiac, mediastinal or pleural metastases, were included in the present study. Sixty of the patients were females and the remaining 64 were males. A total of 115 patients survived, and nine succumbed. The number of Asian, black or African American, white, and patients with undisclosed ethnicity was 13, 6, 103, and 2, respectively.
Standardization and normalization of the RNA sequences Data were parsed in the R software and duplicated samples were removed. Analysis of the mRNAs and lncRNAs in samples of the non-primary tumor and the non-solid tumor tissues was performed.
Subsequently, the original RNAseq metadata and clinical data were merged. The merged data was normalized using the edgeR and the limma packages.

Analysis of the DE genes (DEGs)
The gdcDEAnalysis package in the R software was used to identify the DE lncRNAs, mRNAs, and miRNAs with cut-off values of | logFC |> 2 and P < 0.01. The DEGs were visualized using a bar plot and a volcano plot.

Functional enrichment analysis
The gdcEnrichAnalysis package in the R software was used to perform the Gene Ontology (GO) and the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analyses. Bar and bubble plots were used to visualize the analyses.
Construction of the lncRNA-miRNA-mRNA ceRNA network 5 lncRNA-miRNA-mRNA ceRNA networks are based on the theory that lncRNAs can directly sponge miRNAs, thereby regulating their activities. The gdcCEAnalysis package in the R software was used to identify the miRNA-mRNA and the miRNA-lncRNA interactions. Thereafter, the Cytoscape software (version 3.7.2) was used to construct the lncRNA-miRNA-mRNA ceRNA network. Finally, the proteincoding genes in the ceRNA network were visualized using the GOplot package in the R software.

Survival analysis
Gene Expression Profiling Interactive Analysis (GEPIA; http://gepia.cancer-pku.cn/) is an interactive web server for cancer and normal gene expression profiling and interactive analyses [8]. To further clarify the relationship between the RNAs and survival rate of patients with TEN, GEPIA and the logrank test (the Mantel-Cox test) were used for survival and statistical analyses, respectively. We identified the DE mRNAs and lncRNAs in the ceRNA network with significant associations with the OS, and survival curves were drawn. P < 0.05 was considered to indicate a statistically significant difference.
Construction and analysis of the lncRNA-miRNA-mRNA ceRNA network We constructed a ceRNA network based on the expression of miRNAs, lncRNAs, and mRNAs in patients with TEN. A total of 53 mRNA-, 27 miRNA-, and 4 lncRNA-nodes were identified. The network has been shown in Fig. 6. In addition, the GOCord circle graph shows the relationship between the genes and the GO terms (Fig. 7a), the GOCircle chart shows the 10 most significant GO terms ( Fig. 7b), and GOBar Plot shows the P value of the relevant GO terms (Fig. 7c were significantly associated with the prognosis of patients with TEN (P < 0.05; Fig. 8).

Discussion
TENs are the most common non-lymphomatous primary neoplasms of the anterior mediastinum, and are difficult to diagnose and treat. TENs have the potential to undergo malignant transformations, and complete resection and radiation therapy are only effective at the early Masaoka stages. Advanced, unresectable, or recurrent TEN are typically treated with palliative chemotherapy [9 23] [ 10 24 ]. TEN type C is the most aggressive subtype of TEN and has a poor survival rate. Studies have shown that the majority of thymic carcinomas are diagnosed at advanced stages, and the cumulative incidence of recurrence at 5 years is 35% [ 11 1]. Variations in the morphological appearance of TEN and heterogeneous neoplastic epithelial cells confound the identification of the biological mechanisms involved in the development of TEN. Therefore, it is necessary to prospectively study the prognosis of patients with TEN. The importance of the non-coding transcriptome has become increasingly evident 7 in recent years [12]. The ceRNA hypothesis suggests that all RNA transcripts containing miRNA binding sites may competitively interact with miRNAs, thereby acting as ceRNAs [12 8]. ceRNAs, which include lncRNAs and mRNAs, mutually regulate each other by competing for their shared miRNAs. This type of ceRNA crosstalk has been widely reported in physiological and pathological processes [13]. As ceRNAs serve a regulatory role in tumorigenesis, they may also serve as potential prognostic biomarkers [14] [15] [16]. Thus, the mining of prognostic biomarkers by analyzing ceRNAs may aid the development of novel anti-TEN drugs, thereby improving the survival rate of patients.
In the present study, a total of 1513 DE mRNAs, 188 DE lncRNAs, and 579 DE miRNAs in TEN tissues were identified, based on the RNA-seq data obtained from TCGA. The GO and the KEGG analyses Accumulating evidence suggests that lncRNAs play important roles in cancer progression. Among the four lncRNAs in the ceRNA network, only SNHG3 was statistically significant. Indeed, research on the mitochondrial proteomes in ovarian cancer cells has shown that the lncRNA SNHG3 regulates energy metabolism [17]. Furthermore, SNHG3 has been revealed to promote the progression of non-small cell lung carcinoma [18] and gastric cancer [19]. However, studies have shown that the other three lncRNAs identified in the present study may also be associated with cancer. lncRNA Neat1 is essential for the paraspeckle formation and plays a vital role in inhibiting the transformation of oncogenic 8 signals, particularly p53. Research shows that p53 induces NEAT1 paraspeckle formation, reduces the replication-associated DNA damage. Neat1 deficiency enhances the development of a premalignant pancreatic intraepithelial neoplasia and cystic lesions in the KrasG12D-expressing mice [20]. In addition, NEAT1 targeting in the established human cancer cell lines induced a synthetic lethality with genotoxic chemotherapeutics and a nongenotoxic activation of p53 [21]. The expression of MAGI2-AS3 is down-regulated in liver cancer [22], bladder cancer [23], and breast cancer [24]. HCG11 promotes the proliferation and migration of gastric cancer cells by targeting miR-1276/CTNNB1 and activating the Wnt signaling pathway [25]. The aforementioned studies suggest that HCG11, MAGI2-AS3, SNHG3, and NEAT1 may serve as valuable prognostic markers in TEN. To the best of our knowledge, the present study was the first to investigate the role of these four lncRNAs in the diagnosis and prognosis of TEN through bioinformatics analysis. Therefore, these lncRNAs may provide important insights for the treatment of TEN in the future. It is important to note that the ceRNA prognostic biomarkers identified in this study were obtained through bioinformatics analysis, therefore future experimental studies or clinical trials will be required to validate these results.

Conclusion
The present study identified the TEN-specific lncRNAs, miRNAs, and mRNAs. A lncRNA-related ceRNA network was constructed, and lncRNAs were revealed to serve as potential prognostic biomarkers for TEN. This study may improve our understanding of the role of ceRNA regulation in the development of TEN, and lays the foundation for future clinical research.

Availability of data and materials
All relevant data are within the manuscript.

Ethics approval and consent to participate
No ethics approval was required for this work. All utilized public data sets were generated by others who obtained ethical approval.

Consent for publication
Not Applicable.  GOCord circle graph shows the relationship between the genes and the GO terms ( Figure   7a), the GOCircle chart shows the 10 most significant GO terms (Figure 7b