Differentially expressed RNAs in UCEC
In a study of 575 UCEC samples collected from TCGA (http://cancergenome.nih. gov/publications/publication guide lines), "edge R" (Adjust P < 0. 01 and | log2FC |>2 ) was adjusted to identify the DERNAs, including DElncRNAs, DEmiRNAs, DEmRNAs. A total of 1513 up-regulated DEmRNAs, 914 down-regulated DEmRNAs, 686 up-regulated DElncRNAs, 330 down-regulated DElncRNAs, 124 up-regulated DEmiRNAs, and 50 down-regulated DEmiRNAs were identified in TCGA database. Similarly, a total of 116 samples were identified with the same criteria in the CPTAC database, including 2463 differential DEmRNAs, 1741 DElncRNAs and 204 DEmiRNAs. The heatmap and volcano plot of those highest dysregulated DElncRNAs, DEmiRNAs and DEmRNAs were shown in Fig.1 and the top 10 up and down regulated DElncRNAs, DEmiRNAs and DEmRNAs were shown in supplement Tables 1.
Construction of PPI Network
The 1981 DEmRNAs (| log2FC | >3) were further selected to construct PPI network to select hub genes that play crucial roles in UCEC genesis. Given the large number of DEmRNAs in this module, we used MCC algorithm in “Cytohubba” of Cytoscape software to visualize and select hub genes in the PPI network. The top 20 high score genes in in CPTAC were shown (Fig.2A). Similarly, the top 20 high score genes in TCGA belong to Histone cluster 1 H family which was shown (Fig.2B).
Two graphs separately describe the top 20 most dynamic hub genes and their intersection relationships evaluated by MCC algorithm in “Cytohubba”. These sub-graphs of these selected mRNA-coding protein nodes are shown from highly essential (red) to essential (yellow).
TIICs Enrichment Analysis
Using CIBERSORT algorithm, we evaluated 101 tumor transcriptome profiles from CPTAC database and 543 tumor transcriptome profiles from TCGA database (Fig.3A,3B). In addition, we also performed ssGSEA analysis by GSVA package to score the corresponding TIICs in each simple sample, and finally we found that TIICs in both CPTAC and TCGA database sources expressed well (Fig.3C,3D). On account of the top five immune cell components in UCEC patients by CIBERSORT algorithm, 2 bar graphs were then visualized. In CPTAC, there were abundant CD8 T cells (28.5%) and plasma cells (20.5%) and other TIICs. In TCGA samples, naive CD4 cells (18.9%) and CD8 cells (15.2%) were well infiltrated, accompanied by slightly increased activated NK cells (10.2%), plasma cells (9.1%) and macrophages M0(7.5%) (Fig.3E,3F).
Construction of ceRNA Network and hub LncRNA-miRNA-mRNA subnetwork
In order to better comprehend the interactions of mRNAs, lncRNAs, and miRNAs in UCEC, we constructed an lncRNA-mediated ceRNA regulatory network. To begin with, 1741 DElncRNAs in CPTAC database succeeded to match with 123 lncRNAs in the miRCODE database. Considering 123 of 1741 DElncRNAs could interact with DEmiRNAs, 36 miRNAs in both miRCODE and CPTAC database were selected to construct lncRNA-miRNA pairs. Meanwhile, to interplay with 204 DEmiRNAs acquired from CPTAC database, we retrieved 1420 mRNAs in three databases (miRTarBase, miRDB and TargetScan). The 1420 miRNA-targeted mRNAs predicted in these databases were intersected with 2463 DEmRNAs thus to obtain 124 miRNA-targeted mRNAs belonging to CPTAC database. Finally, an lncRNA-mediated ceRNA network consisting of 36 miRNAs, 123 lncRNAs and 124 mRNAs were achieved (Fig.4A). Meanwhile, the same workflow of UCEC-specific ceRNA network construction was repeated in data from TCGA. We obtained an lncRNA-mediated ceRNA network consisting of 38 miRNAs, 83 lncRNAs and 110 mRNAs (Fig.4B).
Furthermore, cytohubba was applied to visualize our extracted hub genes composing of lncRNA, miRNA and mRNAs and derived regulatory ceRNA network thus to identify potentially prognostic molecular pathways of UCEC in CPTAC and TCGA (Fig.4C,4D). A total of 6 hub lncRNA-miRNA-mRNA regulatory relationships from 2 databases were shown (Table 2). Moreover, coincident ceRNA results in the overall ceRNA network from both CPTAC and TCGA databases were shown in the Venn diagram (Fig.4E).
Table 2 CeRNA subnetwork of prognostic regulatory DEGs in UCEC from CPTAC (up) and TCGA (down) database
LncRNA
|
miRNA
|
mRNA
|
TRBV11-2
|
has-mir-363
|
SOX11
|
MEG8
|
has-mir-424
|
CCNE1, CBX2
|
has-mir-363
|
SOX11
|
has-mir-183
|
DLX4, NR3C1
|
lncRNA
|
miRNA
|
mRNA
|
LINC00443, C2orf48, LINC00483
|
miR-183
|
DLX4, NR3C1
|
DGCR5
|
has-mir-195
has-mir-383
has-mir-424
|
CCNE1
|
Identification and Validation of Prognostic lncRNAs in ceRNA networks
In order to figure out the effects of interactions for survival between lncRNAs, miRNAs and mRNAs, we imported survival-related data of UCEC and genes in ceRNA to analyze its prognosis. Survival R were operated for DERNAs significantly correlated with overall survival in the ceRNA network(p < 0. 05), the results of which were plotted by Kaplan–Meier (K-M) curves (Fig.5A-5W).
As shown in CPTAC, 4 survival-related DElncRNAs on the level of 3-year survival were identified in DElncRNA-mediated ceRNA networks, including FREM2-AS1, HPYR1, LINC00028, MIR205HG (Fig.5A-5D). Similarly, 19 survival-related DElncRNAs, 9 DEmiRNAs and 33 DEmRNAs on the level of 5-year survival were revealed in the ceRNA network for TCGA. Fig.5E-5W illustrated Kaplan–Meier curve analyses about 10 lncRNAs of 19 survival-related DElncRNAs, 5 mRNAs of 33 DEmRNAs and 2 of 9 DEmiRNAs derived from TCGA database. Besides, we successfully validated 5 survival-related lncRNAs in GEPIA database by p values ≦0.1 (Fig.6A-6E).
To further identify DElncRNAs with prognostic features in a more accurate way, multi-Cox regression analyses and corresponded ROC curves were carried out. After eliminating some samples lacking in survival time, 94 complete samples in CPTAC were divided into the high-risk (n=47) and low-risk (n=47) groups (cutoff value = -0. 78) and 543 samples with complete survival information in TCGA into the high-risk (n=272) and low-risk (n=272) cohort by median value (cutoff value= -0. 18; one sample of survival data was just in the median and counted in both groups). We performed a multi-factor COX regression analysis and a global survival analysis of the model thus separately identified two lncRNAs of 3-year survival data in CPTAC and eight lncRNA prognosis candidates of 5-year survival UCEC data in TCGA by p < 0.05 (Fig.7A,7B,7C,7F)). Receiver operating characteristic (ROC) curves tested the influence on their lncRNA signatures associated with overall survival in UCEC. Area under ROC curve of 3-year survival rate (AUC) and 5-year survival rate (AUC) were respectively 0.967 and 0.751. (Fig.7B,7E) Besides, multivariate cox regression analysis of totally 10 prognostic lncRNAs associated with overall survival in UCEC patients generated from 2 databases were shown in Table 3.
Table 3 Multivariate Cox regression analysis of totally 10 prognostic lncRNAs associated with overall survival (OS)
LncRNA(in TCGA)
|
Coef
|
HR
|
95%CI_LL
|
95%CI_HL
|
p-value
|
FAM41C
|
0.017169325
|
1.017317565
|
1.007489748
|
1.027241249
|
5.27E-04
|
MIR7_3HG
|
0.003483874
|
1.00348995
|
1.001209451
|
1.005775643
|
0.002688886
|
LINC00483
|
0.010635045
|
1.010691798
|
1.003527131
|
1.017907617
|
0.003389722
|
ABHD11_AS1
|
8.26E-04
|
1.000826495
|
1.00038596
|
1.001267224
|
2.35E-04
|
LINC00443
|
0.007147984
|
1.007173592
|
1.001433816
|
1.012946267
|
0.014233205
|
OXCT1_AS1
|
0.026906491
|
1.027271739
|
1.006299256
|
1.048681314
|
0.010568912
|
PRICKLE2_AS2
|
0.273785608
|
1.314932861
|
1.133483915
|
1.525428288
|
3.02E-04
|
GLIS3_AS1
|
9.53E-04
|
1.000953853
|
1.000210868
|
1.001697389
|
0.011853011
|
LncRNA(in CPTAC)
|
Coef
|
HR
|
95%CI_LL
|
95%CI_HL
|
p-value
|
MEG8
|
0.007280976
|
1.007307546
|
1.003839185
|
1.010787892
|
3.51E-05
|
TRBV11_2
|
0.027816351
|
1.028206838
|
1.012679086
|
1.043972683
|
3.40E-04
|
In the multivariate Cox regression analysis derived from TCGA database, 8 lncRNAs including FAM41C, MIR7_3HG, LINC00483, ABHD11_AS1, LINC00443, OXCT1_AS1, PRICKLE2_AS2 and GLIS3_AS1 were identifed to construct the OS prediction model. OS-related prediction model=(0.017169325* expression value of FAM41C)+(0.003483874* expression value of MIR7_3HG)+(0.010635045* expression value of LINC00483)+(8.26E-04* expression value of ABHD11_AS1)+(0.007147984* expression value of TRBV11_2)+(0.027816351* expression value of TRBV11_2)+(0.027816351* expression value of LINC00443)+(0.026906491* expression value of OXCT1_AS1)+(0.273785608* expression value of PRICKLE2_AS2)+(9.53E-04* expression value of GLIS3_AS1). We divided the 543 UCEC cases into the high and low-risk groups according to the median values of the OS-related prediction model.
In the multivariate Cox regression analysis derived from CPTAC database, 2 lncRNAs including MEG8 and TRBV11_2 were identified to construct the OS prediction model. OS-related prediction model=(0.007280976* expression value of MEG8)+(0.027816351* expression value of TRBV11_2). We divided the 94 UCEC cases into the high and low-risk groups according to the median values of the OS-related prediction model.
Validations of survival analysis and mRNA Expression at the Transcriptional Level
To further demonstrate the prognostic significance of 33 mRNAs screened from the ceRNA network, we selected external databases for survival analysis and validation with IHC images. Firstly, we input 33 screened mRNAs into HPA database (version 20.1; https://www.Proteinatlas.org/about/assays+annotation#tcga_survival) to validate whether they were associated with the prognosis of UCEC. Consequences revealed that 8 mRNAs(CBX2, CCL22, CCNE1, DLX4, IGFBP5, NR3C1, SOX11, POLQ) highly expressed in UCEC were closely related with its prognosis(log rank P values <0.001). Subsequently, we retrieved overall survival analyses of 8 mRNAs generated from GEPIA by filtered criteria of P values ≤0.1(although 0.11 was also considered as significant in this study) and verified 5 mRNAs (CCNE1, CCL22, NR3C1, IGFBP5 and POLQ).
Based on two previous steps for external verifications, two IHC images of the last screened mRNAs (CCNE1, NR3C1) in the HPA database approved the same results (Fig.8A). Survival validations of 5 mRNAs including CCNE1, CCL22, NR3C1, IGFBP5 and POLQ from GEPIA were shown in the Fig.8B-8F. In this study, we identified 5 survival-related mRNAs, there were no related IHC samples of CCL22, IGFBP5 and POLQ but CCNE1 and NR3C1 to further validate in the HPA database. The translational expression level of CCNE1 and NR3C1 was positively linked with disease status, as they were up-regulated in UCEC samples.
Enrichment Analyses of Functional Pathways
To elucidate the biological functions represented in two profiles of identified DEmRNAs, we performed enrichment analyses mainly by "cluster profiler", with the standard of p <0. 05. In this study, GO analyses disclosed that top significant GO terms (p < 0. 05) commonly obtained from UCEC data in CPTAC (Fig.9A,9B) and TCGA database (Fig.9E,9F). The KEGG analyses revealed that what closely related to DEmRNAs originated from CPTAC were mainly enriched in pathways such as “cellular senescence”, “proteoglycans in cancer” and “microRNAs in cancer” (Fig.9C,9D)). The KEGG results derived from DEmRNAs in TCGA database were shown (Fig.9G,9H). The top 20 GO and KEGG results for TCGA and CPTAC database were provided in supplement Table 2.