3.1 The exploration of immune-related and differently expressed genes and TFs in EC
Here, we identified 6267 DE genes (3861 upregulated genes and 2406 downregulated genes) between tumor and normal tissues (Fig. 1a, b). Afterwards, we collected 2498 immune-related genes. The results of differential gene expression analysis showed that 410 immune-relevant genes were differentially expressed in EC (Fig. 1c, d). The prognostic association between the 410 DE genes and patient OS was estimated by univariate Cox analysis, and 53 immuno-OS-related DE genes with a P-value<0.05 were distinguished for the following analysis (Fig. S2). A total of 102 TFs (55 upregulated and 47 downregulated) were identified as differentially expressed between tumor and normal samples and were also included in the 6267 DE genes (Fig. 1e, f).
3.2 The network of TFs and genes
To enhance the understanding of the potential function of DE TFs and prognosis-related DE immune genes in EC, we constructed a TF and gene network based on the coexpression method. The interaction relationships are listed in Table S1. The interactional network is shown in Fig. 2.
3.3 GO and KEGG analysis of TFs and immune-related genes
The study indicated that there were 29 GO terms and 30 enrichment pathways. Some tumor-related regulatory pathways were observed, including PD-L1 expression and PD-1 checkpoint pathway, MAPK pathway, PI3K-Akt pathway and so on. The enrichment analysis results demonstrated the enrichment of regulatory functions, such as transcription factor complex, nuclear chromatin, and DNA-binding transcription activator activity. The GO term results in EC were shown in Fig. 3a, b, and the significantly enriched KEGG pathways in EC were shown in Fig. 3c, d.
3.4 Identification of prognostic lncRNAs
A total of 204 lncRNAs that may be related to immunity were explored. Then, we divided 541 patients into the training set and testing set by the complete randomization method. By univariate Cox regression analysis in the training set, 4 immune-associated lncRNAs were identified to correlate with OS (P<0.01). Then, a LASSO regression model was carried out to perform the next filtering of the 4 lncRNAs mentioned before. Glmnet from R software package was used for lasso regression analysis (iteration=1000). The trajectory changes of the coefficients of four independent variables were presented in Fig. 4a. Furthermore, cross-validation was applied for model construction, as shown in Fig. 4b, indicating that the mean cross-validated error was minimal when λ=0.0216. At this point, the 4 immune-related lncRNAs were confirmed to have close relativity with overall survival in EC. Afterwards, the result of multivariate Cox regression showed that among the four lncRNAs, three lncRNAs (FP671120.4, LINC02381 and AC074212.1) with positive coefficients may be poor prognostic indicators, while the remaining lncRNA(LNCTAM34A) could be a favorable prognostic factor (Fig. 4c).
3.5 The 4-lncRNA signature for survival prediction
In the training cohort, 272 samples were classified into a high-risk group (n=136) and a low-risk group (n=136) according to the calculated median cutoff value of the risk score. The Kaplan-Meier survival curve analysis revealed that patients with high -risk scores had an obviously poorer OS than those with low -risk scores (P=4.602e-03, Fig. 5a). The AUC for the 4-lncRNA signature achieved 0.717 (Fig. 5b). The distribution of the risk score, survival duration of EC patients and the expression profiles plotted by risk heatmap of the 4 prognostic lncRNAs are demonstrated in Fig. 5c. The patients in the high-risk score group suffered poorer survival than patients in the low-risk score group.
The testing set and entire set were respectively divided into a high-risk group (n=137 in the testing set, n=273 in the entire set) and a low-risk group (n=132 in the testing set, n=268 in the entire set) according to the expression of the 4-lncRNA. The results showed that patients with high risk scores had poorer survival outcomes than patients with low risk scores (Fig. S3a, S4a). The AUC for the 4-lncRNA signature in the testing set and the entire set reached 0.686 and 0.703, respectively (Fig. S3b, S4b).
PCA was performed to detect the biological function of 4-lncRNA signature in EC, based on 4 lncRNAs in model, the whole genome expression set and immune-related lncRNA set (Fig. 5d-f). By using four lncRNAs in the signature and immune related lncRNAs, patients in low- and high- risk groups were separated into two different directions. It indicated that EC patients in low- and high- risk groups generally displayed in distinct immune status patterns, and the different immune states can be distinguished by the lncRNA signature.
3.6 Assessment of independent risk factors
The estimation and verification of independent risk factors were conducted by Cox regression analyses. Univariate Cox regression analysis showed that age, histologic grade and risk score based on 4 immune-related lncRNA signature were identified as factors influencing survival (Fig. 6a). The multivariate Cox regression results demonstrated that the aforementioned features including age (HR=1.023, P=0.049), grade (HR=2.378, P<0.001) and risk score (HR=1.045, P<0.001), were all independent prognostic indicators of EC (Fig. 6b). Moreover, ROC curves were calculated to explore the prognostic forecast capabilities and accuracy of the above factors. As Fig. 6c shown, the 4-lncRNA signature associated with immunity displayed a better AUC (AUC=0.694) and can serve as an effective index to independently predict prognosis.
To evaluate the predictive ability of the signature and obtain more satisfactory results, survival analysis was performed in the randomly regrouped training set (n=272) and test set (n=269). As Fig. S5 and S6 shown, age (p=0.005), grade (p=0.002) and risk score (p<0.001) were directly related to the prognosis of patients in training set. And in testing set, only grade (p=0.004) and risk score (p<0.001) were related to prognosis. Multivariate Cox regression analysis indicated that aside from age (training set: p=0.055, testing set: p=0.177), only grade (training set: p=0.004, testing set: p=0.012) and risk score (training set: p=0.002, testing set: p<0.001) were statistically independent predictive indicators of endometrial cancer. The AUC for the risk score (training set: AUC=0.733, testing set: AUC=0.657) based on 4-lncRNA signature in both training set and testing set was higher than that for grade (training set: AUC=0.665, testing set: AUC=0.635) and age (training set: AUC=0.651, testing set: AUC=0.544). It indicated that 4-lncRNA had the ability to compete sufficiently with traditional clinical factors to predict OS of EC patients. The results demonstrated the superiority of 4-lncRNA in predicting HCC patient OS compared with classical clinical and pathological staging systems.
3.7 Comparison of the immune-related lncRNA signature with other prognostic models
In order to determine whether this immune-related lncRNA signature had more superiority than other endometrial cancer prognostic biomarkers, we compared our signature with nine-gene signature [25], six-gene signature [26], seven-gene signature [27], and nine-gene signature [28]. The genes in these signatures were obtained from the literature, and we constructed the ROC curves and survival curves of the entire cohort. As shown on fig. 7, the AUC values of OS in these models were 0.703, 0.675, 0.597, 0.61 and 0.665, respectively. Through analysis and comparison of these signatures, we know that the accuracy of our signature in predicting prognosis of endometrial cancer is higher than that of other four biomarkers (Table 2).
3.8 Prognostic value of each of the four lncRNAs
We compared the corresponding expression levels of each of the four lncRNAs (FP671120.4, LINC02381, LNCTAM34A and AC074212.1) between EC tissues and non-tumor tissues (Fig. 8). Finally, a total of 541 EC patients were divided into the high- and low-expression groups by utilizing the median expression level of each lncRNA as the critical value. Kaplan-Meier survival analysis was employed to explore the prognostic capacity of each lncRNA, and the analysis results are presented in Fig. 9.
3.9 QRT‐PCR verification
In order to further evaluate the reliability of the immune-related signature, we measured the actual expression of four lncRNAs in the tissues of 29 patients by qRT-PCR. Compared with adjacent normal tissues, FP671120.4, LINC02381 and AC074212.1 were upregulated while LNCTAM34A were downregulated in EC tissues (Fig. 10). Similarly, compared with normal endometrial epithelial cell line hEEC, the expression level of FP671120.4 and LINC02381 significantly upregulated in EC cell lines (Fig. 11). In addition, AC074212.1 expression was upregulated and LNCTAM34A was downregulated in Ishikawa. However, the expression of AC074212.1 and LNCTAM34A were both no significant difference between HEC-1A and hEEC. The results of qRT-PCR verification in 29 patients with endometrial cancer and in cells were consistent with the above-mentioned bioinformatics results. It revealed the validity and reliability of the biological signature we constructed. The flowchart of our research strategy is described in Fig. 12.