3.1 Identification of Immune-related lncRNAs
In total, 14,142 lncRNAs and their expression profiles were obtained from the TCGA database, and the list of 331 Immune-related mRNAs was identified from the MSigDB. Then, 272 Immune-related lncRNAs were screened according to the pre-defined threshold. The entire expression profile of the Immune-related lncRNAs was randomly divided into the training cohort and the testing cohort by using the “caret” package in R. As shown in Table 1, there was no significant difference in clinical parameters between the training cohort and the testing cohort. Therefore, the expression profiles of the 272 Immune-related lncRNAs in the training cohort were selected for subsequent study.
Table 1
Clinical features of the patients with CRC in each cohort.
Variables | Training cohort (n = 253) | Testing cohort (n = 253) | P | Entire TCGA cohort (n = 506) |
Age (years) | | | | |
≤ 65 | 120 | 105 | 0.2104 | 225 |
> 65 | 133 | 148 | | 281 |
Gender | | | | |
Female | 105 | 125 | 0.0898 | 230 |
Male | 148 | 128 | | 276 |
Survival status | | | | |
Alive | 209 | 206 | 0.8169 | 415 |
Dead | 44 | 47 | | 91 |
Tumor invasion (T) | | | | |
T1 | 7 | 7 | 0.8897 | 14 |
T2 | 45 | 46 | | 91 |
T3 | 174 | 170 | | 344 |
T4 | 27 | 29 | | 56 |
Unknow | 0 | 1 | | 1 |
Lymph node (N) | | | | |
N0 | 150 | 145 | 0.6741 | 295 |
N1 | 63 | 61 | | 124 |
N2 | 40 | 46 | | 86 |
Unknow | 0 | 1 | | 1 |
Metastasis (M) | | | | |
M0 | 214 | 213 | 0.2475 | 427 |
M1 | 38 | 35 | | 73 |
Unknow | 1 | 5 | | 6 |
Tumor stage | | | | |
Stage I | 44 | 46 | 0.5482 | 90 |
Stage II | 100 | 88 | | 188 |
Stage III | 66 | 74 | | 140 |
Stage IV | 38 | 35 | | 73 |
Unknow | 5 | 10 | | 15 |
3.2 Construction of the Immune-related lncRNA Signature
We carried out univariate Cox regression model to assess the prognostic significance of the 272 Immune-related lncRNAs in the training cohort and found that 13 lncRNAs were significantly associated with the overall survival (OS) of CRC patients (P < 0.01, Fig. 1A). Subsequently, we performed Lasso regression with a 10-fold cross-validation on these lncRNAs in order to avoid over-fitting the prognostic signature. As a result, 10 genes were remained from 13 significant prognosis related lncRNAs in univariate Cox regression model by using Lasso regression model (Fig. 1B&C). Then, a total of five lncRNAs were identified as independent prognostic factors by Multivariate Cox regression model, and its regression coefficients (β) were also determined for further analysis (Fig. 1D and Table 2). Finally, based on the expression levels of these five lncRNAs, the risk score of each patient was calculated as follow: riskScore = (0.22928 × ExpHCG11) + (0.52785 × ExpMIR181A2HG) + (0.46034 × ExpNIFK−AS1) + (0.05573 × ExpSNHG7) + (0.65167 × ExpZEB1−AS1).
Table 2
Multivariate Cox results of lncRNAs based on TCGA data.
lncRNA | Coefficient (β) | HR | 95% CI of HR |
HCG11 | 0.22928 | 1.258 | 1.058–1.494 |
MIR181A2HG | 0.52785 | 1.695 | 1.059–2.712 |
NIFK-AS1 | 0.46033 | 1.585 | 0.932–2.693 |
SNHG7 | 0.05573 | 1.057 | 1.015–1.102 |
ZEB1-AS1 | 0.65167 | 1.919 | 1.111–3.314 |
HR, Hazard Ratio; CI, Confidence Interval. |
3.3 The Prognostic Influence of the Established Signature in TCGA
The CRC patients were divided into low-risk group and high-risk group according to the median risk score. We assessed the predictive performance of the five-lncRNA immune signature by time-dependent ROC curves and found that the AUC for 1-, 3-, and 5-year OS were 0.790, 0.799, and 0.760, respectively (Fig. 2A). Survival status analysis showed that the patient mortality rate increased with the increase of risk score (Fig. 2B). The Kaplan-Meier log-rank test revealed that patients in the high-risk group suffered shorter survival time than those in the low-risk group (P < 0.05, Fig. 2C). Then, we applied the testing cohort to validate the prognostic ability of the five-lncRNAs immune signature. Time-dependent ROC curves showed that the AUC for 1-, 3-, and 5-year OS were 0.651, 0.633, and 0.690, respectively (Fig. 2D). Survival status analysis showed that the patient mortality rate increased with the increase of risk score (Fig. 2E). Kaplan-Meier curves showed that patients in the high-risk group suffered shorter survival time than those in the low-risk group (P < 0.05, Fig. 2F). Similarly, we verified the prognostic ability of this Immune-related lncRNA signature in the entire TCGA cohort. Time-dependent ROC curves showed that the AUC for 1-, 3-, and 5-year OS were 0.710, 0.716, and 0.799, respectively (Fig. 2G). Survival status analysis showed that the patient mortality rate increased with the increase of risk score (Fig. 2H). Kaplan-Meier curves showed that patients in the high-risk group suffered shorter survival time than those in the low-risk group (P < 0.05, Fig. 2I). These findings above suggested that the Immune-related lncRNA signature was competent for predicting the prognosis of CRC patients.
3.4 Clinical Values of the Immune-related lncRNA Signature
We assessed the association between the signature and clinical parameters to further confirm the clinical values of the Immune-related signature. As a result, the risk scores roughly increased with increasing T stage (Fig. 3A), N stage (Fig. 3B), M stage (Fig. 3C), and American Joint Committee on Cancer (AJCC) tumor stage (Fig. 3D). These findings revealed that this Immune-related lncRNA signature may be associated with the metastasis and progression of human CRC. Then, we applied univariate and multivariate Cox regression model to determine whether the signature can be used as an independent prognostic factor in CRC. The hazard ratio (HR) of risk score and 95% CI were 1.205 and 1.135–1.280 in univariate Cox regression analysis (P < 0.001), and 1.189 and 1.109–1.274 in multivariate Cox regression analysis (P < 0.001), respectively. This Immune-related lncRNA signature could serve as prognostic factor for CRC independent of clinicopathological factors.
3.5 Construction of the Prognostic Nomogram
To better predict the prognosis of patients with CRC, we constructed a nomogram by integrating risk score and other clinicopathologic risk factors and found that risk score and TNM stage were the largest risk factors for OS of CRC patients (Fig. 4A). The AUC of OS rate showed that risk score (0.778) and stage (0.769) had a certain prediction ability (Fig. 4B). In addition, the calibration plot showed a satisfactory agreement between predictive and actual values at the probabilities of 3- and 5-year survival (Fig. 4C). These findings indicated that the nomogram held good accuracy in predicting the survival of CRC patients.
3.6 lncRNA-mRNA Co-expression Network Analysis
Based on the correlation coefficient threshold > 0.5, and the corresponding P < 0.001, we found that these five Immune-related lncRNAs had obvious correlation with 24 mRNAs, and the lncRNA-mRNA co-expression relationship network was constructed and visualized by using Cytoscape software (Fig. 5A). Subsequently, GO enrichment analysis showed that the mRNAs co-expressed with the Immune-related lncRNA signature mainly enriched in immunomodulatory pathways, such as myeloid cell differentiation, regulation of hemopoiesis, regulation of leukocyte differentiation (Fig. 5B). KEGG pathway enrichment analysis showed these mRNAs mainly enriched in NF-κB signaling pathway, RIG-I-like receptor signaling pathway, IL-17 signaling pathway, and TNF signaling pathway (Fig. 5C).