Immune scores was significantly associated with prognosis and clinical characteristics of SKCM
The information of 471 patients with SKCM, including gene expression and clinical characteristics such as survival time and status, age, gender and tumor stage, were downloaded from TCGA. ESTIMATE algorithm method was applied to evaluate the immune scores and stromal scores, representing the proportion of immune cells and stromal cells respectively. To investigate the relationship between scores and overall survival, Kaplan-Meirer analysis was performed to analyze the survival probability of all tumor samples, which were split into two groups according to the median of scores. As a result, compared with low immune scores group, a higher survival probability was observed in high immune scores group (Fig. 1A, P < 0.01). Similarly, prognosis of patients in high stromal scores group was better than that in low stromal scores group, although without statistical difference (Fig. 1B, P = 0.076). Nonetheless, ESTIMATES scores were positively related to survival probability (Fig. 1C, P < 0.01), indicating the significant role of immune components for predicting prognosis of SKCM patients. Immune scores were negatively associated with T classification (Fig. 1D), while there was no significant correlation between stromal scores and clinical features (data was not shown), suggesting that immune scores may be related to progress of SKCM. Taken together, these results indicate that the analysis of immune scores contributes to predicting the prognosis of patients with SKCM.
Identification and functional enrichment of DEGs
According to the above results, we considered that immune components was important for prognostic diagnosis of SKCM. Hence, we divided all the tumor samples into low and high immune scores groups to differentiate immune related genes. As a result, 493 DEGs were identified, including 486 upregulated and 7 downregulated DEGs. All of the DEGs were applied to annotate their potential function by GO and KEGG enrichment analysis. GO enrichment analysis ascribed for DEGs mostly enriched in immune-related GO terms, including T cell activation, regulation of T cells and lymphocyte activation (Fig. 2B). In addition, KEGG enrichment results showed that DEGs mapped to cytokine-cytokine receptor interaction, Graft-versus-host disease and hematopoietic cell lineage (Fig. 2C). Hence, these results suggested a predominant role of immune factors, which was worth investigating further.
Intersection of top-ranked DEGs in PPI network and DEGs responsible for prognosis
PPI network was performed by STRING database with high confidence (0.9). Re-analysis of PPI network by MCODE of in cytoscape software differentiated top two important modules (Fig S1). Additionally, the top 30 DEGs with more nodes were depicted in the bar plot (Fig. 2D). Univariate COX regression and Kaplan-Meier survival analysis for survival probability were conducted to discriminate significant genes from 493 DEGs. Subsequently, intersection analysis of 100 top-ranked DEGs in PPI network and DEGs with significant p-value (P < 0.05) in univariate COX regression and Kaplan-Meier survival analysis was performed to screen significant prognostic factors (Fig. 2E). In total, 84 DEGs were identified for further analysis.
Construction of immune-related genes prognostic model
Firstly, all samples were randomly split into a training dataset and a test dataset (7:3). Lasso Cox regression analysis was applied to establish the prognostic signature model in training dataset. Subsequently, 9 genes were selected for further multivariate Cox regression analysis. 3 genes including HLA Class II Histocompatibility Antigen, DQ Beta 2 Chain (HLA-DQB2), CD80 Molecule (CD80), Guanylate Binding Protein 4 (GBP4) were defined to develop a prognostic model for OS (Table 1). Risk scores = expression of HLA-DQB2 * (-0.012311896) + expression of CD80 * (-0.628923031) + expression of GBP4 * (-0.017960845). According to the formula, the risk scores of patients in train or test dataset were calculated. Additionally, patients were classified into low- and high- groups based on the median of risk scores. The distribution of risk scores and expression patterns 3 prognostic genes in training and test datasets was displayed in Fig. 3C and Fig. 3D respectively. Kaplan-Meier curves in both training and test datasets demonstrated that patients in the low-risk group had a better prognosis (P < 0.01) (Fig. 3E and 3F).
Table 1
Prognostic model generated by multi-cox analysis.
gene
|
coef
|
HR
|
HR.95L
|
HR.95H
|
pvalue
|
HLA-DQB2
|
-0.01231
|
0.987764
|
0.974121
|
1.001597
|
0.082721
|
CD80
|
-0.62892
|
0.533166
|
0.246753
|
1.152024
|
0.109611
|
GBP4
|
-0.01796
|
0.982199
|
0.968251
|
0.996349
|
0.01385
|
Establishment of nomogram for survival probability prediction
Next, ROC curves for training and test dataset were depicted to evaluate the prognostic value of immune-related prognostic model. The AUCs ascribed for 3-, and 5-year OS in training dataset were 0.718 and 0.722 respectively, while that in test dataset were 0.670 and 0.674 (Fig. 4A and 4B). As shown in Fig. 4C and 4D, the results analyzed by Univariate Cox and Multiple Cox revealed that T and N stage classification, as well as our prognostic model were independent prognostic factors in SKCM patients. In addition, the statistical difference of clinicopathological factors including survival status, age, gender and TMN stage classification between low- and high-risk groups was investigated by chi-square test. The results in Fig. 4E determined that survival status, T and stage classification were significantly different in low- and high-risk groups. Finally, in order to predict OS for SKCM patients, a visual nomogram model was established in Fig. 4F.
Correlation between risk scores and immune cells
CIBERSORT algorithm was applied to calculate the proportion of 21immune cells in all tumor samples (Fig. 5A). And the correlation of 21 immune cells was depicted as a matrix in Fig. 5B. T cells CD8 exhibited negative correlation with Macrophages M0 and T cells CD4 memory resting, while it was positively related to T cells CD4 memory activated and T cells follicular helper. The expression of immune cells was different in low and high risk scores groups (Fig. 6A and 6B). To identify the significant survival-related immune cells, the intersection analysis of difference and correlation analysis was performed (Fig. 6C). As a result, 7 immune cells such as T cells CD8, T cells CD 4 memory activated, Macrophages M1 and NK cells activated presented negative correlation with risk scores, while 6 immune cells such as T cells CD4 memory activated, Macrophages M0 and Macrophages M2 were positively associated with risk scores.