Design of workflow
For investigating the biomarkers in MM, a hypothesis was proposed, which assumes that differential genes related to immunity may be related to the prognosis of metastatic patients. Thus, the prognosis was explored in MM patients with OBS from TCGA database. The workflow was listed in Figure1
Patient demographics and clinical factors
Our study sample comprised 448 patients. Patient characteristics of the discovery and validation set are shown in Table 1 and Table 2. The TCGA sample form 448 patients as the discovery set (96 patients of primary, 352 patients of metastasis) was analyzed, and the GSE19234 (44 patients with above stage III of MM) samples of MM patients as the validation set (Table 2) (44 patients of metastasis) was analyzed.
The analysis of immune-related genes in primary and metastasis tissue of MM patients
Compared to primary tissues, a total of 172 DEGs consisting of 21 up-regulated and 151down-regulated genes were identified. The heatmap and volcano plot of the DEGs are shown in Figures. 2 A. To further verify whether these 172 DEGs were related to immune microenvironment reconstruction, the immune-related genes (IRGs) from the Immport database https://immport.niaid.nih.gov/home (The Immunology Database and Analysis Portal) were applied to validate. Finally, 28 IRGs were obtained by matched them, the heatmap of 28 differentially expressed IRGs was shown in Figure. 2B (Supplement file. 1).
Establishment of the immune-related genes risk signature
Seven predominant clinical and prognostic factors, including age, gender, race, clark level value, pathologic TNM, stage, Breslow depth value,and IRGs were evaluated using univariate and after testing for collinearity, the results showed that 16 IRGs and two clinical factors (Clerk scores and pathologic T) were selected (Table. 3), for univariable Cox regression, the coefficients of clerk scores (HR was 1.261 with 95% CI [1.026, 1.55], and p-value = 0.0275 *) and pathologic T stage (HR was 1.101 with 95% CI [1.013,1.197], and p-value = 0.024 *) were statistically significant. Then, the stepwise multivariate Cox regression analysis was performed to establish the IRG-derived risk signature in metastasis MM. After stepwise multivariate cox proportional hazard regression, only 6 IRGs for OBS were included in the model (Table 4). The data showed that the pathologic T stage continued to be influencing factors of the prognosis of MM patients in TCGA, with p-values of 0.025. For every MM patient, a PI as a risk score was calculated according to equation (1). The PI was significantly associated with OBS in metastasis group. As, the PI was the most significant with the smallest p-value (1.921×10-13**) and the largest HR (1.614 [1.420, 1.833]), which thereby implied that the PI was superior to conventional prognostic markers, including T stage in predicting patient outcome.
The time-dependent ROC curve was created in Figure. 3B. The ability and efficiency of each model to predict metastasis patient’s outcome was estimated by calculating the AUC of ROC curve, which was conducted using the survival ROC package in R software. The time-dependent ROC curve was created, the AUC of ROC curve were 0.700 (5 year), 0.692 (3 year) and 0.709 (1 year) larger than 0.50 for OBS indicated that the model performed well. Above results demonstrated that the integrative model could effectively classify patients. Thus, if the PI of a patient was larger than 0.228 (1 year) and 0.239 (3 year) or -0.153(5 year), this patient was assigned to a high-risk group. On the contrary, a patient would be considered low-risk (Figure. 3C). K-M curves (Figure. 3A) show that the p-value of the log-rank test was 2.01Í10-10, which implied that the PI could significantly separate patients in a high-risk and low-risk group. Therefore, it can be concluded that the PI calculated via 6 IRGs had a good predictive value of prognosis of MM patients in the TCGA database.
GO functional enrichment analysis
We analyzed the GO enrichment and pathways in which gene-signature involved in. Of these IRGs models, the number of genes in a gene signature model is so small that it is difficult to enrich in GO analysis. ClusterProfiler package was employed to analyze the models. The above package can compare the results of biological process, cellular component and molecular function (Figure. 4). The results of biological process suggested that the 172 differential genes share many processes (Figure. 4A), include epidermis development, skin development, epidermal cell differentiation. These biological processes are related to the occurrence and development of MM.
Correlation of the risk score with tumor-infiltrating immune cells in MM
By applying the CIBERSORTx algorithm to RNA-seq data, the relative proportions of 22 immune cell subsets of MM were acquired. Consecutively, 96 cases of primary MM and 352 cases of metastasis MM in the TCGA dataset were enrolled for further analysis after the filter criteria with p-value < 0.05 via CIBERSORTx algorithms.
The boxplot showed that in TCGA datasets, the immune status of the primary and metastasis samples showed a certain degree of heterogeneity. Overall, we noticed that the majority of infiltrating immune cells within primary MM tissue were comprised of naïve B cells followed by resting CD8 + T cells and plasma cells (Figure. 5A, Supplement file.2).
As shown by bar plot in Figure 5, the abundance of the 22 infiltrative immune cells by using CIBERSORTx were significantly different between primary group (Figure. 5B) and metastasis group (Figure. 5C) in MM cohorts. Among them, the macrophage M2 was the most significant enrichment of immune cells.
Subsequently, as shown in the box plots (Figure. 5D), the infiltration levels of plasma cells, naive B cells and Tfh cells were significantly differential in primary group and metastasis group in TCGA datasets (p-value < 0.05). The naïve B cells and plasma cells were all significantly higher in metstasis group than that in primary group in (p-value < 0.05). And conversely, for Tfh cells, there is a lower infiltration level in metastasis group than primary group.
Correlation of the risk score with tumor-infiltrating immune cells
As we known the immune checkpoint gene expression, which is associated with treatment responses of immune check point inhibitors, was also analyzed. We evaluated nine genes previously reported to be targets of ICIs: PDCD1 (PD-1), CD274 (PD-L1), CD276 (B7-H3), CTLA-4, IDO1, LAG3, VTCN1, HAVCR2 and TNFRSF4 (OX40). The correlation analysis showed that correlationbetween 22 types of tumor immune cells both and 9 immune-checkpoint genes (ICGs) and 6-IRGs were very different in primary tissues and metastatic tissues of MM. (Figure. 6A,C and 7A,C, Supplement file. 3,5 and 7,9). Except for CXCR4 and VTCNI, the correlation between 6 genes and 9-ICGs is very similar. In the primary group, CXCR4 and VTCNI showed a negative correlation (r = -0.18, p-value = 0.01) in the metastasis group, while there was no correlation in the primary group. The transfer group CD79A (r = -0.28, p-value <= 0.001) and LYZ (r = -0.189, p-value <= 0.001) showed a negative correlation with CD276, but there was no between CD79A, LYZ, and CD276 in the primary group correlation (p-value = 1), (Figure. 6B and 7B, Supplement file.4 and 8). The correlation between 6-IRGs and tumor purity showed that (Figure. 6D and 7D, Supplement file.6 and 10), fourth IRGs were negativities correlated with the tumor purity both in primary group and metastasis group. The correlation analysis results of the 6-genes and the differential tumor infiltrating cells in the primary and metastatic groups showed that (Figure. 6E-G and 7E-G): in the metastatic group, LYZ (r =0.309, p-value <0.0001) and CXCR4(r =0.331, p-value < 0.0001) were highly correlated with plasma cells, and CCL19(r =0.303, p-value < 0.0001), CD79A (r =0.241, p-value < 0.0001 ) and the naïve B cells were correlated. Not only that, CXCR4 is also highly correlated with Tfh (r =-0.175, p-value < 0.0001). Perhaps these genes are related to the transfer of primary tumor.
Validation set prognosis analysis for MM in GEO cohort
We screened six IRGs and verified them with the GEO data set again (GSE19234), which can distinguish the prognosis of high-risk and low-risk patients. The validation result is shown in Figure 8. From figure 8, the PI was significantly associated with OS in independent data of MM (HR= 1.457, 95% CI= 0.991-2.142) of GSE19234 dataset. The value of AUC=0.676 (5 year), AUC=0.681 (3 year) and AUC=0.845 (1 year) also indicated that the model performed well.
Validation set immune analysis for MM in GEO cohort
The correlation analysis showed that correlation between 22 types of tumor immune cells and 6 IRGs were very similarity in metastatic tissues of TCGA dataset and tissues of MM in GSE19234(Figure. 9A, supplement file.11). The correlation analysis results of 6 IRGS and 22 immune infiltrating cells showed that: CXCR4 showed a significant positive correlation with plasma cells (r = 0.524, p-value =0.03), and a significant negative correlation with activated NK cells and resting DC cells (r = -0.534, p-value =0.02). CCL19 showed a significant positive correlation with T cells CD4 memory activated (0.534, p-value =0.02), and a significant negative correlation with NK cells activated (r= -0.683, p-value <0.001) and Dendritic cells resting (r= -0.535, p-value =0.02). It is worth noting that the correlation analysis results of 6 IRGS and TME showed that 6 IRGS showed a significant negative correlation with tumor purity (p-value <0.001), and a significant positive correlation with other immune microenvironment scores (p-value <0.001) (except SLP1 and S100A7) (Figure 9B, supplement file.12). From Figure 9C (supplement file. 13), we can see the correlation between 6 IRGs and ITLS, 6 IRGs and PDL1 have a strong positive correlation (except SLP1 and S1007A), which also provides a certain amount for future immunotherapy Ideas. From Figure 9D, we can see the correlation between 6-IRGs and three type ITLSs, verifying the results of the TCGA dataset.
Evaluation the immune status between low-risk and high-risk groups
To further explore the relationship between six-gene signature and TME, ESTMATE method was used to assess the overall immune status of high-risk and low-risk groups by analyzing the expression profiles of the 29 immune signature gene sets. The low-risk group in TCGA and GSE19234 (Figure.10) showed more immune activities than that of high-risk group. Tumor purity of low-risk group in TCGA and GSE19234 were significantly higher than that of high-risk group, which suggested more infiltrated immune and stromal cells in the TME of low-risk group.