2.1 Landscape of PRG expression and somatic mutation in MM
The workflow chart was shown in Figure 1. The expression levels of 29 PRGs in MM patients and controls from the GSE39754 dataset were displayed as heat maps (Figure 2B). By comparison, we found that 12 out of 29 PRGs were significantly differentially expressed, among which GPX4, CASP4, PLCG1, CASP3, GSDMB, and AIM2 were upregulated, while NLRC4, ELANE, CASP9, NLRP3, IL18, and TNF were downregulated in the patients’ group (Figure 2A). The correlation matrix of 33 PRG expression levels in the MMRF project was depicted in Figure 2C, among which the results with Pearson correlation coefficient greater than 0.65 were shown in Figure 3. Next, we compared the difference of each PRG expression in different clinical subgroups. For gender subgroups, CASP4, CASP5, GSDMA, IL18, IL1B, IL6, NLRP3, NLRP2, and TIRAP were significantly differentially expressed. For age subgroups, there were differences in the expression of CASP1, IL18, NLRP3, and TNF. For ISS stage subgroups, the expression of genes, such as AIM2, CASP1, CASP3, CASP4, CASP5, CASP6, CASP8, CASP9, GSDMB, GSDMD, IL6, NLRC4, NLRP1, PRKACA, and PYCARD, was different, and the expression level increased with stage grade (Figure 4).
Somatic mutation analysis revealed that PRG mutations were present in 14% (107/764) patient samples, among which 6.8% (52/764) samples have non-synonymous mutations. Among 33 PRGs, 31 genes were found to have mutations, of which 24 genes had non-synonymous mutations. Missense mutation was the most common variant classification, and C > T was the most common SNV class. The gene with the highest mutation frequency was NLRP3, followed by NLRP2, NLRP7, GSDMB, AIM2, CASP14 and NLRP4 (Figure 2D-E).
2.2 Differences of immune infiltration and correlation between PRGs and immune cells
The infiltration of 22 immune cells between MM patients and controls from the GSE39754 dataset was shown in Figure 5A. We analyzed the correlation of infiltration degree between different immune cells in the MMRF project, and the correlation matrix was displayed in Figure 5B. Finally, we evaluated the relationship between PRGs and immune cells (Fig. 5C). The result indicated that ELANE, GSDMA, IL18, IL1B, NLRP3, and TNF were significantly negatively correlated with plasma cells; ELANE, IL1B, and NLRP3 were significantly positively correlated with monocytes; ELANE, NLRP3 and eosinophils were significantly positively correlated; GSDMA, IL18 and M2 macrophages were significantly positively correlated; CASP5, NLRP3 and neutrophils were significantly positively correlated; and ELANE was significantly positively correlated with M0 macrophages and resting NK cells, respectively.
2.3 Analysis of MM molecular subtypes based on PRGs
The NMF classification results showed that the cophenetic value began to drop significantly when rank = 4, so the optimal number of clusters was 4 (Figure 6A). The consensus map of NMF clustering and principal component analysis plot were displayed in Figure 6B-C, and the heatmap of PRG expression in 4 clusters was depicted in Figure 7A. Then we analyzed the prognosis of patients in different clusters and found that there was no difference in the survival time of patients in general, while multiple comparisons revealed that the survival time of cluster 1 and cluster 2 patients was statistically different, and patients in these two clusters accounted for 80.5% (615/764) of all cases (Figure 6D-E). DEGs analysis between these two clusters identified 372 up-regulated genes and 36 down-regulated genes. By GSVA analysis of these DEGs, two differential pathways including the toll-like receptor signaling pathway and cytosolic DNA sensing pathway were identified (Figure 6F). Subsequently, we compared the subtype scores of different clinical characteristics in each cluster and found that males had higher scores in cluster 1 and cluster 4; patients aged < 50 years had higher scores in cluster 3, while patients aged ≥ 50 years had higher scores in cluster 4; scores in cluster 1 increased with the increase of ISS stage (Figure 7B-D).
The immune infiltration analysis of different clusters indicated that the infiltration degree of multiple immune cells in cluster 4 was higher than that in other clusters, while for plasma cells, which accounted for the highest proportion in all immune-infiltrated cells, the infiltration degree was lower in cluster 4 (Figure 8A-B).
2.4 Prognostic marker screening and risk score calculation based on PRGs
The results of univariate Cox regression showed that 14 PRGs, including AIM2, CASP5, IL1B, IL6, NLRC4, NLRP1, NLRP6, NOD1, PJVK, PLCG1, PRKACA, PYCARD, SCAF11, and TIRAP, were significantly associated with the survival of MM patients (Figure 9A). By LASSO regression analysis, 9 PRGs were screened out as prognostic markers, which were AIM2, CASP5, IL1B, IL6, NLRP6, NOD1, PRKACA, PYCARD, and SCAF11 separately (Figure 9B-C). The predictive ROC curve of the LASSO regression results revealed that the AUC value was 0.643, indicating a relatively good prognostic predictive ability (Figure 9D).
According to the analysis results of the LASSO regression model, the coefficients of the candidate prognostic markers were determined, and the risk score (RS) was calculated as follows: RS = 0.0485 × AIM2 + 0.0735 × CASP5 + 0.0152 × IL1B + 0.1043 × IL6 + 0.1053 × NLRP6 + 0.1188 × NOD1 + 0.1533 × PRKACA + 0.1004 × PYCARD + 0.0674 × SCAF11. The ROC curves of 1, 3 and 5-year survival predicted by RS were displayed in Figure 9E, among which the predictive ability of 3-year survival was the best (AUC = 0.658). The optimal cutoff value for RS predicting survival in MM patients was 1.7714. According to the cutoff value, MM patients were divided into high-risk and low-risk groups, and patients with no survival information were excluded. The survival time of patients in high-risk group was significantly shorter than that in low-risk group (Figure 10A).
GSE2658 dataset was used to validate the prognostic predictive ability of risk groups constructed by the prognostic markers screened by LASSO regression. By validation, we found that the survival analysis results of risk groups in the GSE2658 dataset were consistent with those in the MMRF project (Figure 10B). Comparisons of the survival time for patients in high and low expression groups of 9 candidate markers showed that there were differences in the groups of CASP5, IL1B, NOD1, PRKACA, AIM2 and SCAF11 (Figure 10C-K).
By comparing the differences of RS in different clinical subgroups, we found that there were differences in ISS stage subgroups, and the higher the stage, the higher the RS (Figure 10L-N).
2.5 Construction of individualized prognostic prediction model
Univariate Cox regression analysis demonstrated that in addition to RS, the impact of gender, age, and ISS stage on patient survival were statistically significant (Figure 11A). We constructed a multivariate prognostic predictive model using Cox regression (Figure 11B), and generated a nomogram (Figure 11C) and a calibration curve (Figure 11D-F) using rms package, to predict the probability of 1-, 3-, 5-year survival for myeloma patients. The nomogram constructed by ISS stage (c-index = 0.661), RS (c-index = 0.648), or ISS stage combined with gender and age (c-index = 0.690), showed that these variables were good predictors, while the survival predictive model constructed by gender, age, ISS stage and RS had a c-index of 0.721, indicating better prognostic predictive ability.