Landscape of genetic variation of histone methylation modification regulators in AML
The overview of this work was shown in the form of a flowchart (Fig. 1). The clinical profiling of TCGA-LAML cohorts was summarized in Table 1.
Table 1
Clinical characteristics of AML cohort for classification and validation
Clinical Characteristics | | Number | Percent (%) |
TCGA-LAML (n = 124) | | | |
Survival status | Survival | 47 | 37.9 |
| Death | 77 | 62.1 |
Age | ≥ 60 years | 51 | 41.1 |
| < 60 years | 73 | 58.9 |
Gender | Female | 57 | 46.0 |
| Male | 67 | 54.0 |
IDH2-R132 mutation | Negative | 112 | 90.3 |
| Positive | 12 | 9.7 |
IDH2-R140 mutation | Negative | 114 | 91.9 |
| Positive | 10 | 8.1 |
IDH2-R172 mutation | Negative | 122 | 98.4 |
| Positive | 2 | 1.6 |
NPM1 mutation | Negative | 93 | 75.0 |
| Positive | 31 | 25.0 |
FLT3 mutation | Negative | 88 | 71.0 |
| Positive | 36 | 29.0 |
Activating RAS mutation | Negative | 116 | 93.5 |
| Positive | 8 | 6.5 |
FAB subtype | M0 | 12 | 9.7 |
| M1 | 30 | 24.2 |
| M2 | 27 | 21.8 |
| M3 | 13 | 10.4 |
| M4 | 27 | 21.8 |
| M5 | 12 | 9.7 |
| M6 | 2 | 1.6 |
| M7 | 1 | 0.8 |
Risk status | Favorable | 28 | 22.6 |
| Intermediate | 71 | 57.3 |
| Poor | 25 | 20.1 |
GEO-GSE71014 (n = 104) | | Number | Percent (%) |
Survival status | Survival | 68 | 65.4 |
| Death | 36 | 34.6 |
A total of 24 HMMRs were finally identified in this study, including 13 writers, 7 erasers and 4 readers. Figure 2A summarized the dynamic reversible process of histone methylation mediated by HMMRs. The differences in gene expression levels between normal and cancer tissues are the critical messages for researchers to assess its attribute, oncogene or tumor suppressor [26, 27]. However, due to the special distribution of hematological malignancy, it is almost impossible for us to obtain the expression messages of HMMRs of the patients with AML before illness. Thence, we chosen to explore the gene expression of 24 regulators in AML patients with different RiskStatus. The result showed that there were 6 regulators with remarkable gene expression differences in different RiskStatus, including 1 writer (CARM1), 2 readers (ATRX and PC) and 3 erasers (KDM2A, KDM4A and KDM5B) (Fig. 2B). Among them, the up-regulation of ATRX, PC, KDM2A and CARM1 implied poor RiskStatus while the up-regulation of EZH2 and KDM5B suggested favorable RiskStatus (Fig. 2B).
Considering the interaction of the HMMRs, we preformed the co-expression analysis. As showed in Fig. 2C, there were two positive co-expression zones (as shown in red boxes) and the bigger one displayed that SETD2, a histone methyltransferase, expressed a positive correlation with ATRX, KDM5A, NSD1 and KMT2A, respectively in gene expression, while the smaller one exhibited 3 writers, including SUV39H1, DOT1L and EHMT2, taking together to form a co-expression loop. Meanwhile, a significant negative correlation in gene expression was discovered between KDM5B and SETD7, KDM5A and SUV39H1, along with ATRX and DOT1L (Fig. 2C). Subsequently, we summarized the incidence of the somatic mutation and copy number variation (CNV) of 24 regulators in the patients with AML. The result showed that there was no obvious genetic alteration in other regulators except for KMT2A that exhibited a significant CNV alteration, including fusion and amplification in copy number (Fig. 2D).
All in all, the above analyses displayed the basic landscape of 24 HMMRs in AML and indicated that cross-talking among the regulators of writers, readers, and erasers may play an important role in the formation of different histone methylation modification patterns in AML.
Histone methylation modification patterns mediated by 24 regulators in AML
In order to verify whether Histone methylation regulators affect the formation of histone methylation modification patterns in AML, the R package of ConsensusClusterPlus was used to classify patients with qualitatively different m6A modification patterns based on the expression of 24 HMMRs. It is obvious that the optimal consensus clustering can be obtained when K value of the consensus matrix was equal to 2 (Fig. 3A, S1), which was confirmed by the results of Consensus Cumulative Distribution Function (CDF) Plot (Fig. 3B) and Delta Area Plot (Fig. 3C). Therefore, two distinct histone methylation modification patterns were eventually identified, including 95 cases patterns A and 56 cases in patterns B. We termed these two patterns as Cluster A and Cluster B, respectively. Principal component analysis (PCA) for the transcriptome profiles of these two modification patterns also showed that there was significant distinction existed on them (Fig. 3D). To validate the established histone methylation modification patterns, we repeated the correlated analyses using another independent AML-cohort, GSE71014, of which the clinical profiling was summarized in Table 1. As shown in Figure S2A-C, the best consensus clustering also obtained when K value of the consensus matrix was equal to 2, which was consistent with the previous clustering analysis (Fig. 3A-D, S1). What’s more, the survival analysis displayed that the patients of Cluster A exhibited a more prominent survival advantage than Cluster B in both TCGA-LAML cohort and GSE71014 cohort (Fig. 3E-F).
Subsequently, unsupervised clustering of 24 HMMRs was conducted to explore the clinical features of these two histone methylation modification patterns in TCGA-LAML cohort while the two modification patterns, FAB subtypes, mutations of several genes, gender, age, fustat and futime were used as patient annotations. As showed in Fig. 3G, the expression of 24 histone methylation regulators were remarkably upregulated in Cluster A,comparing with Cluster B. It is worth noting that there was a significant difference between Cluster A and Cluster B in the distribution of FAB category, that was, AML-M0 subtype was only distributed in Cluster A but not in Cluster B. Besides, more patients with NPM1 mutation distributed in Cluster B than Cluster A. However, there were no obvious differences between the patients of Cluster A and Cluster B in FLT3 mutation, IDH1 mutation, activating Ras, gender, age, fustat and futime.
The above results showed that the patients with AML could be clustered effectively based on the expression of histone methylation modification regulators. What’s more, the significant clinical characteristics’ differences of the two histone methylation modification patterns also indicated histone methylation modification plays a critical role in AML progression.
Biological characteristics of distinct histone methylation modification patterns in AML
To determine the biological behavior differences between Cluster A and Cluster B, we firstly analyzed their distribution differences of somatic mutation in TCGA-AML cohort using maftools package. It is believed that higher TMB results in more neo-antigens, increasing chances for T cell recognition and clinically correlates with better immune checkpoint inhibitors (ICI) outcomes [28, 29]. The result of somatic mutation analysis obviously shown that Cluster A presented more extensive tumor mutation burden than Cluster B, with the altering frequency 62.5% against 48.15% (Figure S3A-B). Furthermore, the tumor mutation burden (TMB) quantification analysis also confirmed that Cluster A was markedly correlated with a higher TMB value (Fig. 4A) although there was no significant difference in microsatellite instability between Cluster A and Cluster B (Fig. 4B).
To further investigate the characteristics of Cluster A and Cluster B, we performed GSVA enrichment analysis. As showed in Fig. 4C, Cluster A was remarkably enriched in protein and RNA metabolism pathways, such as RNA degradation, spliceosome pathway, valine leucine and isoleucine biosynthesis pathway and so on while Cluster B presented an enrichment pathway associated with energy metabolism, including oxidative phosphorylation pathway, pantothenate and CoA biosynthesis pathway and lysosome pathway.
It has been well known that tumor microenvironment (TME), which is also composed of tumor cells, stromal cells, immune cells and multiple secreted factors, plays a crucial role in tumor progression [30, 31]. Therefore, we firstly performed the MCP-counter method to investigate the immune cell infiltration of the two histone modification patterns. The result showed that Cluster A presented a higher immune cell infiltration in T cells, B cells, NK cells and endothelial cells while a lower infiltration in monocytes and neutrophils that were the noumenon of tumor cells (Fig. 4D). Then, we conducted the CIBERSORT method to verify the result of MCP-counter. Consistently, the result also showed that the infiltration of CD4 + memory T cells, naive B cells, resting NK cells and plasma cells in Cluster A were all higher than in Cluster B but the infiltration of monocytes was lower (Figure S3C).
Enlighten by the results of immune cell infiltration analyses, we compared the expression of some immune cell markers, chemokines and cytokines between Cluster A and Cluster B so that we could learn more about the immune characteristics of histone modification patterns in AML. As showed in Fig. 4E, a series of immune-activating factors were up-regulated in Cluster A in comparison with Cluster B, including CD244 that is a marker of NK cells, inducible T cell costimulatory ligand (ISOSLG), CD96 that plays a role in the adhesive interaction of activated T and NK cells, TNFRSF18 and CD47. Meanwhile, some immune inhibiting factors were down-regulated in Cluster A, such as HAVCR2, CD86 and LGALS9. Combining the previous results that the expression of 24 histone methylation modification regulators was generally up-regulated in Cluster A (Fig. 3G), we speculated that HMMRs could promote the expression of a series of immune-activating factors in AML.
Overall, we had made a rather detailed description of the biological characteristics of the two histone methylation modification patterns in AML and found that Cluster A displayed more remarkable immune cell infiltration that should be a key factor for its prominent survival advantage.
Generation of histone methylation modification-related score in AML
The expression differences of histone methylation modification regulators between Cluster A and Cluster B would directly lead to the differences in the transcriptome of Cluster A and Cluster B, which resulted in the clinical and biological characteristics differences in these two histone methylation modification patterns of AML.
To further investigate the changes of the transcriptome of Cluster A and Cluster B, we firstly conducted the empirical Bayesian approach of limma package to determine the differentially expressed genes (DEGs) between them. The volcano plots exhibited that there were lots of DEGs between Cluster A and Cluster B, remaining 61 DEGs even with the log2 (fold change) value was equal to ± 3 (Fig. 5A). Then, the clusterProfiler package was used to perform GO enrichment analysis for the DEGs. As shown in Fig. 5B, lots of pathways directly related to immune system were enriched in Cluster A compared with Cluster B, which indicated again that histone methylation modification played a non-negligible role in regulating TME landscapes in AML.
Although we had proved that histone methylation modification patterns were strongly associated with the prognosis of AML, it would be difficult to directly predict individual therapeutic effect and prognosis according to the histone methylation modification patterns due to its complexity. Therefore, we established a scoring system to quantify the histone methylation modification patterns of the patients with AML based on the DEGs discovered in Fig. 5A. We termed as M-RiskScore. Univariate Cox analysis was performed to identify the genes that were eligible for lasso regression analysis and the results showed that a total of 8 genes were selected with their p-value was less than 0.01 (Fig. 5C). The subsequent lasso regression analysis displayed the formula for calculating the M-RishScore was composed of 6 genes finally and they were ADAMTS15, CADM1, CDO1, SYT17, FAM163A and POF1B (Fig. 5D-E).
Taken together, we identified the DEGs between Cluster A and Cluster B, by which we constructed a scoring system, M-RiskScore successfully. We would try to explore the value of M-RiskScore for predicting the therapeutic effect and prognosis of the patients with AML in the subsequent analysis.
The value of M-RiskScore in prognosis prediction for the patients with AML
After we have established the scoring system, we started to explore its potential value in predicting the outcome of the patients with AML.
With the medium value was set as the cutoff value, we divided the patients with AML into high or low M-RiskScore group. The prognosis analysis showed that the low M-RiskScore group exhibited a better survival advantage than the high M-RiskScore group (Fig. 6A). Importantly, the results of correlation analysis between histone methylation modification patterns and M-RiskScore displayed that Cluster A, which had been proved to possess a more prominent survival benefit than Cluster B (Fig. 3E), was with a lower M-RiskScore (Figure S4A), indicating the reliable and trustful of M-RiskScore due to these consistent results. Therefore, the above analyses implied that M-RiskScore was likely to be an independent poor prognostic marker for the patients with AML.
To verify and compare the predictive accuracy of M-RiskScore with those of other well-known prognostic markers in AML, two receiver-operating characteristics (ROC) curves were created. As showed in Fig. 6B-C, M-RiskScore had the highest AUC value both in the cohort of 3 years and 5 years, indicating its remarkable predictive accuracy compared with other prognostic markers of AML. What’s more, both the results of univariate and multivariate Cox regression analysis showed that the Hazard ratio of M-RishScore was significantly higher than those of other prognostic markers of AML (Fig. 6D-E). Logically, it was no doubt that M-RiskScore could serve as an independent poor prognostic marker for the patients with AML.
Finally, to further improve the prognostic value of M-RiskScore in AML, we performed the nomogram associating M-RiskScore with other prognostic markers of AML, including FAB subtype, gender, RiskStatus and age (Fig. 6F). The result of decision curve analysis (DCA) for the nomogram showed that M-RiskScore had a common higher Net Benefit value than other independent prognostic markers in the range of Risk Threshold (Fig. 6G). Moreover, all the calibration plots for the probability of survival of 1 year, 3 years and 5 years ran very close to the diagonal, showing outstanding calibration (Fig. 6H). However, the result of ROC analysis for nomogram exhibited that there was no significant increase in the predictive accuracy compared with independent M-RiskScore in the cohorts of both 3 years and 5 years (Fig. 6B-C, Figure S4B).
In a word, these results proved that M-RiskScore played an important role in predicting the outcome of the patients with AML and could be an independent poor prognostic marker of AML.
M-RiskScore in the role of chemotherapy of AML
Chemotherapeutic drugs will continue playing an important role in AML treatment by the virtue of their universal and highly effective cytotoxicity [2]. Therefore, we specifically tested the ability of M-RiskScore to predict the efficacy of induction chemotherapy in patients with AML. Three GEO-AML cohorts with the chemotherapeutic information were download and processed. They were GSE84344, having the respond information for decitabine treatment of the patients with AML, GSE103424 that contained the clinical characteristics of the patients with AML following the treatment of IA regimen (idarubicin + cytarabine) and GSE10087, which processed the clinical information of the patients with AML treated with different standard induction chemotherapy, including IA regimen, CIA regimen (clofarabine + idarubicin + cytarabine), FAI regimen (fludarabine + cytarabine + idarubicin), decitabine, BID FA regimen (twice daily fludarabine + cytarabine) and CECA regimen (cyclophosphamide + etoposide + carboplatin + cytarabine). The basic clinical characteristics of these three GEO-AML cohorts were summarized in Table 2. We found that patients with low M-RiskScore showed more obvious therapeutic advantages both in the complex-chemotherapy-treatment cohort and the decitabine-treatment cohort compared to those with high M-RiskScore (Fig. 7A-B). In contrast, the analysis of the IA regimen cohort exhibited the AML-patients with high M-RiskScore benefited more from the related treatment relatively (Fig. 7C). The somatic mutation analysis for AML-TCGA cohort displayed that the top three most frequently mutated genes in AML-patients with low M-RiskScore were DNMT3A, KIT and WT1, while they were NPM1, DNMT3A and RUNX1 in the high M-RiskScore group (Fig. 7D-E). Given DNMT3A, a DNA methyltransferase encoded by DNMT3A, is one of the well-known target of decitabine [32], we speculated that a higher mutation frequency of DNMT3A could be the cause of its poorer chemotherapeutic response to the treatment with decitabine in the high M-RiskScore group (Fig. 7B, D). What’s more, given these mutation characteristics, we proposed a hypothesis that the mutated KIT and/or mutated WT1 led to the oncogenesis in AML-patients with low M-RiskScore. Meanwhile, the leukemogenesis in the patients with high M-RiskScore should be attributed to the mutations of NPM1 and/or RUNX1.
Table 2
Clinical characteristics of AML cohort for correlation analyses.
Patient series | GSE103424 | GSE103424 | GSE103424 |
No. of patients | 52 | 45 | 41 |
Age | | | |
≥ 60 years | NA | 45 | 25 |
< 60 years | NA | 0 | 16 |
Gender | | | |
Female | 24 | 17 | 14 |
Male | 28 | 28 | 27 |
Risk status | | | |
Favorable | 5 | 0 | 4 |
Intermediate | 30 | 23 | 15 |
Poor | 10 | 14 | 22 |
NA | 7 | 8 | 0 |
Response to chemotherapy | | | |
Yes | 17 | 18 | 11 |
No | 35 | 27 | 30 |
All in all, the above results indicated that M-RiskScore was associated with the chemotherapeutic response of the AML-patients with different chemotherapy treatments and could serve as an evaluated index for chemotherapeutic drugs’ usage.