Samples clinical characteristics
In the present study, a total of 137 AML patients were included in the analysis, including 77 males and 60 females, 126 whites and 11 blacks, with mean age of diagnosis was 54.44 ±15.85 years. Using current SWOG criteria for cytogenetics risk category, the sample can be divided roughly into three categories: 28 patient had favorable cytogenetics, 74 had intermediate-risk/normal cytogenetics, and 35 had poor cytogenetics. 48 cases survival and 89 cases death. The clinical characteristics of included AML patients are shown in Table 1.
Identification and analysis of Survival- and cytogenetics- dual related DEGs
For survival-related DEG analysis, the data was adjusted for gender (male, female (reference group)), age and race (white, Black or African American (reference group)). After screening by Cox proportional hazard regression analyses (P-value < 0.05, exp (coef) >1), and 1022 survival related DEGs were obtained. 17 genes were lost by conversion of gene names, and finally 1005 survival-related DEGs were obtained.
Considering the cytogenetics risk as an imperative clinical factor in AML, we further re-filtered the survival-related DEGs in different cytogenetics risk in order to find out the target genes which also associated with AML cytogenetics risk. The cytogenetics risk category usually can be divided into two groups: favorable intermediate/normal risk (104 cases) and poor risk (33 cases). Genes were considered up-regulated (down-regulated) if log2 Fold Change in expression was higher (or lower) than 1 (abscissa), and adjusted P-value < 0.05 (ordinate), a total of 2291 cytogenetics risk category related DEGs were identified using DESeq2 (as shown in Fig 1).
Because the large number of survival-related DEGs might cost a huge calculation power to analysis, we performed the functional enrichment analysis to enrich the genes in DESeq2 with high relevancy biological functions. A total of 790 meaningful pathways of cytogenetics risk category were listed by GSEA. The top 10 pathways of up-regulated and down-regulated gene sets of NES were marked and listed in Fig 2.
Finally, a Venn diagram was drawn to reflect the intersection of survival related DEGs (Overall Survival), cytogenetics risk related DEGs (CR DESeq2), and functional enrichment analysis (Fig 3). A total of 42 intersecting DEGs were identified in the two sets, and there is only one gene in the CR GSEA top10 pathways of up-regulated among the 42 intersecting DEGs, namely IGHM.
Survival analysis of IGHM
Multivariate Cox regression analysis was used to analyze the influence of clinical characteristics and IGHM gene expression on the survival time of AML patients. Age, gender, and race were included as covariates, the survival rate of patients with different IGHM amounts of expression was analyzed by Cox proportional hazards model. The model results of hypothesis testing showing P = 0.21 (Fig 4). The Wald test value of the overall Cox regression model is 31.54, and its P-value < 0.001. The average survival time of patients based on all factors was 19 months (95%CI: 12~27 months), which is about 1.5 years. The results of multivariate analysis showed that the two factors, IGHM expression and age, have their statistical significance with the survival time of AML patients, all P-values were < 0.05 (Table 2).
After adjusting for other variables, the Kaplan-Meier survival curve for each factor in different conditions is demonstrated in Fig 5. The survival rate of patients in the IGHM high expression group was statistically lower than that in low expression group (P =0.041). The risk of death in IGHM high expression patients was 2.07 times higher than that in low expression ones (P <0.05). The mean value 55 years of age was used as a cut-off point because of age is a continuous variable, these variables were converted into categorical variables by the cut function and the survival curve was plotted. Kaplan-Meier survival analysis further showed that the survival rate of old group is lower than that of young group, and the difference was statistically significant (P <0.001). The risk of death in the patients with old group having 3.00 times the risk compared to young patients (P <0.001).
Establish and evaluate nomogram prognostic model
The nomogram, which is also called alignment diagram, the advantage of application in medical research is to estimates the survival of individual patient by incorporating multiple clinical variables and their interdependent relationships(12). Its essence is the visualization of the built model, the closer the C-index of the nomogram is to 1, the better the accuracy of the model. In this study, the C-index of fitting the prediction model of Cox regression is 0.69, this result shows that the model has a good accuracy. The calibration curve is shown in Fig 6, and the predicted calibration curve is closer to the standard curve, it also shows that the prediction ability of nomogram is better.
Nomogram results showed that the expression of IGHM in LAML patients has a great influence on their survival time. It is predicted that the 1-year, 3-year and 5-year survival of the patients with low expression group of IGHM are > 85%, > 70% and > 60%, and the 1, 3, and 5-year survival of patients in the high expression group are predicted to be 68%, 43%, and 30%,respectively (in Fig7).
Analysis of IGHM-related signaling pathways
After focusing IGHM as our target gene, we reviewed the results of GSEA analysis to evaluate the potential function of IGHM (in Table 3). Among the significant top 10 up-regulated pathways with statistical significance (abs (NES) ≥ 1, NOM p-value ≤ 0.05 and FDR q-value ≤ 0.25), IGHM participated in 7 pathways, including humoral immune response by circulating immunoglobulin, complement activation, phagocytosis recognition, B cell mediated immunity, B cell receptor signaling pathway, membrane invagination, and positive regulation of B cell activation.