The prognostic value of RASGEF1A RNA expression and DNA methylation in cytogenetically normal acute myeloid leukemia.

BACKGROUND
Acute myeloid leukemia (AML) is a significantly heterogeneous malignancy of the blood. Cytogenetic abnormalities are crucial for the prognosis of AML. However, since more than half of patients with AML are cytogenetically normal AML (CN-AML), predictive prognostic indicators need to be further refined. In recent years, gene abnormalities are considered to be strong prognostic factors of CN-AML, already having clinical significance for treatment. In addition, the relationship of methylation in some genes and AML prognosis predicting has been discovered. RASGEF1A is a guanine nucleotide exchange factors of Ras and widely expressed in brain tissue, bone marrow and 17 other tissues. RASGEF1A has been reported to be associated with a variety of malignant tumors, examples include Hirschsprung disease, renal cell carcinoma, breast cancer, diffuse large B cell lymphoma, intrahepatic cholangiocarcinoma and so on [1, 2]. However, the relationship between the RASGEF1A gene and CN-AML has not been reported.


METHODS
By integrating the Cancer Genome Atlas (TCGA) database 75 patients with CN-AML and 240 Gene Expression Omnibus (GEO) database CN-AML samples, we examined the association between RASGEF1A's RNA expression level and DNA methylation of and AML patients' prognosis. Then, we investigated the RASGEF1A RNA expression and DNA methylation's prognostic value in 77 patients with AML after allogeneic hematopoietic stem cell transplantation (Allo-HSCT) as well as 101 AML patients after chemotherapy respectively. We investigated the association between sensitivity to Crenolanib and expression level of RASGED1A in patients by integrating 191 CN-AML patients from BeatAML dadataset. We integrated the expression and methylation of RASGEF1A to predict the CN-AML patients' prognosis and investigated the relationship between prognostic of AML patients with different risk classification and expression levels or methylation levels of RASGEF1A.


Introduction
Acute myeloid leukemia (AML) is a highly heterogeneous malignancy of the blood in genetic basis and prognosis. The main feature of AML is the clonal proliferation of myeloid blasts in bone marrow, peripheral blood and/or other tissues, which may cause impaired hematopoiesis and bone marrow failure, with symptoms like frequent infections, easy bruising or bleeding, anemia, bone or abdominal pain, etc. [1][2][3] The incidence of AML is approximately 4.3 per 100,000 persons in the United States and 3.7 per 100,000 persons in Europe. [4] As a disease commonly incident in elderly people, the median diagnostic age of AML is 67 years old, and nearly one-third of AML patients are diagnosed at the age of 75 or more. [1] For a long time, the common chemotherapy of AML has been "3 + 7", which are 3 days of daunorubicin or idarubicin combining 7 days of cytarabine to acquire complete remission (CR). [2,5] 70-80% of patients under 60 years old can enter CR after receiving several anthracycline-based chemotherapy combinations. The situation is worse in older patients. Only approximately 40-65% of older patients will achieve CR, but 85% will relapse in 2 to 3 years. [2] In the Medical Research Council (MRC) AML11 trial, the AML patients receiving DAT (thioguanine, cytarabine, and daunorubicin) have a 5year overall survival (OS) of 12%,the patients with AML receiving ADE (etoposide, cytarabine, and daunorubicin) have a 5-year OS of 8% and the patients with AML receiving MAC (mitoxantronecytarabine) have a 5-year OS of 10%. [6] In the Leukemia Research fund (LRF) AML14 trial, the 1-year survival and 3-years survival of 1137 intensively treated patients were 47% and 19%, and the rates of 275 non-intensively treated patients were 19%, 2%. [7] Another study showed that only 40 ~ 50% of patients over 65 years old can achieve CR. The 5-year survival rate for young AML patients is 40% and is only 5% in elderly patients. [4] Since common treatments are not effective for some AML patients, precision medicine is crucial for AML, which means more prognostic factors are badly required.
Cytogenetic abnormalities have long been considered one of the most signi cent factors in the prognosis of AML, which have been a rmed in clinical practice, for instance patients with inv(16), t(15; 17) as well as t(8; 21) have better estimate. [8,9] However, nearly one-half of the AML patients are cytogenetically normal (CN), cannot predicting prognosis by cytogenetic markers. [9] In recent year, several genetic mutations have been described, providing more possibility for predicting prognosis. [9,10] The 4-years survival rate of AML patients with CEBPA mutations is nearly 60%, and AML patients with NPM1 mutation have a 50% 4-year survival rate approximately. Patients with mutations of the CEBPA or NPM1 genes get a favorable prognosis, and BAALC, MLL-PTD and CEBPA can help to predict the response of patients obtaining allogeneic hematopoietic stem cell transplantation (ALLO-HSCT). FLT3-Internal Tandem Duplication (FLT3-ITD) patients own an adverse prognosis, whose 4-years survival rate is only 20-25%. [11][12][13] Mutations such as RUNX1, DNMT3A and ASXL1 are cytogenetically normal AML (CN-AML) patients' adverse prognostic factors. [14][15][16] Above ndings illustrated that gene abnormalities are extreme valuable prognostic factors for CN-AML, providing research directions for us to nd precise risk strati cations.
DNA methylation is a direct chemical modi cation of DNA that form a 5-methylcytosine through transferring a methyl group to the cytosine's C5 position. DNA methylation controls gene expression mostly through recruiting proteins can inhibit expression of gene or impairing the transcriptional activators' binding, mainly to reduce gene expression. [17] In addition, methylation also exerts its effects by other mechanisms, including splicing variants, down regulating microRNA, fostering DNA rearrangements, etc. [18][19][20] Previous studies have demonstrated that DNA methylation levels are closely related to the AML patients' prognosis. For example, high GADD45A methylation level can predict poor survival of AML patients, and the decrease of SCIN expression due to promoter methylation results in adverse prognosis of AML patients, while high HOXA5 methylation levels are valuable biomarkers for prognosis in AML patients. [21][22][23] Methylation further affects the prognosis of disease through several mechanisms, indicating that gene methylation also has important research signi cance in AML patients.
RasGEF1A belongs to RasGEF family and is capable of regulating the transformation and activation of Rap2A, a member of Ras family. [24] Studies showed that Rap2a is a direct target of p53, and the overexpression of Rap2a may promote the invasion and migration of many cancers, including lung cancer, renal cell carcinoma, osteosarcoma, nasopharyngeal carcinoma, and gliomas of the central nervous system. [25][26][27][28] A study exhibited that the expression of RASGEF1A is signi cant for intrahepatic cholangiocarcinoma cells' survival and migration, while the suppression of RASGEF1A inhibits the growth of ICC cells. [29] It has been reported that RASGEF1A is highly expressed in brain, moderately expressed in bone marrow, spleen, lymph node, testis and gall bladder, weakly expressed in other 21 examined tissues. [30] But the prognostic signi cance of the expression and methylation of RASGEF1A genes in CN-AML has not been reported so far. Our study integrated CN-AML patients' two independent groups to studied the prognosis value of RASGEF1A RNA expression level and DNA methylation level in CN-AML patients.

Data source
We integrated RNA expression pro les and DNA methylation pro les of 75 the Cancer Genome Atlas (TCGA) CN-AML samples and 240 Gene Expression Omnibus (GEO) CN-AML patients. [31,32] We analyzed the relation between survival and RASGED1A RNA expression through 75 TCGA patients CN-AML and 240 GSE12417 CN-AML. We analyzed the relation between post-treatment survival rates and RASGED1A RNA expression of 67 patients with CN-AML receiving Allo-HSCT and 92 CN-AML patients with only chemotherapy from TCGA database. We analyzed the relation between the prognosis and the level of RASGED1A methylation by analyzing 85 CN-AML patients, 77 AML patients with Allo-HSCT and 101 AML patients with only chemotherapy from TCGA database. We investigated the association between sensitivity to Crenolanib and expression level of RASGED1A in patients by integrating 191 CN-AML patients from BeatAML dadataset. [33] This research was in consistent with the Declaration of Helsinki.
AML patients were selected established on the following criteria. The 75 TCGA AML patients were normal karyotype and received chemotherapy or transplant treatment. Their gene expression was measured using RNA-sEq. The 92 AML patients from TCGA database were normal karyotype or abnormal karyotype and received chemotherapy treatment. Their gene expression was measured using RNA-sEq. The 67 AML patients from TCGA database were normal karyotype or abnormal karyotype and received Allo-HSCT treatment. Their gene expression was measured using RNA-sEq. The 162 AML of GSE12417 U133B dataset and the 78 AML of GSE12417 U133 Plus dataset were normal karyotype and received chemotherapy or transplant treatment. Their gene expression was measured using microarray. The 191 CN-AML patients from BeatAML were normal karyotype received chemotherapy or transplant treatment. Their gene expression was measured using RNA-seq.
RNA expression pro les and DNA methylation pro les analysis RNA expression microarray of each CN-AML sample from GEO database was calculated through the method of robust multiarray averaging method (RMA). The expression levels of each probe were transformed with log2. RNA expression data (RNA-seq) and DNA methylation data (HumanMethylation450 chip) were obtained from TCGA database. RNA expression levels of each gene were displayed with RPKM (Reads Per Kilobase per Million mapped reads). It was transformed with log2(FPKM + 1). The methylation of each gene was calculated from the probes related to this gene and was showed from 0 to 1 (0 for no methylation and 1 from 100% of DNA methylation). The P-value of event-free survival (EFS) and OS for each gene were screened from the RNA expression pro les or DNA methylation pro les of each dataset. Patients of each dataset were classi ed into RASGEF1A-high expression and RASGEF1A-low expression group with the RASGEF1A RNA expression or RASGEF1A DNA methylation with survminer package.

Statistics
The statistical results were analyed through ggplot2 and survminer packages of R software v3.1.3. Survival curves were designed by Kaplan-Meier estimation. Survival analysis was performed by the Logrank test. The unpaired t test was employed to compare the average values of two groups. The Fisher's exact test was employed for enumeration data of two or more groups. Cox proportional hazards regression models of multivariate analysis were employed to make a judgement about the prognosis value of multiple biomarkers related to CN-AML.

Results
The 75 TCGA CN-AML samples' baseline characteristics We examine the baseline characteristics of the 75 TCGA CN-AML including 10 CN-AML patients with RASGEF1A-low expression and 65 patients with RASGEF1A-high expression and 240 GSE12417 CN-AML patients including 162 from GSE12417 U133B and 78 from GSE12417 U133 Plus. In 75 TCGA CN-AML patients, we found that the baseline features of the two groups matched. Except for recurrence (P = 0.034, Fisher's Exact test), other indicators were not statistically signi cant. These indicators not only included gender, race, FAB type, nuclear type, risk, induction, transplantation, pre-transplantation, gene mutations such as DNMT3A, NPM1, NET2 (P > 0.05, Fisher's exact test), but also included age, bone marrow blast cell (BM_BLAST), peripheral blood WBC (WBC), peripheral blood blast cell (PB_BLAST, P > 0.05, unpaired ttest, Table 1). In GSE12417 U133B including 21 patients with low RASGEF1A expression and 141 patients with high RASGEF1A expression we found that the baseline features of two groups are match including French-American-British (FAB) type (P > 0.05, Fisher's Exact test) and age (P > 0.05, unpaired t-test). In 78 from GSE12417 U133 Plus including 12 patients with low RASGEF1A expression and 66 patients with high RASGEF1A expression, we found that the baseline features of two groups are match including FAB type (P > 0.05, Fisher's Exact test) and age (P > 0.05, unpaired t-test, Table S1 and S2).

RASGEF1A expression can forecast the prognosis of CN-AML
To address the correlation between RASGEF1A expression level and survival rate of AML samples, we conducted a study. We examine RASGEF1A expression level and prognosis in 75 TCGA CN-AML patients and 240 GSE12417 CN-AML patients. We found 10 patients with low RASGEF1A expression from TCGA CN-AML has a better EFS and OS (P < 0.0001, Log-rank test, Figure. 1A). In CN-AML patients from GSE12417, 21 patients with low RASGEF1A expression from GSE12417 U133B (P < 0.0001) and 12 patients with low RASGEF1A expression from GSE12417 U133 Plus dataset (P < 0.0001, Figure. 1B) have a improved OS.
The 75 TCGA CN-AML samples' multivariate analysis  RASGEF1A expression level can forecast the prognosis of AML samples received chemotherapy or Allo-HSCT To further understand the correlation between RASGEF1A expression and the prognosis of samples with AML received chemotherapy or Allo-HSCT, we compared the survival rates of patients with low and high RASGEF1A expression received treatment. In 67 AML patients after Allo-HSCT, RASGEF1A-low expression group had more favorable prognosis than RASGEF1A-high expression group (EPS: P < 0.0001, OS: P < 0.0001, Log-rank test, Figure.  Combination of expression and methylation levels of RASGEF1A provide a accurate prognostic classi cation for patients with CN-AML We comprehensively examine expression and methylation levels of RASGEF1A of 75 CN-AML patients and obtained a more accurate method for prognosis classi cation. As shown in Fig. 4, 75 patients were divided into 4 groups, which are RASGEF1A-high expression RASGEF1A-high methylation group (G1), RASGEF1A-high expression RASGEF1A-low methylation group (G2), RASGEF1A-low expression RASGEF1A-high methylation group (G3) and RASGEF1A-low expression RASGEF1A-low methylation group (G4). In EFS, G3 had the best prognosis, G1 and G4 had moderate prognosis and G2 had the worst prognosis among the 4 groups (EFS: P < 0.0001, Log-rank test, Figure. 4). Due to the small sample size, G4 in right side of Fig. 4 had only 1 patient, which may have individual differences. Removing this point, G3 had the best prognosis, G1 had moderate prognosis and G2 had the worst prognosis among the 3 groups (OS: P < 0.0001, Log-rank test, Figure. 4).
The BeatAML CN-AML samples with RASGEF1A high expression level with better sensitivity to Crenolanib To investigate the relationship between RASGEF1A expression level and sensitivity to Crenolanib in CN-AML patients, we integeted 191 CN-AML patients from BeatAML including 50 RASGEF1A low expression and 141 RASGEF1A high expression. We found the IC50 of Crenolanib of CN-AML patients with RASGEF1A high expression level is lower, which suggest RASGEF1A-high has better sensitive to Crenolanib ( Figure S1).

Discussion
With the gradual development of research on AML, the heterogeneity in the disease is becoming more common, and cytogenetics is increasingly recognized as an independent predictor of prognosis in AML patients. However, though many prognostic factors have been identi ed, there is more than 50% of patients do not have cytogenetic marker, which are cytogenetically normal AML patients. [9,10] The good news is many gene abnormalities that escape cytogenetic detection have been discovered, like CEBPA, NPM1, KIT, having great signi cance in predicting prognosis. [14,15] In addition, the role of gene methylation in prognosis has also been discovered, like high GADD45A methylation predicting low survival of AML patients. [21] Therefore, new gene abnormalities are still required for further discovery, in order to provide clues for predicting prognosis, studying the pathogenesis and exploring new therapeutic targets. In this study, we integrated survival of 75 TCGA samples with CN-AML and 240 GEO CN-AML samples, examined the effect of RASGEF1A expression on CN-AML patients' prognosis, including patients received chemotherapy or Allo-HSCT, and the in uence of RASGEF1A methylation levels on CN-AML samples' prognosis.
In this study, we found that RASGEF1A can be used as a CN-AML patients' prognostic factor. We compared the OS and EFS of 75 TCGA samples with CN-AML and 240 CN-AML patients from GEO database with or without high RASGEF1A expression, including CN-AML patients' overall prognosis and respective prognosis of AML patients received chemotherapy or Allo-HSCT. The study showed that the prognosis of patients in RASGEF1A-high expression group of RASGEF1A was poorer than the RASGEF1Alow expression group (P < 0.0001, Fig. 1and Fig. 2), indicating that the RASGEF1A gene expression level is a strong adverse factor for CN-AML samples' prognosis. We can examined the 75 CN-AML samples' baseline characteristics in Table 1 In addition, we studied the association between the methylation of RASGEF1A gene and patients survival rates. Just as the high methylation level of GADD45A can indicate adverse prognosis, we discovered the high methylation of RASGEF1A also suggests unfavorable prognosis in CN-AML patients. [21] As can be seen from Fig. 3A, the EFS and OS of patients are much better in RASGEF1A-high methylation group than RASGEF1A-low methylation group (P < 0.0001, Figure. 3A). Not only that, but we also found the same results in patients receiving chemotherapy or Allo-HSCT, high level of methylation suggests favorable prognosis of AML patients receiving chemotherapy (P < 0.0001, Figure. 3B) or Allo-HSCT (P < 0.0001, Figure. 3C). Therefore, we can say that the level of RASGEF1A methylation is a new prognostic factor of AML patients.
Furthermore, we found that the integrative analysis of gene expression and methylation can provide a more accurate method of prognostic classi cation. In Fig. 4, we can see that CN-AML samples can be separated into 4 groups by RNA level and methylation level, and the difference in EFS and OS among the 4 groups were statistically signi cant (P < 0.0001, Figure. 4). In the left side of Figure.4, it is easily to nd that RASGEF1A-low expression RASGEF1A-high methylation group (G3) had the best EFS and RASGEF1A-high expression RASGEF1A-low methylation group (G2) had the worst EFS among 4 groups, consistent with previous conclusions. Due to the small overall sample size, there is only one patient in G1 group in the OS ( Figure. 4, right side), we have to say individual factors may have a great impact on the outcome. Despite this, we can nd that the G3 group had the best OS and G2 had the worst OS among other three groups similarly. It has been found that methylation can affect survival not only by regulating gene expression, but also through some other mechanisms. [34] This shows that the two may re ect different processes in the pathogenesis, while the integrative analysis may be more comprehensive. This method has been applied in other diseases and many new discoveries have been harvested. The new strategies to overcome tamoxifen resistance may be discovered by analyzing the DNA expression and methylation features of cancer stem cells. [35] By investigating the DNA imbalance and methylation pro les of myeloma cells, it was found that the genomic heterogeneity always present from diagnosis to relapse.
[36] The integrative analysis in soft tissue sarcomas, helped to nd new biomarkers associated with pathogenesis. [37] The integrative analysis of gene expression level and methylation level of RASGEF1A allows CN-AML patients to be more accurately graded, making the prognosis prediction more accurate and even providing a more accurate direction for treatment. The CN-AML patients with RASGEF1A-high expression level have better sensitivity to Crenolanib.
RASGEF1A is an important guanine nucleotide exchange factors of Ras. Previous studies have exposed that RASGEF1A high expression is related to the survival and migration of intrahepatic cholangiocarcinoma cells, and our study has shown a correlation with the prognosis of patients with AML. [29] This may suggests that this gene further contribute to the pathogenesis of AML and is associated with some malignant diseases. At the same time, this gene was detected in the bone marrow and other 26 detected tissues, especially in brain, lymph nodes, spleen and testis. [25][26][27][28][29][30] We can guess that there are also potential correlations with RASGEF1A gene and diseases in those tissues, which requires further studies.
What is regrettable is that this experiment does not involve molecular mechanisms. To date, no prospective studies have proved the clinical signi cance of RASGEF1A in the treatment of CN-AML.
In conclusion, the high RASGEF1A expression level and low methylation level suggest poor survival and adverse prognosis for CN-AML patients. The integrative analysis of RNA and methylation level can provide a more accurate classi cation for prognosis. The CN-AML patients with RASGEF1A high expression level have better sensitivity to Crenolanib. Low RASGEF1A expression is a favorable prognostic factor for AML patients receiving chemotherapy or Allo-HSCT.
Declarations Figure 1 Compare the survival levels of the RASGEF1A-high expression group and the RASGEF1A-low group in CN-AML patients. A: EFS and OS in 75 CN-AML patients from the TCGA database (P < 0.0001). B, OS in 162 CN-AML patients from the GSE12417 dataset (P < 0.0001) and OS in 78 CN-AML patients from the GSE12417 dataset (P < 0.0001). A Log-rank test was used to compare the survival curves of high and low gene expression. CN-AML, cytogenetically normal acute myeloid leukemia. EFS, Event-free survival time (months); OS, Overall survival time (months). Left side of A: the x-axis represents the EFS time (months); the y-axis represents the survival probability; Right side of A: the x-axis represents the OS time (months); the y-axis represents the survival probability. B: the x-axis represents the OS time (months); the y-axis represents the survival probability Figure 2 Compare the survival levels of the RASGEF1A-high expression group and the RASGEF1A-low expression group in AML patients receiving chemotherapy or Allo-HSCT. A, EFS and OS in 67 AML patients receiving Allo-HSCT from the TCGA database (P < 0.0001). B, EFS and OS in 92 AML patients receiving chemotherapy from the TCGA database (P < 0.0001). A Log-rank test was used to compare the survival curves of high and low gene expression. Event-free survival time (months); OS, Overall survival time (months). Left side: the x-axis represents the EFS time (months); the y-axis represents the survival probability; Right side: the x-axis represents the OS time (months); the y-axis represents the survival probability. Figure 3