High expression of CD52 mRNA predicts poor prognosis for cytogenetic normal acute myeloid Leukemia patients


 Background: The prognosis of cytogenetic normal acute myeloid leukemia (CN-AML) varies. Finding new biomarkers affecting the prognosis of these patients may bring a new strategy for precise classification and treatment. CD52 plays a significant role in chronic lymphocytic leukemia (CLL). However, the potential role of CD52 in CN-AML remains largely elusive. Methods: We analyzed the prognostic role of different expression levels of CD52 in 58 CN-AML from The Cancer Genome Atlas (TCGA) dataset and validated these results with 345 CN-AML patients from Gene Expression Omnibus (GEO) dataset. Results: CN-AML patients with high CD52 mRNA expression had a poorer prognosis compared to low CD52 expression ( event-free survival [EFS], P =0.056; overall survival [OS], P=0.043; log-rank test) and the results was verified by GSE12417 (OS, P=0.020; log-rank test) and GSE71014 (OS, P=0.020; log-rank test). Hematopoietic stem cell transplantation (HSCT) may improve prognosis of patients with CD52 high . Regression analysis shows that the expression level of CD52 (HR=1.503; 95%CI:1.158-1.949 ; P=0.002) is a prognostic factor independent of age (HR=3.045; 95%CI:1.524-6.086; P=0.002) and FLT3 mutation status (HR=2.219; 95%CI:1.123-4.382; P=0.022). CD52 gene expression shows a predictive effect on EFS (1-year survival- area under the curve [AUC]:0.685, 2-year survival-AUC:0.752) and OS (1-year survival-AUC: 0.717, 2-year survival-AUC:0.770). Besides, we also found that there is a significant negative correlation between CD52 mRNA expression and DNA methylation . Accordingly, we speculated that CD52 DNA hypomethylation may responsible for the high level of CD52 mRNA. Functional enrichment analysis of differentially expressed genes in CD52 high and CD52 low suggests that adhesion molecule deregulation maybe also the potential pathological mechanism of CD52. Conclusions: CD52 gene mRNA overexpression is an independent adverse prognostic factor for CN-AML. CD52 DNA hypomethylation may responsible for the high level of CD52 mRNA. Adhesion molecule deregulation maybe potential pathological mechanism of CD52. Whether CD52 monoclonal antibodies play a role in high risk patients need further research.

Acute myeloid leukemia (AML) is a genetically heterogeneous malignant disease with adverse clinical outcome. The prognosis of these patients is mainly influenced by cytogenetic and molecular aberrations. Allogeneic hematopoietic stem cell transplantation (allo-HSCT) in the first complete remission (CR1) is recommended for high-risk AML patients, whereas data are less clear in the intermediate cytogenetic subgroup, particularly cytogenetic normal acute myeloid leukemia (CN-AML) (1). The prognosis of these patient can be further stratified according to different gene mutation such as FLT3, NPM1, CEBPA (2). In addition to genetic mutations, abnormal gene expression, such as NCALD, PDK2,PDK3, BAALC, CDKN1B, ERG, and MN1 etc. also used to predict the prognosis of CN-AML (3)(4)(5). Therefore, it is significant to identify new biomarkers to predict the prognosis of CN-AML, which may provide a new perspective to develop effective strategies to improve the prognosis of these patients.
CD52 (Campath-1) is a 21 to 28 kilodalton glycoprotein composed of 12 amino acid sequences, which is mainly expressed on normal and malignant B and T lymphocytes (6,7) and some acute hematological diseases (8)(9)(10). It was reported that CD52 is an adverse predictor in cutaneous Tcell lymphoma (CTCL) and "double-hit" and "double-expressor" lymphomas (11,12). CD52 expression on neoplastic stem cells (NSCs) was proved to be correlated with a poor survival in myelodysplastic syndromes (MDS) and AML patients with 5q- (13). As we know that the CD52 antigen is a glycoprotein anchored on the cell membrane, which may shed from cells as soluble CD52 (14). Soluble CD52 antigen is also a significant survival predictor in CLL (15). CAMPATH-1H (alemtuzumab), CD52-specific monoclonal antibody (mAb), inducing cell lysis by activating complement and cell-mediated cytotoxicity, had been already approved for the treatment of relapsed or refractory B-cell chronic lymphocytic leukemia (CLL) (16,17). Although targeting CD52 molecule was used as an approach to treat patients in clinics, the physiological and pathological significance and prognostic value of CD52 molecule is less understood so far, especially in CN-AML. Herein, we systemically evaluated the clinical significance of CD52 mRNA expression in CN-AML, which may provide new insights for prognostic stratification and individualized strategies.

Data source and preprocessing
A total of 58 de novo CN-AML patients with RNA-seq data which was showed with FPKM value (Fragments Per Kilobase Per Million Mapped Fragments) were detected from The Cancer Genome Atlas (TCGA; https://tcga-data.nci.nih.gov/) and relative expression values of mRNA was logtransformed using log2. The methylation data (Illumina Human Methylation 450k) was found in 48 of 58 patients.
Two independent expression data of 242 CN-AML patients in GSE12417 and 104 CN-AML in GSE71014 were downloaded from the National Center of Biotechnology Information-Gene Expression Omnibus (NCBI-GEO) database (https://www.ncbi.nlm.nih.gov/geo). Expression values were all normalized in GEO arrays. Gene expression in GSE12417 was performed by using the Affymetrix Human Genome U133A Array (GPL96; N=163) and Affymetrix Human Genome U133 Plus 2.0 Array (GPL570; N=79). We had corrected batch effect between these two platforms using the sav package in R (version 3.6.1; http://www.r-project.org). Microarray data in GSE71014 was done with Illumina HumanHT-12 V4.0 expression beadchip (GPL10558; N=103). All mRNA expression values were converted by log2.

Statistical and bioinformatics analysis
Considering of clinical significance, age was transformed into categorical variables. Pearson Chi-square analysis or Fisher exact test was applied for the comparison of categorical variables.
Univariate and multivariate Cox regression were performed to assess the proportional hazard.
Prognostic impact of CD52 gene expression was analyzed through Kaplan-Meier analysis (Log-rank test). Overall survival (OS) is the time from AML diagnosis until death from any cause or last clinical follow-up. Event-free survival (EFS) is the time from diagnosis to removal from the study due to death, relapse, failure to achieve complete remission (CR), or censored at the last follow-up. Area under the receiver operating characteristic curve (AUC-ROC) analysis was performed with the timeROC package.
Comparisons between two different subgroups of continuous parameters were made using Kruskal-Wallis test. The Pearson correlation test was used to measure associations among continuous variables. A two-tailed P value of less than 0.05 was considered statistically significant for above analyses. Different gene expression between two group was performed with EdgeR package using the raw read counts. False discovery rate (FDR)<0.05 and |log2 fold change (FC)| > 1 was considered to be significantly different in gene expression. Enrichment analysis of Gene ontology (GO) function and Kyoto encyclopedia of genes and genomes (KEGG) pathway for different expression genes were performed by "org.Hs.eg.db", "clusterProfiler", "ggplot2" and "enrichplot" packages and P value <0.05 was regarded as statistically significant differences.

Baseline patient characteristics
A total of 58 CN-AML patients were included in the research. Thirty patients received chemotherapy alone, and the remaining twenty-eight patients were proceeded with allogeneic hematopoietic stem cell transplantation. We divided the patients into high expression group and low expression group according to the CD52 gene median FPKM value. The clinical and molecular characteristics between the two groups were compared ( Table 1). The median ages of CD52 high and CD52 low were 51 years (range, 21-75 years) and 63 years (range, 21-88 years), respectively. The patients in CD52 high cohort were older than low expression cohort (P=0.019) and the mutation ratio of DNMT3A was higher in the latter group (P=0.014). There were no significant differences between the two groups in sex, race, FAB classification, chemotherapy, transplant, genetic mutations (NPM1, FLT3, IDH1, IDH2, RUNX1) (all P values > 0.05). The baseline characteristics patients in GSE12417 and GSE71014 had already described in previous studies (18,19).

High expression of CD52 is a poor prognostic marker for CN-AML patients
In order to evaluate the impact of CD52 expression on the survival CN-AML of patients, we employed the Kaplan-Meier method and log-rank test in 58 CN-AML patients from TCGA dataset. Results showed that CD52 high group had shorter EFS ( 58 CN-AML were divided into the chemotherapy-only group (n=30) and allo-HSCT group (n=28). Kaplan-Meier survival curves suggested that CD52 high was an adverse factor for chemotherapy group (EFS, p=0.041; OS, p=0.013; Fig.1 e-f), whereas the expression level of CD52 played a small role in the survival of HSCT group (EFS, p=0.3647; OS, p=0.4812; Additional file: Fig.S1 a-b). HSCT may prolong overall survival of CD52 high (p=0.0795) patients in some extent but CD52 low (p=0.4812) (Additional file: Fig.S1 c-d). CD52 low patients could receive a long-term remission with standard intensive chemotherapy and HSCT is adviced for CD52 high patients.

Correlation of CD52 mRNA expression with other biomarkers in CN-AML patients
We analyzed the relationship of mRNA expression between CD52 and the genes mutations that were reported to affect the prognosis of CN-AML patients. The mutation of CEBPA was associated with lower levels of CD52 mRNA (Fig.4a, p=0.001), while the mutation of DNMT3A tended to show higher levels of CD52 (Fig.4b, p=0.004). The gene mutation status, such as FLT3 (p=0.188), NPM1 p=2.642×10 -6 ) also show a correlation with gene expression, which were not show here.

Functional annotation and pathway enrichment of differentially expressed genes (DEGs)
To gain insights into the biological function of CD52, we analyzed different gene expression in CD52 high group and CD52 low group. A total of 933 differentially expressed genes had been found (214 downregulated genes; 719 downregulated genes). The differentially expressed genes between two groups has been shown in volcano plot (Fig.4e). Go and KEGG functional annotation analysis has been shown in Fig.4f-g. GO analysis found that T cell activation is important GO category (32 gene; P=0.0005) in the biological process (BP) ontology. In cellular component (CC) ontology, the most significant GO category is T cell receptor complex (7 gene; P= 0.000061). In molecular function (MF) ontology, the major histocompatibility complex (MHC) protein binding is the most important GO category (9 gene; P= 0.0003). KEGG analysis also show that differentially expressed proteins mainly enriched in T cell receptor signaling pathway. In addition to T cell activation-related pathways, some DEGs are enriched in leukocyte cell-cell adhesion and regulation of leukocyte cell-cell adhesion (Table2).

Discussion
The prognostic value of CD52 at protein level has been recognized in the CLL and some malignant lymphomas (12,26). CD52 molecules or CD52+ microvesicles in plasma all play a role in disease prognosis (12,15,26,27). Katharina  CD52 mRNA is a prognostic marker for CN-AML patients. However, it is not clear whether the prognostic value is independent or based on other prognostic factors. We analyzed the correlation of CD52 mRNA level with age, gene mutation and other biomarkers, which was reported as prognostic markers in intermediate-risk AML. There is no correlation between age and CD52 mRNA expression level, which may eliminate abnormal gene expression related to aging. CN-AML patients with CEBPA mutation shows a relatively good prognosis, while DNMT3A presents adverse (29,30). Mutation of CEBPA and DNMT3A genes are associated with lower and higher expression of CD52 mRNA respectively. The association verified the prognostic value of the CD52 gene. However, whether the pathogenic effects of mutation and CD52 were independent or interrelated remains to be verified.
There is a significant negative correlation between gene expression and DNA methylation. The result suggests that CD52 gene hypomethylation may responsible for the overexpression of CD52 mRNA in CN-AML. Obviously, further studies are needed to confirm the direct connections of CD52 mRNA expression with methylation.

CD52 is an important immune regulator on T-cell activation as previous reports, which can
modulate T-cell activation either by its intracellular signal pathways or by the interaction of soluble CD52 and Siglec-10 expressing on T cells (31,32). Functional enrichment analysis of differential expression genes in CD52 high and CD52 low found that DEGs were significantly enriched in terms of T cell activation, which is consistent with previous reports. In addition to T cell activation-related pathways, some DEGs enriched in leukocyte cell-cell adhesion, regulation of leukocyte cell-cell adhesion, positive regulation of cell-cell adhesion and cell adhesion molecules (CAMs). Many oncogenes, such as BCR-ABL1, RUNX1-ETO, MLL-AF6 etc., directly regulate the activity of adhesion molecules to control malignant hematopoietic progenitor cell (28). We speculated that adhesion molecule deregulation maybe the potential pathological mechanism of CD52. However, whether CD52 play its role in this manner needs more experiments to confirm.
CD52 monoclonal antibody had already been approved for the treatment of relapsed or refractory CLL. The use of CD52 mAb in the AML is rarely reported. Raoul et al. found that Single-agent alemtuzumab have limited activity in CD52-positive recurrent or refractory acute leukemia(33). Our research found CD52 high was an adverse prognostic factor for CN-AML. However, whether patients with high CD52 expression will benefit from CD52 mAb in combination with chemotherapy needs further research.
The limitation of the research also need mention. The patients in our research are from three different datasets and received different chemotherapy strategies, which may influence the result to some extent. Besides, bioinformatics analysis only provided possible clues to the pathological mechanism of CD52, and more experimental data are needed to verify our guess.

Ethics approval and consent to participate
The written informed consent of all patients in this study was consistent with the Helsinki Declaration.

Consent for publication
Not applicable.

Availability of data and materials
The datasets analyzed in the study are available from the corresponding author.

Competing interests
All authors declare that they have no competing interests.

Supplementary Files
This is a list of supplementary files associated with this preprint. Click to download.
Additional file 1.pdf