Biomarkers Associated with CD8+ T Cell Inltration in Childhood AML

Acute myeloid leukemia (AML) is a common hematological malignant tumor in children. AML is characterized by high morbidity, recurrence and mortality rates worldwide. Immune cell infiltration in tumor microenvironment plays an important role in tumor progression. This study aimed at exploring biomarkers related to CD8+T cell infiltration in children with AML. Transcriptome data and clinical data were retrieved from TARGET database. We collected whole blood samples from some AML children to verify the results. Through the joint analysis of the data of multiple databases we found that CAMK2D, MPZL3, MSL3 are associated with CD8+ immune infiltration. Through PCR analysis, it was found that CAMK2D, MPZL3, MSL3 was highly expressed in the whole blood of children with AML. Analysis showed that CAMK2D, MPZL3 and MSL3 are potential clinical prognostic markers related to CD8+T cell infiltration in children with AML. The findings of this study show that CAMK2D, MPZL3, MSL3 are implicated in prognosis of AML. Notably, the three genes are implicated in CD8+T cells-related pathways. remains a health burden due to high morbidity and mortality rates. Advances in bioinformatics have enabled deposition of large gene chip sequencing data sets and clinical data of AML patients in tumor public database. Multi-directional analysis of chip data and clinical data of AML patients will help in identifying effective markers for diagnosis and prognosis of AML. These findings will provide a basis for more accurate clinical diagnosis and treatment of AML. studies show that tumor microenvironment (TME) plays an important role in tumor development. Cancer cells and surrounding Sertoli cells promote malignant phenotypes of cancer, such as malignant proliferation, resistance to apoptosis and evasion of immune surveillance. Therefore, TME plays a key role in prognosis and progression of different cancer types. TME is mainly composed of stromal cells and immune cells. Previous studies report that stromal cells are implicated in tumor angiogenesis and extracellular matrix remodeling. However, the role of immune cells found in TME on tumor growth and development have not been fully explored. Studies report that tumor infiltrating immune cell (TIC) in TME is a potential biomarker for therapeutic efficacy in development of renal clear cell carcinoma. High expression levels of CCL5 gene inhibits proliferation and invasion of renal clear cell carcinoma cells. CCL5 is implicated in CD8+T cell infiltration thus it is a potential biomarker and therapeutic target of renal clear cell carcinoma 6 . Previous studies report patients with AML present with fatigue and aging of immune cells. Cancer cells directly affect activity, expansion, co-signaling and expression of CD8+T cells. Chemotherapy partly restores the function of CD8+T 7 . Role of AML in shaping response of CD8+T cells and restoration of CD8+T cell function after treatment provides a theoretical basis for new immunotherapy approach for AML. However, currently no specific molecular markers have been reported for immunotherapy against AML. Therefore, there is need to explore molecular markers related to CD8+T cellular immunity for development of AML immunotherapy. In this study, we analyzed RNA-seq transcriptome data and clinical data of AML children. Analysis shows key regulatory genes and regulatory pathways related to CD8+ immune cells. The findings from the study provide a novel immunotherapy approach for diagnosis and treatment of AML.


Introduction
Leukemia is the most common hematological malignancies in children, with a mortality rate of 4% worldwide 1 . Leukemia can be grouped into lymphocytic leukemia and myelogenous leukemia based on the source of hematopoietic primordial cells. Childhood leukemia has the highest recurrence rate of acute myeloid leukemia in the world. It is responsible for the highest number of leukemia-related deaths. Childhood leukemia is characterized by multiple organ metastasis and poor prognosis. Currently, diagnosis of leukemia is mainly carried out through bone marrow biopsy. However, carrying out bone marrow biopsy in children is challenging. In addition, multiple bone marrow aspiration examinations are needed for accurate diagnosis, which further weakens the body and affects psychology of the child. Although several studies have explored pathogenesis of AML, AML remains a health burden due to high morbidity and mortality rates. Advances in bioinformatics have enabled deposition of large gene chip sequencing data sets and clinical data of AML patients in tumor public database. Multi-directional analysis of chip data and clinical data of AML patients will help in identifying effective markers for diagnosis and prognosis of AML. These findings will provide a basis for more accurate clinical diagnosis and treatment of AML.
Previous studies show that tumor microenvironment (TME) plays an important role in tumor development. Cancer cells and surrounding Sertoli cells promote malignant phenotypes of cancer, such as malignant proliferation, resistance to apoptosis and evasion of immune surveillance. Therefore, TME plays a key role in prognosis and progression of different cancer types. TME is mainly composed of stromal cells and immune cells. Previous studies report that stromal cells are implicated in tumor angiogenesis and extracellular matrix remodeling. However, the role of immune cells found in TME on tumor growth and development have not been fully explored. Studies report that tumor infiltrating immune cell (TIC) in TME is a potential biomarker for therapeutic efficacy 2,3 . cell infiltration thus it is a potential biomarker and therapeutic target of renal clear cell carcinoma 6 . Previous studies report patients with AML present with fatigue and aging of immune cells. Cancer cells directly affect activity, expansion, co-signaling and expression of CD8+T cells. Chemotherapy partly restores the function of CD8+T 7 . Role of AML in shaping response of CD8+T cells and restoration of CD8+T cell function after treatment provides a theoretical basis for new immunotherapy approach for AML. However, currently no specific molecular markers have been reported for immunotherapy against AML. Therefore, there is need to explore molecular markers related to CD8+T cellular immunity for development of AML immunotherapy.

Infiltration of CD8+T cells in tumors and
In this study, we analyzed RNA-seq transcriptome data and clinical data of AML children. Analysis shows key regulatory genes and regulatory pathways related to CD8+ immune cells. The findings from the study provide a novel immunotherapy approach for diagnosis and treatment of AML.

1.Data collection and processing
Transcriptomic files and clinical information on childhood acute myeloid leukemia were obtained from the TARGET public database. Count data were converted into the FPKM format for subsequent verification.
In this study, we constructed a weighted gene co-expression network to determine the gene module of coexpression, the relationship between gene network and phenotype, as well as the core genes in the network.
The co-expression network of all genes in the dataset was constructed using WGCNA-R package. Genes with a 10000 variance were screened by this algorithm for further analysis in which the soft threshold was set at 9. The weighted adjacency matrix was transformed into the topological overlap matrix (TOM) to estimate network connectivity, while the hierarchical clustering method was used to construct the clustering tree structure of the TOM matrix. Different branches of the cluster tree represent different gene modules while different colors represent different modules. Based on the weighted gene correlation coefficient, genes were classified according to their expression patterns. Genes with similar patterns were classified into one module.

2.Functional enrichment analysis of gene module
To determine the biological functions and signal pathways involved in the WGCNA interesting module (darkolivegreen module in this study, which has the highest correlation with phenotype), the Metascape database (www.metascape.org) was used to annotate and visualize the genes in a specific module for Gene Ontology (GO) analysis and Kyoto Encyclopedia of Gene Genome (kegg) pathway analysis. Minoverlap ≥ 3p ≤ 0.01 was considered to be statistically significant.

3.Immune gene correlation
Gene effects on immune infiltration was evaluated. CIBERSORT was used to quantify the level of immune cell infiltration in each sample, and the correlation between gene expression and immune cell content was analyzed by spearman rank correlation.

TCGA data acquisition
The TCGA database (https://portal.gdc.cancer.gov/) is the largest cancer gene information database. It stores gene data, including gene expression data, miRNA expression data, copy number variation, DNA methylation, SNP and so on. We downloaded the original mRNA expression data of AML (level3 FPKM format) obtained from a total of 151 samples.

Gene set enrichment analysis
GSEA(gene set enrichment analysis)analysis utilizes predefined gene sets to rank genes according to their degree of differential expression between two sample types. This analysis then determines whether the predefined gene sets are enriched at the top or bottom of the ranking table. In this study, we used GSEA to compare the differences in signaling pathways between the high-risk and the low-risk groups. The comparison was done to explore the possible molecular mechanisms of the differences in prognosis between the two groups. The number of replacement was set to 1000 while the type of replacement was set to the phenotype.

6.GeneMANIA analysis
Genemania (http://www.genemania.org) is a flexible and user-friendly PPI network construction database. It is used to visualize functional networks between genes and to analyze gene functions as well as interactions. The website can be used to set up data sources of gene nodes, and has a variety of bioinformatics analysis methods, such as physical interaction, gene co-expression, gene co-location, gene enrichment analysis and website prediction. In this study, Genemania was used to generate the core gene network for exploring the potential mechanisms of core genes.

Drug sensitivity analysis
Based on the largest pharmacogenomics database (GDSC Cancer Drug sensitivity Genomics Database, https://www.cancerrxgene.org/), the R software package "pRRophetic" was used to predict the chemosensitivity of each tumor sample. The IC50 estimation of each specific chemotherapeutic drug treatment was obtained by regression analysis. Ten cross-validation tests were performed using the GDSC training set to verify the regression and prediction accuracy. Default values were selected for all parameters, including the "combat" function that removes the batch effect, and the average of repetitive gene expression.

Verification of gene expression
We collected the whole blood samples of 30 children with AML and 30 normal children as control.
Through PCR quantitative analysis, we found that the expression of CAMK2D,MPZL3,MSL3 in whole blood cells of AML children was higher than that of normal children, and the difference was statistically significant.

Statistical analysis
All statistical analyses were performed in R language (version3.6). p ≤ 0.05 was considered to be statistically significant.

Data collection and WGCNA analysis
The soft threshold β was determined by the function "sft$powerEstimate", and was set to 9 ( Figure 1A ).
Based on the tom matrix detection gene module, a total of 20 gene modules ( Figure 1B and Figure 1C)  p=2e-28). Therefore, the darkolivegreen module was selected for follow-up verification. The constituent genes in the darkolivegreen module were selected and their correlation with CD8+ immune cell characteristics determined( Figure 1D). A significant correlation was found (cor=0.71, p=2.8e-14).

GO and KEGG analysis.
The darkolivegreen module genes were selected for GO and KEGG analysis( Figure 2A). It was revealed that the darkolivegreen module mainly enriched T cell activation( Figure 2B), T cell receptor signaling pathway, Antigen receptor-mediated signaling pathway, Alpha-beta T cell activation and other signal mechanisms that are associated with T cells. It is speculated that the influence of darkolivegreen module on AML progression is closely related to the above pathway. The interactions between the modules are as shown in the PPI diagram( Figure 2C).

KM-plot survival analysis
By selecting the darkolivegreen module genes and performing KM-plot survival analysis (Figure 3), we established the relationship between these genes and the prognosis of AML patients. The core genes associated with AMLL progression were screened out. A total of 10 genes (ZAP70, BIRC3, CAMK2D, CD6, ERP27, LY9, MPZL3, MSL3, OXNAD1 and TMEM2) are closely associated with AML prognosis. Exploring the potential mechanism of these genes in AML and describing the molecular map of immune-related therapy for AML will be helpful in the therapeutic management of AML.

Internal and external interactions of core genes
Tumor microenvironment is mainly composed of tumor-associated fibroblasts, immune cells, extracellular matrix, a variety of growth factors, inflammatory factors, special physical and chemical characteristics and cancer cells. The tumor microenvironment significantly affects tumor diagnosis, survival outcomes and sensitivity to clinical therapy. We found that the 10 genes were closely correlated with immune cell infiltration, while 9 genes(ZAP70, BIRC3, CAMK2D, CD6, ERP27, LY9, MPZL3, OXNAD1 and TMEM2)) were associated with CD8+ cells of AML( Figure 4A), which further validates our previous results. In addition, we downloaded the first 20 genes that are highly associated with AML in children from the Genecards database.
We analyzed their correlation with the expression of core genes. The results showed that the 10 core genes were closely correlated with the first 20 AML-associated genes( Figure 4B). The Circos diagram shows the interaction of these 10 genes( Figure 4C). The Genemaina diagram shows that the 10 genes are involved in the adjacent network( Figure 4D), which implies that the core genes are involved in immune mechanisms such as lymphocyte activation，T cell activation and leukocyte activation.

TCGA database verification
To verify the roles of the 10 genes in AML, we extracted transcriptomic and clinical data from the TCGA database. The findings of the KM-plot suggested that CAMK2D (p=1.021e-03), MPZL3 (p=8.941e-04) and MSL3 (p=3.897e-02) could predict the prognosis of AML patients( Figure 5A). The immune infiltration analysis of TCGA-AML expression profile showed that MSL3 and MPZL3 were closely correlated with CD8+( Figure   5B). We further determined the sensitivity of core genes to common chemotherapeutic drugs. Based on the drug sensitivity data obtained from the GDSC database, we used R packet "pRRophetic" to predict the chemosensitivity of each tumor sample. The results revealed that the high and low expression levels of CAMK2D, MPZL3 and MSL3 affected the sensitivity to common chemotherapeutic drugs (Docetaxel and Paclitaxel) ( Figure 6A, Figure 6B and Figure 6C).

MSI and GSEA analysis
Microsatellite instability (MSI) refers to the increase or loss of simple repetitive sequences in the genome.
MSI is prevalent in many tumors. Because of the significant correlation between MSI and immune checkpoint blocking response, MSI can be used to predict tumor responses to immunotherapy. In this study, analysis of the relationship between core genes and MSI revealed that a high expression of MPZL3 was associated with increased MSI instability (paired 0.031). We then determined the specific signaling pathways involved in the three cores and explored the potential molecular mechanisms by which core genes affect AML progression.
GSEA results showed that: overexpressed CAMK2D genes were involved in the B cell receptor signaling pathway( Figure 7A), Pantothenate and CoA biosynthesis as well as in Th1 and Th2 cell differentiation signaling pathways( Figure 8A and Figure 8B); overexpressed MPZL3 genes( Figure 7B) were involved in antigen processing and presentation, galactose metabolism and Th17 cell differentiation signaling pathways( Figure   8C and Figure 8D) while overexpressed MSL3 genes ( Figure 7C)were involved in Toll-like receptor signaling pathway( Figure 8E and Figure 8F), Autoimmune thyroid disease and Th17 cell differentiation signaling pathways.

Experimental verification
The whole blood samples of 30 AML children and 30 healthy children were collected and compared. The expression of three genes was analyzed by PCR. It was found that CAMK2D,MPZL3,MSL3 was highly expressed in the whole blood cells of AML children(Figure9A, Figure 9B and Figure 9C). The statistical analysis showed that the difference was statistically significant.

Discussion
AML is the most common non-solid tumor in children worldwide. AML is characterized by high prevalence and mortality rate of AML, therefore it is a major burden in children 1 . Accurate diagnosis of AML is achieved through bone marrow biopsy. Diagnostic accuracy of bone marrow puncture varies with clinical stage of AML, therefore, multiple biopsies may be required for accurate diagnosis. These several biopsy procedures delay diagnosis and treatment of AML. In some cases, recurrent refractory AML is not sensitive Therefore, components and function of immune cells in tumor microenvironment play a key role diagnosis, treatment and prognosis of different cancer tumors. In this study, explored diagnostic, therapeutic and prognostic targets against AML in children through analysis of AML chip data and clinical data. to the findings of this study can be used for timely AML diagnosis, for development of novel chemotherapeutic drugs, and for AML prognosis.
We constructed WGCNA network using transcriptome data and clinical data of target database. Recent studies reported presence of CD8+ immune cells in 6 different malignant tumors. Therefore, an immunotherapy based on CD8+ can be used for cancer treatment. Identification of tumor antigen-specific T cells from cancer patients is key in development of immunotherapy, diagnosis and treatment approaches 11 .

Analysis showed that biomarkers involved in immune infiltration in children with
In addition, a previous study explored anti-tumor activity of dendritic cells by cross-activating CD8 cells and presenting foreign antigens to activate CD8+T cells 12 17 . A previous study reports that mP3-expressing PMN significantly inhibits autologous healthy donor T cell proliferation. However, mP3-expressing PMN does not affect cytokine production in activated T cells. Activity of mP3-expressing PMN on T cells requires cell proximity and is abrogated by P3 inhibition 18 .
In this study, we report that CD8+ cells are significantly correlated with prognosis of AML. Moreover, CAMK2D，MPZL3，MSL3 were implicated in regulatory pathway of T cells which is correlated with function and activity of CD8+ cells, thus can be used in AML prognosis. MPZL3 protein resides in the mitochondria. MPZL3 is protein coding gene implicated in seborrheic dermatitis 21 . Studies have shown that expression of MPZL3 is associated with different cancer types. A previous study reports that MPZL3 is a associated with lung cancer susceptibility 22 . A study on microarray analysis showed that MPZL3 is highly expressed in rectal cancer cells 23 . These findings imply that MPZL3 is a cancer-related gene. In this study, through the joint analysis of three databases, three genes with diagnostic value for AML were found. Through the pathway and enrichment analysis of genes, it was found that these three genes were associated with CD8+ cells. As a common hematological tumor in children, through the collection of clinical blood samples for nearly a year, through the quantitative PCR analysis of CAMK2D,MPZL3,MSL3, it was found that there was a clear high expression in the whole blood cells of AML children. As a kind of hematological tumor with high mortality, AML has systemic distribution, no designated target organs, and high mortality, so it is difficult to carry out further clinical trials to verify the deep pathway and function of a certain gene, while animal experiments are difficult to completely simulate the environment of human bone marrow for the production of cancer blood cells, so the verification is limited to the study of gene expression.
It is still necessary for more scholars to conduct more in-depth basic research on the specific and detailed regulatory mechanism of these three genes on CD8+ cells in AML, so as to provide a new idea for immunotherapy for children with AML.
In summary, we conducted a multi-angle analysis of transcriptome data and clinical data of AML patients.             AHeatmap shows correlations of ten hub genes with T-cell in ltration.BHeatmap shows correlations of ten hub genes and the top 20 genes most related to AML in children that selected from Genecards database.(C)Internal interaction between the 10 hub genes. (D)The most relevant interaction networks and functional networks between these 10 hub genes and neighboring genes.