Predicting diagnostic gene Biomarkers associated with N6-methyladenosine and ferroptosis in patients with acute myocardial infarction

This study aimed to provide an early potential diagnosis of acute myocardial infarction (AMI) and determine its correlation with ferroptosis, immune checkpoints, and N6-methyladenosine (m6A). We downloaded microarray data from NCBI (GSE61144, GSE60993, and GSE42148) and identied differentially expressed genes (DEGs) in samples from healthy subjects and patients with AMI. We also performed systematic gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses and used STRING to predict the interactions between proteins. Then we proceeded with the identication of the rst ten DEGs by plotting the receiver operating characteristic curve using the multiscale curvature classication algorithm, and verication of their diagnostic signicance. Next, we investigated the relationship between these target genes and immune checkpoints, ferroptosis, and m6A.

targets for AMI have been identi ed through the study of genetic factors such as, the mRNA loci assigned to AMI in genome association studies (GWAs) [10][11][12]. Non-coding RNA (ncRNA) is also one of such genetic factors. Previous studies have extensively investigated the regulatory effects of micro-large RNA (miRNA) on several pathological aspects of AMI, including myocardial apoptosis, in ammation, angiogenesis, and brosis. Further study of miRNAs, long ncRNAs (LncRNAs), and circular RNAs (CircRNAs) can facilitate the regulation of these processes through various interesting mechanisms.
Herein, we downloaded the NCBI microarray data, identi ed DEGs in AMI samples, and compared them with normal controls [23]. The identi cation of DEGs was followed by systematic GO and KEGG analyses [24][25][26][27]. Protein-protein interactions (PPIs) among the products of DEGs were studied using STRING [28,29]. Finally, the key genes were identi ed and examined to determine if the genes associated with AMI were associated with immune checkpoints, ferroptosis, and N6-methyladenosine (m6A) modi cation [27,30,31]. In conclusion, this study provides new insights into the molecular mechanisms responsible for the occurrence of AMI.

Microarray data
From the GEO database (http://www.ncbi.nih.gov/geo) we used the MiniML microarray dataset (GSE42148, GSE60993, and GSE61144). GSE42148 was based on a GPL13607 platform Agilent-028004 SurePrint G3 Human GE 8x60K Microarray (Feature Number Version). GSE60993 was based on Illumina HumanWG-6 V3.0 Expression Beadchip of the GPL6884 platform. GSE61144 was based on Sentrix Human-6 V2 Expression BeadChip of the GLP6106 platform. The GSE42148 dataset included peripheral blood samples from six patients with AMI and 11 healthy controls. The GSE60993 dataset included seven patients with ST-segment elevation myocardial infarction (STEMI), ten patients with non-STEMI (NSTEMI), and seven healthy controls. The GSE61144 dataset included STEMI (n=7) peripheral blood samples. We log2 transformed and normalised transcriptome data employing the pre-process Core package in R Foundation for Statistical Computing version 4.0. 3 (3027). Based on platform annotation information, we converted the probes into gene symbols through the Strawberry Perl language (version 5.32.1.1), and excluded probes containing multiple genes. Additionally, we calculated the mean values of the genes corresponding to multiple probes. Furthermore, we controlled the initial quality using ANOVA and removed the batch effect using sva packing in R [1][2][3].

Filtering DEGs
We identi ed differential expression of RNAs using the "Limma" package in R. There were 30 AMI cases and 28 healthy controls. We then analysed the adjusted P values to correct the false positive results in the GEO dataset. The adjusted P-value < 0.05 and | log2 fold-change (FC) | > 1.5 represented statistical standards for RNA expression screening.
We obtained a box graph using the R package GGplot2. The R packages ggord and pheatmap were used to draw the PCA diagram and heatmap, respectively. The above analysis methods and R packages were implemented using R Foundation of Statistical Calculation (2020) version 4.0.3 [32].

Functional enrichment analysis
We used GO for functional gene annotation, particularly in molecular function (MF), biological pathways (BP), and cellular components (CC). The KEGG enrichment analysis provided a good reference for gene function research and the correlating high level genomic functional information. To have a better understanding of the effect caused by mRNAs, we applied the ClusterPro ler package (version: 3.18.0) in R to analyse GO functions of potential targets and the KEGG pathway enrichment.

Screening of candidate diagnostic biomarkers
The Interactive Gene Retrieval Tool, STRING, is an online biological database that provides gene analysis and builds gene interaction networks at the protein level [33]. In this study, we constructed the proteinprotein interaction network of DEGs using STRING (Version 11.0) [27,34]. We then visualised the PPI network using Cytoscape version 3.8.2(3330). MCC and MCODE algorithms in cytoHubba plug-in were used to screen key genes in the network.

Diagnostic Value of Characteristic Biomarkers in AMI
In order to test the predictive value of identi ed biomarkers, we used the GLM function in R (version 3.6.3) package to build logistics model, and used the GGploT2 package to visualize the results. Receiver operating characteristic (ROC) curves were generated using the mRNA expression data from the GSE42148, GSE60993, and GSE61144 dataset. There were 30 patients with AMI and 28 patients without AMI. The diagnostic value of the identi ed hub genes was evaluated using the area under the ROC curve (AUC), which was between 0.5 and 1. The closer the AUC is to 1, the better is the diagnostic effect. The accuracy of AUC ranged from 0.5 to 0.7, while that of AUC ranged from 0.7 to 0.9. The accuracy of AUC was higher than 0.9.
Effect of the immune checkpoint-, m6A-, and ferroptosis-related gene expression in AMI.
Based on the results of previous studies, we identi ed immune checkpoint-, ferroptosis-, and m6A-related genes. The dataset we downloaded was from the GEO database (https://www.ncbi.nlm.nih.gov/geo/), and the data format was MiniML. We extracted the expression of immune checkpoint-related genes. To derive ferroptosis-related genes, we used the systematic analysis of the aberrances and functional implications of ferroptosis in cancer published by Ze-Xian Liu et al. Moreover, we used the molecular characterisation and clinical signi cance of m6A modulators across 33 cancer types published by Juan Xu to derive the m6A-related genes. We used the R package ggord to draw PCA graphs; and implemented a box plot using the R package ggplot2. In addition, we established a heatmap using the R package pheatmap. All the above analysis methods and R package were implemented using R foundation for statistical computing (2020) version 4.0.3. [34][35][36][37].

Results
Identi cation of differentially expressed genes DEGs in GSE42148, GSE60993 and GSE61144 were identi ed using Limma quartile normalisation and background correction methods. Limma screening identi ed 253 DEGs, including 181 down-regulated and 72 up-regulated genes ( Figure 1).

Functional correlation analysis
Using the "clusterPro ler" package in Bioconductor and the gene function spectrum obtained through enrichment analysis of GO and KEGG pathways, we found that DEGs were mainly concentrated in the following functional categories: allograft rejection, cell adhesion molecules (CAMs), and graft versus host disease, Th1 and Th2 cell differentiation natural killer cells mediated cytotoxicity, leukocyte cell−cell adhesion, immune response−activating signal transduction, immune response−activating cell surface, and T cell activation (Fig. 2).
Identi cation and validation of biomarkers for diagnostic characteristics.
The Search Tool for the Retrieval of Interacting Genes (STRING) is an online biological database that provides gene analysis and constructs networks of gene interactions at the protein level [28]. In this study, we used STRING (version 11.0 ) to construct the protein-protein interaction (PPI) network of the DEGs [30]. In order to further explore central genes related to restenosis and their mechanism of action, 72 genes with up-regulated expression in the 253 differentially expressed genes in the restenosis group were found and uploaded to STRING online database to build a PPI network, and a PPI network with 72 genes as nodes and 72 edges was realized (as shown in gure). Nodes represent differentially expressed genes enriched in STRING database, while edges re ect the interactions between differentially expressed genes. Since genes with high binding degree and high clustering coe cient are more important in maintaining the stability of the entire network, this paper searched genes with high binding degree and clustering coe cient greater than 0.4 through PPI network. The average node degree was 7.36, the average local clustering coe cient was 0.526, and the P-value of PPI enrichment was 1.0e-16. Among the 72 nodes, the top 10 genes with high binding degree were found by Cytoscape(version 3.8.2) MCODE and MCC calculation methods, which were as follows: GZMB GZMA PRF1 KLRB1 CD2 KLRD1 IL2RB NKG7 GZMK CCL5 (Fig. 3,4) were identi ed as central genes that play key roles in AMI.
Diagnostic effect of characteristic biomarkers on acute myocardial infarction Seven biomarkers were used to distinguish AMI from control samples demonstrating highly diagnostic predictive results ( Figure 5 Acute myocardial infarction is associated with immune checkpoints, m6A, and iron death. From the 24 genes associated with iron death that were collected changes were observed in gene expression between patients with AMI, stable angina, and healthy controls. Among the patients, we observed that ACSL4, CARS, CISD1, CS, GPX4, NFE2L2, RPL8, and SAT1 iron death-related genes were closely associated with AMI. Studies have shown that high levels of the antioxidant enzyme glutathione peroxidase (GPx) are associated with improved prognosis after acute coronary syndrome (ACS) and have a protective effect [38]. Many regulators are involved in RNA methylation, including methyltransferase (Writer), RNA-binding protein (Reader), and demethylase (Erasers) [34,36,37]. Therefore, we collected genes associated with these three regulatory types and investigated their association with AMI. We found that the expression of METTL3, WTAP, YTHDC1, and YTHDF2 was signi cantly increased in patients with AMI (P 0.01). During the veri cation of immune checkpoint, LAG3, SIGLEC15, and TIGIT were found to be closely related to AMI (P < 0.01) (Figure 6,7,8).

Discussion
In this study, we rst used the GEO gene expression dataset to detect differential gene expression associated with AMI and identify DEGs via functional analysis. Thereafter, we used MCC and MCODE to screen out ten genes as potential diagnostic markers. We also analysed the independent prediction and the joint prediction ROC curve. Subsequently, we validated the association of AMI with immune checkpoints, ferrptosis, and m6A. This study will contribute to the timely diagnosis and improved treatment of AMI.
Myocardial infarction is a leading cause of morbidity and mortality worldwide. Studies show that in 2015 alone, 15.9 million patients suffered from AMI [39]. Despite signi cant improvements in the early diagnosis and treatment of AMI in the past decade, it remains a leading cause of death and disability. Therefore, the identi cation of new biomarkers for the early diagnosis of AMI requires further investigation. In recent years, RNA has emerged as a particular primary biomarker for cardiovascular disease. For example, SOCS3 can be used as a biomarker to predict the risk of AMI, and high expression of SOCS3 is an independent risk factor for AMI [40]. Small RNAs, such as mir-34, have signi cant regulatory effects after heart failure and provide important information about heart failure [41,42]. PTGS2 is associated with reduced risk of stroke and myocardial infarction [43]. With the development of gene chip technology, microarrays have been widely used in heart disease research [44,45].
On comparing the expression levels of patient target genes that earlier predicted STEMI development, we found signi cant differences in GZMB, GZMA, GZMK, PRF1, KLRB1, CD2, KLRD1, IL2RB, NKG7, and CCL5 expression. The expression of these genes was signi cantly increased after STEMI. Granase B (GZMB), a member of the serine protease family of granase, promotes apoptosis and is currently the most widely studied granase in the eld of health and disease [46]. Studies have shown that mir-518a-5p can target GZMB to reduce hypoxia-reoxygenation-induced vascular endothelial cell injury, thereby improving coronary artery disease [47]. GZMK, belonging to the granase serine protein family, also plays a key role in myocardial ischaemia. For example, microRNA-145 can protect mice from myocardial ischaemia reperfusion injury by regulating the expression of GZMK under sevo urane treatment [48]. CD2 is an immunoglobulin superfamily transmembrane glycoprotein expressed on the surface of T cells, NK cells, thymocytes, and dendritic cells [49,50]. Due to the increased expression of CD2 on activated and memory T cells and its importance for spontaneous NK cell toxicity, CD2-targeted therapy may be an effective tool for regulating the activation of these cell types in transplant patients or patients with autoimmune diseases [51]. It is not di cult to predict that it may also play a key role in the occurrence of AMI. IL2RB mutations can lead to immune dysregulation driven by T and NK cells [52]. In addition, PRF1 mutations can also alter immune system activation, in ammation, and autoimmune risk [53]. Therefore, we believe that immune dysregulation or immune in ammation is signi cantly associated with AMI. However, further experiments are warranted to verify this hypothesis.
In the analysis of immune checkpoint-related genes in patients, we found that LAG3, SIGLEC15, and TIGIT immune checkpoint proteins were closely related to the occurrence of AMI. LAG3, a member of Ig superfamily protein, binds with MHC Class II with high a nity and is a strong immune checkpoint inhibitor for activating T cells [54,55]. Studies have shown that plasma LAG3 is a potential independent predictor of HDL-C levels and the risk of coronary heart disease [56]. In recent years, ferroptosis has been a hot topic in investigations of atherosclerotic lesions, and frequent and long-term whole blood donation can reduce iron content in the body, which may be related to the reduced risk of atherosclerotic cardiovascular events [57]. In our study on the relationship between DEGs and ferroptosis in AMI, we found that eight genes related to ferroptosis were closely related to the occurrence of AMI. However, the speci c mechanism of the relationship between AMI and iron death has not been explained via basic relevant research. Similarly, in the analysis of m6A-related genes in patients with AMI, we found that methylation of METTL3, WTAP, YTHDC1, and YTHDF2 was closely related to the occurrence of AMI. Studies have shown that cytoplasmic proteins YTHDF1 and YTHDF2 mediate methylation-dependent translation promotion and degradation of target genes, respectively [58,59]. However, the mechanism of its association with AMI remains to be further studied. Therefore, we believe that the occurrence of AMI is related to the immune checkpoint, ferroptosis, and m6A, but this hypothesis needs to be veri ed.
This study has some limitations. For example, there are several studies on the differential expression of ACS genes. However, the results of those studies are different to this study. This could be due to the following reasons: (1) different batches of microarray analyses have different results to some extent; (2) compared with other studies, this study adopted three AMI data sets, providing a comprehensive analysis method for bioinformatics for AMI. Therefore, the results of this study are reliable. It is also important to verify the results in subsequent experiments. In addition, the reproducibility of immune checkpoint-, ferroptosis-, and m6A-related genes obtained from the dataset needs to be further validated. Further largescale basic studies can be carried out to verify the conclusions of this study.

Conclusions
The timely diagnosis and treatment of AMI can help improve global health. Considering this, our study aimed to identify new genetic markers associated with AMI. We found ten genes related to the occurrence of AMI. Furthermore, we believe that the occurrence of AMI is related to immune checkpoint, ferroptosis, and m6A; however, since the mechanisms of association for these phenomena with AMI remains unclear, our hypothesis needs further veri cation.     The enriched KEGG signalling pathways were selected to demonstrate the primary biological actions of major potential mRNAs. The abscissa indicates gene ratio and the enriched pathways were presented in the ordinate. Gene ontology (GO) analysis of potential mRNA targets. The biological pathways (BP), cellular component (CC), and molecular function (MF) of potential targets were clustered based on the ClusterPro ler R package (version: 3.18.0). In the enrichment result, p <0.05 or FDR <0.05 were considered meaningful. Figure 3 In the PPI network, nodes with llers inside indicate that the 3D structure of the protein was known or predicted, and empty nodes indicate that the 3D structure was unknown. The connections between proteins represent the predicted functional associations that are speci c and meaningful. There are seven differently-colored lines: 1. Light blue for database auxiliary evidence; 2. Purple for experimental proof; 3. Red for gene fusion; 4. Yellow-green for evidence mined from the article; 5. Green for gene close; 6. Blue for gene co-generation; and 7. Black for gene co-expression.  False negative rate, the ratio of positive classes to all negative classes in the sample predicted by the classi er, i.e., FP/(FP+TN). By changing different thresholds, a pair of TPR and FPR will be obtained. ROC curve is a curve drawn with FPR as abscissa and TPR as ordinate. As shown in the gure, each point on the curve corresponds to FPR and TPR at different thresholds. (The meaning of TPRate is the proportion of all samples of true category 1 that are predicted to be category 1. The meaning of FPRate is the proportion of all samples with true category 0 that are predicted to be category 1. AUC means that a positive sample and a negative sample are randomly selected from the sample. The probability that the classi er predicts the positive sample to be positive is P1, and the probability that the negative sample is positive is P2. AUC means the probability that P1 > P2).

Figure 6
A: Box plot after data standardisation. B: PCA results before batch removal for multiple data sets. C: PCA results after batch removal. D: Immune checkpoint-related gene analysis results. G1 represents the healthy control group and G2 represents the AMI group. The signi cance of the two groups was statistically analyzed by Wilcoxon test. The horizontal axis represents different Immune checkpointrelated mRNAs, the vertical axis represents the expression distribution of related genes, where different Page 24/26 colors represent different groups, and the upper left corner represents the signi cance with asterisks indicating the signi cance levels *P < 0.05, **P < 0.01, ***P < 0.001.

Figure 7
A: Box plot after data standardisation. B: PCA results before batch removal for multiple data sets. C: PCA results after batch removal. D: Ferroptosis-related gene analysis results. G1 represents the healthy control group and G2 represents the restenosis group. The signi cance of the two groups was statistically analyzed by Wilcoxon test. The horizontal axis represents different ferroptosis-related mRNAs, the vertical axis represents the expression distribution of related genes, where different colors represent different groups, and the upper left corner represents the signi cance with asterisks indicating the signi cance levels *P < 0.05, **P < 0.01, ***P < 0.001 Figure 8