Myocardial Infarction Biomarker Discovery with Integrated Gene Expression, Pathways and Biological Networks analysis

Abstract


Introduction
Despite the improvement in its management, Myocardial infarction (MI) remains one of the most prevalent and challenging heart diseases due to high morbidity and mortality.The estimated worldwide incidence of MI rises to approximately by 7.29 million people every year [1].In recent years, experimental and epidemiological studies have shown that the vascular in ammation is a central hallmark of cardiac disease by increasing the endothelial damage resulting in the hemodynamic changes and blood thrombogenicity [2][3][4].Endothelial cells dysfunction response is observed after MI and followed by dysregulated immune system and in ammatory responses [5].However, mechanisms underlying the changes that occur in immune cells and other signaling pathways during MI remain poorly elucidated.Identi cation of dysregulated individual genes and signaling pathways is crucial for understanding the pathogenic molecular changes in MI and for developing effective treatments with novel targets.Gene expression microarray studies are commonly used for high throughput biomarker identi cation.
Compared to tissue microarray studies, peripheral blood is an attractive tissue for new biomarker screening due to the simplicity and low invasiveness of sample collection.In the present study, we applied integrated analysis of microarray data collected from MI patients and healthy control to identify and classify differentially expressed genes (DEGs) and to explore the reliable and unique MI-associated molecular signature pro le and their key pathways and screened for potential MI target genes based on genetic association studies.Using systems biology approach, we searched for degree of network interactions and highly connected gene network modules to identify potential biomarkers in MI.

Identi cation of DEGs
Based on the cutoff criteria (adjusted P < 0.05 and |logFC|>1.5),over 20,000 genes were analyzed, and 806 genes were differentially expressed of which 571 genes were up-regulated and 235 down-regulated.
After removing the un-mapped DEGs, there were 492 up-regulated and 184 down-regulated DEGs taken up for downstream analysis.Top 10 up-and down-regulated DEGs according to their signi cance are shown in Table 1 and Table 2.A box plots of the data distribution before and after dataset normalization shown that all chip data reached comparable levels and a volcano plots demonstrates an overview of high expressed genes using logFC, where the threshold in the volcano plot was |logFC|>1.5.Subsequently, the heatmap of DEGs was created showing the expression pro les of MI and non-MI resulted in hierarchical clusters (Fig. 2a-c).

DEGs GO enrichment analysis
To characterize the biological role of identi ed DEGs, functional gene functional annotation and enrichment analysis was performed using DAVID, GO (molecular function, biological process and cellular component), KEGG and Reactome pathways.We combined up and down-regulated DEGs to have a larger scale perspective of the biological function.Top GO analysis results are presented in Fig. 3a,b.In biological process category, the most common processes include immune system, regulated exocytosis, response to stimulus and cytokine receptor binding.Under molecular function categories, DEGs were associated with receptor ligand activity, chemokine receptor binding and G protein coupled receptor (GPCR) binding (Fig. 3a,b).

KEGG and Reactome pathway
KEGG pathway analysis revealed that the DEGs were signi cantly enriched in Osteoclast differentiation, TNF signaling, Cytokine-cytokine receptor interaction and NF-kappa B signaling pathways.The results of the Reactome pathways analysis demonstrates an enrichment in neutrophil degranulation, signaling by interleukins and TLR signaling as shown in Fig. 4a,b.

PPI Network and modular analysis
To explore the interaction of DEGs in MI, DEGs were submitted to STRING database for PPI network construction and Cytoscape software 3.7.2 was used to analyze the network [9].PPI network of up-and down-regulated DEGs contain 681 nodes and 4704 edges with a mean node degree of 13.8 (Fig. 5).
DEGs in the PPI network were ranked by the topological analysis method of degree centrality score of the network where TNF, TLR4, CXCL8, TLR2, STAT3, TLR8, IL1B, VEGFA, MMP9, and JUN had the highest degree of connectivity among DEGs.which may play a critical role in cardiovascular disease.Further analysis of PPI was carried out by MCODE, a Cytoscape plugin used to identify the most signi cant connected and tightly clustered subnetworks with degree cutoff 2, node score cutoff 0.2, K-score 2, and maximum depth of 100.Four modules were extracted from the initially constructed PPI network (Fig. 6ac).Pathway enrichment analysis was then carried out in ClueGO to show pathway speci city for each module (Additional le 1: Fig. S1a-d).

Hub and bottleneck genes
To investigate the signal of identi ed modules, we construct PPI network for identi ed MCODE modules (PPI-MCODE).Maximal Clique Centrality (MCC) method in Cytohubba was used to identify hub genes, centralities and shortest paths method to identify bottlenecks genes.From identi ed modules, we extracted top 20 hub and 20 bottleneck genes (Fig. 7a,b).The analysis reveals unique twelve genes each as hub and bottleneck genes and eight were classi ed as hubs as well as bottlenecks.Filtered genes include unique hub genes CCL20, IL1A, CXCL8, CCR2, IL1RN, CXCL2, ICAM1, ITGAX, MMP9, TLR8, CCL3, PTGS2; unique bottleneck genes CDKN1A, LILRB2, FCGR2A, PTX3, MNDA, ATG7, JUN, CD33, DDX58, FCGR1A, FPR1, UBR4; and overlapped hub-bottleneck genes TNF, VEGFA, STAT3, CXCL1, CCL4, TLR2, IL1B, TLR4 where TNF and TLR4 were the most connected gene in the whole PPI network (Table 3).Functional GO, KEGG and Reactome pathways analysis of identi ed hub and bottleneck genes are illustrated in Additional le 2: Table S1, S2, S3 and S4.Signi cant key genes in the network and candidate genes screening In order to identify key genes, we referred to DisGeNET, a comprehensive discovery platform database containing known human gene-disease association to validate our work ow.A total of 32 hub and bottleneck genes, identi ed by Cytohubba, were mapped into DisGenNET database and MI and HF related genes were extracted.A total of seventeen genes were found to be associated with MI and HF (Fig. 8a).
Remaining fteen genes were resubmitted to DisGeNET for an overview of cardiovascular or other associated diseases.Some of these genes, CCL4, CCL3, CXCL1, JUN, DDX58, FCGR1A and FCGR2A were involved in multiple cardiovascular disease for instance myocardial ischemia, heart valve disease and thrombosis (Fig. 8b).A glimpse of the remaining gene-associated disease is illustrated in Fig. 8c and Table 4.In parallel, we used Open Target platform to screen for distinct biomarkers that correlate with MI (Fig. 9).We rst submitted hub and bottleneck genes to Open Target platform and a total of 12 drug target were extracted including PTGS2, VEGFA, TNF, CD33, IL1B, IL1A, MMP9, CCR2, TLR4, ICAM1, CXCL8 and FCGR1A (Additional le 2: Table S5).Separately.We extracted speci c MI drug targets list based on genetic association score < 0.2 (Additional le 2: Table S6).A total of 77 MI-drug target were then compared with hub and bottleneck genes based on enzyme-substrate, PPI and pathways interaction.Associated genes with MI drug targets were found to be IL1B, IL1A, IL1RN, STAT3, PTX3, PTGS2 and JUN (Additional le 1: Fig. S2).Reanalysis of these genes in DisGeNET database show that all genes were associated with MI except for CD33, FCGR1A and JUN which are bottleneck genes according to our analysis.Available drug for CD33 and FCGR1A and their targets are displayed in Additional le 2: Table S7.Various of identi ed targets in our work ow have been already in use to improve the clinical outcome in MI patient, such as IL1A, IL1B and PTGS2 as discussed below.We speculated that these genes may play an important role in MI development and could be potential therapeutic targets for MI treatment.

Discussion
Myocardial infarction is a complex disease where the genetic makeup may contribute to distinct cellular processes and to the development of disease phenotypes [10,11].MI is associated with systemic in ammatory dysregulated response including elevated levels of circulating in ammatory mediators, and activation of peripheral leukocytes and platelets.
In concordance with previous studies, our work highlights a possible immune change in MI pathology.
The crucial role of immune system in myocardial infarction is well established [12,13].Previous studies suggested an intense immune and in ammatory responses trigger several distinct cardiac pathological conditions including early onset MI [14][15][16].The immune signals may indirectly contribute to the reverse remodeling in dilated cardiomyopathy, dilative remodeling, infarcted size increase and heart failure [17,18].
Identi cation of key genes in major pathways might be an attractive therapeutic target for MI.MCODE analysis found four tightly connected sub-networks (modules) containing most of the topologically signi cant nodes of the whole networks.The transcriptional pro ling presented in identi ed modules demonstrating the relationship variability in the functional biological pathways of MI.Several identi ed hub genes were signi cantly associated with MI.Thrombomodulin (THBD) is an endothelial anticoagulant protein, involved in anti-in ammatory responses, including innate immunity, brinolysis and complement activity [29,30].Genetic studies linked several polymorphisms within THBD to be associated with higher risk of coronary artery disease and myocardial infarction [31][32][33][34][35].
Another identi ed gene is Superoxide Dismutase 2 (SOD2).SOD2 is one of the major mitochondrial antioxidant enzymes.Homozygous mice for SOD2 have a severe phenotype, heart and liver complications, metabolic acidosis, and early neonatal death [36].Recent study showed that plasmatic concentration of SOD2 was higher in CAD than in healthy control [37,38].Dubois-Deruy et al. showed a direct interaction between circulatory MicroRNAs and SOD2 in heart failure post MI rat model [39].These data indicate that SOD2 might serve as a potential biomarker candidate for post-myocardial infarction.
Endothelin 1 (EDN1) is a potent endogenous vasoconstrictor and a crucial modulator of neutrophil function [40,41].High plasmatic levels of EDN1 have been linked to poor clinical outcomes after STsegment elevation myocardial infarction (STEMI), microvascular obstruction presence and were found to be a strong predictor of 1-year survival post-MI [42][43][44].
TLR2 a member of TLR family was also identi ed as DEGs in MI and an important node in MCODE modules.TLR2 is a key player in innate immunity, in ammatory response and pathogenesis of heart failure after MI [45].TLR2 modulates ventricular remodeling after myocardial infarction and its expression is negatively correlated with the in ammatory response which may in uence myocardial ischemia-reperfusion injury therapy.Monoclonal anti-TLR2 antibody causes a pronounced reduction of leukocyte in ux, cytokine production and preserves cardiac function and geometry in vivo [46].TLR2 de ciency has also been shown to have a protective effect in a mouse [47,48].These experimental evidences increase the potential of TLR2 as a therapeutic target.
Hubs and bottlenecks are crucial components in signaling networks as they are hypothesized to be encoded by essential genes and more relevant to biologically signi cant process with respect to the disease [49][50][51].We ltered out 32 key genes by using Cytohubba.We utilized the gene expression data to infer the disease related gene expression to identify candidate regulatory genes and to infer signaling pathway networks directly from candidate genes expression.According to DisGeNET database, role of 17 hub and bottleneck genes in MI is supported by many experimental work across the globe [52][53][54][55][55][56][57][58][59][60][61][62][63] including VEGFA, which limits myocardial damage in MI animal models [63], ICAM1, implicated in pathophysiologic responses and neutrophil in ltration [55,61], TLR4 that mediates maladaptive left ventricular remodeling and impairs cardiac function [62].Interleukins family members such as IL1A and IL1B as well as PTGS2 which will be discussed further below.
Detection of these key MI related genes indicated that our framework of integrated bioinformatics analysis pipeline is sensitive enough to rapidly identify potential target gene biomarkers or novel drug targets for any complex disease.Remaining genes in hubs and bottlenecks network were tightly connected with known MI functions.Prior to the present study, few studies have addressed the gaps in identifying potential biomarkers in MI early stage[64-66].Our strategy in identifying distinct biomarkers that correlate with MI and its development were based on comparing predicted physical interaction and genetic association of MI-gene targets using Open Target Platform.Twelve genes from the hubs and bottlenecks were target for other diseases (Additional le 2: Table S5) and seven hubs and one bottleneck gene in total were linked to MI (Additional le 1: Fig. S2).Among identi ed targets IL1A and IL1B, member of the interleukin-1 gene family, which are proin ammatory factors produced by different cell types, including endothelial cells, in response to various stimuli [67-69].The intense in ammatory response is a key process is which play a central role in many cardiovascular diseases including Myocardial infarction [70][71][72].Targeting in ammatory pathways key genes such as Interleukin-1 members in cardiovascular disease in order to neutralization elimination of speci c activated in ammatory mediators may contribute to protective effect in the infarcted heart without disturbing the reparative response [73].
Variants in IL1A and IL1B are associated with acute MI as well as heart failure [74][75][76], Interleukin-1 blockers are used to improve clinical outcome in patients with prior MI.There are currently four clinically available IL-1 blockers, anakinra, rilonacept, canakinumab and gevokizumab each with a different e cacy and safety pro le [77].For instance, IL-1 blocker, was originally approved for the treatment of rheumatoid arthritis (anakinra) [78].However, It has also been successfully studied in small clinical trials to test its effectiveness in patients with myocardial infarction and favorably affect cardiac remodeling and exhibit a protective effect from post-infarction heart failure development [79-

Conclusion
By using a series of bioinformatics analysis based on DEGs between MI patient and healthy sample, we illustrated the hub, bottleneck genes and pathways as markers of in ammatory response which may be involved in the early stage of MI.In parallel, we have generated a candidate MI genes list by using DisGeNET and Open Target platform.Identi cation of MI speci c genes and pathways, which predispose individuals to MI, will facilitate the targeting of novel therapeutics and treatment strategies in development.These results may provide insight for the study of MI therapeutic targets.However, a limitation of the present study was that the identi ed candidate genes in MI were not experimentally validated, which may become a focus of future studies.

Methods
Public Microarray domain Data collection

Figures
Figures

Figure 1 Work
Figure 1

Table 1
Top 10 signi cant up-regulated DEGs in Myocardial infarction

Table 3
Top 20signi cant hubs, bottlenecks and overlapped genes

Table 4
[90,91]adays, only canakinumab has an indication for cardiovascular disease [82].Our study revealed that PTGS2 is one of the key players in MI development.PTGS2, also known as COX-2, is inducible enzyme that plays an important role in several pathophysiological processes, including in ammation, angiogenesis, and tumorigenesis[83, 84].GO analysis showed that PTGS2 was enriched in TNF and Interleukins signaling pathways.PTGS2 has previously been associated with MI, ischemic heart disease and stroke risk [56, 85, 86].However, the role of selective COX-2 inhibitors in medical practice remain controversial.Saito el al., have shown that COX-2 induction during MI appears to contribute to myocardial injury, and treatment with the speci c inhibitor of the enzyme ameliorated the course of the disease [87].Similarly, COX-2 inhibition was shown to be associated with improved hemodynamics and lower cardiomyocyte apoptotic rates in a genetically modi ed mouse model of non-ischaemic heart failure and in a canine model of myocardial ischemia-reperfusion[88, 89].Others have shown that COX-2 inhibitors may be detrimental for myocardial infarction[90,91].
MI dataset was obtained from NCBI Gene Expression Omnibus (GEO) (http://www.ncbi.nlm.nih.gov/geo/), a functional genomics public database for data repository of high throughput gene expression.GSE66360 [GPL570, (HG-U133_Plus_2), Affymetrix Human Genome U133 Plus 2.0 Array] dataset consists of microarray data from isolated circulating endothelial cells from patients experiencing acute myocardial infarction (n = 49) and healthy subject (n = 50).The healthy samples had no previous history of cardiovascular diseases or other comorbidities.No ethics committee approval or patient consent were required for the present study.The study design was performed according to the experimental work ow in Fig.1.By using robust multiarray average algorithm in R package software version 3.6.3;(http://www.R-project.org/),datasets were analyzed with the Affymetrix platform to convert raw array data into expression value, followed by background correction, quintile normalization and probe summarization.Differentially Expressed Genes (DEGs) from chosen dataset GSE66360 was identi ed between the MI and non-MI groups using affy package in R, a Volcano plot was generated to assess the overall DEGs.The limma package was then used to select DEGs and paired t-test to analyze the statistical signi cance of the observed DEG patterns.Benjamini-Hochberg correction method was used for p-value adjustment to false discovery rate (FDR).FDR < 0.05, and |logFC| >1.5 was set as a cutoff point for DEGs selection.Finally, the DEGs were divided into up-and down-regulated genes.
(http://www.cytoscape.org/) was used for PPI network visualization.Modular and cluster analysis of the PPI network were performed in MCODE, a Cytoscape plugin, to identify functional gene communities in the subnetwork with degree cutoff 2, node score cutoff 0.2, K-score 2, and maximum depth of 100.Enrichment analysis and pathway annotation of functional genes in the MCODE identi ed sub-network were searched using functional annotation tool in Cytoscape ClueGO.PPI-MCODE modules was then