A comprehensive bioinformatics analysis Young-Aged Coronary Heart Disease

Background Coronary heart disease (CHD) is a leading cause of morbidity and mortality worldwide[1]. Although effective primary and secondary prevention successfully reduced the mortality of CHD, morbidity and mortality of young aged coronary heart disease (YA-CHD) didn’t decrease. However, little is known about the prevalence and mechanism of YA-CHD.Methods Dataset GSE 12288 from Gene Expression Omnibus was imported and performed comprehensive bioinformatics analysis, including gene ontology analysis (GO analysis), pathway analysis, protein-protein interaction network (PPI) analysis and core network analysis.Results RAP1A, which regulates platelet integrin activation and has a critical role in platelet production, was significantly up regulated, while TNKS2, which keeps the integrity of the leukocyte telomere structure and shows a significant association with longevity, was significantly downregulated. Biological process analysis showed “phagosome” pathway was mostly significant related to YA-CHD. Innate immune response module and type I interferon signaling module, interacts with IRF1, may major in the regulation of YA-CHD progression and maybe the potential therapeutic target of YA-CHD.Conclusions RAP1A and TNKS2 in peripheral leukocytes may serve as novel biomarkers in predicting the onset of YA-CHD. Further studies about weather IRF1 influence YA-CHD through regulating innate immune type I interferon signaling pathway was needed.

increases steadily [7,8]. Epidemiological studies shows premature onset CHD was associated with a 2 to 3 times increased risk of CHD in first-degree relatives, which indicates the genetic predisposition of CHD [9]. Present studies on YA-CHD patients (CHD patients younger than 45 years old) showed obviously different characteristics compared with CAD in old patients (elder than 65 years old) [10]. Based on these evidence, YA-CHD seems to have stronger genetic predisposition, grater differences in sex ratio, better outcomes with early intervention, less complications and worse long-term outcome. Early screening and intervention seem to be essential to solve the problem. However, the mechanism hiding behind YA-CHD and suitable serum biomarker in screening YA-CHD patients remains unclear.
Next-generation sequencing technologies had successfully helped in identifying a variety of genes that are involved in the onset and progression of multiple diseases, and screening high risk populations in the past few decades. With the help of gene profiling studies, we may identify novel biomarkers and therapeutic targets of YA-CHD and demonstrated the roles of specific metabolic pathways. In this study, we carried out a comprehensive bioinformatics analysis between normal young-adults and YA-CHD patients' blood samples on Gene Expression Omnibus (GEO) database. we tried to identify key differentially expressed genes (DEGs), biological processes, pathways and PPI networks closely associated with YA-CHD.

Methods
Gene Expression Profiles GEO (https://www.ncbi.nlm.nih.gov/gds) is a public repository at the National Center of Biotechnology Information for storing high-throughput gene expression datasets [11]. We use "CAD" or "CHD" AND "Homo sapiens" [Organism] in the GEO Database to identify potential datasets. We further screened the datasets according to the following inclusion criteria: 1. clearly diagnosed of CAD by coronary angiography (CAG), 2. patients younger than 45 years old. 3.healthy young aged adults has similar age with patients (less than 2 years) were taken as controls. Finally, 1 dataset GSE12288 based on the GPL96 platform (Affymetrix Human Genome U133A Array) was included in our study, which contains 7 "cases-control" pairs. GSE12288 tested Gene expression profile in circulating leukocytes in patients with coronary artery disease. Baseline conditions are shown in Table 1.
Differentially expressed genes screening: The average expression value of different probes corresponded to the same gene was used to represent the expression value of the gene. DEGs (Differentiated Expressed Genes) between YA-CHD group and healthy control group were screened and selected by the cut-off point P<0.05 and |fold change (FC)|>1.5. The expression data were processed using limma package in R software [12].

Functional enrichment analysis
The selected DEGs were deposited to the Database for Annotation, Visualization, and Integrated Discovery (DAVID) v6.8 Beta (https://david-d.ncifcrf.gov/) for further analysis.
DAVID bioinformatics resources consists of an integrated biological knowledgebase and analytic tools aimed at systematically extracting biological meaning from large gene/protein lists [13]. In this study, DAVID database was applied to investigate Gene Ontology (GO) annotation and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis of DEGs. P<0.05 was chosen as the threshold.

Protein-protein interaction analysis
Protein-protein interaction (PPI) of DEGs was obtained from Search Tool for the Retrieval of Interacting Genes (STRING, http://string-db.org/). It provides uniquely comprehensive coverage and ease of access to both experimental as well as predicted interaction information [14]. Confidence score>0.4 of PPI was the selection threshold to construct the PPI network. PPI networks were visualized with Cytoscape software [15]. Modules from the PPI network were identified using the Molecular Complex Detection (MCODE) plugin (13) with the following criteria: Degree cutoff, 2; node score cutoff, 0.2; k-core, 2; and max depth, 100. p<0.05 was used as a threshold value.

Differentially Expressed Genes
We identified 384 DEGs, 293 up-regulated in young aged CHD patients and 91 down-  (Table 3) KEGG pathway analysis To further research the functions of upregulated DEGs, we represented them to the KEGG database. We identified 17 significant pathways based on KEGG database analysis. The most significant pathway in our KEGG analysis was phagosome with a P-value of 3.2E-08.
In down regulated DEGs, macromolecular complex disassembly (GO:0032984, p = 0.00216) was involved in biological processes, while large ribosomal subunit (GO:0015934, p = 0.00667) was involved in Cellular Component. No KEGG pathway was found in downregulated DEGs. (Table 4) PPI network and module selection. expressed mostly in YA-CHD patients and owns the greatest difference. RAP1A encodes a member of the Ras family of small GTPases. It was reported as an novel regulators of significant processes in the cardiovascular system ranging from blood vessel formation and permeability, platelet aggregation, to cardiac myocyte growth and survival [16].
Nearly study demonstrate an essential role for RAP1 signaling in platelet integrin activation and a critical role in platelet production [17]. The expressive abundance of RAP1A may serve as a prediction factor in identifying the high risk group young aged adults more susceptive to CAD. BACH1 (BTB Domain and CNC Homolog 1) was associated with tumor metastasis, especially in breast cancer. ERV3-2(Endogenous Retrovirus Group 3 Member 2) belongs to uncategorized gene, no enough present studies about this gene is available. HLADQA1 (MHC Class II HLA-DQ-Alpha-1), belongs to the HLA class II alpha chain paralogues. Gene Ontology (GO) annotations related to this gene include peptide antigen binding and MHC class II receptor activity. The association between HLADQA1 and YA-CHD remains unclear. TNKS2 (Tankyrase 2) is a protein coding gene, which involved in various processes such as Wnt signaling pathway, telomere length and vesicle trafficking. Further studies found its importance for keeping the integrity of the leukocyte telomere structure, shows a significant association with longevity [18]. Leukocyte telomere length (LTL) was found associated with CAD since 2001 [19], and has been regarded as a potential marker of biologic aging [20]. The down-regulated of TNKS2 in YA-CHD patients reflects the biological aging of cardiovascular system. Thus, the expressive abundance may serve as a marker of biological aging of cardiovascular system. ACYP1 (Acylphosphatase 1) is a protein coding gene, related pathways are Pyruvate metabolism and Pyrimidine metabolism. Gene Ontology (GO) annotations related to this gene include acylphosphatase activity, has been reported to possesses two alternative splicing forms that induce apoptosis [21]. GO analysis in 293 up regulated genes suggests the attack of YA-CHD has close relationship with immune response. What's interesting is, RAP1A was not involved in immune response for biological processes, which indicates its influence in YA-CHD was independent from inflammation. The plantlet aggregation associated with RAP1A and biological aging associated with TNKS2 are significant in YA-CHD, which may become novel biomarkers in predicting the disease risk. Present studies showed that coronary atherosclerosis is associated with macrophage polarization in epicardial adipose tissue [22]. What's more, increased circulating C-reactive protein and macrophage-colony stimulating factor are complementary predictors of longterm outcome in patients with chronic coronary artery disease [23]. The onset of YA-CHD and CHD in elder patients may both influenced by the activation of macrophage.
Core network analysis further points out the innate immune response and type I interferon signaling pathway are important in YA-CHD. There are 13 nodes in the most significant module, module1, which formed an immune response network. GO analysis on these 13 nodes shows innate immune response (GO:0045087, p = 5.16e-12) was significantly enriched for biological progression. Many of the genes were associated with regulating the function of macrophage, either up regulated or down regulated. Module 2 consists of 5 nodes and regulates type I interferon (IFN) signaling pathway. Present studies show Bcells producing type I IFN modulate macrophage polarization in tuberculosis [24]. The relationship between module 1,2 and macrophage further hint that innate immune system may regulate the polarization of macrophage by type 1 interferon and influence the progression in YA-CHD. Actually, interferons (IFNs) are key regulators of both innate and adaptive immune responses [25]. Myeloid type I interferon signaling was found to promote atherosclerosis by stimulating macrophage recruitment to lesions in mice [26]. Reducing macrophage proteoglycan sulfation will increase atherosclerosis and obesity through enhanced type I interferon signaling [27]. Considering YA-CHD patients owns higher BMI level, weather YA-CHD was caused by the reduction of macrophage proteoglycan sulfation needs further exploration.
Although the findings in this study in encouraging, there are still several limitations can't be ignored in this study. Firstly, considering our laboratory condition, no clinical validation was performed. Secondly, weather the patients included in our study companied with familial hypercholesterolemia or Kawasaki diseases was uncertain, cause the clinical and genetic background of GSE12288 was not available. However, the cluster result in the heatmap of DEGs showed outstanding homogeneity, which indicates a reliable internal validity.
In summary, RAP1A may be a new treatment target in YA-CHD patients. The expression of TNKS2 may be a novel biomarker to predict the biologic age for cardiovascular system.
The immune response and macrophage may also play an important role in YA-CHD. Further studies on the role of IRF1 and type I interferon signaling pathway in YA-CHD in needed.    Figure 1 384 Differentiated Expression Genes between YA-CHD group and control group.   Top 17 KEGG pathway analysis in 293 upregulated DEGs.