The abnormal regulation of spliceosome may be a key factor for coronary heart disease: a simple bioinformatics analysis

Background: Cardiovascular Diseases (CVDs) has become a major disease threatening human health. As the main species of CVDs, coronary heart disease (CHD) is becoming more and more common. The pathogenesis of CHD, especially at the molecular level, is not entirely clear up to now. Explaining the pathogenesis of CHD is particularly important for its treatment and prognosis. Biological database data analysis via bioinformatics has been an important method for studying gene expression strategy in multiple human disease. The aim of this study was to identify key different expressed genes (DEGs) in CHD and elucidate the biological process of it. Methods: A total of two published microarray datasets of CHD was downloaded from the Gene Expression Omnibus (GEO). Then, bioinformatics analyses including differentially expressed genes (DEGs) analysis, venn analysis, gene ontology (GO) annotation, Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment, protein and protein interaction (PPI) network construction was performed. Quantitative real-time polymerase chain reactions (RT-qPCR) were used to detect the expression levels of DEGs in CHD. Results: A total of 122 dysregulated genes were selected as DEGs in CHD. The GO annotation analysis displayed these DEGs involved in DNA transcription and mRNA splicing regulation. The DEGs regulatory network showed the downregulated genes LUC7L3, HNRNPA1, SF3B1, ARGLU1, SRSF5, SRSF11, SREK1, PNISR, DIDO1, ZRSR2 and NKTR were located in the network control center, which were the spliceosome related genes. The RT-qPCR results were consistent with our microarray analysis. Conclusion: The abnormal regulation of spliceosome might be a key factor in the development of CHD, which must play key roles in cardiovascular disease (CVD), especially in CHD. Our study has provided a new idea for the treatment and prognosis of CHD, and the spliceosome might be the potential prognostic biomarkers of it.


Introduction
Cardiovascular Diseases (CVDs) is the main cause of death of population in the world. About 17.7 million people died from CVDs, accounting for 31% of all deaths worldwide, according to the statistics of the World Health Organization in 2015. And coronary heart disease (CHD) is the main cause of CVDs, with an estimated 7.4 million deaths [1] . CHD is a heart disease caused by narrowing or obstruction of vascular lumen which resulted from atherosclerotic lesions in the coronary arteries, resulting in myocardial ischemia, hypoxia or necrosis. In a 2014 study, nearly 1/5 men and 1/10 women died of coronary heart disease [2][3][4] . It is estimated that the prevalence of CHD will increase by approximately 10% over the next 20 years [5] . It has become one of the important diseases threatening human health, and the research and exploration of effective treatment for CHD has never stopped. The main risk factors for CHD include dyslipidemia, diabetes, arteriosclerosis, obesity, smoking, a sedentary lifestyle, as well as stress, age, male sex and a family history of CHD [6] , but the cause is not entirely clear.
The spliceosome has been proven to be a protein-directed metalloribozyme [7] , which as one of the most complex machineries of eukaryotic cells, removes intronic sequences from primary transcripts to generate functional messenger RNA (mRNA) and long noncoding RNAs (lncRNA) [8] , a process called alternative splicing. Alternative splicing is a dynamic and regulated process that can be in uenced by an array of variables such as cis-regulatory sequences and trans-acting factors, the transcriptional process, and DNA/RNA methylation [9,10] . Multiple studies have shown that aberrant alternative splicing is related to human disease, which can both be the cause and the consequence of disease [11] . Mutations in genes that are required for proper function of the spliceosome have been described to be a key cause for spinal muscular atrophy, retinitis pigmentosa, and Prader-Willi syndrome [12][13][14] . However, not many mutations in splicing factors have been described that lead to human cardiac pathologies. To date, only mutations in the splicing factor RNA-binding motif protein 20 (RBM20) have been causally linked to heart disease [15][16][17] . Otherwise, genes involved in RNA splicing has been reported to be aberrantly expressed in heart disease. For example, the Hif1α-inducible splicing factor SF3B1 is reported to be upregulated in the diseased human and mouse heart [18] . In addition, Rbfox1 is downregulated in failing human and mouse hearts [19] . However, different expressed genes in CHD has not been expounded clearly.
In this study, we used bioinformatics analyses to realize mRNA expression pro ling of CHD, which were available in the Gene Expression Omnibus (GEO) database. In our results, 122 DEGs in CHD were displayed and functional analyzed, and the key biological process in CHD was indicated. This study aimed to provide valuable information for further pathogenesis mechanism elucidation, and identify biomarkers of diagnosis, prognosis and therapeutic targets in CDH for further.

Materials And Methods 1 Microarray data collection
The microarray expression pro les of CHD were obtained from GEO database (http://www.ncbi.nlm.nih.gov/geo/). Coronary heart disease was inputted in the retrieval eld of the database, and then 210 related datasets were shown. According to the preliminary screening criteria of human, series and tissue, the last 12 datasets were chosen for the next screening. We chose GSE71226 and GSE19339 to analyze in the end following the inclusion criteria of mRNA microarray analysis in CHD.
The speci c information was showed in Table 2 in detail.

Data preprocessing and identi cation of DEGs
The raw expression datasets were downloaded and preprocessed to perform background subtraction, quantile normalization and log2 transformation before using moderated t-tests within Bioconductor package Affy and Limma (BiocGenerics, Biobase, parallel). Subsequently, the Linear Models Limma package for microarray data was used to screen DEGs between CHD and normal controls. The threshold for the DEGs screening was set as P-value <0.05. Hierarchical cluster analysis of group samples based on expression levels of mRNA were visualized through "pheatmap" package in R language.
3 Venn analysis of co-DEGs in GSE71226 and GSE19339 The co-DEGs in GSE71226 and GSE19339 were analyzed in the Draw Venn Diagram database (http://bioinformatics.psb.ugent.be/webtools/Venn/). The list of genes to be analyzed was uploaded on the database, the venn diagram and the relative co-gene list would be showed for the next analysis. 4 Gene ontology (GO) and pathway enrichment analysis of DEGs The GO annotation analysis has commonly used for functional studies of large-scale transcriptomics data. The Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway database contains biochemistry pathways. Genes list was uploaded on the DAVID Bioinformatics Resources 6.8 (https://david.ncifcrf.gov/), and the GO (biological process, cell component, molecular function, KEGG) results would be showed and downloaded for a txt le. The GO results were visualized through "ggplot2" package in R language.

Gene set enrichment analysis (GSEA) of DEGs
The GSEA analysis was accomplished via GSEA_4.0.2 software and visualized through GSEA online. The threshold of pathway enrichment of DEGs were P<0.01.

Protein and protein interaction (PPI) analysis of DEGs
The PPI analysis of DEGs was accomplished via STRING (https://string-db.org/) online analysis software. The genes list was uploaded in the multiple proteins menu bar and the PPI results would be showed later. The speci c network diagram were visualized through Cytoscape software.

RT-qPCR
The samples were obtained from ten Chinese patients with coronary heart disease and ten healthy people in Department of Cardiac Surgery, Qingdao municipal hospital, which was showed detailedly in Table3. Total RNA of each samples were extracted from peripheral blood of CHD patients and normal people using high e ciency blood total RNA extraction kit purchased from Tiangen (Lot# DP443). The mRNA was extracted by Oligo (dT) Primer (Takara, cat# 3806; Lot# T2301AA) at 65℃ for 5 mins and then being subjected to reverse transcription using RevertAid Reverse Transcriptase (Thermo Scienti c, #EP0441) and dNTP mixture (Takara, Cat# 4019, Lot#AI11312A). The relative mRNA expression level was detected by RT-qPCR using PowerTrack SYBR Green Master Mix (Thermo Scienti c, #4367659) with the speci c primer of genes. The primer sequence was showed in Table 4.

DEGs in CHD
We collected two mRNA expressions pro ling including 7 CHD cases and 7 normal controls ( Table 2) from GEO database. With a screening condition of |log 2 (FC)|≧1 and p-value≦0.05, a total of 2262 DEGs including 694 up-and 1568 down-regulated genes were identi ed in GSE71226 ( Figure 1A) and 537 DEGs including 263 up-and 274 down-regulated genes were identi ed in GSE19339 ( Figure 1B). The all DEGs were subjected to heatmap analysis. As Figure 1 shown, the difference of expression pattern of DEGs between CHD and normal controls were notable ( Figure 1C, 1D).
Due to the different sources of samples (The samples in GSE71226 were from the peripheral blood of patients with coronary heart disease and the normal people; the samples in GSE19339 were from blood vessel after myocardial infarction of patients with coronary heart disease and the normal people), there were some differences between the DEGs of GSE71226 and GSE19339. And because of the minimal patient information in these GEO database, we could not analysis the in uence of age, gender and the history on the DEGs in CHD.

Venn analysis of DEGs in GSE71226 and GSE19339
We next analyzed the common DEGs in the two databases. We found that the number of co-upregulated genes was 8, which was 1.15% in GSE71226 and 3.04% in GSE19339 (Figure2 Left). And the number of co-downregulated genes was 114, which was 7.27% in GSE7226 and 40.14% in GSE19339 (Figure2 Right). It indicated that most co-DEGs in the two databases were downregulated genes, suggesting us that these co-downregulated genes might be the key genes for the pathogenesis of CHD.

GO annotation of DEGs in CHD
To understand the biological roles of the co-DEGs in the two database, we performed GO analysis. GO enrichment analysis showed most DEGs in CHD belonged to categories that might related to mRNA processing and splicing regulation, regulation of transcription in cells ( Figure 3A). The cell components of DEGs in CHD were nucleoplasm, nucleus and cytoplasm ( Figure 3B), and the molecular functions of the most DEGs were poly(A) RNA binging, protein binding, DNA binding ( Figure 3C). The KEGG enrichment analysis was also performed which showed that the signi cantly enriched signaling pathways of most DEGs were spliceosome ( Figure 3D). These results all indicated that the occurrence of CHD was related to the integral protein expression disorder in cells.

GSEA enrichment of DEGs in CHD
In order to detect the reactome pathways of DEGs in CHD, GSEA analysis was performed for further. The threshold was FDR <0.01. We chose the common pathways of the DEGs in the two GSE database to analyze. The common pathways of the DEGs related to mRNA processing and DNA repair, which was consistent with our previous results ( Figure 4).

Protein-protein interaction of DEGs in CHD
In order to nd the core DEGs in CHD, the protein-protein interaction (PPI) of the 122 co-DEGs was analyzed next. As gure 5 indicated that, most co-downregulated genes were in the PPI network, while the co-upregulated genes were bare. The core DEGs in the PPI network were LUC7 like 3 pre-mRNA splicing factor (LUC7L3), heterogeneous nuclear ribonucleoprotein A1 (HNRNPA1), splicing factor 3b subunit 1 (SF3B1), arginine and glutamate rich 1 (ARGLU1), serine and arginine rich splicing factor 5 (SRSF5), serine and arginine rich splicing factor 11 (SRSF11), splicing regulatory glutamic acid and lysine rich protein 1 (SREK1), PNN interacting serine and arginine rich protein (PNISR), death inducer-obliterator 1 (DIDO1), zinc nger CCCH-type, RNA binding motif and serine/arginine rich 2 (ZRSR2) and natural killer cell triggering receptor (NKTR) in CHD ( Figure 5), suggesting that the aberrant expressional levels of these genes might have signi cant and prominent in uence on CHD development. 6 The core DEGs in CHD To further illuminate the possible pathogenesis of CHD, the 11 DEGs in the PPI network were analyzed. We extracted the fold change of these DEGs in CHD patients compared with the normal, which revealed that these DEGs were all downregulated notable in CHD ( Figure 6A). The related biological progress, these DEGs involved in, were DNA transcription and RNA spliceosome regulation, which was consistent with our speculation ( Figure 6B).

RT-qPCR veri cation of the core DEGs
To validate the microarray analysis data, the core DEGs (LUC7L3, HNRNPA1, SF3B1, ARGLU1, SRSF5, SRSF11, SREK1, PNISR, DIDO1, ZRSR2 and NKTR) we indicated in gure 5 were used to perform RT-qPCR veri cation. The expression levels of these DEGs were all signi cantly downregulated in the peripheral blood of ten patients with coronary heart disease compared with the normal people, which were detected by RT-qPCR ( Figure 7A-7K). The regulation trends of candidate DEGs in our RT-qPCR were generally consistent with our integrated analysis. All in all, our results indicated that protein expression and regulation disorder in cells, especially RNA spliceosome and DNA transcription related genes, inevitably leads to the occurrence of CHD which might provide a new thought for the clinical treatment of CHD.

Discussion
Despite more than 40 years of scienti c and clinical research on the treatment of multiple coronary artery lesions, the optimal treatment regimen remains controversial. CABG in cardiac surgery and PCI in interventional cardiology are powerful weapons in the treatment of multiple coronary artery lesions. OPCAB has become one of the mainstream surgical methods of conventional CABG. Although cardiopulmonary bypass is avoided, OPCAB still has a large surgical trauma and a long recovery time for patients. From stainless steel to alloy, bare metal to drug-eluting stents, PCI has also experienced continuous innovation in stent technology.
The trend of surgery is to become more and more minimally invasive, coronary surgery is no exception, the minimum trauma for maximum clinical bene t is our tireless pursuit of surgeons. The coronary hybridization technique is in line with the concept of minimally invasive coronary surgery. Although our MIDCAB surgery is very mature, our pursuit of minimally invasive coronary surgery has never stopped. Our center has planned to send cardiac surgeons to top heart centers in China to learn about robot and thoracoscopic coronary artery bypass surgery. In the future, our hybrid coronary artery surgery will be more minimally invasive. However, in the treatment of CHD, it is far from enough to pursue the perfection of surgical treatment.
Our work analyzed gene expression data in patients with CHD in GEO database GSE71226 and GSE19339 datasets, with the purpose of DEGs detection and to explore the gene-level mechanism of the pathogenesis of CHD. Our study showed about 1.118%-2.954% (GSE71226:2.954%; GSE19339:1.118%) of 23517 genes were upregulated and 1.165%-6.667% (GSE71226:6.66%; GSE19339:1.165%) of 23517 genes were downregulated in CHD, indicating that the occurrence of CHD is closely related to the change of gene expression at cell level. Due to the differences of samples and platforms in various microarray studies, integrated analysis of various microarray datasets could obtain more accurate disease-related regulators with a larger sample size than an individual microarray, we chose the 8 co-upregulated and 114 co-downregulated genes as the co-DEGs for further analysis. The GO annotation analysis of these 122 DEGs displayed these DEGs involved in DNA transcription and mRNA splicing regulation, indicating that the occurrence of CHD was related to the integral protein expression disorder in cells. Alternative splicing is a posttranscriptional mechanism that can substantially change the pattern of gene expression. Up to 95% of human genes have multiexon alternative spliced forms, suggesting that alternative splicing is one of the most signi cant components of the functional complexity of the human genome. Our results showed most of DEGs in CHD were alternative splicing related genes, indicating alternative splicing regulation should be received more attention in the study of cardiac diseases.
In the DEGs regulatory network, the downregulated LUC7L3, HNRNPA1, SF3B1, ARGLU1, SRSF5, SRSF11, SREK1, PNISR, DIDO1, ZRSR2 and NKTR were located in the network control center, which were DNA transcription and RNA splicing regulation related genes. LUC7L3 has been reported to be involved in the formation of splicesome via the RE and RS domains. Previous study showed LUC7L3 functioned in cardiac sodium channel splicing regulation of human heart failure [20,21] . HNRNPA1 is one of the most abundant core proteins of heterogeneous nuclear ribonucleoproteins (hnRNP) complexes and plays a key role in the regulation of alternative splicing. SF3B1 is an outstanding pre-mRNA splicing factor due to its frequent cancer-associated mutations and targeting for drug therapy [22][23][24][25][26] . In the early stages of spliceosome assembly, SF3B1 orchestrates an ATP-dependent series of structural and compositional rearrangements among small nuclear ribonucleoprotein particles (snRNP) at the pre-mRNA splice sites that ultimately accomplishes the act of pre-mRNA splicing [22,27,28] , but its roles in CHD has not been shown. ARGLU1 has been reported to be a transcriptional coactivator and splicing regulator important for stress hormone signaling and development [29] and multiple cancer regulation [30] . SRSF5 is a member of the serine/arginine (SR)-rich family of pre-mRNA splicing factors, which constitute part of the spliceosome [31] . Each of these factors contains an RNA recognition motif (RRM) for binding RNA and an RS domain for binding other proteins. It has been reported that SRSF5 functions as a novel oncogenic splicing factor and plays a key role in multiple cancers [32][33][34][35] and immunity regulation [36][37][38][39] , however its roles in CHD has not been reported. SRSF11 is one kind of splicing factor experienced alternative splicing switch, and in turn induced different amino acid sequences between MDS and controls [40] . SREK1 is a member of a family of serine/arginine-rich (SR) splicing proteins containing RNA recognition motif (RRM) domains [41] . PNISR, also known as SFRS18. Data mining using publicly available interaction databases also supported the credibility of the interaction of LUC7L3 and SFRS18 splicing regulators [42] . Previous paper has reported that these splicing factors indeed contribute to disease pathogenesis and their effect is-at least partially-mediated by the regulation of psoriasis-associated EDA+ bronectin [43] . García-Domingo et al. showed that DIDO1 translocates to the nucleus when is overexpressed and activates apoptosis in vitro by the upregulation of procaspase 3 and 9 [44,45] . The deletion of DIDO in mice caused a disease suggestive of Myelodysplastic Syndrome/MPN as observed by Fütterer and cols [46] . ZRSR2 is an essential splicing factor, associates with the U2 auxiliary factor heterodimer, which is required for the recognition of a functional 3' splice site in pre-mRNA splicing, and may play a role in network interactions during spliceosome assembly. Roger A Fleischman showed that the common clinical features of patients with an isolated mutation of ZRSR2 are a macrocytic anemia without leukopenia, thrombocytopenia or an increase in marrow blast percentage [47] , and mutation of ZRSR2 bas been reported to be associated with myelodysplastic syndromes [48][49][50] . NKTR has been reported to be a membrane-anchored protein with a hydrophobic amino terminal domain and a cyclophilin-like PPIase domain. It is present on the surface of natural killer cells and facilitates their binding to targets, and regulates the expression of the NK-TR gene [51] . Most of these DEGs in CHD we detected in our paper were all reported to be mRNA splicing related genes, which regulated different human disease via their RNA splicing function. Our results indicated that these DEGs were all downregulated signi cantly in CHD, suggesting that these genes must play key roles in CHD. In conclusion, RNA splicing regulation must play key roles in cardiovascular disease (CVD), especially in CHD and gene mutation and expression regulation should be received more attention in the study of cardiac diseases. Our study has provided a new idea for the treatment and prognosis of coronary heart disease.

Conclusions
Our nding shows that the spliceosome related genes, especially LUC7L3, HNRNPA1, SF3B1, ARGLU1, SRSF5, SRSF11, SREK1, PNISR, DIDO1, ZRSR2 and NKTR were markedly low-expressed in CHD, which indicates they are key regulators in the development of CHD. In addition, it means that the abnormal regulation of spliceosome must play key roles in cardiovascular disease (CVD), especially in CHD. The abnormal regulation of spliceosome might not be the most immediate but the most primary reason. The abnormal expression and mutation of genes involved in the spliceosome pathway network could be the potential biomarkers of CHD, which provides a propranolol method for the early diagnosis or prognosis of it. Furthermore, our results remind us that the spliceosome regulation should be received more attention in the study of cardiac diseases. Our study has provided a new direction for the scienti c research and treatment of CHD.

Declarations
Ethics approval and consent to participate It was agreed that the experimental project, that peripheral blood of the patients were extracted for sample analysis, was designed in accordance with the principles of the Helsinki Declaration, reviewed by the Medical Ethics Committee of Qingdao Municipal Hospital, fully respected the informed consent of the subject and their family members. The sampling techniques used in the study were consistent with medical practice, the project meets the requirements of medical ethics.

Consent for publication
Not applicable.      RT-qPCR veri cation of the core DEGs. RT-qPCR to detect mRNA levels of the core DEGs in the peripheral blood of ten patients with coronary heart disease compared with the normal people. mRNA levels of each gene were lower relative to GAPDH in the CHD than in the normal. The statistical signi cance from at least three independent repeats was calculated via student's t-test, ***P<0.001.