Integrated analysis of the proteome and transcriptome in human ischemic cardiomyopathy

Background: Ischemic cardiomyopathy (ICM) is the primary cause of heart failure, which leads to an unacceptable rate of mortality and morbidity. The molecular mechanisms involved in ICM remains incompletely understood. This study aimed to investigate the molecular mechanisms of ICM by integrated proteome and transcriptome analyses. Methods and Results: Data independent acquisition (DIA) mass spectrometry and RNA-seq technologies were performed in left ventricular species from 5 ICM patients and 5 unused non-failing donors. A total of 546 differentially expressed proteins (DEPs) and 1080 mRNAs (DEGs) were identied in ICM compared with control, which were mainly involved in inammatory/immune response, response to stress (such as hypoxia and reactive oxidative species), oxidative stress, and ECM (extracellular matrix) organization. Moreover, though the low correlation between transcriptome and proteome, 41 key genes were identied, which showed the same expression directions at mRNA and protein levels. Among them, HSP90AA1 occupied a central position in the PPI network. Furthermore, a differentially expressed lncRNA-mRNA-protein network was constructed, which consisted of 13, 11, and 11 differentially expressed lncRNAs, mRNAs, and proteins, respectively, and the expression of this network were validated by qRT-PCR. Conclusion: This identied some key genes and a lncRNA-mRNA-protein regulatory network involved in ICM, which should provide a framework for an in-depth interrogation into the complex molecular mechanisms of ICM.


Background
Heart failure (HF) is one of the most common diseases with an unacceptable rate of mortality and morbidity, leading to an extremely serious medical and socio-economic burden [1,2]. Among the various reasons for HF, ischemic cardiomyopathy (ICM) has emerged as the primary cause, resulting in more than 50% of HF patients [3]. ICM refers to the cardiac dysfunction caused by diffuse of severe coronary artery stenosis or even complete occlusion. The ischemic events trigger a series of cellular events, such as myocardial stunning, hibernation, or even death, accompanied by activation of broblast, which ultimately leads to cardiac remodeling or myocardial brosis, leading to adverse events such as heart failure [4]. Despite the advances in medical therapy that have led to a markedly improved prognosis, these patients still have a high rate of death. Therefore, in-depth and comprehensive understanding of the molecular mechanism, identi cation of new molecular markers and drug targets are particularly important for better management of ICM.
Increasing evidence has suggested the crucial roles of long non-coding RNA (lncRNA), more than 200 nucleotides in length, in many physiological and pathological processes of cardiovascular diseases, including ICM [5][6][7]. Although several lncRNAs have been identi ed to be involved in the pathogenesis of ICM [8,9], the role of these transcripts is largely unknown and needed to be further explored.
With the development of the high-throughput techniques, numerous studies have tried to discover the molecular mechanism and potential biomarker of disease through those "omics" techniques from different levels, such as genomics, transcriptomics, proteomics, metabolomics, and so on [10]. Single biomarkers may not be su cient to represent the complex biological process of a disease. In contrast, the omics techniques could capture thousands of variables and demonstrate the interrelation of these variables in the development of disease [10]. Compared with RNAs, the proteins are more reliable when re ecting the molecular status of disease for their relative stability. However, the proteomics technique may ignore the low abundant proteins like transcription factors [11]. Integration of distinct, complementary approaches, such as proteomics and transcriptomics, has the potential to yield novel and comprehensive insights into the complex biological process of diseases [12,13]. However, not too many studies have systemically investigated the molecular mechanisms of ICM by integrated proteome and transcriptome analyses.
In the present study, human left ventricular tissues of ICM were systemically explored through transcriptomics and proteomics techniques to identify the relevant biomarkers and mechanisms. Moreover, the potential lncRNA-mRNA regulation pairs were identi ed by antisense-, cis-, and transpredictions. These results might provide some important evidence to understand the mechanisms of ICM.

Ethics statement
This study was conducted in accordance with the guidelines of the Declaration of Helsinki and approved by the Ethics Committee of local Hospital. All participants provided informed written consent.

Tissue collection
The left ventricular tissue of ve ICM patients with end-stage HF undergoing heart transplantation, as well as ve abandoned non-failing donors were collected for analysis in this study. All the samples were dissected on ice surface and snap-frozen in liquid nitrogen or xed in 4% paraformaldehyde for future analysis. The diagnosis of ICM was based on clinical history, echocardiography, and coronary angiography data. The clinical characteristics of the patients, such as clinical history and echocardiography data, are shown in Table 1.

Protein extraction and digestion
The samples were ground to powder in liquid nitrogen and dissolved in lysis buffer (7M Urea, 2% SDS, 1 Protease Inhibitor Cocktail (Roche Ltd. Basel, Switzerland)). After lysing on ice for 30 min, the samples were centrifuged at 15,000 rpm for 15 min at 4 °C. The supernatant was collected and precipitated with ice-cold acetone at -20 °C overnight. The precipitations were then cleaned with acetone three times and redissolved in 7M Urea by sonication on ice. The protein concentration of the supernatant was determined by using the BCA Protein Assay Kit. For protein digestion, 50 μg protein was hydrolyzed by sequencing-grade trypsin (Promega, Madison, WI) at a substrate/enzyme ratio of 50:1 (W/W) at 37℃ for 16 hours.
The digested peptides were then lyophilized under a vacuum.
High pH reverse-phase separation The peptide mixture was re-diluted in buffer A (buffer A: 20mM ammonium formate in water, pH10.0, adjusted with ammonium hydroxide), and then fractionated by a linear gradient pH separation system, Ultimate 3000 system (ThermoFisher Scienti c, MA, USA) connected to a reverse-phase column (XBridge C18 column, 4.6mm x 250 mm, 5μm, (Waters Corporation, MA, USA)). The linear gradient was 5% to 45% buffer B (20 mmol/L ammonium formate in 80% acetonitrile, pH 10.0, adjusted with ammonium hydroxide). After re-equilibration, the column worked at a ow rate of 1 ml/min at 30℃ for 40 min. Ten fractions were collected during the separation and freeze-dried in a vacuum concentrator.

Data-dependent acquisition (DDA) analysis
The dried fractions were re-dissolved in 30 μl buffer C (0.1% formic acid in water), and analyzed by online electrospray tandem mass spectrometry. The experiments were performed on EASY-nLC 1200 system connected to an Orbitrap Fusion Lumos Tribrid (Thermo Fisher Scienti c, MA, USA). Three μl peptide was separated on an analytical column (Acclaim Pep Map C18, Thermo Fisher Scienti c) at a ow rate of 200 nL/min, with a linear gradient of 5% to 35% buffer D (0.1% formic acid in acetonitrile) in 120 min. The electrospray voltage of 2 kV versus the inlet of the mass spectrometer was used.
The raw data were searched against the H. sapiens proteome database using Spectronaut X (Biognosys AG, Switzerland) with a default setting. Carbamidomethyl and oxidation were set as the xed and variable modi cations, respectively. False discovery rate (FDR) was set to 1% for peptide and protein identi cation.
Data-independent acquisition (DIA) analysis DIA analysis was performed with the same mass spectrometer and LC system as DDA. The mass spectrometer was run under data independent acquisition mode, and automatically switched between MS and MS/MS mode. The raw data of DIA were also analyzed by using Spectronaut X with the default setting. iRT peptides were used to calibrate retention time. All results were ltered by FDR of 1%. Differentially expressed proteins (DEPs) were calculated using the t-test and Benjamini-Hochberg. And proteins with fold change > 1.2 or < 0.83 with adjusted p value < 0.05 were consider as DEPs.
RNA extraction, library preparation, and RNA sequencing Total RNA was extracted using TRIzol reagent ((Invitrogen Life Technologies, USA). The concentration and quality of RNA were assessed using the NanoDrop1000 spectrophotometer (Thermo Fisher Scienti c, USA) and Agilent 2100 Bioanalyzer (Agilent Technologies, USA). After removing the ribosomal RNA, the mRNA and ncRNA were digested into short fragments by Ribo-Zero™ rRNA Removal Kit (Epicentre Madison, USA), and were retro-transcribed into cDNA by SuperScript III Reverse Transcriptase kit (Life Technologies, Thermo Fisher Scienti c, USA). The cDNA was then puri ed with QiaQuick PCR extraction kit (Qiagen, Germany), followed by end polish, poly(A) addition, and Illumina sequencing adapters ligation. The complementary strand was digested, and the sequencing strands were ampli ed with 15 cycles of PCR reaction, followed by sequence on Illumina HiSeqTM 4000 platform.

RNA-seq Data processing
The fastp (version 0.18.0) was used to lter the raw reads to obtain high quality of clean reads. The reads mapped to the ribosomal RNA database by Bowtie2 (version 2.2.8) were removed. The genome reference pro le was downloaded from the genome website, and the paired-end clean reads were aligned to the reference genome using HISTA2 (version 2.1.0). The reconstruction of the transcript was performed by StringTie (version 1.3.4). For lncRNA identi cation, the CNCI (version 2) and CPC (version 0.9-r2) were used to assess the protein-coding potential of novel transcripts. The overlap of the non-protein-coding potential results of the two software was de ned as lncRNA. The fragment per kilobase of transcript per million mapped reads (FPKM) value was calculated to quantify the expression abundance and variations by StringTie. The differentially expressed mRNA (gene) (DEG) and lncRNA (DEL) were calculated by DESeq2 between two different groups. The mRNAs with adjusted p value < 0.05 and fold change > 1.5 or <0.67 were considered as DEGs, while lncRNAs with p value < 0.05 and fold change > 2 or < 0.5 were assigned as DELs.

Functional annotation of DEPs/DEGs
To obtain the overview of the characteristics and the mechanisms of the DEPs/DEGs involved in ICM, gene functional enrichment analysis, including Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway functions, were performed by using online databases of DAVID (https://david.ncifcrf.gov/) and Metascape (http://metascape.org/). P value < 0.05 was considered a statistically signi cant difference.

Protein-protein interaction (PPI) network construction
The PPI network was constructed by using an online database of STRING (http://string-db.org/). The interaction score > 0.4 (medium con dence) was considered as signi cance. The Cytoscape (version 3.6.1) was used to further analyze and visualize the network.

lncRNA-mRNA association analysis
To identify the antisensecharacteristics of lncRNA, the software RNAplex (version 0.2) (http://www.tbi.univie.ac.at/RNA/RNAplex.1.html) was used to predict the complementary interaction of lncRNA and mRNA, basing on the calculation of minimum free energy through thermodynamics structure. Another function of lncRNAs is the cis-regulation of their neighboring genes on the same allele. Therefore, the lncRNAs were annotated again, and those in less than 100kb up/downstream of a gene were considered cis-regulators. Furthermore, the correlations of expression between lncRNAs and mRNAs were calculated to reveal the transregulation function of lncRNAs. The Pearson's correlation of each pair of lncRNA and mRNA no less than 0.95 were considered as transregulation.

Validation by qRT-PCR
The procedure of the qRT-PCR was performed as described in previous studies [11,14]. In brief, total RNA was extracted and reverse transcribed to cDNA. The SYBR-green qPCR kit (Life Technologies, USA) was used to detect the relative expression of the RNA. The primer sequences are shown in Table 2. The 2 -∆∆Ct method was used to calculate relative gene expression levels. Endogenous GAPDH was used for standardization.

Identi cation of ICM-related DEPs
The proteomic analysis based on the DIA technique was performed to investigate the abundance of cardiac protein in end-stage ICM patients. As a result, a total of 28,724 unique peptides and 3419 protein groups were successfully identi ed. Compared with non-failing donors (control), a total of 546 proteins were differentially expressed in the ICM group, including 377 up-regulated DEPs and 169 down-regulated DEPs ( Fig. 1A and 1B).  . 2C and Additional le 1). The KEGG pathway analysis results revealed that the ICM-related DEPs were mainly involved in autoimmune response, substance and energy metabolism, such as Systemic lupus erythematosus (hsa05322), Complement and coagulation cascades (hsa04610), Carbon metabolism (hsa01200), Fatty acid degradation (hsa00071), Citrate cycle (TCA cycle) (hsa00020), etc. (Fig. 2D and Additional le 1) Furthermore, similar results were observed in the analysis of Metascape. As shown in Fig. 2E and 2F, the ICM-related DEPs were markedly enriched in cardiac muscle structure development/organization and contraction, oxidative stress, energy metabolism, lipid metabolism, responses to stress, nucleotide metabolism, ECM (extracellular matrix) organization, immune responses, and so on.

Identi cation of ICM-related DEGs and the common enriched functions between DEGs and DEPs
To investigate the transcriptomic variations of ICM, the RNA-seq technology was performed. After ltration, an average yield of 70 M clean reads was obtained, with an average alignment rate of 97.38% of each sample. After analysis of the expression level, a total of 1080 DEGs were identi ed, including 609 up-regulated and 471 down-regulated DEGs in ICM compared with control ( Fig. 3A and 3B).
Moreover, the functions of these DEGs were investigated, and a total of 129 BP, 35 CC, 33 MF, and 17 KEGG pathways were enriched (Additional le 2). Similar to DEPs, the biological function of the DEGs were mainly involved in in ammatory/immune response, response to stress, and brosis, such as in ammatory response (GO:0006954), T cell activation (GO:0042110), response to hydrogen peroxide (GO:0042542), bone mineralization (GO:0030282), ECM-receptor interaction (hsa04512) (Additional le 2).

Integrated analysis of proteome and transcriptome
The integrated analysis of transcriptome and proteome can provide a more comprehensive insight into the gene transcriptional pro le and post-transcriptional regulation [15]. In the present study, the expression data of transcriptome and proteome were combined together, and the correlation between the two pro les was calculated. As shown in Fig. 4A, there was no signi cant correlation between the transcriptome and proteome (r = 0.23). According to the threshold of fold change, the integrated data could be divided into nine quadrants. The third and seventh quadrants represented the same trend for transcriptome and proteome expression, and a total of 28 signi cantly co-up regulated genes, as well as 13 signi cantly codown regulated genes, were identi ed (Fig. 4A).
The functions of these co-regulated genes were mainly involved in response to hypoxia or oxidative stress, muscle contraction, and the complement cascade, such as HIF-1 signaling pathway, oxidationreduction process, response to hydrogen peroxide, straited muscle contraction, and complement activation (Fig. 4B). The interactions of these co-regulated genes/proteins were shown the Fig. 4C, and only 18 interactions among 19 proteins were identi ed. The co-regulated genes with the top 10 degrees were shown in Fig. 4D, and HSP90AA1 showed a central position in the PPI network.
Identi cation of the DELs and differentially expressed lncRNA-mRNA-protein network construction Increasing evidence has suggested the crucial role of lncRNA in regulating the transcriptional and posttranscriptional expression of coding genes through three main types, including antisense-, cis-, and transregulation [6,16]. In the present study, a total of 1227 lncRNAs differentially expressed in ICM compared with control, including 626 up-regulated DELs and 601 down-regulated DELs (Fig. 5A and 5B). Moreover, 12 of lncRNA:antisense-mRNA pairs, 37 of lncRNA:cis-mRNA pairs, and 4283 of lncRNA:trans-mRNA pairs were identi ed between the DEGs and DELs. Additionally, the co-regulated genes between transcriptome and proteome with their regulatory lncRNA were extracted out, and their network was constructed. As shown in Fig SNHG5-283) were identi ed which regulated the expression of the 11 co-regulated genes (TF, CACYBP, FMOD, AOC3, TAGLN, PDK1, S100A13, S100A4, HMGN2, IGHA1, HLA-DPA1) by two antisense-, two cis-, and 13 transmanners. For example, the decreased lncRNA of HLA-DPB1 might up-regulate the expression of the HLA-DPA1 mRNA via antisenseactivity, which resulted in an increase of HLA-DPA1 protein (Fig. 5C).

Validation by qRT-PCR
To validate the expression of genes identi ed in the differentially expressed lncRNA-mRNA-protein network, the qRT-PCR was performed. As shown in Fig. 6, four lncRNAs (AL513365.2-202, DNM1P46-201, HLA-DPB1-217, and SNHG32-204) and one mRNA (S100A13) showed no statistical difference, while the other 9 lncRNAs and 10 mRNAs showed signi cant difference with the same directions as RNA-seq data.

Discussion
In the present study, an integrated analysis of transcriptome and proteome was performed to investigate the molecular mechanisms related to the pathogenesis of ICM. Each molecular feature represented partial aspects of the ICM, and the combination of lncRNA, mRNA, and protein information provided a more comprehensive insight into the cellular mechanisms related to ICM.
Compared with DNA and RNA, protein is the most directly functional molecule [17]. In this study, the DIA approach was used to identify the abundances as well as the changes of proteins in ICM, and a total of 546 proteins (including 377 up-and 169 down-regulated DEPs) were differentially expressed compared with NFD. Previous studies have revealed multiple cellular pathophysiologies involved in ICM, such as oxidative stress, in ammation, mitochondrial dysfunction, apoptosis cascade, calcium overload, myocardial brosis, and so on [4,18,19]. Similarly, our results of the GO and pathway analysis also suggested that these DEPs participated in cardiac muscle structure development/organization and contraction, oxidative stress, energy metabolism, lipid metabolism, responses to stress and nucleotide metabolism, ECM organization, and immune responses.
The use of complementary approaches could attenuate some de ciencies and provide new perspectives compared with each omics method used in isolation. In this study, the RNA-seq technique was performed, and 1080 genes were identi ed to be changed in RNA level, including 609 up-and 471 down-regulated DEGs in ICM in contrasted to NFD. Similarly, these DEGs were also markedly involved in in ammatory/immune response, response to stress, oxidative stress, and brosis. The exact same terms enriched from DEPs and DEGs furtherly highlighted the key roles of the processes of ECM organization and response to stress, as well as the complement and coagulation cascades signal pathway, in the mechanisms of ICM, which were consistent with previous researches [19][20][21].
The integrated analysis of transcriptome and proteome has the potential to provide a systemic view on gene expression pro le and complex biological functions related to the status of post-translational turnover and varying translation e ciencies [13,15]. However, the correlation of gene expression pro le at transcript and protein levels was poor overall, which was similar to other researches [15,22]. Although the genetic information ultimately translated into amino acid information, the post-transcriptional and posttranslational variations could result in an imperfect correlation between genetic programs and protein phenotypes [22,23], that may partially interpret the poor correlation between transcripts and proteins. In this study, 41 key genes were identi ed, which showed the same expression directions at mRNA and protein levels. And these key genes were mainly associated with the process of response to hypoxia or oxidative stress, muscle contraction, as well as the complement cascade. Among the 41 genes, the HSP90AA1 occupied a central position in the PPI network, which was down-regulated with log 2 FC (fold change) of -1.41 and − 1.06 in mRNA and protein levels, respectively.
HSP90AA1 is a member of the HSP90 (heat shock protein) family, which is considered as a protein stabilizer binding to multiple receptors, such as epidermal growth factor receptor, transforming growth factor TGFβ receptor, and epithelial-mesenchymal transition factor, and thus protect the corresponding signal [24][25][26]. Previous researches have revealed the role of HSP90 in promoting brosis in different models, including renal brosis, liver brosis, and cardiac brosis in pressure overloaded rats [24,27,28]. It seems that HSP90 would act as a risk factor exacerbating the process of cardiac brosis and heart failure. However, studies also suggested the protective function of HSP90 in cardiomyocytes under ischemia and reperfusion conditions [29,30]. It was demonstrated that over-expression of HSP90AA1 could alleviate the cardiomyocytes apoptosis, and knockdown of HSP90AA1 could enhance the apoptosis [30]. Therefore, the protective or aggravating role of HSP90AA1 in ICM or HF needs to be further investigated in the future.
LncRNAs have been suggested as crucial regulators of various cardiovascular diseases involved in response to stress, apoptosis, autophagy, proliferation, brosis, and so on [9,31]. For example, overexpression of lncRNA Gm2691 could attenuate the apoptosis after myocardial infarction[32], while down-regulation of H19 could promote cell proliferation and inhibit cell apoptosis [33]. The mechanisms of lncRNAs in regulating the physiological or pathophysiological process are involved in a variety of approaches, especially binding miRNA as competing endogenous RNA, in uencing neighboring genes as a cis-acting element, regulating remote genes as a trans-acting element, binding mRNA as antisenseregulator [16,32,34]. Thus, lncRNAs through those transcriptional and post-transcriptional modulation to regulate the relevant expressions of genes. In this study, 12 of lncRNA:antisense-mRNA pairs, 37 of lncRNA:cis-mRNA pairs, and 4283 of lncRNA:trans-mRNA pairs were identi ed between the DEGs and DELs. For the 41 key genes with the same expressive trends between mRNA and protein levels, only 11 genes (TF, CACYBP, FMOD, AOC3, TAGLN, PDK1, S100A13, S100A4, HMGN2, IGHA1, HLA-DPA1) were identi ed their related regulatory lncRNAs (

Conclusion
Taken together, we systematically investigated the transcriptomic and proteomic data of left ventricles collected from ICM as well as NFD. The results highlighted the processes of ECM organization, response to stress (such as oxidative stress), as well as the complement and coagulation cascades signal pathways in the mechanisms of ICM. Though the correlation between transcriptome and proteome was poor, 41 key genes were identi ed, which showed the same expressive directions at mRNA and protein levels. Moreover, the lncRNA regulatory network of the DEGs was investigated based on antisense-, cis-, and transregulation, and 13 lncRNAs were obtained, which regulated 11 of the 41 key genes. It is important to note that many of these lncRNAs and genes remain poorly understood in the pathogenesis of ICM and needs further exploration. This work should provide a framework for an in-depth interrogation into the complex molecular mechanisms of ICM.

Declarations
Ethics approval and consent to participate The collection and research use of the human heart tissues was conducted in accordance with the guidelines of the Declaration of Helsinki and approved by the Ethics Committee of Guangdong People's Hospital (approval No. GDREC2016255H). All participants provided informed written consent.

Consent for publication
Not applicable.

Availability of data and materials
The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.