Genomic and RNA-Seq Proling of Patients with Heart Failure Identied Alterations in Phosphorylation and Immune Signaling Pathways

Background: Heart failure (HF) is a complex pathophysiological state in which the delivery of blood and nutrients to the tissues is inadequate. It is rarely curable and often associated with poor prognosis. The current study aimed to analyze the exomic and RNA-Seq data of patients with HF to identify the key altered pathways. Methods: Heart failure participants were recruited, and whole blood samples were collected. The samples were used for whole exome sequencing and RNA-Seq analysis. Gene expression and RNA-Seq results were veried by Gene chips and RT-PCR Results: We report a dysregulation of phosphorylation and immune signaling in patients with HF both by exomic and RNA-Seq data. We identied mutations in TITIN,OBSCURIN, NOD2, CDH2, MAP3K5, and SLC17A4 to be associated with HF by exomic analysis. And certain genes, S100A12, S100A8, S100A9, PFDN5, and TMCC2, were found upregulated in patients with HF by RNA-Seq. Conlcusion: Our results demonstrated the overall disruption of key phosphorylation and immune signaling pathways in patients with HF.

Chinese and American scholar from Heart Association [7,8]. These mutations result in aberrant proteins in the sarcomere, which in turn affect the contractile function of the latter; this induces cardiac remodeling through dysregulated downstream signaling pathways, including calcium signaling, G-protein, protein kinase C (PKC), and renin-angiotensin system/ mitogen-activated protein kinase (MAPK) pathways [9,10]. Pro-in ammatory cytokines and immune-related pathways have also been studied in relation to HF [11]. Tumor necrosis factor (TNF) is upregulated in patients with severe congestive heart failure [12]. The roles of interleukin, endothelin, and other cytokines are still under investigation, and their upregulation has been proven to be detrimental under certain conditions [13].
Several therapeutics that target HF are being evaluated currently. Our efforts in this study focused on a cross-platform integrated analysis of patients with HF, focusing on the dissection of molecular pathogenesis. In this study, we comprehensively used whole exome sequencing, gene chip, and RNA-Seq technology to nd the gene expression changes in patients with HF.

Human participants and sampling
For human studies, all participants were recruited, and whole blood samples were collected correspondingly from PLA general hospital. The blood samples were kept at BD paxgene tube for RNA extraction and kept on BD EDTA routine blood tube for DNA and plasma extraction. The diagnosis of HF was based on standard guidelines according to the symptoms or signs, electrocardiogram, chest radiograph, and echocardiography. Patients were included in the sequencing if they had a reduced ejection fraction (LVEF < 40%).

Exome sequencing library preparations and analysis
Genomic DNA was extracted from Human whole bloods. Clearseq inherited disease capture kit were used to capture. 10× of Whole-exome sequencing data were covered. The raw reads were ltered to high quality clean data and then were used to align against the human reference genome by Burrows-Wheeler Aligner. Alignment les were converted to BAM les and then translated to SNP. ANNOVAR was used to analysis.

Gene chips
Gene chips containing cardiovascular diseases-related genes were used in this study. 1ug of genomic DNA was fragmented and hybridized to the capture probes following the manufacturer's protocols (Roche). The resulting libraries were sequenced on the Illumina Hiseq 2000 platform to generate pairedend reads of 90 bp. Finally, a mean coverage of 153.01× for 957 samples was got.

RNA-sequencing
Human bloods were stored in BD paxgene tube. Total RNA was extracted using the RNeasy Mini Kit (Qiagen, Dusseldorf, Germany) and RNA concentration was measured using NanoDrop 2000 (Thermo). RNA Nano 6000 Assay Kit was used to measure RNA integrity. NEB Next UltraTM RNA Library Prep Kit was used to generate sequencing library. And then the library preparated was sequenced on Illumina Hiseq X Ten platform. Clean reads transformed from Raw sequences were mapped to human genome reference. Only reads with a perfect match were then for further analysis. Differential expression analysis was performed using the DESeq R package (1.10.1). The resulting P values were adjusted using the Benjamini controlled to the false discovery rate. P-value < 0.01 were assigned as differentially expression.

Protein chip analysis
The samples were analyzed with a Human Cytokine G1000 arrays (AAH-CYT-G1000; ayBiotech, Norcross, GA, USA) according to the manufacturer's instructions. Brie y, each sample was biotinylated and added onto duplicate glass slides preprinted with capture antibodies. Bound proteins were incubated with a streptavidin-conjugated uorescent dye and the observed uorescence intensities were normalized to the intensities of the internal positive controls. The arrays were scanned using an excitation wavelength of 532 nm, and the uorescence was measured using an InnoScan 300 Microarray Scanner.Differentially expressed proteins were arranged using hierarchical clustering and represented as a heat map.

Data analysis and statistics
Statistics analysis was performed using the SPSS 20.0 software, and the nonparametric test and twosided X 2 -test were used to compare the differences between the two groups. A P-value of < 0.05 was considered statistically signi cant.

Whole exome sequencing results and validation
We analyzed 17 human whole blood samples, including seven from patients volunteers diagnosed with HF with a mean ejection fraction (EF) of 34% and 10 from healthy volunteers as controls (CNTs), using whole exome sequencing. The mean ± SD of clinical characteristics of all he volunteers is indicated in Supplementary materials (Supplementary Table 1). We found 430 variants to be present in the proband (Supplementary Table 2). We functionally annotated two clusters of genes with altered expression levels which encoded proteins for serine/threonine kinase activity and immune response (Fig. 1). We detected 35 variants in 29 genes in the phosphatase/kinase group and 45 variants in 36 genes in the immune response group (Table 1).
To validate the results obtained using whole exome sequencing, we expanded the sample size. We analyzed 1,000 patients volunteers with HF and 250 healthy volunteers. The mean ± SD of clinical characteristics of all the volunteers is indicated in Supplementary Table 3. We used a gene chip and evaluated the association between the SNPs of phosphatase/kinase genes and immune response genes.

RNA-Seq results and validation
To investigate the differentially expressed genes (DEGs) in patients with HF, the samples used for whole exome sequencing were subjected to RNA-Seq analysis. Signi cance analysis of the RNA-Seq results revealed 810 DEGs: 197 upregulated (≥ 1.5-fold, P < 0.01) and 613 downregulated (≤ 1.5-fold, P < 0.01; Fig. 2A, Supplementary Table 4). Functional analysis of the genes was performed using the DAVID database. We performed Gene Ontology (GO) term enrichment for identifying the genes regulating functional processes with biological signi cance. Functional annotation of two clusters of the signi cantly dysregulated genes revealed enrichment of genes corresponding to protein autophosphorylation and in ammatory response (Fig. 2B).
To validate the results obtained through RNA-Seq analysis, qRT-PCR was performed with an expanded sample size. We selected nine genes from the altered phosphorylation and in ammatory response groups, namely TRAPPC5, PDCD, PFDN5, and TMCC2 from the phosphorylation group and S100A12, S100A8, S100A9, LGGALS3, and CISD2 from the in ammatory response group. qRT-PCR analysis revealed increased expression of the nine genes in the HF samples (Fig. 2C). The qPCR results, thus, validated the results of RNA-SEq. STRING analysis elucidated the relationship of the dysregulated genes with altered phosphorylation or in ammatory response pathways (Fig. 2D).

Discussion
In this study, we identi ed phosphorylation and in ammatory responses as the two biological pathways associated with HF, using RNA-Seq and whole exome sequencing. We also showed that mutations in patients with HF are located in TTN, OBSCN, NOD2, CDH2, MAP3K5, and SLC17A4, and the genes that showed upregulated expression in patients with HF were S100A12, S100A8, S100A9, PFDN5, and TMCC2.
Technological advances have enabled the use of whole exome sequencing for the study of human diseases. However, the cost per base of next-generation sequencing platforms still precludes the generation of large sample sizes of completely sequenced genomes with high coverage [14]. In comparison, the gene chip technology is much cheaper. Therefore, there has been considerable focus on whole exome sequencing as the rst step, with gene chip as the validation step [15]. RNA-Seq is used for transcriptome pro ling using deep-sequencing technologies, and it generates millions of short sequences in a single run. These fragments, or 'reads', can be used to measure gene expression levels and to identify novel splicing events [16]. In this study, we comprehensively used whole exome sequencing, gene chip, and RNA-Seq technology to nd the gene expression changes in patients with HF. Both whole exome sequencing and RNA-Seq identi ed phosphorylation and in ammatory response as the two key pathways associated with HF. These results were further veri ed using gene chip analysis.
Many eukaryotic cell functions, including signal transduction, cell adhesion, gene transcription, RNA splicing, apoptosis, and cell proliferation, are regulated via protein phosphorylation [17]. The expression of cardiac phosphatases is increased in patients with end-stage HF [18]. Muscle contraction and its molecular motor myosin are regulated through the phosphorylation of cardiomyocyte cytoskeletal proteins, such as the regulatory myosin light chain (MLC2). Decreased levels of phosphorylated MLC2 (MLC2-P) have been observed in HF [19,20]. HF top-down quantitative proteomics has identi ed the phosphorylation of cardiac troponin I (cTnI) as a candidate biomarker for chronic HF [21]. cMyBP-C phosphorylation clearly has a direct effect on the contractile properties of the heart, sarcomere organization, and its ability to attenuate the development of HF [22]. Chronic protein kinase A (PKA) hyperphosphorylation of RyR2 results in a diastolic leak that causes cardiac dysfunction [23]. The p38 MAPK pathway is a potential target in the therapeutic regimens for infarction, hypertrophy, and HF [24]. Consistent with results of these previous reports, the current study elucidated the role of altered phosphorylation in HF, using RNA sequencing and whole exome sequencing.
In addition, we demonstrated alteration of in ammation processes in HF. The mechanisms that drive the development of HF can be divided into four broad categories [25]. The rst is based on traditional risk factors, such as ischemic injury, hypertension, and metabolic syndrome; the second includes genetic cardiomyopathies; the third is based on valve dysfunction. In these three categories, the initial insult is not immune-based, rather, activation of immune system is a secondary response; and the fourth category is immune-based, which includes autoimmune and infectious (viral and bacterial) triggers. Therefore, HF cannot occur without in ammation. Upregulation of pro-in ammatory cytokines has been implicated in the progression of HF, and elevated in ammatory markers constitute important risk factors for HF [26].
Gene chip veri cation using a bigger sample size identi ed that the mutations in patients with HF were mainly in TTN, OBSCN, NOD2, CDH2, MAP3K5, and SLC17A4. The two most signi cantly mutated genes were TTN and OBSCN. Mutation in the TTN gene locus 2q31 has been implicated in human skeletal muscle and heart diseases for over 15 years [28]. However, only recently, TTN mutation was found to be associated with HF [29]. Mutations in OBSCN variants are also relatively common in inherited cardiomyopathies. Unique OBSCN variants have been found in a group of 30 end-stage failing hearts [30], wherein the frequency was similar to that for TTN-truncating mutations, which were proposed to be the major alterations associated with patients with HF. Interestingly, obscurin (encoded by OBSCN) interacts with titin (encoded by TTN) through the N-terminal domain, which in turn interacts with M-line complexes of titin and myomesin, hence, enhancing the binding and contributing to stability [31]. In addition, titin and obscurin contain signaling domains close to their C-terminal (a protein serine/threonine kinase domain in titin, and a Rho-GTPase GTP-GDP exchange factor domain in obscurin) that can be coupled to two serine/threonine protein kinase domains [32]. Phosphorylation of a speci c sequence in the cardiac isoform (N2B region) by cAMP-dependent or cGMP-dependent protein kinases results in acute reduction in stiffness [33]. Based on the gene chip analysis, we identi ed that nearly 70% of mutations were related to TTN and nearly 40% of mutations were related to OBSCN. Therefore, targeting TTN and OBSCN as therapeutic strategies will be highly signi cant. The pathological increase in passive stiffness caused by TTN and OBSCN mutations may be reversed by PKA and cGMP-dependent protein kinase (PKG) [34]. In addition, We also found NOD2, CDH2, MAP3K5, and SLC17A4 mutations in HF here. NOD2 plays a key role in the immune response to intracellular bacterial lipopolysaccharides [35]. CDH2 encodes a classical cadherin and is a member of the cadherin superfamily, generating a calcium-dependent cell adhesion molecule. The glycoprotein MAP3K5 is a member of the MAPK signaling cascade, abundantly expressed in the human heart [36]. SLC17A4 is a sodium/phosphate cotransporter in the intestinal mucosa and plays an important role in the absorption of phosphate from the intestine [37]. The signi cance of mutations in the genes encoding these four proteins as therapeutic targets will require further validation, as strong mutations in these genes were not identi ed in this study. S100 proteins are localized in the cytoplasm and/or nucleus of a wide range of cells, and are majorly secreted from neutrophils, monocytes, and macrophages [4]. S100 proteins are recognized for their potential in in ammation [38]. S100A8/A9 is homologous to S100A12, which provides a signi cant predictive value for one-year mortality in elderly patients with severe HF [39]. Our group was the rst to prove, in a previous study, that S100A12 is a potential biomarker for the prediction of HF. Upregulation of S100A12, S100A8, and S100A9 in patients with HF was consistent across our studies. Further, S100A8, S100A9, and S100A12 could be potential indicators in evaluating HF [40,41]. In addition, PFDN5 and TMCC2 were upregulated in RNA-Seq, which was further veri ed using real-time PCR. PFDN5 encodes a member of the prefoldin alpha subunit family, and TMCC is a novel protein of the endoplasmic reticulum. Misfolded proteins are upregulated in patients with chronic HF, and misfolded PFDN5 and TMCC2 are promising biomarkers for the prediction of HF.
Our study revealed the key pathways and molecules implicated in HF; however, it has several limitations. First, the initial population used for RNA-Seq and whole exome sequencing was modest, which decreased the sensitivity and accuracy of detecting the changes in expression levels and gene variants that do not make large contributions to HF. Second, the SNP coverage was relatively low when we used a large sample to verify the variants.
In conclusion, we veri ed that phosphorylation and in ammation are associated with HF, and con rmed that compounds partly inhibiting protein phosphatase and in ammation can have cardiac protective SKZ and CLL con rm the authenticity of all the raw data.
Ethics approval and consent to participate The study was approved by the Ethics Committee of the Chinese PLA General Hospital, and the doc ument number is S.2017-035-01.  . The rst vertical column shows the normalized fold-changes for RNA-Seq and qPCR of ve in ammation-related genes (S100A12, S100A8, S100A9, LGGALS3, and CISD2).
The second vertical column shows the normalized fold-changes for RNA-Seq and qPCR of four stressrelated genes (TRAPPC5, PDCD, PFDN5, and TMCC2). (D) The dysregulated signaling pathway. Genes with mutations or copy number changes are listed, and relationships across them are shown by a connection diagram. For each altered gene, the percentage of cases with this alteration is indicated, and for signi cantly upregulated genes, the average fold-changes are shown.

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.