Identication of hub genes for adult patients with sepsis via construction of a microRNA-mRNA-PPI network

Objective: To screen out potential prognostic hub genes for adult patients with sepsis via construction of a microRNA-mRNA-PPI network, and investigate the localization of these core genes in peripheral blood monocytes. Methods: The peripheral bloods of 33 subjects were performed microRNA and mRNA sequencing using high-throughput sequencing, differentially expressed genes and microRNAs were identied by bioinformatics. 10×single-cell transcriptome sequencing was conducted further. Results: Of which 23 adult septic patients and 10 healthy individuals, a relative levels of transcription of 20391 genes, and 1633 microRNAs were detected by RNA sequencing. A total of 1114 differentially expressed genes and 76 differentially expressed microRNAs were obtained using DESeq2, 454 differential expression genes (DEGs) were distinguished ultimately. The microRNA-mRNA-PPI network was constructed based on DEGs and the top 20 differentially expressed microRNAs, which included 10 upregulated and 10 down-regulated microRNAs. Furthermore, hub genes TLR5, FCGR1A, ELANE, GNLY, IL2RB and TGFBR3 which may be association with the prognosis of sepsis and their negative correlation microRNAs were recognized. Genes TLR5, FCGR1A and ELANE were mainly expressed in macrophages, genes GNLY, IL2RB and TGFBR3 were expressed specically in T cells and nature kill cells. Conclusions: The parallel analysis of mRNA and microRNA in patients with sepsis using RNA-seq was feasible. Potential hub genes and microRNAs which are possibly related to sepsis prognosis were identied, it provides new prospects for sepsis. However, further experiments need to be required.


Introduction
Sepsis is a syndrome complex involving host reaction malfunction and life-threatening organ damage caused by infection [27]. It is a common severe emergency symptom, and there are approximately 19 million patients with sepsis worldwide each year [17]. The incidence rate of sepsis in the ICU of hospitals in China is approximately 20.6%, and the mortality rate of sepsis is still high despite the timely use of antibiotics and other positive adjuvant therapies. The fatality rate of patients with severe sepsis is up to 50% and more [38]. The pathogenesis of sepsis is complex and has long been the focus of medical research. The Multiple Organ Dysfunction (MOD) score and Sequential Organ Failure Assessment ( SOFA)score were widely used in the evaluation of organ damage and prognosis in patients with sepsis, but further research showed that they had no good predictive effect for the death group and survival group of sepsis [44]. Rapid diagnosis and identi cation of sepsis is an important means to improve the survival rate of patients with sepsis and reduce organ dysfunction related to sepsis. At present, the diagnosis of sepsis mainly depends on clinical infection symptoms and blood cultures, which lack rapid and sensitive biomarkers [12]. Currently, researches focused on the biomarkers of sepsis are increasing. The level of procalcitonin (PCT) in patients with sepsis or septic shock at admission is considered to be a better prognostic indicator than other in ammatory markers [17]. Nevertheless, the critical value for determining death or survival of patients with sepsis cannot be determined [30], PCT can not to be applied for predicting the prognosis of sepsis. C-reactive protein and white blood cell count are also poorly speci c and sensitive in the diagnosis of bacterial infection [26]. In particular, under the standard of sepsis 3.0, it is urgent to further study the pathogenesis of sepsis and nd new biomarkers related to sepsis to provide a theoretical basis for clinical diagnosis and treatment.
As advances in science and technology, the high-throughput sequencing technologies are acquiring increasing attention, genomic information related to sepsis can be found, which is helpful for pathogenesis research and rapid and accurate diagnosis, treatment and prognostic prediction of sepsis [20].It is an important foundation of precision medicine and provides new ideas and technologies for medical science research.
In summary, we used RNA-seq technology to perform microRNA and mRNA sequencing in peripheral blood cells of adult patients with sepsis to construct the microRNA-mRNA-PPI regulatory network in sepsis and screened out genes TLR5, FCGR1A, ELANE, GNLY, IL2RB and TGFBR3 which were associated with the prognosis of sepsis, the microRNAs that regulate negatively these key genes were also identi ed. Furthermore, the location of the genes in peripheral blood mononuclear cells (PBMCs) were con rmed.

Subjects recruitment and Blood samples collection
In this study, septic patients(n=23) were recruited when were hospitalized in the EICU,

RNA-sequencing
The blood samples of one patient were sent to mRNA and microRNA sequencing at the same time. mRNA and microRNA sequencing were performed with the assistance of BGI (Shenzhen,China). Brie y, total RNA was extracted from peripheral blood cells using Trizol ((Invitrogen, Carlsbad, CA, USA)and RNA integrity number (RIN) was quali ed and quanti ed by Agilent 2100 bioanalyzer(Agilent, Santa Clara, USA), mRNA should meet the requirement of 28S/18S>1; microRNA quality control should meet the requirement of 28S/18S>1.5.
According to the manufacture's protocol, after removal of ribosomal RNA(rRNA) from total RNA and Solid Phase Reversible Immobilization(SPRI) beads puri cation, the RNA was fragmented into small pieces, afterwards the cleaved RNA was reversely transcribed into cDNA and ampli ed with Polymerase Chain Reaction(PCR) to create the cDNA library. Quality and quantity assessments of the libraries were processed via the Agilent 2100 bioanalyzer and real-time quantitative PCR (qPCR) (TaqMan Probe). The quali ed libraries were applied to mRNA-sequencing on the DNBSEQ platform (BGI-Shenzhen, China) nally.
After quality and quantity control, 1 µg total RNA for each sample was prepared to construct microRNA library.Puri cation of total RNA was implemented by electroph-oretic separation and small RNA regions which are correspondent to the 18-30 nucleotides(nts) bands in the marker lane (14-30 ssRNA Ladder Marker, TAKARA) were gathered and recovered. After annealing, using the SuperScript II Reverse Transcriptase (Invitrogen ,Carlsbad, CA, USA), the adapter-ligated small RNAs were subsequently transcribed into cDNA and the products were enriched by several rounds of PCR. The PCR products were screened by agarose gel electrophoresis that binding to target fragments 110~130 bp, and afterwards puri ed by QIAquick Gel Extraction Kit (Qiagen, Valencia, CA). The miRNA libraries was quali ed using the Agilent 2100 bioanalyzer, and quanti ed through real-time quantitative PCR (QPCR) (TaqMan Probe). The nal ligation PCR products were sequenced microRNA by means of the BGISEQ-500 platform (BGI-Shenzhen, China) Sample reads were trimmed to remove reads with an unknown base N content greater than 5%, adapters and low-quality bases using Trimmomatic software and aligned with the reference genome using HISAT and Bowtie2 software.

Differential genes and microRNAs analysis
The data was divided into the sepsis group and the healthy control group. Raw data were pre-processed via the edgeR (V4.0) and the differential expressed genes and microRNAs were analyzed using DESeq2[18] with false discovery rate(FDR) < 0.05 and log2 fold change(log2FC )>2. Afterwards, the top 20 microRNAs in ascending order of FDR were selected as differentially expressed microRNAs (DEMs) for sepsis. the miRWalk3.0 (http://mirwalk.umm.uniheidelberg.de/) was used to predict the potential target genes of DEMs. Genes with a negative correlation between the DEMs and the differential genes were identi ed by the OmicShare (https://www.omicshare.com/), which were intersected with the predicted target genes of DEMs, the screening criteria were de ned as P value <0.05 and cor<-0.4. The intersecting genes were de ned as the nal differential expression genes(DEGs).

Enrichment analysis of differential genes function
To further understand the integrating information of DEGs from an overall perspective, the Metascape database (https://metascape.org/) was employed to conduct the DEGs genes ontology (GO) analysis, gene-disease association analysis [25], gene-organization distribution analysis [24]. The Metascape database integrates multiple authoritative data resources such as GO, Kyoto Encyclopedia of Genes and Genomes(KEGG), UniProt and DrugBank, it could provide comprehensive and detailed information about each gene, and complete not only pathway enrichment and bioprocess annotation, but also gene-related protein network analysis,gene-disease association analysis, gene-organization distribution analysis.

Construction of microRNA-mRNA-PPI network
The DEGs were subjected to protein-protein interaction (PPI, combined score > 0.4) analysis using the STRING database (https://string-db.org/), and the microRNA-mRNA-PPI network was constructed via the OmicShare (https://www.omicshare.com/) to screen potential core genes.

Hub genes survival analysis
The GSE65682 dataset [2] was downloaded from the Gene Expression Omnibus (GEO) database (https://www.ncbi.nlm.nih.gov/geo/) and included clinical information such as gene expression data and survival time of 479 patients with sepsis in ICU, including 365 sepsis survivor. The survival data of patients with sepsis in GSE65682 were applied to conduct the survival analysis over the core genes in the microRNA-mRNA-PPI network, and the survival curve was generated by GraphPad Prism (version 7.0) software to select the potential key genes related to the prognosis of sepsis.
Negative regulation of key microRNA-mRNA pairs To understand the mechanism of the core genes in sepsis, and by which microRNAs are these genes are regulated, directed network analysis between the core genes and the top 20 microRNAs(DEMs) were performed using the OmicShare tools(https://www.omicshare. com/tools).
10×Singlecell RNA sequencing and data analysis 5 peripheral blood Mononuclear cells (PBMCs) were collected from 2 healthy controls, 1 systemic in ammatory response syndrome(SIRS), 2 sepsis and were subjected to single-cell RNA sequencing analysis (10× Genomics), the 10× Genomics platform applied micro uidic technology according to the manufacturer's protocol.
The experimental data were further quality control based on the preliminary quality control of Cell Ranger, excluding the data from multicellular, double-cell or unbound cells. Cells with gene numbers and UMI numbers within the mean ±2 x standard deviation and less than 20% UMIs mapped to mitochondrial genes were considered as high-quality cells. The results were visualized in a two-dimensional space by tSNE (non-linear dimensionality reduction).

Statistical analysis
The clinical data of subjects were analysed by SPSS 22.0 software. the continuous variables were expressed as the mean ± standard deviation(SD), and analysed with unpaired Student's t-test; the chisquare test was adopted for the categorical variables, and a P value < 0.05 was considered statistically signi cant.

Subjects clinical characteristics
The experimental ow chart of this study is shown in Figure 1. Total 23 patients with sepsis and 10 healthy volunteers were recruited in the current study, as clinical information shown in Table 1, there were no signi cant differences in age, gender, platelet (PLT) between the septic patients and the healthy controls, considerable differences were found in white blood cell(WBC), neutrophiles, neutrophile/lymphocyte ratio (NLR) and hemoglobin(HGB). Blood culture were positive in 12 (52%) sepsis cases, Gram negative bacteria were the predominant Microorganism(8,66%). In addition,Surgical conditions are the major primary sites of infection 10,43%).

Differnetially expressed gens and miRNAs in sepsis
To identify the plasma prognostic biomarkers and potential mechanisms of sepsis, RNA-seq included mRNA and microRNA sequencing were utilized for one specimen simultaneously, mRNA sequencing for 33 subjects yielded relative levels of transcription of 20391genes ,furthermore, by microRNA sequencing, approximately 1633 microRNAs were detected.
Afterwards, bioinformatics were used to analyzed the above sequencing data for obtaining differential genes and microRNAs. Through pre-processed, the principal component analysis(PCA) was applied for removing the heterogeneous samples among the sepsis samples and control samples ( Fig. 2A and B). A total of 1114 differentially expressed genes(767 up-regulated ) and 76 differentially expressed microRNAs(45 down-regulated) were initially screened out by DESeq2 (Fig. 2C and D) between the sepsis group and the healthy control group. According to the ascending order of FDR, 10 up-regulated and 10 down-regulated miRNAs respectively were considered as differentially expressed microRNAs (DEMs) for sepsis. Intersection analysis of differentially expressed genes which were negative association with DEMs and the predicted target genes based on DEMs by miRwalk3.0 yielded 454 differential expression genes (DEGs) ultimatelly, the majority of which (361; 79.5%) were upregulated in sepsis.

Enrichment analysis of differential genes function
It is essential for comprehensive understanding of the biological function of the DEGs. Enrichment analysis presented that these DEGs were signi cantly related to neutrophil degranulation, regulation of defense, response to bacterium, and chemotaxis (Fig. 3A). Gene-disease association analysis found that these genes may be involved in in ammation, immunosuppression,pneumonitis, bacterial infection (Fig. 3B). Furthermore, these genes were mainly distributed at NK cells, bone marrow, spleen, adipocyte, blood, and liver through geneorganization distribution analysis (Fig. 3C).

microRNA-mRNA-PPI regulatory network
The DEMs and DEGs were submitted to the STRING database (https://string-db.org/), and the OmicShare (https://www.omicshare.com/), the microRNA-mRNA-PPI regulatory network for sepsis was constucted ( Fig. 4A and B), the 30 potential key genes which located at the center of the microRNA-mRNA-PPI network were screened out to be the potential core genes involved in sepsis, and presented by heatmap (Fig. 4C).

Hub genes survival analysis
Based on the data from GSE65682, we explored associations between the potential core genes and sepsis outcome, the survival analysis showed that genes TLR5, FCGR1A,ELANE which were upregulated in sepsis, and decreased expression genes GNLY, IL2RB and TGFBR3 were signi cantly associated with sepsis outcome, the higher these genes TLR5, FCGR1A, GNLY, IL2RB and TGFBR3 expression, the better the clinical outcomes in patients with sepsis(P < 0.05), the relationship between the gene ELANE and sepsis outcomes showed the opposite trend (P < 0.05) (Fig. 5A-F ). Consequently, genes TLR5, FCGR1A,ELANE, GNLY, IL2RB and TGFBR3 were screened out to be the potential prognostic biomarkers involved in sepsis ultimately.
Negative regulation of core microRNA-mRNA pairs Directed network analysis by the OmicShare (https://www.omicshare.com/) was used to analyze the regulatory relationship between core genes and microRNAs and a directed network diagram was drawn. The miRNAs accosiated with these core biomarkers were identi ed and presented in Figure 6A-F.

Location of hub genes in cell clusters
We compared gene expression patterns in PBMCs among healthy controls,SIRS and sepsis and identi ed 9 transcriptionally distinct cell clusters ( Figure 7A), marker CD14, CD3E represented monocytes and NK-T cells, respectively ( Figure 7B,C). Genes TLR5, FCGR1A and ELANE were mainly expressed in macrophages( Figure 7D-F), nevertheless, genes GNLY, IL2RB and TGFBR3 were expressed speci cally in Tcells and NK cells( Figure 7G-I). the expression abundance and ratio of each hub gene in different cell clusters were shown in Figure 7J. It provides a better prerequisite for subsequent mechanistic studies.
Discussion RNA-seq, or RNA sequencing, also referred to as transcriptome sequencing, is a newly developed technology for transcriptome analysis using deep sequencing technology and can quantitatively detect RNA expression levels [21]. RNA-seq could be applied for identifying DEGs in healthy and diseased tissues and providing a platform to study the mechanism of sepsis further. Sepsis is a life-threaten condition that threatens patents' health, despite the early administration of antibiotics, the improvement in organ support, the rates of mortality remain high in patients with sepsis. the development, immune response, and prognosis of sepsis are the focus of medical research. The focus of the present study was to distinguish the sepsis from health, and the sepsis outcome. we conducted RNA-seq (including mRNA and microRNA) on 23 patients with sepsis and 10 healthy controls, and distinguished 454 signi cantly differential expressed genes (361 upregulated) and 20 microRNAs with different expression for sepsis, compared to healthy volunteers. Through gene enrichment analysis and construction of microRNA-mRNA-PPI regulatory network, up-regulated genes TLR5, FCGR1A,ELANE and down-regulated genes GNLY, IL2RB and TGFBR3 were identi ed to be the potential vital markers, moreover, these genes are highly related with the prognosis of sepsis. In addition, the biological function of the core genes were probably associated with bacterial infection, in ammation, immuno-suppression, chemotaxis. The localization of these hub genes in PBMCs was clari ed further by single-cell sequencing.
MicroRNAs are endogenous RNAs of approximately 23 nucleotides(NT) and can inhibit gene transcription and gene protein expression by pairing with the mRNA of protein-coding genes. Studies have shown that microRNAs can be used as independent prognostic factors of disease[28 ; 29]. Therefore, we discussed the core genes combined with differential expression microRNAs and found that they were regulated by different microRNAs negatively, providing an bene cial opportunities for studying the pathogenesis of sepsis. In this study, We also observed that miR-20a, miR-101-3p, let-7f, miR-196B-3p, miR-212a, miR-129-5p, miR-149-5p, miR-219a, miR-124-3p and miR-9-5p are potentially regulate these hub genes.
For the core biomarkers, toll-like receptor 5(TLR5), as a member of Toll-like receptors(TLRs), is the receptor of bacterial motile components, targeting extracellular agellin,and thus induce the production of in ammatory cytokines and immune response [34]. Previous studies demonstrated that TLR5 is not only involved in H. pylori infection [23],systemic in ammatory response syndrome (SIRS) [15], but also prevents systemic in ammation and liver damage caused by pulmonary B. pseudomallei infection, TLR5 de ciency facilitates bacterial growth and dissemination [3]. Another paper indicated that increased TLR5 expression on monocytes was associated with mortality in patients with sepsis [39], which was not consistent with our results that higher TLR5 expression in peripheral blood attenuates mortality of septic patients, this discordance might be explained by the different samples used, more experiments are needed to con rmed. FCGR1A, also named as CD64,is expressed on most myeloid cells and is a high a nity Fc receptor (FcγRI), which binds to monomeric IgG [4].It participates in a number of functions, including phagocytosis, antigen-presentation, the production of cytokines [32].
Neutrophil FCGR1A was involved in tubculerlosis, regardless of HIV infection [33; 37], and would differentiated TB from latent TB infection [11]. Previous study indicated that Neutrophil CD64 expression would be an important diagnostic marker of infection and sepsis in hospital patients [8]. In this study, higher expression of FCGR1A in peripheral blood represented good outcome in sepsis and it was mainly expressed in macrophages.Elastase, neutrophil expressed (ELANE) gene, encodes neutrophil elastase, which exists in neutrophil, and plays an important role in killing pathogens. Previous study indicated that overexpression of ELANE was observed in septic patients via gene expression pro les analysis [42], which was in accordance with our conclusions. Moreover, inhibition of ELANE-mediated histone H3 proteolysis contributes to mononuclear-macrophage differentiation[6].
GNLY, is a cytotoxic granuloprotein secreted by cytotoxic T lymphocytes and natural killer cells [14],It exerts a toxic role with respect to bacteria, fungi, parasites, and tumors [13], Granulysin acts as an immune alarmin and promotes antigen-presenting cells activation through TLR4 [31], It was associated with the e cacy of pegylated-interferon-alpha therapy in Chinese patients with HBeAgpositive chronic hepatitis[16], rheumatoid arthritis [40],children with Mycoplasma pneumoniae pneumonia [10].
Limitations of this study should be mentioned. First, this study was done in a single center and comprised small samples, we would augment the samples to further study. Secondly, the mechanisms of the core genes in sepsis are required to validate by subsequent experiments.

Conclusions
In the current study, RNA-seq technology was used to conduct joint analysis of mRNA and microRNA in patients with sepsis. Potential genes TLR5, FCGR1A, ELANE, GNLY, IL2RB and TGFBR3, and microRNAs which are possibly post-transcriptional regulatory factor related to sepsis prognosis were screened out, which provides new prospects for exploring the physiopathologic mechanisms, diagnosis, and treatment of sepsis. Further experiments need to be required to verify these potential key genes and correspondent microRNAs in sepsis. Figure 1 The experimental ow chart of this study. RNA seq technology was applied to mRNA and microRNA sequencing for the peripheral blood of total 33 subjects, combined with the bioinformatics, the key genes and corresponding microRNA in sepsis were screened out.  PPI, protein-protein interaction, DEMs, differentially expressed genes, DEGs, differential expression genes, NC, the healthy control group Figure 5 Identi cation of core genes for sepsis.The survival analysis of the potential core genes based on GSE65682. It showed that the relationship between the expression of core genes and 28-day survival time of the septic patients. Higher expression of these genes TLR5(A), FCGR1A(B), GNLY(D), IL2RB(E) and TGFBR3(F)states higher survival rates in patients with sepsis(P < 0.05). Higher expression of these genes ELANE(C) indicates poorer outcome in septic patients(P < 0.05).

Figure 6
The network graph between the core genes and microRNAs.