SCREENING OF POTENTIAL CORE GENES IN THE PERIPHERAL BLOOD OF ADULT PATIENTS WITH SEPSIS BASED ON IMMUNOREGULATION AND SIGNAL TRANSDUCTION FUNCTIONS

ABSTRACT Objective: Based on the functions of immunoregulation and signal transduction, septic peripheral blood sequencing and bioinformatics technology were used to screen potential core targets. Methods: Peripheral blood of 23 patients with sepsis and 10 normal volunteers underwent RNA-seq processing within 24 hours after admission to the hospital. Data quality control and differential gene screening were performed based on R language (P < 0.01; log2FC ≥ 2). Gene function enrichment analysis was conducted on differentially expressed genes (DEGs). Then, target genes were submitted to STRING to constitute the PPI network, and GSE65682 were used to explore the prognostic relevance of potential core genes. Meta-analysis was used to verify the expression trends of core genes in the sepsis group. Then, cell line localization analysis of core genes in the 5 peripheral blood mononuclear cell samples (normal control = 2; systemic inflammatory response syndrome = 1; SEPSIS = 2) was performed. Results: A total of 1,128 DEGs were obtained between sepsis and normal group, of which 721 were upregulated and 407 downregulated. These DEGs were mainly enriched in leukocyte-mediated cytotoxicity, cell killing regulation, adaptive immune response regulation, lymphocyte-mediated immune regulation, and negative regulation of adaptive immune response. PPI network analysis results showed that CD160, KLRG1, S1PR5, and RGS16 were located in the core area, which are related to adaptive immune regulation, signal transduction, and intracellular components. The above four genes in the core area were found to be related to the prognosis of patients with sepsis, of which RGS16 was negatively correlated with the survival rate, and CD160, KLRG1, and S1PR5 were positively correlated. However, several public data sets showed that CD160, KLRG1, and S1PR5 were all downregulated in the peripheral blood of patients with sepsis, while RGS16 was upregulated in the sepsis group. Single-cell sequencing analysis showed that they were mainly expressed in NK-T cells. Conclusions: CD160, KLRG1, S1PR5, and RGS16 were mainly located in human peripheral blood NK-T cells. Sepsis participants expressed lower levels of S1PR5, CD160, and KLRG1, while sepsis participants expressed higher levels of RGS16. This suggests that they may be potential research targets for sepsis.


INTRODUCTION
Sepsis, a life-threatening organ dysfunction caused by a dysregulated host response to infection, is a biphasic disease whose progression has been shown to be closely related to systemic inflammatory responses and host immunosuppression (1,2). As one of the most common critical clinical diseases, sepsis is the leading cause of death in intensive care units (ICUs), and its high morbidity and mortality have attracted the attention of the World Health Organization due to its heavy financial burden on the medical system (3). The pathology of sepsis is complex, and new diagnostic and therapeutic strategies are urgently needed for early prevention and treatment to promote prognosis.
The core mechanism of sepsis is immune dysfunction, and the innate and adaptive immune systems play distinct roles. Recent studies have confirmed that the fast-acting innate immune system plays a greater role in preventing the rapid progression of sepsis than the adaptive immune system (4). At present, the morbidity and mortality of sepsis show increasing trends because of population aging, progress in the use of immunomodulators for treating an increasing number of diseases, and immunosuppressive therapy in organ transplant recipients and cancer patients. Therapies that suppress inflammation help reduce the time spent in the ICU for patients with sepsis but do not reduce overall mortality. However, synthesizing inflammatory mediators and understanding the role of cells in the host response and their different effects on cells, tissues, or systems are key to developing medical treatments that modulate the immune system's response to pathogens (5). The immune function of patients with sepsis is poor, and their sensitivity to pathogenic bacteria is enhanced, but the clearance rate is reduced; therefore, it is easy to induce secondary infections and cause organ dysfunction. By studying the immune status of patients who died of sepsis, it was found that the number of inflammatory factors and immunocompetent cells in the serum of dead patients was significantly reduced (6). Studies on septic mice also indicate that the antigen-presenting ability of pulmonary macrophages and dendritic cells is weakened, indicating an immunosuppressive state during sepsis. Studies have increasingly shown that immunodepression plays a crucial role in the pathophysiology of sepsis (7). Therefore, the focus of the treatment of the disease has shifted to the stage of immunosuppression caused by immune cell apoptosis, and immune enhancement strategies have become a research hotspot in the treatment of sepsis (8,9).
Sepsis is related to immune response disorders caused by bacterial infection. However, the mechanism of systemic immune disorder caused by bacteria is unclear, and there are no effective targeted therapies (10,11). A major problem with previous sepsis sequencing or gene chip technology is that human peripheral blood is a mixture of a variety of karyocytes. These technologies cannot locate cell lines through genes, researchers cannot easily conduct in-depth research, and it is difficult to find appropriate cell lines for in-depth research. Studies have shown that the immunocytologic characteristics related to sepsis in the whole blood of patients with sepsis can be detected by single-cell genomics (12). Single-cell RNA sequencing (scRNA-seq) technology can be used to analyze the whole transcriptome of a large number of single cells and is widely used to identify the heterogeneity of the immune system in diseases (13,14). Recent studies have shown that scRNA-seq has detected renal injury in patients with sepsis and proved that the failure of cell-cell communication is related to organ dysfunction in endotoxemia (15). The whole blood of patients with sepsis is composed of several karyocytes, and scRNA-seq can detect the gene expression of single cells and identify heterogeneous cell types in more samples (16).
This study intends to explore the core targets related to immune regulation in sepsis through sequencing and bioinformatics technology. To further clarify the cell line localization of the core target, a hybrid analysis strategy was adopted by mixing the single-cell sequencing results of several patients. Localizing the cell line of the core gene lays the foundation for subsequent in vitro experimental verification and functional research.

MATERIALS AND METHODS:
Sepsis sample collection Patients with sepsis (N = 23) who were admitted to the ICU/ emergency intensive care unit of the Affiliated Hospital of Southwest Medical University within 24 hours from January 2019 to January 2020 were enrolled, and normal volunteers were included as the control group (n = 10). Blood samples were collected from patients or volunteers using PAXgene tubes and stored in a freezer at −80°C in the biological sample bank of the Affiliated Hospital of Southwest Medical University. The admission diagnosis of patients was based on the SEPSIS 3.0 criteria, namely, infection + quick SOFA ≥ 2. The exclusion criteria were as follows: (1) patients who were younger than 16 years or older than 65 years; (2) patients with an organ failure history; (3) patients with a history of blood diseases, such as immunodeficiency; and (4) patients who were unwilling to be enrolled in the study. All cases signed informed consent by themselves or their relatives to ensure that they were included in the study. This trial was approved by the ethics committee of the Affiliated Hospital of Southwest Medical University (ethics no: ky2018029), and the clinical trial registration number is ChiCTR1900021261.

Peripheral blood gene sequencing and filtration
Total RNA was extracted from blood samples by the TRIzol method (Invitrogen, Carlsbad, CA) and quantitatively analyzed by an Agilent 2100 (Thermo Fisher Scientific, MA). According to the manufacturer's instructions, the first step was the removal of ribosomal RNA (rRNA) using targeted specific oligonucleotides and ribonuclease H reagent. After purification with SPRI beads, RNA was fragmented into small pieces with divalent cations at high temperature. The dissected RNA fragments were copied into the first strand cDNA using reverse transcriptase and random primers, and the second strand cDNA was then synthesized using DNA polymerase I and RNase H. The quality and quantity of the library were evaluated by two methods: the distribution of fragment size was examined by an Agilent 2100 Bioanalyzer, and the library was quantified by real-time quantitative PCR (TaqMan probe). The qualified library was sequenced at both ends on a BGISEQ-500/MGISEQ-2000 system (BGI-Shenzhen, China). The raw sequencing data (lncRNA/mRNA, miRNA) were filtered using the filtering software SOAPnuke (https://github. com/BGI-flexlab/SOAPnuke), and the filtered clean reads were saved in FASTQ format.

Differential genetic screening
Homogenized quality control was conducted on the matrix data by R language (EdgeR: log2(CPM + 2)), and the principal component analysis (PCA) method was used to perform dimension-reduction analysis on the samples to exclude outlier samples and identify sample clusters with high similarity. The DESeq2 method was used as a statistical analysis method to compare the two groups of data to screen out differentially expressed genes (DEGs). The difference threshold parameters were set as follows: P < 0.01; log 2 FC ≥ 2.

Gene ontology analysis
Gene ontology (GO) is a method to classify and describe genes from the following three aspects: biological process (BP), cellular component (CC), and molecular function (MF). To further explore the functional enrichment of DEGs, R3.6.3 was used to conduct GO analysis on DEmRNAs, and P < 0.05 was considered statistically significant. In this study, we mainly focused on functions such as immune regulation and signal transduction to explore the correlation of immune regulation-related genes and visualize the results on the network.

Protein-protein interaction analysis
To further screen out the potential core genes, STRING (https://cn.string-db.org/) was used to construct the protein-protein interaction network diagram for the related genes to understand the genes located in the middle of the network. Theoretically, the closer to the middle, the greater its function and the more important its connection with the outside world. This study focused on the analysis of immune regulation and signal transduction, which would be helpful for researchers to screen out core targets of the potential mechanism of sepsis.
To reduce the false-positive rate of the potential core genes in this study, we downloaded other data sets with the same subject from the GEO public database and validated the core genes using the META strategy. Data set screening strategies are as follows: (1) the species of peripheral blood was human; (2) the method was gene chip or RNA-seq; (3) the age was 16 years or older and 65 years or younger; (4) patients with sepsis were selected as the experimental group, while normal people were selected as the control group; and (5) the total sample size was 20 cases or more. All raw data were quality controlled by excluding some poor quality data sets. Finally, five high-quality data sets (GSE28750, GSE54514, GSE69528, GSE95233, and GSE67652) were obtained and downloaded. If there were several values for a particular gene in a data set, the average value was used. Meta-analysis was performed on the expression levels of core genes, and a forest map was constructed.

Survival curve
Survival curve analysis was of great guiding significance to analyze whether specific genes had important functions. To explore the prognostic function of potential core genes screened by the protein-protein interaction (PPI) method in patients with sepsis, the public data set GSE65682, which was submitted by Scicluna BP in 2015, was downloaded. It included related gene data of more than 400 patients with sepsis and prognosis at 28 days. We extracted the relevant data and used GraphPad Prism (version 7.0, GraphPad Software USA, San Diego, California) for mapping analysis. The log-rank test was used for statistics. P < 0.05 was considered statistically significant.

Single-cell sequencing
Blood cells were a mixture of multiple cell lines. To clarify the cell line localization of specific target genes and lay a foundation for later in vitro functional research, 10Â single-cell sequencing technology was used to explore the cell line localization of each target gene. The specific method should refer to the company's operation manual. In short, we collected blood samples from 5 cases (2 normal subjects, 1 in the SIRS group, 2 in the sepsis group) for sequence separately, and all sample expression data were analyzed together for visualization analysis. Raw reads generated in the high-throughput sequencing processing were sequences in fastq format. The 10Â genomics software CellRanger was used for quality statistics of the raw data, and the Seurat software package was used for further quality control and processing of the data. The PCA (principal component) linear dimension reduction was conducted by gene expression level, and the PCA results were visualized by tSNE (nonlinear dimension reduction). In addition, the specific genes identified were visualized by the FeaturePlot function.

Statistical analysis
The routine measurement data of each group were counted in the format of the mean ± standard deviation and statistically analyzed by t test; P < 0.05 was considered statistically significant. Among them, the original RNA-seq data were compared after logarithmic conversion. The log-rank test was used for survival curves, and core genes were verified by continuous variable meta-analysis.

Clinical information of patients with sepsis
In this study, peripheral blood samples were collected from 23 patients with sepsis and 10 normal volunteers. Patients with sepsis were admitted according to the SEPSIS 3.0 standard, and they all experienced dysfunction of two organs. This study collected patient sex, age, white blood cell (WBC), direct bilirubin, creatinine, prothrombin time, and other clinical organ function damage data. According to the clinical data in Table 1, there were no appreciable differences between the septic patients and the healthy controls in terms of age, sex, platelet counts, alanine transaminase, aspartate aminotransferase, or creatinine, but significant differences were discovered in terms of WBC counts, neutrophil counts, neutrophil/ lymphocyte ratios, and hemoglobin. In addition, the sepsis group had significantly higher than average concentrations of procalcitonin, prothrombin time/international normalization ratio, and lactic acid. Twelve of the sepsis cases (52%) had positive blood cultures, and gram-negative bacteria (8 or 66%) were the most common pathogen. Moreover, surgical sites made up the majority of primary infection sites (10, 43%). The Charlson Comorbidity Index, which was used to predict 10-year survival in individuals with numerous comorbidities, was greater and statistically significant in sepsis as compared with the healthy controls.

Differential gene screening
The expression values of the two groups of samples were standardized and then subjected to dimension-reduction processing by PCA. The two groups of samples were distinguished well from each other (Fig. 1A). Upon comparing the genes in the peripheral blood of the two groups, 1,128 genes were identified as differen-tially expressed in the sepsis group. Compared with the normal group, 721 DEGs in the peripheral blood of the sepsis group were upregulated, and 407 were downregulated (Fig. 1B). Among them, the expression of S1PR5, CD160, and KLRG1 was downregulated in the sepsis group, while the expression of RGS16 was upregulated in the sepsis group.

Functional enrichment
This study mainly explored the BP of sepsis from the aspect of immunoregulation function. The results of MF showed that functional enrichment, such as negative regulation of the adaptive immune response, regulation of lymphocyte-mediated immunity, regulation of cell killing, and leukocyte-mediated cytotoxicity ( Fig. 2A), was statistically significant. Functional interconnection analysis showed that many target proteins between the abovementioned functions intersected and were closely linked (Fig. 2B).

Protein-protein interaction core gene screening
Based on the abovementioned gene clusters with immunoregulation and other related functions, this study showed that many The results showed that the mean value of each sample was essentially at the same level and was comparable between the two groups. The ordinate is the logarithm value of gene expression. B, Volcano plot of differential gene distribution. Each point represents a gene, and blue represents downregulated genes, and red points represent upregulated genes. The logarithmic value of the difference multiples is the abscissa, and the negative logarithm value is the difference significance correction P value of ordinates (the base number is 10). genes, such as BCL6, IL10, IL2RB, CD160, CD247, KLRG1, S1PR5, and RGS16, were located in the central area of the PPI network through protein interaction analysis. Functional enrich-ment also showed that these factors were related to the adaptive immune response, cell surface receptor signaling pathway, signal transduction, and other functions, which was largely consistent  4. Core gene meta-analysis verification. Meta validation of core genes was conducted based on multiple sepsis data sets in the GEO database. If the heterogeneity test was P ≥ 0.05, the fixed effect model was selected; if P < 0.05, the random effect model was selected for testing and statistical analysis. CD160 (A), KLRG1 (B) and S1PR5 were downregulated in the sepsis group, while RGS16 showed an increasing trend in the sepsis group, with no significant difference (C).
with the subject of this study (Fig. 3A). The results showed that CD160, KLRG1, S1PR5, and RGS16 were located in the intrinsic component membrane and were involved in signal transduction. CD160, KLRG1, S1PR5, and KLRG1 were downregulated in the sepsis group, whereas IL10, BCL 6, NFIL3, and RGS16 were upregulated in the sepsis group (Fig. 3B).

Core gene expression meta validation
To compensate for the shortage of samples in these sequencing data, other sepsis-related data sets in the GEO database were used for meta-gene verification. The results of the meta-random effect model test showed that the expression of CD160, KLRG1, and S1PR5 increased in the sepsis group (Fig. 4, A-C). However, the statistical analysis of RGS16 was not statistically significant, and most of the data sets showed an increasing trend in the sepsis group (Fig. 5D).

Relationship between core genes and prognosis
Through the analysis of clinical prognosis data of patients with sepsis, the expression values of CD160, KLRG1, and S1PR5 out of the abovementioned alternative core genes in the GSE65682 data set were positively correlated with the survival rate of patients with sepsis (Fig. 5, A, B, D). The expression of the RGS16 gene was negatively correlated with the survival rate of patients with sepsis (Fig. 5C).

Localization of core genes in single cell lines
Human blood samples are a mixture of multiple cell types. Single-cell sequencing technology is helpful to clarify the expression and localization of core genes in cell lines. According to PCA dimension-reduction processing, the cells were divided into nine cell modules. After identification by general markers, Groups 1, 2, 6, and 8 were T-cell lines; group 4 was the NK-cell group; groups 3 and 5 were monocyte populations; and group 7 was the B-cell group (Fig. 6A). S100A12 was a common biomarker of monocytes, which confirmed that groups 3 and 5 were monocyte lines (Fig. 6B). CD3G was a biomarker of the T-cell line and was located in groups 1, 2, 6, and 8 (Fig. 6C). HOPX was mainly located in groups 4 (Fig. 6D). Violin diagram shows the expression distribution of S100A12, CD3G, and HOPX markers on cell lines (Fig. 6, E-G). Single-cell sequencing and location analysis showed that the four core genes CD160 and S1PR5 were mainly expressed in the NK cells. Meanwhile, KLRG1 and RGS16 mainly expressed in T cell (Fig. 6, H-L). These results lay a foundation for follow-up in vitro experiments.

DISCUSSION
At present, the difficulty in the diagnosis and treatment of sepsis relates to the lack of mechanistic understanding of its occurrence, development, and core targets related to prognosis; thus, accurate targeted treatment cannot be carried out. This study was committed to screening out the potential core targets, and 1,128 DEGs were screened by RNA-seq technology and clinical prognosis information. The function of different genes was analyzed, and the genes related to immunoregulation and signal transduction were examined. Four potential core genes were screened out by PPI and survival curves: CD160, KLRG1, S1PR5, and RGS16. These genes are located in the core area of the network and may be potential core targets. These core genes are expressed in NK-T cells and may be associated with the prognosis of patients with sepsis. This information suggests that these potential targets are worthy of further research, which may provide clues for future therapeutic targets.
Immunoregulation and signal transduction play important roles in the occurrence and development of sepsis. Sepsis can manifest as the controllable release of inflammatory mediators or the dynamic regulation of the immune system. Therefore, which genes or factors are involved, and which of these are good or bad? These answers need to be completed with the help of modern research technologies, such as RNA-seq and single-cell sequencing. In this study, four potential targets were selected through big data analysis and bioinformatics analysis.
CD160 is a glycosylphosphatidylinositol-anchored transmembrane glycoprotein that was first found on the surface of NK cells and is a member of the immunoglobulin superfamily (17). Studies have shown that CD160 plays a role in various cancers, chronic viral diseases, malaria, paroxysmal nocturnal hemoglobinuria, atherosclerosis, autoimmune diseases, skin inflammation, acute liver injury, and retinal vascular diseases (18). In recent years, studies have shown that CD160 positively regulates CD8 + T cells during chronic virus infection and delays the disease progression of patients with chronic HIV-1 infection (19). In our FIG. 6. Single-cell sequencing diagram used to locate the core gene cell lines. A, Two-dimensional general diagram of tsNE after PCA dimension reduction. Each dot indicates a cell, where groups 1, 2, 6, and 8 are t-cell lines; group 4 is an NK-cell line, groups 3 and 5 are monocyte lines, and group 7 is a B-cell line. B, The distribution of S100A12 in the general diagram. S100A12 is the monocyte line surface marker. C, The distribution of CD3G in the general diagram. D, HOPX is a marker of NK cells. E-G indicate the expression and distribution of S100A12, CD3G, and HOPX and in human blood PBMCs. H-K, Visualization tsNE results of KLRG1, CD160, S1PR5, and RGS16. L, The gene expression bubble chart, with the color representing the relative expression level; red indicates high expression, and blue indicates low expression. The bubble size indicates the proportion of expression across cell lines. PCA, principal component analysis; PBMC, peripheral blood mononuclear cell. study, CD160 expression was significantly reduced in peripheral blood, which may be due to its suppression of immune function, thus contributing to the death of septic patients.
Coinhibitory receptor killer cell lectin-like receptor G1 (KLRG1) is expressed on NK cells and antigen-experienced T cells and is considered to be a marker of aging. The high expression of KLRG1 on tumor cells can induce the activation of immune cells. One study showed that the expression of KLRG1 is positively correlated with the overall survival rate of patients with lung adenocarcinoma (20). This result is very similar to our study of KLRG1 in sepsis.
Phosphatidic acid sphingosine receptor (S1PR) is involved in a variety of cellular and physiological activities, including lymphocyte/hematopoietic cell transport (21). In a study of COPD, it was found that the expression of S1PR5 is significantly correlated with the phagocytosis of alveolar macrophages, which can be used as a strategy for macrophage-targeted treatment of COPD and other chronic inflammatory lung diseases (22). Our study also suggests that increased expression of S1PR5 may lead to increased survival rate in patients with sepsis.
The G protein signaling pathway regulatory factor (RGS) protein superfamily negatively controls the G protein-coupled receptor signal transduction pathway. It is a key factor for G protein mediating the activation of lymphocytes and regulates the inflammation and survival response of various cells. In a study of glioma, an increase in RGS16 expression levels was significantly correlated with poor prognosis (23). In a study of human monocyte THP-1 as an in vitro model, RGS16 was found to limit the proinflammatory spectrum induced by the activation of myelocytes (24). In our study, we also found high expression of RGS16 in the peripheral blood of patients who died from sepsis.
How these prognostic genes regulate the progression of sepsis and the mechanism of the inflammatory response to immune regulation require further study. This study focused on immunoregulation and signal transduction, screened out the potential core targets, and conducted a prognosis correlation analysis and cell line localization to lay an experimental foundation for further mechanistic research in the future. However, the number of sequencing samples in this study was relatively small, and there may have been false-positive results. The limitation of this study is that the relevant content is the observation and research of sepsis, which provides relevant clues for future research. However, the function of target genes has not been deeply verified.
Author contributions: YT performed the experiments, analyzed experimental results, and wrote the manuscript. LW performed the experiments, designed the study, and wrote the manuscript. WC prepared Figures 1-6 and Table 1. WZ and YH designed the study and revised the manuscript. All authors read and approved the final manuscript.