Differential Expression of Long Noncoding RNAs in Peripheral Blood Leukocytes of Patients with Schizophrenia

Background: The diagnosis of schizophrenia (SCZ) depends on the evaluation of clinical symptoms, but there is no objective biomarker. Studies have shown that long non-coding RNAs (lncRNAs) may be involved in the pathogenesis of SCZ. In this study, we evaluated the differential gene expression of lncRNAs and mRNAs in peripheral blood (PBL) of patients with SCZ. Methods: We proled the transcriptome of PBL in 50 patients with SCZ and 50 controls without psychiatric diagnoses, using RNA-seq. LncRNA-mRNA interactions were predicted using “RNAplex”, a hierarchical classication-Spielman correlation coecient approach was used to analyze the correlation between lncRNAs and protein-coding gene expression among samples, and systematic bioinformatic methods (Go/Pathway) were used to perform lncRNA functional annotation. The results were validated using qPCR. Functional sites in sequences were predicted using the PROSITE, NCBI, UCSC, and JASPAR databases. Results: We identied 94 lncRNAs and 1179 mRNAs differentially expressed in PBL, of which 46 lncRNAs were identied for the rst time. Enrichment of lncRNAs involved biological processes and signaling pathways related to neutrophil activation involved in the immune response. Spearman correlation coecient analysis showed that 81 lncRNAs and 410 mRNAs had correlated expression (p < 0.01 and |r| ≥ 0.4). qPCR performed on independent samples veried that the core node of the lncRNA-mRNA co-expression network, the IL1RAP-TCONS_00138311 variable splicing transcript, was highly expressed in patients with SCZ (2 △△ Ct of 0.56, area under the ROC curve of 0.924). The top four ranked transcription factors were predicted to be HSF1, HSF2, HSF4, and FOXA1. Conclusions: Combined with sequence function analysis, we showed that the transcription factors FOXA1, HSF1, HSF2, and HSF4 may mediate the activation of IL1B-induced NF-kβ pathway and other inammatory pathways through regulation of the IL1RAP

Background Schizophrenia (SCZ) is a serious mental disorder of uncertain etiology, with a high incidence of morbidity, recurrence, and disability, as well as a lifetime prevalence of approximately 1% in the adult population [1].
SCZ has a high prevalence of residual symptoms, with approximately 10-15% of patients requiring lifelong care [2]. Till date, SCZ is diagnosed empirically by clinicians, and there are no objective laboratory indicators. Elucidating the etiology and pathological mechanism of SCZ, identifying clinical diagnostic biomarkers, and developing targeted drug therapy have valuable application and signi cance.
Advances in genome-wide association studies (GWAS) indicate that the occurrence of SCZ is affected by a combination of genetic and environmental factors [3]. Recent studies have shown that long non-coding RNAs (lncRNAs) may be involved in the pathogenesis of SCZ [4]. LncRNAs play important roles in many life processes, such as dose compensation, epigenetic regulation, cell cycle regulation, regulation of cell differentiation, transcription activation, and transcription interference [5]. Pro-in ammatory cytokines, such as IL-6, TNF-α, and IFN-γ, have been reported to be signi cantly elevated in SCZ; in contrast, lncRNAs, such as TMEVPG1, enhance the expression of IFN-γ, and NRON regulates the transcription factor NFAT, thereby regulating IL-6 and IFN-γ expression. Conversely, these in ammatory cytokines can stimulate and alter lncRNA expression. LncRNAs are mostly located in the nucleus and are highly expressed in the mammalian brain [6]. As brain tissue and peripheral blood (PBL) gene expression have a common regulatory pathway [7], multiple cytokines secreted in the brain are also present in peripheral blood leukocytes. Therefore, many lncRNAs that are highly expressed in brain tissue are highly expressed in PBL as well. It is noteworthy that the detection of lncRNA expression levels has high feasibility and reliability. In this study, we analyzed the transcriptome of PBL to evaluate potential lncRNA markers of SCZ.

Study Samples
Blood samples used for sequencing were collected from patients in the psychiatric inpatient or outpatient departments of the Sixth A liated Hospital of Kunming Medical University, the Second People's Hospital of Honghe Prefecture, and the Second People's Hospital of Yuxi City. Patients (50 cases) in the test groups conformed to the diagnosis of SCZ according to the fourth edition of the American Diagnostic and Statistical Manual of Mental Disorders (DSM-IV). Patients with SCZ were at the initial onset or did not take antipsychotic drugs in the ve weeks before enrollment, belonged to Han ancestry, and were aged 15 to 60 years. A control group was recruited at the same conditions. RNA Sequencing of PBL After drawing 5 ~ 10 mL of venous blood into tubes containing the anticoagulant EDTA, centrifugation was performed within 60 min of collection. Total RNA was extracted using the TRIZOL Reagent (Ambion®), and the VAHTS Total RNA-seq (H/M/R) Library Prep Kit for Illumina® (NR603) kit was used for RNA puri cation and library construction. The constructed sequencing library was used for RNA sequencing (RNA-seq) in Illumina HiSeq TM 4000 (China, Illumina). Illumina HiSeq TM4000 is currently the only expression pro le chip that can achieve 30 repetitions of the probe, and the chip results and qPCR show a correlation coe cient R 2 = 0.97 [8]. RNA-seq compared with the other sequencing techniques has a broader dynamic range, provides a better estimate of relative expression levels of any genomic region with higher technical reproducibility, and facilitates the detection of alternative splicing [9]. With these advantages, RNA-seq has been used to reconstruct the entire transcriptome of an organism [10]. The novel lncRNA predictions for the selected new transcripts, for the coding capacity of new transcripts using CPC, CNCI, and SwissProt, and for the intersection of these transcripts without coding potential and protein annotation information were considered reliable.
Gene Co-Expression Network Construction Differential analysis of lncRNA expression between groups was performed using edgeR [11]. Differential transcripts were selected using FDR and log2FC, and the ltering conditions were FDR < 0.05 and |log2FC| > 1. As lncRNA expression is low compared to mRNA levels, a hierarchical classi cation-Spielman correlation coe cient approach was used to analyze the correlation between lncRNA and protein-coding gene expression among samples. The Cytoscape software was used to build the co-expression network.

LncRNA Action Modes
The action modes of lncRNAs involved in regulation are divided into three categories: antisense, cis, and trans [12]. We predicted target genes by the modes of action, in which each lncRNA is likely to be involved in the following manner: Antisense action target gene prediction was performed using RNAplex to nd short interactions between two long-stranded RNAs and to nd complementary binding between antisense lncRNAs and mRNAs. Cis action target gene prediction was performed by annotating lncRNAs within 10 kb upstream or downstream of a gene, and analyzing their potential to intersect with regions where cis-acting elements are located. Trans-action target gene prediction was performed using Pearson's correlation coe cient method to analyze the correlation of lncRNAs and protein-coding gene expression between samples, and protein-coding genes with absolute correlation values greater than 0.999 were selected.
Targeted mRNA Enrichment Analysis (Go/Pathway) Using the DAVID database (V.6.8) and the "clusterPro ler" R package [13], Gene Ontology (GO) enrichment analysis of 410 differentially expressed mRNAs was performed to annotate the function of lncRNAs [14].
Signaling pathways were also annotated using the Kyoto Encyclopedia of Genes and Genomes (KEGG) [15].

Quantitative Polymerase Chain Reaction (qPCR)-based Validation of LncRNAs
The selected core lncRNAs were subjected to qPCR analysis for validation. The ltering conditions were FDR < 0.05, |log2FC| > 1, FPKM > 0.5, and GO/Pathway enrichment analysis of functional lncRNAs. β-Actin was used as an internal reference gene to compare the control group (CK) and test group (T), using the 2 △△Ct method. The lncRNA expression rates between the CK and T groups were compared using the independent samples t-test. ROC curve statistical test was used to determine whether lncRNAs can be used as a possible diagnostic biomarker for SCZ. The SPSS 22.0 software was used to analyze and process the data, and statistically signi cant differences are indicated by p < 0.05.

Functional Analysis of qPCR-validated LncRNAs
Functional site prediction in qPCR-validated transcripts was performed using the PROSITE database [16]. If the candidate sequence was found to have no transcription or translation function or was not itself a transcription factor, then an NCBI blast search was performed to predict whether the lncRNAs are transcripts of target genes and their homologues [17]. A conservative analysis was then performed via the UCSC database [18], and transcription factors that bind to lncRNAs were searched. Finally, the JASPAR database was used to predict transcription factors and binding sites [19], selecting transcription factors with the same direction of transcription.

Clinical Characteristics of Patients that Provided Sequencing Samples
Statistical differences were evaluated in sex and age between the CK (n = 50) and T (n = 50) groups. Analysis using the independent samples t-test showed that the mean age ± standard deviation of the T and CK groups were 35.56 ± 10.04 and 32.52 ± 8.04, respectively, and had no signi cant difference (p > 0.05). There was no signi cant difference in gender composition (p > 0.05).

RNA-seq Quality Assessment
RNA quality of the sequencing samples was analyzed using an Agilent 2100 Bioanalyzer. The OD 260 /OD 280 values were all between 1.8-2.0 (Fig. 1a). The two-sample independent repeated correlation is shown in Fig. 1b, which shows that the quality of the isolated RNA met the requirements of sequencing. The Euclidean distance algorithm was used to calculate the expression distance of each sample gene, the Pearson algorithm was used to calculate the distance between the samples, and then a cluster map was established according to the distance (Fig. 1c). A total of 5106 new lncRNA transcripts were obtained by CPC, CNCI, and SwissProt prediction (Fig. 1d). Principal component analysis of sequencing samples showed that the PC1 principal component was 79.4%, PC2 was 9.3%, and that PC1 + PC2 could distinguish 88.7% of the overall variance (Fig. 1e).
A total of 1179 differentially expressed mRNAs were obtained (Fig.   2b, 2d), of which 702 were upregulated and 477 downregulated. CNOT2, RNF19B, MECP2, TNFAIP3, and ALDO were the top ve upregulated mRNAs (|log2FC| > 9) and AHR, ARMC8, HBP1, ERVFC1-1, and COTL1 were the top ve downregulated ones (|log2FC| > 7). Heat maps of the expression patterns of 94 lncRNAs and 1179 mRNAs showed distinct group information after clustering between groups. The SCZ group was clearly distinguished from the normal group, suggesting that the differentially expressed genes may serve as potential biomarkers of SCZ.
The obtained data and Cytoscape software were used to construct a lncRNA-mRNA co-expression network, and independent individuals were excluded.
Some core lncRNAs, such as PDCD1, MEG3, LINC00599, PSMA3-AS1, CLSTN2-AS1, AC008892.1, and XLOC078139, which target multiple mRNAs, were identi ed in the co-expression network. The core node IL1RAP was also found to be linked to several genes, suggesting that it could be a candidate gene for late functional validation.

LncRNA Action Modes
Target gene prediction of lncRNA mode of action yielded one lncRNA with an antisense interaction with mRNA, 22 lncRNAs with cis action, and 94 trans-acting lncRNAs (Fig. 4). Among them, XLOC_079942, XLOC_081142, LINC00958, LINC00599, LINC01052, CLSTN2-AS1, AL034397.3, and LINC01287 had more trans-acting association with mRNAs as members of the core node. IL1RAP was associated with multiple lncRNAs. The results showed that lncRNAs mainly affected mRNA expression in a co-expression manner, and thus participated in gene expression regulation and transcription.

Results of GO/Pathway Enrichment Analysis
GO/Pathway enrichment analysis of 410 target mRNAs was performed and ltered at p < 0.05. A total of 271 mRNAs were enriched in biological process (BP) terms, with the rst four items being signi cantly related to neutrophil activation, neutrophil degranulation, neutrophil activation involved in immune response, neutrophil-mediated immunity, and leukocyte migration (all related to immune response) (Fig. 5a, 5b). Twenty-three mRNAs were enriched in the following cellular component terms: speci c granule lumen, secret granule lumen, and cytoplasmic vesicle lumen (Fig. 5c, 5d). A total of 16 mRNAs were enriched in molecular function (MF) terms, including protein binding, transcription factor binding, and transcription regulatory region DNA binding (Fig. 5e, 5f). The rst four of these BPs share the genes CAPN1, PLAU, OLR1, CTSD, ABCA13, LCN2, MS4A3, and TCN1, and are all involved in the pathogenesis of neuropsychiatric disorders.
The KEGG enrichment results also showed that the main enrichment pathways for targeted mRNAs were concentrated in immune response, such as C-type lectin receptor, NOD-like receptor, TNF, and the NFkappa β signaling pathways (Fig. 6a-6d).

qPCR Validation Results
The mean Ct values of β-actin obtained in PBL samples from CK and T groups were 15.44-15.24, and the CV% was 3.38-3.05.
We performed qPCR experiments with the lncRNAs selected by bioinformatic analysis to meet the conditions of FDR < 0.05, |log2FC| > 1, and differential expression of FPKM > 0.5 between groups. The experimental results were as follows: IL1RAP-TCONS_00138311 was downregulated 0.56-fold in the T group relative to the CK group (p < 0.05) (Fig. 7a, Table 1), and CCR3-TCONS_00134168 was upregulated 4.68-fold in the T group relative to the CK group (p < 0.05) (Fig. 7b, Table 1). The sequencing results were considered accurate and credible, and were subjected to further analysis and functional veri cation.
Taking IL1RAP-TCONS_00138311 and CCR3-TCONS_00134168 as test variables, and the CK and T groups as static variables, a ROC curve was constructed. The results showed that the expression of IL1RAP-TCONS_00138311 and CCR3-TCONS_00134168 in the PBL leukocytes of patients with SCZ had certain effects on the predicted value (p < 0.05). Using IL1RAP-TCONS_00138311 as a diagnostic indicator, AUC = 0.924, 95% of the con dence interval was 0.873-0.974, and the Jorden index reached its maximum value of 0.74 when FPKM = 15.48 in PBL, at which point the sensitivity was 80.0% and the speci city 94.0% (Fig. 7c, Table 1). Principal component analysis, using IL1RAP-TCONS_00138311 sequencing expression value, showed that the PCA1 + PCA2 components explained 24.3% of the overall variance (Fig. 7d).

IL1RAP-TCONS_00138311 Sequence Functional Analysis
The selected transcripts were predicted using the CPC, CNCI, and SwissProt databases, and none had coding potential or protein annotation information. The IL1RAP-TCONS_00138311 transcript sequence was used for an NCBI blast search, with Query Length 1311, bits to 1023 length, and matches to Sequence ID: NG_029105.2 (43715-44737, Score = 1890, Expect = 0.0, Identities = 100%, Gaps = 0%), and was identi ed as a new IL1RAP transcript. Using the upstream 2000 bp and the downstream 100 bp sequences as the gene promoter region, a conservative analysis through the UCSC database identi ed transcription factors that can bind to the lncRNA (Fig. 8a). Finally, using the JASPAR database, with score set to 500 and relative pro le score threshold set at 90%, the transcription factors in the same transcription direction as IL1RAP-TCONS_00138311 were selected. The top four ranked transcription factors were predicted to be HSF1, HSF2, HSF4, and FOXA1, and the binding sites are shown in Fig. 8b.

Discussion
Our results show that lncRNAs mainly affect mRNA expression in a co-expression manner and thus participate in gene expression regulation and transcription. IL1RAP, identi ed as a core node, was also found to be functionally linked to several genes. The differential expression of the variable splicing transcript TCONS_00138311 of IL1RAP was veri ed using qPCR, and a higher ROC area of 0.924 could speci cally distinguish between patients with SCZ and healthy individuals.
The GO/Pathway analyses showed 271 transcripts enriched in BP terms. Among them, the rst four of these BPs shared the genes CAPN1, PLAU, OLR1, CTSD, ABCA13, LCN2, MS4A3, and TCN1, with all of them involved in the pathogenesis of neuropsychiatric disorders. OLR1, CTSD, and MS4A3 are involved in the occurrence and development of neurodegenerative diseases. Interaction between OLR1 and the LDL component, L5, inhibits the nerve growth factor-induced activation of the PI3k/Akt cascade and suppresses cell viability and neuronal differentiation induced by nerve growth factor [20]. CTSD is expressed at high levels in almost all tissues of the brain and is essential for maintaining a steady state of lysosome-dependent proteins in this organ [21]. Defective or reduced activity of CTSD expression can lead to pathological protein aggregation in Parkinson's disease and Alzheimer's disease (AD) [22]. The MS4A3 gene encodes a member of the transmembrane 4A gene family that exhibits a unique expression pattern in hematopoietic cells and non-lymphoid tissues, which may play a role in signal transduction and may function as a subunit associated with receptor complexes. Polymeric SNP association analysis of this gene showed a correlation with AD genes and cognitive decline [23]. CAPN1, PLAU, and TCN1 are involved in neuroprotection. CAPN1 calpain is a calcium-dependent protease, which is related to synaptic plasticity and neuroprotection in the mammalian brain [24] and regulates Alzheimer's disease-related protein signaling pathways [25]. PLAU is a urokinase-type plasminogen activator ligand that promotes neuroprotection with the receptor PLAUR or its plasma membrane-binding partner platelet-derived growth factor receptor beta (PDGFRβ) [26]. The TCN1 gene encodes a member of the vitamin b12-binding protein family, which is a major component of secondary granules in neutrophils. Vitamin B12 de ciency disrupts neurodevelopment during pregnancy and throughout childhood [27], and adequate levels of this vitamin are also necessary for adult neurocognitive function [28]. Elevated TCN1 expression is signi cantly associated with poorer memory [29]. ABCA13 and LCN2 are signi cantly associated with schizophrenia. The ABCA13 gene encodes a transmembrane transport protein, with a C-terminal motif involved in lipid transport in lipid metabolism [30], and dysregulation of lipid metabolism is strongly associated with psychiatric disorders, with ABCA13 increasing the susceptibility to SCZ, bipolar disorder, and major depression [31]. LCN2 plays an important role in mediating the in ammatory activity by binding to chemotactic peptides, leukotrienes, and platelet-activating factors [32], and is signi cantly increased in SCZ [33]. Proteins, such as TNIP1 and IL1B, regulate NF-kβ activation and play a role in autoimmunity and tissue homeostasis; MGST1, CAMP, TNFRSF14, and FOS are involved in cell chemotaxis, immune mediator induction, and in ammatory response regulation.
The KEGG enrichment results also showed that the main enriched pathways for targeted mRNAs were concentrated in the immune response, such as C-type lectin receptor, NOD-like receptor, TNF, and the NF-kappa β signaling pathways (Fig. 6a-6d). The nuclear factor-kβ (NF-kβ) complex provides direct transcriptional control of interferon-induced transmembrane protein (IFITM) and cytokines [34][35][36], which are overexpressed in SCZ [37]. In the prefrontal cortex (PFC) of schizophrenic subjects, the mRNA levels of two NF-kβ family members, NF-kβ1 and NF-kβ2, increased, whereas the mRNA levels of NF-kβ sitebinding protein Schnurri, which inhibits the function of NF-kβ-2, were low [36]. These ndings suggest that higher NF-kβ transcriptional activity may play an important role in elevating IFITM and cytokine mRNA levels in the PFC of patients with SCZ.
IL1RAP is an essential protein for the IL1 receptor [38], regulating the IL1B-induced activation of the NF-kβ pathway and other in ammatory pathways, which is closely related to SCZ. In addition, the IL1B in ammatory pathway plays an important role in the development of SCZ [39]. There are currently four main types of IL1RAP transcripts and protein isomers. Among them, mIL1RAP and sIL1RAP are the main types, with different molecular structures, tissue distribution, and biological functions [40,41]. mIL1RAP has extracellular, transmembrane, and intracellular segments, which affect IL1B downstream signaling and activating transcription and translation of IL1B effector genes [42]. sIL1RAP only has the extracellular segment that can bind to IL1B, as absence of an intracellular segment prevents activation of the IL1B pathway.

Conclusions
In this RNA-seq study, we used more samples for whole transcriptome sequencing and screened the differentially expressed lncRNAs, cicrRrna, microRNAs, and small RNAs in the PBL of patients with SCZ. Among them, the variable splicing transcript TCONS_00138311 of IL1RAP caught our attention. IL1RAP is an essential protein for the IL1 receptor, regulating the IL1B-induced activation of the NF-kβ pathway and other in ammatory pathways, which is closely related to SCZ. The limitation of this research is that the relevant discussions only focus on the bioinformatics analysis stage. Although the differential expression of the target gene is veri ed at the molecular level, there is still a lack of systematic molecular biology and zoology experimental veri cation; but the relevant research results are also provides precise directions for subsequent experiments, and we are also conducting cell-level experiments based on the results of this research. Hence, the next step we aimed to investigate the regulation of IL1RAP variable splicing by the transcription factors FOXA1, HSF1, HSF2, and HSF4, to clarify whether IL1RAP aberrant variable splicing is involved in the pathogenesis of SCZ, or whether the IL1RAP-TCONS_00138311 new transcripts induce the activation of NF-kβ pathway and other in ammatory pathways through IL1B, and thus participate in the development of SCZ (Fig. 8c)

Consent for Publication
Not applicable.

Availability of Data and Materials
The datasets used and analyzed during the current study are available from the corresponding author upon reasonable request.

Competing Interests
The authors declare that they have no competing interests.    Predicted lncRNA and mRNA co-expression network in schizophrenia. The co-expression network was established between the 81 signi cantly differentially expressed lncRNAs and 410 signi cantly differentially expressed mRNAs that had Spearman correlation coe cients (p < 0.01 and |r| ≥ 0.4). Red through blue colors indicate high to low expression levels. Purple through green colors indicate high to low correlation levels.

Figure 4
LncRNA-mRNA mode of action diagram. Diamonds and circles represent lncRNAs and mRNAs, respectively (colors from blue to red represent correlation size from -1 to 1); green, black, and purple lines indicate antisense, cis, and trans modes of action, respectively.