Clinical Characteristics of Patients that Provided Sequencing Samples
Statistical differences were evaluated in sex and age between the CK (n = 50) and T (n = 50) groups. Analysis using the independent samples t-test showed that the mean age ± standard deviation of the T and CK groups were 35.56 ± 10.04 and 32.52 ± 8.04, respectively, and had no significant difference (p > 0.05). There was no significant difference in gender composition (p > 0.05).
RNA-seq Quality Assessment
RNA quality of the sequencing samples was analyzed using an Agilent 2100 Bioanalyzer. The OD260/OD280 values were all between 1.8-2.0 (Fig. 1a). The two-sample independent repeated correlation is shown in Fig. 1b, which shows that the quality of the isolated RNA met the requirements of sequencing. The Euclidean distance algorithm was used to calculate the expression distance of each sample gene, the Pearson algorithm was used to calculate the distance between the samples, and then a cluster map was established according to the distance (Fig. 1c). A total of 5106 new lncRNA transcripts were obtained by CPC, CNCI, and SwissProt prediction (Fig. 1d). Principal component analysis of sequencing samples showed that the PC1 principal component was 79.4%, PC2 was 9.3%, and that PC1 + PC2 could distinguish 88.7% of the overall variance (Fig. 1e).
Gene Co-Expression Network Construction
A total of 94 differentially expressed lncRNAs and mRNAs with FDR < 0.05 and |log2FC| > 1 were identified (Fig. 2a, 2c), of which 70 were upregulated and 24 downregulated. Forty-six new lncRNAs were identified for the first time, of which XLOC_079942, LINC02362, ACTA2-AS1, AL162431.2, and CLSTN2-AS1 were the top five upregulated lncRNAs (|log2FC| > 5), and AL390763.2, XLOC_078139, XLOC_043880, AC008892.1, and AL034397.3 were the top five downregulated ones (|log2FC| > 4).
A total of 1179 differentially expressed mRNAs were obtained (Fig. 2
d), of which 702 were upregulated and 477 downregulated. CNOT2, RNF19B, MECP2, TNFAIP3, and ALDO were the top five upregulated mRNAs (|log2FC| > 9) and AHR, ARMC8, HBP1, ERVFC1-1, and COTL1 were the top five downregulated ones (|log2FC| > 7). Heat maps of the expression patterns of 94 lncRNAs and 1179 mRNAs showed distinct group information after clustering between groups. The SCZ group was clearly distinguished from the normal group, suggesting that the differentially expressed genes may serve as potential biomarkers of SCZ.
According to Spearman correlation coefficient analysis, 81 lncRNAs and 410 mRNAs showed correlated expression (p < 0.01 and |r| ≥ 0.4), 1152 mRNAs were positively correlated with lncRNAs, and 300 mRNAs were negatively correlated with lncRNAs (Fig. 3).
The obtained data and Cytoscape software were used to construct a lncRNA-mRNA co-expression network, and independent individuals were excluded.
Some core lncRNAs, such as PDCD1, MEG3, LINC00599, PSMA3-AS1, CLSTN2-AS1, AC008892.1, and XLOC078139, which target multiple mRNAs, were identified in the co-expression network. The core node IL1RAP was also found to be linked to several genes, suggesting that it could be a candidate gene for late functional validation.
LncRNA Action Modes
Target gene prediction of lncRNA mode of action yielded one lncRNA with an antisense interaction with mRNA, 22 lncRNAs with cis action, and 94 trans-acting lncRNAs (Fig. 4). Among them, XLOC_079942, XLOC_081142, LINC00958, LINC00599, LINC01052, CLSTN2-AS1, AL034397.3, and LINC01287 had more trans-acting association with mRNAs as members of the core node. IL1RAP was associated with multiple lncRNAs. The results showed that lncRNAs mainly affected mRNA expression in a co-expression manner, and thus participated in gene expression regulation and transcription.
Results of GO/Pathway Enrichment Analysis
GO/Pathway enrichment analysis of 410 target mRNAs was performed and filtered at p < 0.05. A total of 271 mRNAs were enriched in biological process (BP) terms, with the first four items being significantly related to neutrophil activation, neutrophil degranulation, neutrophil activation involved in immune response, neutrophil-mediated immunity, and leukocyte migration (all related to immune response) (Fig. 5a, 5b). Twenty-three mRNAs were enriched in the following cellular component terms: specific granule lumen, secret granule lumen, and cytoplasmic vesicle lumen (Fig. 5c, 5d). A total of 16 mRNAs were enriched in molecular function (MF) terms, including protein binding, transcription factor binding, and transcription regulatory region DNA binding (Fig. 5e, 5f). The first four of these BPs share the genes CAPN1, PLAU, OLR1, CTSD, ABCA13, LCN2, MS4A3, and TCN1, and are all involved in the pathogenesis of neuropsychiatric disorders.
The KEGG enrichment results also showed that the main enrichment pathways for targeted mRNAs were concentrated in immune response, such as C-type lectin receptor, NOD-like receptor, TNF, and the NF-kappa β signaling pathways (Fig. 6a–6d).
qPCR Validation Results
The mean Ct values of β-actin obtained in PBL samples from CK and T groups were 15.44–15.24, and the CV% was 3.38–3.05.
We performed qPCR experiments with the lncRNAs selected by bioinformatic analysis to meet the conditions of FDR < 0.05, |log2FC| > 1, and differential expression of FPKM > 0.5 between groups. The experimental results were as follows: IL1RAP-TCONS_00138311 was downregulated 0.56-fold in the T group relative to the CK group (p < 0.05) (Fig. 7
a, Table 1
), and CCR3-TCONS_00134168 was upregulated 4.68-fold in the T group relative to the CK group (p < 0.05) (Fig. 7
b, Table 1
). The sequencing results were considered accurate and credible, and were subjected to further analysis and functional verification.
Taking IL1RAP-TCONS_00138311 and CCR3-TCONS_00134168 as test variables, and the CK and T groups as static variables, a ROC curve was constructed. The results showed that the expression of IL1RAP-TCONS_00138311 and CCR3-TCONS_00134168 in the PBL leukocytes of patients with SCZ had certain effects on the predicted value (p < 0.05). Using IL1RAP-TCONS_00138311 as a diagnostic indicator, AUC = 0.924, 95% of the confidence interval was 0.873–0.974, and the Jorden index reached its maximum value of 0.74 when FPKM = 15.48 in PBL, at which point the sensitivity was 80.0% and the specificity 94.0% (Fig. 7c, Table 1). Principal component analysis, using IL1RAP-TCONS_00138311 sequencing expression value, showed that the PCA1 + PCA2 components explained 24.3% of the overall variance (Fig. 7d).
qPCR and ROC analyses results
(5' to 3')
7.13 ± 0.69
14.59 ± 0.88
8.58 ± 1.06
12.93 ± 1.43
|*△△Ct = Test group△Ct-Control group△Ct|
|*2△△Ct = Expression ratio, 2△△Ct > 1 = upregulated, 2△△Ct < 1 = downregulated.|
IL1RAP-TCONS_00138311 Sequence Functional Analysis
The selected transcripts were predicted using the CPC, CNCI, and SwissProt databases, and none had coding potential or protein annotation information. The IL1RAP-TCONS_00138311 transcript sequence was used for an NCBI blast search, with Query Length 1311, bits to 1023 length, and matches to Sequence ID: NG_029105.2 (43715–44737, Score = 1890, Expect = 0.0, Identities = 100%, Gaps = 0%), and was identified as a new IL1RAP transcript. Using the upstream 2000 bp and the downstream 100 bp sequences as the gene promoter region, a conservative analysis through the UCSC database identified transcription factors that can bind to the lncRNA (Fig. 8a). Finally, using the JASPAR database, with score set to 500 and relative profile score threshold set at 90%, the transcription factors in the same transcription direction as IL1RAP-TCONS_00138311 were selected. The top four ranked transcription factors were predicted to be HSF1, HSF2, HSF4, and FOXA1, and the binding sites are shown in Fig. 8b.