Clinical Characteristics of Sequencing Samples
Statistical differences were analyzed between the sex and age of the 50 CK and 50 T groups collected. Analysis using the independent samples t-test yields: the mean age (mean ± standard deviation) of the T and CK groups was 35.56 ± 10.04 and 32.52 ± 8.04, respectively, and no significant difference (p>0.05), there was no significant difference in gender composition (p>0.05).
RNA-seq Quality Assessment
Total RNA of the sequencing samples was analyzed for quality by an Agilent 2100 Bioanalyzer. The OD260/OD280 values are all between 1.8-2.0 (Fig. 1a), the two-sample independent repeated correlation is shown in Fig (Fig. 1b), suggesting that the quality of the total RNA isolated and purified in this experiment meets the requirements of sequencing. The euclidean distance algorithm is used to calculate the expression distance of each sample gene, and the Pearson algorithm is used to calculate the distance between the samples, then established cluster map according to the distance (Fig. 1c). 5106 new lncRNA transcripts obtained by CPC, CNCI and SwissProt prediction (Fig. 1d). Principal component analysis of sequencing samples showed that PC1 principal component was 79.4%, PC2 was 9.3%, and PC1+PC2 could distinguish 88.7% of the overall variance (Fig. 1e).
Gene Co-Expression Network Construction
Selecting for differential expression of lncRNA and mRNA with FDR<0.05 and |log2FC|>1. 94 differentially expressed lncRNAs were obtained (Fig. 2a,c), 70 were up-regulated and 24 were down-regulated, 46 new lncRNAs were identified for the first time, of which XLOC_079942, LINC02362, ACTA2-AS1. AL162431.2, CLSTN2-AS1 were the top five differential fold change lncRNAs in up-regulated expression (|log2FC|>5), and AL390763.2, XLOC_078139, XLOC_043880, AC008892.1, AL034397.3 are the top five lncRNAs with differential fold change in down-regulated expression (|log2FC|>4). 1179 differentially expressed mRNAs were obtained (Fig. 2b,d), 702 were up-regulated and 477 were down-regulated, of which CNOT2, RNF19B, MECP2, TNFAIP3, ALDO are the top five mRNAs with differential fold change in upregulated expression (|log2FC|>9), and AHR, ARMC8, HBP1, ERVFC1-1, COTL1 are the top five mRNAs with differential fold change in downregulated expression (|log2FC|>7). Heat map of 94 lncRNA and 1179mRNA expression, exhibiting distinct group information after clustering between groups, The up-regulated expression genes could clearly distinguish from the down-regulated expression group, and the differential genes could significantly represent the schizophrenia group and thus distinguish from the normal group, suggesting that the sequenced differential genes may serve as potential biomarkers of schizophrenia.
According to Spearman correlation coefficient analysis, 81 lncRNA and 410 mRNA have expression correlation (p<0.01 and |r|≥0.4), 1152 mRNAs showed positive correlation with lncRNA, 300 mRNAs showed negative correlation with lncRNA (Fig. 3), the obtained data used cytoscape software to construct the lncRNA-mRNA co-expression network, and independent individuals were excluded. Exhibiting some core lncRNAs in the co-expression network, such as PDCD1, MEG3, LINC00599, PSMA3-AS1, CLSTN2-AS1, AC008892.1, XLOC078139, etc, targeting multiple mRNAs;the core node IL1RAP was also found to be linked to several genes, suggesting that it could be a candidate gene for late functional validation.
LncRNA Action Modes
Target gene prediction of the mode of action in which lncRNAs might be involved, yielding one lncRNA with an antisense interaction with mRNA, 22 lncRNAs present cis action, the trans-acting lncRNA is the most, up to 94 (Fig. 4). Among them, XLOC_079942, XLOC_081142, LINC00958, LINC00599, LINC01052, CLSTN2-AS1, AL034397.3, LINC01287 are more trans-associated with mRNA, as the core node, ILIRAP is associated with multiple lncRNA. The results show that lncRNAs mainly affect mRNA expression in a co-expression manner and thus participate in gene expression regulation and transcription.
Results of GO/Pathway Enrichment Analysis
GO/Pathway enrichment analysis of 410 target mRNAs, filtered at P<0.05. Enriched to 271 Biological Process (BP) terms, among them, the first four items that are significantly related to neutrophil activation, neutrophil degranulation, neutrophil activation involved in immune response, neutrophil mediated immunity, and leukocyte migration are all related to immune response (Fig. 5a,b); 23 Cellular Component (CC) terms, including specific granule lumen, secret granule lumen, cytoplasmic vesicle lumen, etc (Fig. 5c,d); 16 Molecular Function (MF) terms, including protein binding、transcription factor binding、transcription regulatory region DNA binding etc. (Fig. 5e,f);The first four of these BPs share the genes CAPN1, PLAU, OLR1, CTSD, ABCA13, LCN2, MS4A3, and TCN1 are all involved in the pathogenesis of neuropsychiatric disorders.
The KEGG enrichment results also prove that the main enrichment pathways for targeted mRNAs are concentrated in the immune response, such as C-type lectin receptor signaling pathway, NOD-like receptor signaling pathway, TNF signaling pathway, NF-akppa β signaling pathway and so on (Fig. 6a-d).
QPCR Validation Results
Mean Ct values of β-Actin were detected in peripheral blood samples from CK and T groups using β-Actin as an experimental internal reference, at 15.44-15.24, the CV% was 3.38-3.05. We performed qPCR experiments on the lncRNAs selected by the prenatal bioinformatics analysis to meet the conditions of FDR<0.05 and |log2FC|>1, and the differential expression of FPKM>0.5 between groups. The experimental results were as follows: IL1RAP-TCONS_00138311 was down-regulated in the T group relative to the CK group, which was 0.56 times (P<0.05) (Fig. 7a, Table 1); CCR3-TCONS_00134168 was up-regulated in the T group relative to the CK group, and was 4.68 times that in the CK group (P<0.05) (Fig. 7b, Table 1); Thus the sequencing results are believed to be accurate and credible, and the sequencing results can be further analyzed and functionally verified.
Taking IL1RAP-TCONS_00138311 and CCR3-TCONS_00134168 as test variables, and CK group and T group group as state variables, the ROC curve was constructed. The results showed that the expression of IL1RAP-TCONS_00138311 and CCR3-TCONS_00134168 in peripheral blood leukocytes of SCZ patients had certain effects on the predicted value (P<0.05). Using IL1RAP-TCONS_00138311 as a diagnostic indicator, AUC=0.924, 95% of the confidence interval is 0.873-0.974 and the Jorden index reaches its maximum value 0.74 when FPKM = 15.48 in PBL, at which point the sensitivity was 80.0% and the specificity was 94.0% (Fig. 7c, Table 1). Principal component analysis using IL1RAP-TCONS_00138311 sequencing expression value shows that PCA1 plus PCA2 components can explain 24.3% of the overall variance (Fig. 7d).
Table 1: QPCR and ROC Analysis Results
LncRNA
|
IL1RAP-TCONS_00138311
|
CCR3-TCONS_00134168
|
Sequence
(5' to 3')
|
F:AGGGGAAGGGAATCAACAAATAG
|
F:TCAAGACTTCGTGGCTTAAACAATA
|
R:GGGGCGTGGCATGTAACC
|
R:GGAACTCCATACCTGAAAGACCCTA
|
CK
|
7.13±0.69
|
14.59±0.88
|
T
|
8.58±1.06
|
12.93±1.43
|
2^-△△Ct*
|
0.56
|
4.68
|
AUC
|
0.9236
|
0.6486
|
P-value*
|
0.009***
|
0.016**
|
*△△Ct=Test group△Ct-Control group△Ct
*2^-△△Ct= Expression ratio, 2^-△△Ct>1= Up-regulated, 2^-△△Ct <1= Down-regulated.
IL1RAP-TCONS_00138311 Sequence Functional Analysis
The selected transcripts were predicted by CPC, CNCI, and SwissProt databases, and none had coding potential or protein annotation information. The acquires IL1RAP-TCONS_00138311 transcript sequence uses the NCBI blast function, enter Query Length 1311, bits to 1023 length, and matches to Sequence ID: NG_029105.2 (43715-44737, Score=1890, Expect=0.0, Identities=100%, Gaps= 0%), identified as a new IL1RAP transcript. Enter the upstream 2000bp and downstream 100bp sequences as the gene promoter region, and conduct a conservative analysis through the UCSC database to find transcription factors that can bind to lncRNA (Fig. 8a). Finally, using the JASPAR database, the score is set to 500, relative profile score threshold set 90%, select the transcription factor in the same transcription direction as IL1RAP-TCONS_00138311, the top four ranked transcription factors were predicted to be HSF1, HSF2, HSF4, and FOXA1, and the binding sites are as follows (Fig. 8b).