Identification of overlapping common DEGs
This study collected the gene expression profiles of three RNA-seq datasets (GSE92874, GSE63738, and GSE121376) from the NCBI GEO database to identify and investigate differentially expressed gene expression patterns that may impact SZ development. Firstly, the DEGs were obtained by analyzing three RNA-seq datasets using the GREIN interactive web platform. DEGs were observed in a total of, 25213, 22214 and 21849 from the datasets of GSE92874, GSE63738, and GSE121376 respectively. Then, a total of 3880 (2453 upregulated and 1427 downregulated genes), 1953 (1080 upregulated and 873 downregulated genes), and 5787 (2922 upregulated and 2865 downregulated genes) deregulated DEGs were identified from the GSE92874, GSE63738, and GSE121376 respectively based on the criteria of p-value < 0.5 and log2FC ≥ 1 or ≤ − 1 (supplementary data 1). Figure 1(A, C, E) illustrates an interactive heatmap of the top 100 upregulated and downregulated genes. Figure 1 (B, D, F) shows all the DEGs with a log2FC versus the – log10 (p-value) between the SZ patients and healthy groups in a MA plot, and positive Log2FC represents upregulated genes, whereas negative Log2FC indicates downregulated genes. Four hundred twenty overlapping DEGs were identified using Venn diagrams among, GSE92874, GSE63738, and GSE121376datasets (Figure F), including 290 upregulated genes and 130 downregulated genes. Then, using bioinformatics and machine learning tools were employed to analyze the functional annotation of genes, enrichment analysis of interaction networks, and molecular and biological pathways using those commonly deregulated DEGs.
Gene Ontological Analysis Of Overlapping Degs
The GO is a comprehensive conceptual model for describing the functional characteristics of DEGs. GO analyzed the functional annotation of overlapping DEGs into three functional categories: BP, MF, and CC. In the aspect of BP, the overlapping DEGs were enriched in cell communication, signal transduction, cell growth and maintenance, regulation of immune responses, cell-matrix adhesion, immune response, peptide metabolism, regulation of the biological process, cell differentiation, cell surface receptor linked signal transduction, regulation of gene expression, epigenetic, calcium-mediated signaling, transport, cell adhesion and transcription (Fig. 2A). In terms of MF (Fig. 2B), the commonly shared DEGs were mainly involved in EMC structural constitute, MHC receptor class I receptor activity, MHC receptor class II receptor activity, transcription factor activity, cell adhesion molecule activity, metallopeptidase activity, receptor binding, motor activity, growth factor activity, amino acid transporter activity, GTPase activator activity, ATPase activity, and calcium ion binding. For the CC category, overlapping expressed genes were significantly enriched in extracellular, EMC, extracellular space, plasma membrane, nucleus, cytoplasm, fibrinogen complex, integral to the plasma membrane, proteinaceous EMC, collagen type I, collagen type V, integrin complex, microfibril, TAP complex, inclusion body, intermediate filament, and basement membrane (Fig. 2C). These findings suggest that EMC-related processes are essential in the SZ development mechanisms.
Identification Of Significant Signaling Pathway Enrichment Analysis
Functional enrichment pathways analysis determines how complicated the underlying biological processes cause SZ pathogenesis. Using web-based bioinformatics tools like EnrichR, including four pathways databases such as Reactome, Panther, BioPlanets, and NCI nature, performed gene set enrichment analysis of overlapping DEGs to find critical signaling pathways that may relate to SZ pathogenesis (Fig. 3). The top 15 signaling pathways were chosen based on the significance of the adjusted p-value of less than 0.1 as a criterion for pathway analysis. Critical biological pathways were identified which may be implicated in SZ using the Reactome 2016 pathway database, including EMC organization, collagen biosynthesis and modifying enzymes, assembly of collagen fibrils and other multimeric structures, collagen formation, elastic fibre formation, integrin cell surface interactions, muscle contraction, antigen processing-cross presentation, transmission across chemical synapses, endosomal/Vacuolar pathway, neuronal System, ER-phagosome pathway, neurotransmitter receptor binding, ECM proteoglycans, and fibronectin matrix formation (Fig. 3A). Panther 2016 pathway revealed integrin signaling pathway, ionotropic glutamate receptor pathway, nicotinic acetylcholine receptor signaling pathway, 5-Hydroxytryptamine degradation, Alzheimer disease-presenilin pathway, TGF-beta signaling pathway, cadherin signaling pathway, cytoskeletal regulation by Rho GTPase, inflammation mediated by chemokine and cytokine signaling pathway, p53 pathway T cell activation, dopamine receptor-mediated signaling pathway, adrenaline and noradrenaline biosynthesis, GABA-B receptor II signaling, and JAK/STAT signaling pathway (Fig. 3B). Results from the BioPlanet 2019 pathway database identified the most significant biological function associated with signaling pathways, including beta-1 integrin cell surface interactions, TGF-beta regulation of EMC, collagen biosynthesis and modifying enzymes, EMC organization, syndecan-1 pathway, ECM-receptor interaction, integrins in angiogenesis, antigen processing and presentation, beta-3 integrin cell surface interactions, interleukin-1 regulation of EMC, neural crest differentiation, inflammatory response pathway, integrin cell surface interactions, integrin beta-5 pathway, and neuronal system (Fig. 3C). The pathway analysis of the NCI 2016 nature database revealed beta1 integrin cell surface interactions, syndecan-1-mediated signaling events, integrins in angiogenesis, beta3 integrin cell surface interactions, beta5 beta6 beta7 and beta8 integrin cell surface interactions, VEGFR3 signaling in lymphatic endothelium, Plexin-D1 Signaling, Direct p53 effectors, syndecan-4-mediated signaling events, IL4-mediated signaling, BMP receptor signaling, ALK1 signaling events, integrin family cell surface interactions, Arf6 trafficking events, and syndecan-2-mediated signaling events (Fig. 3D). We showed pathway enrichment analysis along with gene name, adjusted p value and combined score (supplementary data 2). These pathway enrichment analysis findings suggest that EMC and integrin cell surface contacts may have played a role in the progression of SZ.
Protein-protein Interaction Network And Hub Protein Identification
To investigate a correlation between overlapping DEGs and SZ disorder that focused on the PPIs network. The PPIs network of DEGs was generated using the STRING database (version 11.5), with combined scores more significant than 0.7 (interaction score: high confidence), and visualized using Cytoscape software, with 183 nodes and 384 edges (Fig. 4A). Proteins are represented as nodes in the PPI network, connected by undirected edges, implying a relationship between two proteins. Furthermore, the PPIs network is employed for hub protein discovery, which may aid in the identification of a therapeutic biomarker for the disease comorbidities.
To analyze hub protein across the whole PPIs network, using MNC and MCC approaches employed the Cytoscape plug-in CytoHubba, which was generated Hub proteins network. The PPI networks selected the top 15 hub proteins using the techniques above. The therapeutic targets were determined SZ pathogenesis and progression by identifying the top 15 hub proteins (FN1, COL1A1, COL3A1, COL1A2, COL5A1, COL2A1, COL6A2, COL6A3, MMP2, THBS1, DCN, LUM, HLA-A, HLA-C, and FBN1), which have significantly higher degree interactions (Fig. 4B). We demonstrated the highest rank score of hub genes (supplementary data 3). The found hub proteins are potential biomarkers that might lead to new SZ therapeutic targets and play critical roles in EMC construction during SZ development.
Analysis Of Transcription Factors
Transcription factors (TFs) are essential molecules that regulate gene expression and maintain gene regulatory networks. DEGs are controlled by the activation or repression of transcription factors (TFs), which are required for various biological and cellular processes and have been linked to the development of neurological disorders. The X2K online tool identified transcriptional factors for controlling overlapping DEGs expression in SZ using the ChEA database. Transcription factor enrichment analysis (TFEA) was performed based on the hypergeometric p-value and selected the top 20 TF candidates (Fig. 5A). SUZ12, EZH2, IRF8, TP63, TRIM28, TP53, FOSL2, EGR1, PPARD, KLF4, NANOG, POU5F1, REST, TCF3, SALL4, SOX2, GATA1, and GATA2.The Genes2Networks (G2N) method is employed to find proteins that physically interact with PPIs and TFs to investigate their relationships. Based on the degree of the nodes, the regulatory network of linked TFs and their functionally and physically interacting proteins was depicted (Fig. 5B).
Identification Of Protein Kinase With Upstream Regulatory Network
The mRNA translation, cell proliferation and survival regulation, and nuclear genomic response to cellular stressors are regulated by protein kinases, which comprise sensors and effectors from signal transduction cascades. Protein kinase activity dysregulation has been linked to various disorders, ranging from inflammatory to neurodevelopment disorders. Using the KEA module of X2K, the most critical PTKs linked with SZ were identified to examine possible neurotherapeutic kinase targets. The results of KEA revealed that CSNK2A1, GSK3B, CDK1, MAPK14, ATM, HIPK2, CDK2, DNAPK, MAPK1, MAPK8, CDK4, CK2ALPHA, MAPK3, ABL1, AKT1, ERK1, IKKALPHA, JNK1, PKR, ERK2, PKBALPHA are the most critical PTKs in intracellular signaling pathways related with SZ (Fig. 6A). Then, the human protein reference database (HPRD), PhosphoSite, phospho.ELM, NetworKIN, and Kinexus were used to build a kinase-substrate network. Finally, these databases constructed a regulatory kinase-substrate network in which active PTKs phosphorylate substrates inside the enlarged subnetwork of transcription factors and intermediary proteins (Fig. 6B).
Prediction Of Mirnas Targets With Hub Genes
Using network analyst, the miRNA-hub gene regulatory network associated with the development of SZ was constructed. According to the miRNA-hub gene-targeted network, the network included hsa-miR-29c-3p, hsa-miR-29b-3p, hsa-miR-29a-3p, hsa-miR-143-3p, hsa-miR-767-5p, hsa-miR-6825-5p, hsa-miR-29b-1-5p, hsa-let-7g-5p, hsa-miR-544a, hsa-miR-1331-3p which had targeted relationships with the hub genes (Fig. 7A). Finally, using MitarBase employed to identify the miRNA-target enrichment analysis result based on significant adjusted p-value and target hub genes interaction (Fig. 7B).
The FunRich software examined possible biological pathways related to SZ (Fig. 8). These findings revealed that SZ's major potential biological pathways had been compromised of beta3 integrin cell surface interactions, integrin cell surface interactions, integrin family cell surface interactions, beta1 integrin cell surface interactions, VEGFR3 signaling in lymphatic endothelium, dopamine and serotonin degradation pathway, transmission across chemical synapses, and neuronal system. According to the biological pathways, analysis results might be helpful for future studies to investigate the involvement the SZ pathogenesis.
Interconnection Of Hub Genes Associated With Emc Organization And Integrin Cell Surface Interactions Mechanism
Interactions were annotated as genetic interactions, co-localizations, or neighbouring reactions, and only extracted the small subset of data that described direct protein-protein interactions or data on experimentally resolved protein complexes from the pathway databases (NetPath, Reactome, and WikiPathways). Connectome analysis was observed a network of signaling molecules, receptors, plasma membrane-bound transcription factors, kinases, enzymes, and structural proteins interconnected with hub genes to determine the potential biological signal of EMC organization and integrin cell surface interactions network pathway. Figure 9A has shown that FBN1-“governs as a signaling molecule” interacts with MMP-2/13/8/12 that regulates structural proteins networks and also maintains plasma membrane stability. Similarly, MIA, itself also signaling molecule, is tightly bound with EMCs proteins such as FN1, THBS1, COL5A1, COL3A1, COL1A2, COL2A1, COL2A1, COL5A1, and COL6A3, thus may be modulated integrin activity. Then, the hub genes association with shared EMC organization and integrin cell surface interactions network pathway was determined. Our hub genes identification result described that EMC organization shared their interaction with integrin cell surface molecules as well as their signaling pathways (Fig. 9B). However, it needs to be confirmed and validated this computational analyses results by experimentally. Finally, hub genes used overlapping DEGs of SZ to create a schematic diagram of the EMC-receptor interaction complex pathway based on the KEGG database. In Fig. 9C, we tried to identify how EMC-associated genes interact with integrin receptor and proteoglycan molecules based on the KEGG pathway analysis by bioinformatics method. We observed that fibronectin, collagen, and THBS interact with integrin dimer (α1, α2, α3, α8, αV, αIIb, β1, and β3), thus resulting in maintaining their cell adhesion and signal strength.
Identified Targets Validation With Literature Review
According to GO and functional enrichment analysis, the commonly shared overlapping dysregulated genes were mainly implicated in EMC organization, signal transduction, collagen fibril organization, antigen processing and presentation, and ECM-receptor interaction process. These suggested results might be crucial in the progression of SZ. ECM molecules govern GABAergic function, synaptic pruning and plasticity during childhood and adolescence, neuronal migration, proliferation, differentiation, neurodevelopment modulation, neuroplasticity, axon guidance, and neurite outgrowth42. Previously, it has been reported that altered ECM organization has been associated with the pathophysiology of SZ and neurodegenerative diseases43, 44. A small peptide generated from a collagen protein may protect the brain from SZ by stimulating the formation of neural connections45. These studies suggested that abnormalities of collagen fibril organization may be linked to the pathogenesis of SZ. Evidence reported that signal transduction anomalies caused by changes in the kinase activity network are the foundation of SZ fundamental symptoms46. The results of functional enrichment analysis were consistent with earlier research. Pathway enrichment analysis was used to identify the top 15 pathways associated with the pathogenesis of SZ. Previously, it has been reported that SZ patient-derived cells are sensitive to ECM proteins that bind integrin receptors, thus resulting in increases in focal adhesion number and size in response to ECM proteins and alterations in cell shape and cytoskeleton47,. Cadherins have a critical role in developing brain circuitry and mature synaptic function48; hence genes encoding members of the cadherin superfamily play a vital role in the pathophysiology of neuropsychiatric illnesses. In the mammalian brain, the cadherin signaling system regulates adhesion molecules that are important in cell-cell contact49. The TGF-beta signaling pathway is crucial for the use-dependent control of GABAA synaptic transmission and dendritic homeostasis; moreover, a disruption in the excitatory-inhibitory balance in the hippocampal network may induce psychiatric-like behavior50. In human mental illnesses like SZ, the TGF -β signaling pathway components have been altered in the hippocampus51. Syndecan signaling from the ECM to various cytoplasmic components and their distribution to different membrane compartments rely on oligomerization; hence, the syndecan homodimer is a crucial functional unit52. However, future research needs to investigate the bidirectional link between Syndecan signaling and ECM organization in the pathophysiology of SZ. Mounting studies have reported that inflammatory signaling pathways have a significant role in depression and SZ, and chronic inflammation is linked to immunological dysregulation in SZ53–55.
FN1 mutations may increase SZ susceptibility, suggesting that this gene is an SZ-susceptible gene56. Moreover, in patients with SZ, FN1 gene expression levels involved complement pathway activators (C1qA) and mediators (C3 or C4) enhanced in the midbrain parenchyma near dopamine cell bodies, especially in individuals with a high inflammatory biotype57. This study also discovered that peripheral macrophages were significantly increased in the midbrain of SZ cases related to high inflammatory potential. However, it remains unclear how FN1 triggered inflammatory signaling pathways in the progression of SZ pathogenesis. Evidence reported that the collagen chain gene (COL1A1) expression might be associated with the EMC ligands, thus resulting in the inactivation of DDR1 with a time-dependent decrease in the SZ patients58. In addition, evidence reported that COL1A1 and COL1A2 genes expression levels were significantly decreased in patients with SZ compared to control59. One of the most increased genes, COL3A1, is detected, which encodes the pro-1 chains of type III collagen and exhibits a de novo mutation in SZ patients60. COL5A1 may be associated with genetic variations of psychotic experiences in SZ61. In distinct brain areas, the expression patterns of collagen family genes (COL1A1, COL3A1, COL1A2, COL5A1, COL2A1, COL6A2, COL6A3) encoding proteins that are fundamental components of ECM62. Chronic agonistic interactions may cause abnormal collagen genes expression, which might indicate specific ECM abnormalities in the brain areas of mice with different social experiences63. MMP-2 levels were significantly increased in the CSF of SZ patients, thus may be connected to neuroinflammation in the brain64. This study postulated that the pathophysiology of SZ seems to be influenced by state-dependent changes in MMP-2 and activation of MMP-2, -7, and − 10 cascades. A clinical study reported that THBS1 gene might have a critical role in the development of SZ patients65. The involvement of the THBS1 gene in SZ pathogenesis will need to be investigated further in the future.
This investigation revealed that transcriptional factors (including SUZ12, EZH2, TRIM28, TP53, and EGR1) are associated with the disease progression and development of SZ pathogenesis. SUZ12 binds a location 1.5 kb downstream of the GAD1 TSS, and the H3K27me3 mark is enhanced at the GAD1 promoter region in the prefrontal cortex of individuals with SZ, along with a significant number of the discovered DMRs within the GAD1 regulatory network66, 67. EZH2 may play a role in SZ risk at two different times: during development, when it causes neurodevelopmental abnormalities, and during adulthood, when it causes abnormal expression reactivation68. TRIM28 suppresses ERVs in the gene regulatory network that governs gene expression of protein-coding transcripts essential for brain development69. An essential link between TP53 and SZ was discovered by haplotype analysis70. These findings suggest that TP53 may have a role in SZ etiology. Consequently, transcriptional factors might be useful to develop potential therapeutic targets for SZ. Identification of PKs such as CSNK2A1, GSK3B, CDK1, and MAPK14 were reported to have a critical role in the pathogenesis of SZ71, 72.