Identification of network-based differential gene expression signatures and their transcriptional factors to develop progressive blood biomarkers for Alzheimer’s disease 

DOI: https://doi.org/10.21203/rs.3.rs-2107982/v1

Abstract

Background: Alzheimer's disease (AD) can go years without being undiagnosed due to a lack of biomarker identification with a growing incidence rate in the geriatric population. Identifying genes and their transcriptional factors and kinases that regulate the phosphorylation and pathogenesis of AD is a state-of-art approach to identifying novel diagnostic biomarkers.

Methodology: Microarray dataset GSE140829 was retrieved from the GEO database to identify differentially expressed genes (DEGs) between AD and control samples. Furthermore, a protein interaction network was built using the String database, and DEGs were examined using Cytoscape based on high betweenness centrality (BC) and degree values. Additionally, the hub genes were identified via Cytohubba, and eight modules were then identified using molecular complex detection (MCODE).

Results:

Using a Venn diagram, we mined 1674 common DEGs from AD and control samples. The primary interaction data from the STRING consists of 1198 nodes and 1992 edges, which serve an extenuated network. Further, a core network was extracted from an extended network that consists of 676 nodes connected via 1955 edges and were analyzed based on high BC and Degree values. Based on the network topological analysis and network clustering, the hub genes were identified and further validated by coparing them with the backbone network. Compelling results from both the core network and backbone network HSP90AA1identified as a major blood biomarker, followed by HSPA5, CREBBP, UBC, GRB2, MAPK3, and TRAF6 are selected as the major biomarkers.

Conclusion:

This study shows the potential for predicting AD risk factors and identifies promising blood biomarkers for early AD diagnosis. Additionally, developing inhibitors for the identified transcriptional factors and kinases might improve future therapeutic applications.

1. Introduction

Cerebral amyloid angiopathy and tauopathy are mainly emphasized as remarkable pathological hallmarks of Alzheimer's type of Dementia (DeTure & Dickson, 2019). The global prevalence of Dementia is as high as 44 million, and it is projected to be more than triple by 2050 as the population ages. Alzheimer's disease (AD) itself accounting about 60–70% of the total dementia causes ("2020 Alzheimer's disease facts and figures," 2020). AD is characterized by the extensive deposition of extracellular amyloid plaques peptides and intracellular neurofibrillary tangles (2). Amyloid beta1 − 42 (Aβ1−42) is a proteolytic product of amyloid precursor protein (APP) by β and ϒ- secretases, which turns APP into insoluble Aβ. This results in the accumulation of plaques between the neurons and leads to the early onset of AD (EOAD), Aβ1−42 is the most common isomer present in this regard (Long & Holtzman, 2019). In downstream pathogenesis of Aβ1−42, neurofibrillary tangles are formed mainly due to the hyperphosphorylation of tau protein which causes the dissociation of tau protein and destabilization of microtubules (Chu & Liu, 2019). The accumulation of these lesions upholds the immune response, which induces pro-inflammatory activation and leads to a local inflammatory response. As a result, neuronal loss of specific segments of the brain specifically in the parietal-temporal-occipital region ensuing in reduced brain volume (Kinney et al., 2018).

Nevertheless, to date, the underlying pathophysiological mechanism has not yet been elucidated; as a result, there are no targeted drugs to prevent the pathogenesis of AD. Moreover, a limited number of inhibitors target a broad range of enzymes and specific proteins that induce pathogenesis, such as cholinesterase inhibitors and N-methyl D-aspartate antagonists. These inhibitors can help to control some behavioral symptoms but fail to limit the pathogenesis (Marucci et al., 2021). Besides, there are no diagnostic blood biomarkers which predict the risk for AD development. The available diagnostic methods include positron emission tomography (PET), CSF Aβ1–42/ Aβ1–40 ratio and adrenalinergic biomarkers, etc., but these methods are opted to confirm the AD at later stages (Ashton et al., 2021; Montoliu-Gaya et al., 2021). Therefore, the identification of blood biomarkers provides the chance for a broadly accessible triage for the quick evaluation of patients in primary care or the selection of suitable therapeutic investigations. Further, dentification of blood biomarkers can support the detection of disease modifications, show target engagement, aid in the diagnosis, and monitor for safety (Cummings, 2019). Recent advancements in research findings by employing in silico analysis have uncovered many illuminated molecules underlying the accumulations of pathologic insults. This will help to identify and target the pathological progression and can limit the formation of altered tau protein and abnormal oligomeric peptide deposition (Soeda & Takashima, 2020).

In this article were are trying to showcase the driving molecules that alter the normal counter mechanisms and increase the severity of pathogenic onset by making use of data produced by the experimental methods and available in the form of opensource datasets. A computational approach that helps to identify the multiple interconnected genes and their products that contribute to the pathogenesis of AD could be investigated. This study used Gene Expression Omnibus (GEO) data to examine the differentially expressed genes (DEGs). Further, pathway and gene ontology (GO) enrichment analysis for DEGs was analyzed to elucidate the role of biological process. Our systematic study gives insights into the investigation of several biomarkers that are assumed to play a critical role in the molecular basis of developing risk of AD pathology. Besides, we also identified the transcriptional regulatory factors and their kinases, which could help to track the molecules more precisely. To conclude, the identified biomarkers can be used to diagnose early onset of AD, and developing an inhibitor for the identified transcriptional factors and kinases could scale down the AD pathogenesis. Additionally, because the interplay between genes and signaling networks have been crucial to the development of AD, further invivo and invitro research into this could bridge better baseline for the clinical research.

2. Materials And Method

2.1. Microarray data source

Gene expression omnibus (GEO) (http://www.ncbi.nlm.nih.gov/geo) is a public functional genomics data repository that contains high-throughput gene expression data, chips, and microarrays (Clough & Barrett, 2016). The GSE140829 dataset's series matrix file is retrieved from the GEO database using the following search terms: "Alzheimer's disease," "inflammation," "homo sapiens," and "expression profiling by an array. This dataset had been generated using the platform GPL15988 HumanHT-12 v4 expression BeadChip (nuID) and was deposited by Nachun D et al., 2020 in the GEO database. A total of 587 samples are included in this dataset, comprising blood samples from 204 patients with Alzheimer's disease, 134 patients with other Dementia, as well as 249 control samples.

2.2. Identification of differentially expresses genes

GEO datasets come with a GEO2R (http://www.ncbi.nlm.nih.gov/geo/geo2r/) that analyses nearly any GEO series and compares groups of samples under the same experimental circumstances. GEO2R was used to analyze data pre-processing and screen DEGs between the groups. The adjusted p < 0.01 and |log2 fold change (FC)| with threshold values were chosen for each group (Barrett et al., 2013).

2.3. Functional enrichment analysis: GO and KEGG pathway

Functional analysis for the extracted DEGs was analyzed using the Database for Annotation, Visualization, and Integrated Discovery (DAVID) 6.8, available at http://david.ncifcrf.gov/summary.jsp, is a web-based tool for extracting profound significance genes or proteins from larger datasets (Yang et al., 2020). DAVID was used to undertake GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment studies. The GO enrichment analysis includes the terms Biological Process (BP), Cellular Component (CC), and Molecular Function (MF). To construct molecular interaction, reaction, and connection networks, KEGG pathway analysis was performed to assign groups of DEGs to particular pathways. To test if GO keywords and KEGG pathways were significantly enriched, p < 0.05 was employed.

2.4. Construction of protein-protein interaction (PPI) network for the identification of hub genes

STRING (Search Tool for the Retrieval of Interacting Genes), available at https://string-db.org/, is an online database of known and anticipated protein-protein interactions used for the analysis (Szklarczyk et al., 2021). The interactions investigated include physical and functional links, with data collected mainly from computational forecasts. We mapped the DEGs into the PPI network and set an interaction score threshold of > 0.7 as high confidence. Furthermore, PPI with a core network was also visualized and analyzed using the Cytoscape v3.6.0 software.

2.5. Topological analysis for the constructed PPI core network

The Cytoscape plug-in Network Analyzer is used to do a topological study of the interaction network. Differently linked nodes might be utilized to depict the molecular organization of a network (Assenov et al., 2008). Each node represents a protein, while the edges indicate dynamic interactions. As a result, nodes get mathematical functions as input and output values. Topological parameters such as the number of nodes, connecting edges, network diameter, density, radius, centralization, heterogeneity, clustering coefficient, characteristic path length, distribution of node degrees, neighborhood connectivity, average clustering coefficients, and shortest path lengths are calculated, and displayed. The network theory utilized three important topological characteristics to evaluate the nodes in a network: connectedness degree (K), betweenness centrality (BC), and closeness centrality (CC). Where degree is the total number of edges connected to a specific node, BC denotes the node with the most neighbors, and CC signifies the distance between the two nodes. The node with the greatest BC value significantly influences the network's inflow and has greater network control. In contrast, the node with the highest CC value is generally the network's topological center. Here, the confidence of the interacting network with a power law fit of the type y = axb was evaluated using Network Analyzer v3.3.1 (Hwang et al., 2008; Raman, 2010).

2.6. Construction of backbone network for PPI

Proteins with a higher BC value and the bridges that connect them will contribute for constructing a backbone network. Here, we extracted the proteins with the top 5% BC values and the links between them from the PPI network (Wahab Khattak et al., 2021). Because the backbone network was made up of these proteins, their connections within the network and crossing sites should have been heavily used by proteins with high BC. The core network was used to mine high-value BC nodes and their interconnection to build a backbone network. High BC values also reflect the network's shortest paths that pass-through nodes. These high-BC nodes and the rest of the network's nodes operate as bottlenecks in the control interaction (Sekaran et al., 2021).

2.7. Functional enrichment analysis for the backbone network

FunRich is a free, Windows-based analytical application for functional enrichment and interaction network analysis of genes and proteins. The FunRich analysis is based on a backend database with human-specific aggregated genomic and proteomic datasets of more than 1.5 million annotations, which are updated often (Pathan et al., 2015). An equation allocation test is used to determine the statistical significance of the backbone genes for the functional enrichment analysis. Bar graphs and interactive, downloadable vector graphic network visualizations are used to display the results (Pan et al., 2017).

2.8. Transcription factor and kinase enrichment analysis

eXpression2Kinases (X2K) computationally predicts the involvement of upstream cell signalling pathways given a profile of differentially expressed genes from an extended network. (http://X2K.cloud) (Clarke et al., 2018). In order to find transcription factors that are most likely to control the expression of the differentially expressed genes, X2K first computes enrichment. In order to build a subnetwork, X2K connects these enhanced transcription factors via recognized protein-protein interactions (PPIs). The members of the extended network are subjected to kinase enrichment analysis in the last stage. By comparing several configurations, it is possible to identify the best number of criteria for X2K Web's default settings (Singh et al., 2020).

2.9. Clustering analysis in the PPI network (module analysis)

Several sub-networks or functional modules (clusters) of proteins contribute to a highly complicated biological process intimately connected with a vast biological network. These modules impact each participating node in the network with a defined purpose, regardless of how they affect the core network. Significant modules in the PPI network were detected using Cystoscope's Molecular Complex Detection (MCODE) plug-in. This approach detects the dense and connected regions by weighting nodes based on their local neighborhood density. Implying the input parameters includes Degree cut-off = 4, node score cut-off = 0.2, k-core = 2, and maximum depth = 100 were the requirements (Yu et al., 2020; Zhou et al., 2020).

3. Results

3.1. Identification of DEGs for the PPI network construction and topology analysis

Using the online analysis tool GEO2R, we evaluated the DEGs between AD and control samples for the microarray expression dataset GSE140829 was represented in Fig. 1. We found 1674 DEGs using the Venn diagram to visualize the DEGs among AD and control. Furthermore, the STRING online database was utilized to obtain PPI network of DEGs. A total of 1674 DEGs (1015 up-regulated and 657 down-regulated) were filtered into the PPI, which returned a network with 1198 nodes and 1992 edges, with high confidence > 0.7 interaction score. The primary interaction data from the STRING server was used to build a core network. In the built binary scale-free network, all interactions were unweighted and undirected. The core network consists of 676 nodes connected by 1955 edges, here hub genes were defined as nodes having the highest number of interactions with neighbors as highlighted in Fig. 2. (A). In addition, several topological measures comprising centrality metrics such as between centrality and closeness centrality are used in measuring the functionality of the gene in terms of value. The network's shortest path will connect the two randomly selected nodes, and the network's betweenness centrality (BC), degree, average clustering coefficient, and distributions were shown in Fig. 3.

3.2. Functional enrichment and pathway analysis for core-network

The GO functional enrichment analysis of DEGs of the core network was carried out using the DAVID program. The top GO functional enrichment analyses in DAVID are represented in supplementary Fig. 1. (Standard cut-off: p ≤ 0.05). According to functional enrichment studies, genes in core network are enriched in many biological processes such as GO:0006915; apoptotic process, GO:0045893; positive regulation of transcription, DNA-templated, GO:0006468; protein phosphorylation, GO:0006468; protein phosphorylation, GO:0002181; cytoplasmic translation, GO:0007165;signal transduction, GO:0035556; intracellular signal transduction, GO:0006954; inflammatory response, GO:0042981; regulation of apoptotic process and GO:0097191; extrinsic apoptotic signaling pathway. The KEGG pathway also indicated that all the DGEs in the core network are majorly involved in the downstream signaling of cancer, multiple neurodegenerative disorders, lipid and atherosclerosis, Salmonella infection, Chemokine signaling pathway, Human immunodeficiency virus 1 infection, Coronavirus disease-COVID-19, Human papillomavirus infection, and tuberculosis. These signaling pathways or disease conditions enhance the pro-inflammatory action, resulting in severe synaptic dysfunction and/or neuronal death.

3.3. Identification of critical nods in PPI

Topological parameters were determined with Network Analyzer v3.3.1 to forecast and analyze core networks' key nodes or hub nodes. Hub nodes are a limited number of highly interconnected nodes that are more important in any network. After constructing a network with 676 nodes and 1995 edges, nodes with high BC and high degree were selected as per the cut-off calculated with respect to the sum of the mean and standard deviation of the same. In addition, we compared the common nodes present in both high BC and high degree and were selected; these are critical for constructing the backbone network. We identified 81 nodes with high degree values and 44 nodes with large BC values. In addition, 32 nodes present in both degrees and BC were also identified (Supplementary Table 1). The Heat Shock Protein 90 Alpha Family Class A Member 1 (HSP90AA1), ribosomal protein S27a (RPS27A), cAMP-responsive transcription factor- binding protein (CREBBP), Polyubiquitin-C (UBC), growth factor receptor-bound protein 2 (GRB2), and MAPK3 (Mitogen-Activated Protein Kinase) are the top six nodes that are present in high BC and degree.

3.4. Choosing hub genes from Cytohubba

The top 10 hub proteins from five traditional Cytohubba methods, such as betweenness centrality, closeness centrality, bottleneck, radiality, and degree, were selected using Cytohubba's ranking ways to forecast the hub proteins. This is followed by over-lapping the top ten genes among five parameters and 7 hub genes were eventually found, such as HSP90AA1, RPS27A, CREBBP, UBC, GRB2, MAPK3, and TARF6 as shown in Fig. 4. (A). Notably, the genes such as HSP90AA1, RPS27A, CREBBP, UBC, GRB2, and MAPK3 are the common gens present in high degree and high BC, and additionally TRAF6 from Cytohubba (Supplementary Table 2). The foremost involvement of these selected genes in disease progression was again confirmed by constructing a backbone network.

3.5. Backbone network construction and functional enrichment analysis

The critical point for high BC was set at 5% of the network's total nodes; the backbone network was made up of 34 nodes and the 81 links that connected them (Supplementary Table 3). With the highest BC and degree, we found HSP90AA1, RPS27A, CREBBP, UBC, GRB2, and MAPK3 as the top six genes. Also, they have the highest CC indicating that it is in charge of information flow in the backbone network. Each of these nodes has 33 neighbors, including the top six nodes for one another; additionally, PRKCB, ITGAM, MAPK14, TRAF6, TYROBP, HSPA5, PPP1CC, FOS, SKP1, CD14, PLD1, HTT, CREB1, PTPRC, EIF4A3, HIST2H2AC, SRSF1, STX5, NUP98, FLT3, PRKCD, DDX5, RPL5, YWHAE, RPS18, SREBF1, BTK, and CYP2S1.

Further, the backbone genes were examined for functional and pathway enrichment using the funrich database. Most of the genes were majorly involved in biological processes such as signal transduction, cell communication, and protein metabolism, followed by regulation of nucleobase, nucleoside, nucleotide, and nucleic acid metabolism. Also, the backbone network with several functional enriched processes was shown in Fig. 5. Furthermore, Cytoscape was used to import backbone genes for network topological analysis. The network returned the genes with high BC and high degree was shown in Table.1. Also, Cytohubba analysis for the backbone network returned HSP90AA1, HSPA5, GRB2, MAPK3, TRAF6, and UBC as the hub genes as shown in backbone network construct Fig. 2. (B). The overlapping genes from all the five parameters, including betweenness, bottleneck, centrality, degree, radiality of backbone network, were shown in Fig. 4. (B). Here, HSP90AA1 is the major identified biomarker with the highest degree and BC value in backbone network analysis. Besides, GRB2, MAPK3, TRAF6, and UBC are identified in both core network and backbone network analysis. However, RPS27A and CREBBP were observed in core network analysis, whereas HSPA5 was observed in backbone network analysis.

3.6. Identification of transcriptional factors and its kinases in the PPI for the screened biomarkers

eXpression2Kinase web was employed to identify the transcriptional regulators. The PPI network having highly enriched transcription factors with p value ≤ 0.05 and PPI network interaction expansion was shown in Fig. 6. (A) & (B) respectively. However, for the identified biomarkers the transcription factors such as ZMIZ1, CHD1, TAF1, E2F1, YY1, and MYC are regulating HSP90AA1. ZMIZ1, CREB1, TAF1, BRCA1, KAT2A, NELFE, NFYB, SP1, ATF2, FOS are regulating HSPA5. RUNX1, GATA2, CHD1, BRCA1, and NRF1 are regulating GRB2. ZBTB7A, USF1, and USF2 are regulating MAPK3. ELF and STAT5A are regulating TRAF6. RUNX1, GATA2, RCOR1, and YY1 are regulating CREBBP. SPI1, ZMIZ1, TAF1, STAT3, ELF1, MAX, MYC, and ATF2 are regulating RPS27A.

Along the side, presence of various kinases that are important for signal transduction, because majority of the biological function enriched are the same. In the PPI network, identification of kinases is also brought by X2K web. Top enriched kinases in the PPI network having p value ≤ 0.05 was shown in Fig. 6. (C). Also, we analyzed the kinases that regulate the transcriptional factors enriched which intern regulate the expression of screened markers was shown in table .2.

3.7. Identification of clusters in the PPI network (MCODE analysis)

A protein-protein interaction network was analyzed using the MCODE method to discover strongly linked proteins. This method was utilized to perform a module or clustering analysis and to confirm the functional DEGs. The clusters were filtered using the K-core: 4 parameters provided in the approach to testing functional partners' efficiency towards the core network of essential genes. The genes with the most closely linked interactors are grouped together based on the number of interactions between each node. As a result of the clustering analysis of the genes in the interaction network, eight strongly connected clusters appeared (C1-C8). The top cluster has 19 nodes and 161 edges, with a score of 17.889, and the remaining clusters are listed in the table.3. Additionally, the examination of function enrichment looked into biological processes, molecular functions, and cellular components. Here, C1 is primarily involved in protein metabolism, followed by C2 and C5 for cellular transport, C3, C4, and C7 for nucleic acid metabolism, and C6 and C8 are involved in signal transduction.

4. Discussion

Alzheimer's disease is a neuropsychiatric and neurodegenerative disorder with growing incidence rate all over the world. This study signifies the identification of novel genes or proteins involved in pathophysiological aspects such as protein aggregation and neuroinflammation. The altered biological and molecular activities of DEGs as well as their signaling pathways are responsible for the development of AD. Furthermore, these genes can be targeted for therapeutic approaches in the treatment of AD. In this study we evaluated the DEGs from AD patients as compared to normal senile individuals that were extracted from the GEO database. A total of 1674 DEGs were identified in AD as well as in control blood samples, including both upregulated and downregulated genes and were used for the construction and analysis of the protein-protein interaction network. Notably, this extended network was built using the STRING database and has 744 nodes and 1992 edges. Furthermore, a giant or core network with 676 nodes and 1995 edges was created from the extended network. In addition, there are many subnetworks with fewer nodes and edges in the extended network. The first subnetwork has 5 nodes and 4 edges, whereas the second subnetwork has 4 nodes and 3 edges, and the third network has 3 nodes and 2 edges. The genes present in the subnetwork do not appear to be substantially related to disease severity. However, core network analysis is the main route for identifying truncated proteins by chaperones in AD pathogenesis. According to functional enrichment studies, genes in the core network are enriched in many biological processes such as GO:0006915; apoptotic process, GO:0045893; positive regulation of transcription, DNA-templated, GO:0006468; protein phosphorylation, GO:0046777; protein autophosphorylation, GO:0002181; cytoplasmic translation, GO:0007165; signal transduction, GO:0035556; intracellular signal transduction, GO:0006954; inflammatory response, GO:0042981; regulation of apoptotic process and GO:0097191; extrinsic apoptotic signaling pathway. These enriched biological functions support prior studies suggesting oligomeric amyloid beta buildup causes widespread protein phosphorylation, which leads to prolonged autophosphorylation of membrane receptor kinases. The activation of the GSK-3 beta signaling pathway as a result of autophosphorylation, destabilizes the microtubules, leading to the production of neurofibrillary tangles as downstream pathogenesis of AD. The KEGG pathway also indicated that all the DGEs in the core network are majorly involved in the downstream signaling of cancer, multiple neurodegenerative disorders, lipid and atherosclerosis, Salmonella infection, Chemokine signaling pathway, Human immunodeficiency virus 1 infection, Coronavirus disease-COVID-19, Human papillomavirus infection, and tuberculosis. These signaling pathways or disease conditions enhance the pro-inflammatory signaling pathways, resulting in severe synaptic dysfunction and/or neuronal death. The DEGs enriched in multiple neurodegenerative disorders, including AD are shown in Fig. 7.

In order to examine highly interacting pathways related to the pathophysiology of AD, the key nodes in the core network that are significantly interacting were identified. In accordance with the cutoff value determined in relation to the sum of the mean and standard deviation, we obtained 44 genes from the high BC value (cut off value = 0.019528) and 81 genes from the high degree value (cut off value = 12.66321). Further, we investigated the common genes present in both high BC and high degree; we found 32 genes, with the top six genes being HSP90AA1, RPS27A, CREBBP, UBC, GRB2, and MAPK3. It is further confirmed by presence of all these 6 genes in the hub genes (i.e., top 10 genes) in the core network were identified based on the five ranking methods of Cytohubba. Additionally, using the 5% of high BC genes in the core network, a backbone network was built to further validate the information flow in the core network. A total of 34 genes (5%) were selected from 676 genes from the core network and were pluged into the funrich software for functional enrichment analysis. In the backbone network majority of the genes are involved in the biological process such as signal transduction (41.2%), cell communication (29.4%), protein metabolism (23.5%) and followed by regulation of nucleobase, nucleoside, nucleotide, and nucleic acid metabolism (23.5%). Further, these backbone network genes were again filtered to the Cytoscape for the topological analysis, which returned HSP90AA1in the first position in all three parameters such as high degree, high BC and from cytohubba. Additionally, we got HSPA5 in the second position with high BC and cytohubba. Further, we compelled the presence of all the top genes from core network and backbone network analysis. We obtained HSP90AA1, UBC-RPS27A (closely associated genes), CREBBP, GRB2, and MAPK3, and TRAF6 as the major biomarkers enriched in AD pathogenesis. However, HSPA5 is a known biomarker in AD. Both HSP90AA1 and HSPA5, it is mainly dependent on ATP in the regulation and clearance of protein aggregation. However, it is not effective against fibril conformations as seen in AD and this prompts to increase in secretion and accumulation, which further substantiates the degeneration of neurons and glial cells.

As a newly identified HSP90AA1 biomarker, it is quite prevalent in eukaryotic cells, accounting for 1–2% of total cellular proteins. This protein's concentration can rise to as high as 4–6% in stressful situations. Funrich functional enrichment analysis also shows that the biological process of HSP90AA1 is involved in protein metabolism (GO:0019538) and its protein domain signifies the presence of coiled-coil region and HATPase c. Also, which is having fold chain enrichment of 101.92 in vRNP Assembly, 22.73 in eNOS activation, 21.04 in VEGFR1 specific signalling, 15.23 in glucocorticoid receptor regulatory network, 13.64 in Sema3A PAK dependent axon repulsion and 12.36 in IL-2 signaling events mediated by PI3K. In this regard, aberrantly expressed molecular chaperones have critical functions in regulating the aggregation of damaged proteins in cells, such as amyloid beta 1−42. This critically misfolded or aggregated protein induces tau phosphorylation as downstream pathogenesis. Some studies suggested that inhibition of over-expressed HSP90AA1 reduces the kinases activity and limits the tau phosphorylation as seen in AD (Ou et al., 2014). Furthermore, HSP90-induced amyloid beta 1−42 also activates the glial cells such as astrocytes and microglia and exuberates the inflammatory response leading to severe neuronal degeneration. Besides, HSP90 inhibitory studies show a promising role in the downregulation of several inflammatory responses, mainly by promoting the ubiquitin-protease pathway (Dukay et al., 2019; He et al., 2019) and as such, targeting HSP90AA1 in AD may reduce the pathophysiology by lowering alerted protein structure accumulation

RPS27A - Ubiquitin-40S ribosomal protein, which is also found to be intricated in protein metabolism, has UBQ protein domain (Montellese et al., 2020). It is involved in the biological process such as IRAK2 mediated TAK1 activation with 60.99-fold change, Fanconi anemia pathway with 50.89-fold change, NF-kB activation with 46.92-fold change, p75NTR signalling complexes with 46.92-fold-change, NRIF signaling with 43.57-fold change, and TRAF6 mediated induction of TAK1 complex with 43.57-fold change. Since, the presence of ubiquitin immunoreactivity in AD-related neuronal aggregates, the ubiquitin-proteasome system (UPS) was thought to have a role in the disease's progression (Gong et al., 2016). According to a recent study, a mutant version of UBC called Ub + 1 is connected to both the early and late stages of AD, which are characterized by synaptic dysfunction and neurodegeneration. When non-DNA-encoded dinucleotide deletion (s) are located in mRNA within or close to GAGAG motifs, this process is known as molecular misreading and results in the expression of Ub + 1 (Lam et al., 2000). In addition, it is also possible that other ubiquitin-proteosome members' dysfunction may also promote Aβ accumulation. Altogether, oxidative stress and Aβ accumulation induce the mutant ubiquitin, which was shown to reduce the activity of the proteasome in vitro. However, although UBB + 1 also increased the expression of heat shock proteins (Ding & Keller, 2001).

It has long been understood that cyclic-AMP response element binding protein (CREB), a protein that binds to the cyclic-AMP response element, is crucial for the conversion of short-term memory to long-term memory, which is mediated by long-lasting alterations in synaptic plasticity (Alberini, 2009). It is involved in biological processes, specifically the regulation of nucleobase, nucleoside, nucleotide, and nucleic acid metabolism, and it has protein domain structures such as a coiled-coil region, a zinc finger region, a bromodomain, and KAT1. These protein domains are majorly involved in protein acetylation, which can significantly modify a protein's surface characteristics, solubility, hydrophobicity, and ability to function. Also, it is involved in biological processes such as TRAF3-dependent IRF activation pathway with 20.46-fold change, NICD traffics to the nucleus with 18.60-fold change, notch-HLH transcription pathway with 18.60-fold change, retinoic acid receptors-mediated signaling with 17.30-fold change, and TRAF6 mediated IRF7 activation with 16.30-fold change. Recently, CREB signaling has been linked to several pathological states of the brain, such as cognitive and neurodegenerative illnesses, including AD. During normal conditions, protein kinase A phosphorylates CREBBP in the nucleus and this helps in the production of BDNF (Miranda et al., 2019). In AD, amyloid beta induces the extensive phosphorylation of tyrosine receptors, which results in the successive activation of phosphatases 1 and 2, which attenuates the activation site of CREBBP (serine 133 residue) and results in a lowered level of BNDF, neuronal excitability and plasticity, and triggers neurodegeneration (Wang et al., 2018). As a result, CREBBP levels can be used as a biomarker for unconsolidated long-term memory potentiation in Alzheimer's patients.

An adaptor protein, growth factor receptor-bound protein 2 (Grb2), in accordance with the Funrich functional enrichment analysis these proteins take part in biological processes such as Signal transduction, regulation of cell cycle, and Apoptosis in AD. Its role in biological pathway includes Spry regulation of FGF signaling with 38.18-fold change, signal attenuation with 37.03-fold change, signal regulatory protein (SIRP) family interactions with 37.03-fold change, negative regulation of FGFR signaling with 35.89-fold change and SHC-mediated cascade with 34.08-fold change. Also, it has SRC homology 2 & 3 as the major protein domains. As correlating with the enriched biological pathways, the phosphotyrosine-binding domain (PTB) and src homology domain (SH2) of the adaptor proteins Shc and Grb2 are able to directly bind tyrosine-phosphorylated APP. This is followed by the MAP kinase cascade's SoS, ras, Raf, and MEK are recruited, which activates ERK1/2 (Nizzari et al., 2012). Direct binding to APP or recruitment by Shc are two possible ways that Grb2 can take part in this process. This method of inducing ERK1/2 activity change may help to explain why AD patients' neurons begin to degenerate. Besides, the pathogenic correlation between Shc/Grb2 binding to Aβ during AD development is supported by the observation that the complexes Aβ/ShcA or Grb2 are significantly increased in AD brain compared to controls (Meister et al., 2013).

TRAF6 is increasingly being linked to conditions affecting the central nervous system, including neurodegenerative disorders and neuropathic pain. It was discovered to be integrated with various kinases to control signaling pathways and function as an E3 ubiquitin ligase. Also, it is found to be involved in many biological pathways, including IRAK2 mediated activation of TAK1 complex with 60.99-fold change, NF-kB and signal survival with 46.92-fold change, p75NTR recruits signaling complexes with 46.92-fold change, and induction of TAK1 complex signaling with 43.57-fold change. Its enriched protein domains include coiled coil region, ring structure, and Math structure. In AD, TRAF6 activates Becline1 (autophagy activator protein) through TLR4, resulting in ubiquitination of Beclin-1. This prevents autophagy and induces an inflammatory response. Furthermore, numerous activated astrocytes and microglia near amyloid plaques in Alzheimer's disease lead to the generation of inflammatory cytokines including IL-1B and TNF-α, and TRAF6 is critical to this process (Dou et al., 2018).

Considering the significant role of identified biomarkers in the pathological process of AD, eXpression2Kinases serves the objective of identification of transcriptional factors and their kinases., We discovered that,

  • the transcriptional regulator TAF1 for HSP90AA1, HSPA5, and RPS27A is phosphorylated by CK2ALPHA, CDK1, GSK3B, and CDK2 kinases.

  • HIPK2, CSNK2A1, CDK1, AKT1, GSK3B, PRKACA, RPS6KA3, RPS6KA1, and PKBALPHA kinases control the HSPA5 transcription regulator CREB1.

  • CK2ALPHA, MAPK1, MAPK3, MAPK14, CDK1, AKT1, and GSK3B kinases control the HSPA5 transcription regulator SP1.

  • RUNX1, a transcriptional regulator of GRB2 and CREBBP, is controlled by HIPK2, MAPK1, MAPK3, CSNK2A1, MAPK14, CDK1, GSK3B, and MAPK8 kinases.

  • CDK1, AKT1, and GSK3B kinases regulate transcription regulator GATA2 for GRB2 and CREBBP.

  • CREBBP transcription regulator RCOR1 is regulated by CK2alpha and CDK2 kinases.

  • RPS27A transcription regulator SPI1 is regulated by CK2ALPHA, CSNK2A1, CK2, ABL1 and PRKCA kinases.

5. Conclusion And Future Remarks

The current analysis of the GEO dataset identified key metabolic pathways and processes that contributes to neurodegeneration in AD, which increases the likelihood of cognitive dysfunction. Here, regulation of diverse biological processes and pathways is where the DEGs were primarily discovered. Likewise, genes such as HSP90AA1, HSPA5, CREBBP, UBC, GRB2, MAPK3, and TRAF6 are majorly involved in the signal transduction that leads the destabilization of downstream signaling and contribute to the pathogenesis of AD. Furthermore, we have endeavored to identify the genes that contribute to the development of the disease, but also their transcriptional factor regulators and the kinases that control them. Overall, the identified hub genes, their transcription factors, and their kinase can be targeted as potential targets for treating AD. To label the found hub proteins as disease biomarkers, the possible function of the identified hub genes must be confirmed by additional biological and clinical research experimentation.

Declarations

Author’s contributions

Pavan K J: Material Preparation, Data Collection, Analysis and Writing-Original Draft Preparation. Praveenkumar Shetty: Conceptualization and Supervision. Pavan Gollapalli: Data Analysis and Critical inputs. Vijaykrishnaraj M: Reviewing and Editing. Lobo Manuel Alexander: Reviewing and Clinical Inputs. Prakash Patil: Reviewing and Editing. All Authors Read and Approved the Final Manuscript.

Declaration of Competing Interest:

The authors of the manuscript report to there are no conflicts of interest.

Funding:

This work was not supported by any of the external funding agencies and the work was registered under the Ph.D. program of Mr. Pavan K J (N20PHDBS106) in NITTE (Deemed to be University), Mangalore.

Acknowledgements:

The author PKJ expresses his gratitude to NITTE (Deemed to be University) for providing a Ph.D. fellowship to execute this research work. Dr. PKS and co-authors thank NITTE (Deemed to be University) for providing the necessary facilities.

References

  1. Alzheimer's disease facts and figures. (2020). Alzheimers Dement. https://doi.org/10.1002/alz.12068
  2. Alberini, C. M. (2009). Transcription factors in long-term memory and synaptic plasticity. Physiol Rev, 89(1), 121-145. https://doi.org/10.1152/physrev.00017.2008
  3. Ashton, N. J., Leuzy, A., Karikari, T. K., Mattsson-Carlgren, N., Dodich, A., Boccardi, M., Corre, J., Drzezga, A., Nordberg, A., Ossenkoppele, R., Zetterberg, H., Blennow, K., Frisoni, G. B., Garibotto, V., & Hansson, O. (2021). The validation status of blood biomarkers of amyloid and phospho-tau assessed with the 5-phase development framework for AD biomarkers. Eur J Nucl Med Mol Imaging, 48(7), 2140-2156. https://doi.org/10.1007/s00259-021-05253-y
  4. Assenov, Y., Ramirez, F., Schelhorn, S. E., Lengauer, T., & Albrecht, M. (2008). Computing topological parameters of biological networks. Bioinformatics, 24(2), 282-284. https://doi.org/10.1093/bioinformatics/btm554
  5. Barrett, T., Wilhite, S. E., Ledoux, P., Evangelista, C., Kim, I. F., Tomashevsky, M., Marshall, K. A., Phillippy, K. H., Sherman, P. M., Holko, M., Yefanov, A., Lee, H., Zhang, N., Robertson, C. L., Serova, N., Davis, S., & Soboleva, A. (2013). NCBI GEO: archive for functional genomics data sets--update. Nucleic Acids Res, 41(Database issue), D991-995. https://doi.org/10.1093/nar/gks1193
  6. Chu, D., & Liu, F. (2019). Pathological Changes of Tau Related to Alzheimer's Disease. ACS Chem Neurosci, 10(2), 931-944. https://doi.org/10.1021/acschemneuro.8b00457
  7. Clarke, D. J. B., Kuleshov, M. V., Schilder, B. M., Torre, D., Duffy, M. E., Keenan, A. B., Lachmann, A., Feldmann, A. S., Gundersen, G. W., Silverstein, M. C., Wang, Z., & Ma'ayan, A. (2018). eXpression2Kinases (X2K) Web: linking expression signatures to upstream cell signaling networks. Nucleic Acids Res, 46(W1), W171-W179. https://doi.org/10.1093/nar/gky458
  8. Clough, E., & Barrett, T. (2016). The Gene Expression Omnibus Database. Methods Mol Biol, 1418, 93-110. https://doi.org/10.1007/978-1-4939-3578-9_5
  9. Cummings, J. (2019). The Role of Biomarkers in Alzheimer's Disease Drug Development. Adv Exp Med Biol, 1118, 29-61. https://doi.org/10.1007/978-3-030-05542-4_2
  10. DeTure, M. A., & Dickson, D. W. (2019). The neuropathological diagnosis of Alzheimer's disease. Mol Neurodegener, 14(1), 32. https://doi.org/10.1186/s13024-019-0333-5
  11. Ding, Q., & Keller, J. N. (2001). Proteasome inhibition in oxidative stress neurotoxicity: implications for heat shock proteins. J Neurochem, 77(4), 1010-1017. https://doi.org/10.1046/j.1471-4159.2001.00302.x
  12. Dou, Y., Tian, X., Zhang, J., Wang, Z., & Chen, G. (2018). Roles of TRAF6 in Central Nervous System. Curr Neuropharmacol, 16(9), 1306-1313. https://doi.org/10.2174/1570159X16666180412094655
  13. Dukay, B., Csoboz, B., & Toth, M. E. (2019). Heat-Shock Proteins in Neuroinflammation. Front Pharmacol, 10, 920. https://doi.org/10.3389/fphar.2019.00920
  14. Gong, B., Radulovic, M., Figueiredo-Pereira, M. E., & Cardozo, C. (2016). The Ubiquitin-Proteasome System: Potential Therapeutic Targets for Alzheimer's Disease and Spinal Cord Injury. Front Mol Neurosci, 9, 4. https://doi.org/10.3389/fnmol.2016.00004
  15. He, G. L., Luo, Z., Shen, T. T., Yang, J., Li, P., Luo, X., & Yang, X. S. (2019). Inhibition of HSP90beta by ganetespib blocks the microglial signalling of evoked pro-inflammatory responses to heat shock. Int J Biochem Cell Biol, 106, 35-45. https://doi.org/10.1016/j.biocel.2018.11.003
  16. Hwang, S., Son, S. W., Kim, S. C., Kim, Y. J., Jeong, H., & Lee, D. (2008). A protein interaction network associated with asthma. J Theor Biol, 252(4), 722-731. https://doi.org/10.1016/j.jtbi.2008.02.011
  17. Kinney, J. W., Bemiller, S. M., Murtishaw, A. S., Leisgang, A. M., Salazar, A. M., & Lamb, B. T. (2018). Inflammation as a central mechanism in Alzheimer's disease. Alzheimers Dement (N Y), 4, 575-590. https://doi.org/10.1016/j.trci.2018.06.014
  18. Lam, Y. A., Pickart, C. M., Alban, A., Landon, M., Jamieson, C., Ramage, R., Mayer, R. J., & Layfield, R. (2000). Inhibition of the ubiquitin-proteasome system in Alzheimer's disease. Proc Natl Acad Sci U S A, 97(18), 9902-9906. https://doi.org/10.1073/pnas.170173897
  19. Long, J. M., & Holtzman, D. M. (2019). Alzheimer Disease: An Update on Pathobiology and Treatment Strategies. Cell, 179(2), 312-339. https://doi.org/10.1016/j.cell.2019.09.001
  20. Marucci, G., Buccioni, M., Ben, D. D., Lambertucci, C., Volpini, R., & Amenta, F. (2021). Efficacy of acetylcholinesterase inhibitors in Alzheimer's disease. Neuropharmacology, 190, 108352. https://doi.org/10.1016/j.neuropharm.2020.108352
  21. Meister, M., Tomasovic, A., Banning, A., & Tikkanen, R. (2013). Mitogen-Activated Protein (MAP) Kinase Scaffolding Proteins: A Recount. Int J Mol Sci, 14(3), 4854-4884. https://doi.org/10.3390/ijms14034854
  22. Miranda, M., Morici, J. F., Zanoni, M. B., & Bekinschtein, P. (2019). Brain-Derived Neurotrophic Factor: A Key Molecule for Memory in the Healthy and the Pathological Brain. Front Cell Neurosci, 13, 363. https://doi.org/10.3389/fncel.2019.00363
  23. Montellese, C., van den Heuvel, J., Ashiono, C., Dorner, K., Melnik, A., Jonas, S., Zemp, I., Picotti, P., Gillet, L. C., & Kutay, U. (2020). USP16 counteracts mono-ubiquitination of RPS27a and promotes maturation of the 40S ribosomal subunit. Elife, 9. https://doi.org/10.7554/eLife.54435
  24. Montoliu-Gaya, L., Strydom, A., Blennow, K., Zetterberg, H., & Ashton, N. J. (2021). Blood Biomarkers for Alzheimer's Disease in Down Syndrome. J Clin Med, 10(16). https://doi.org/10.3390/jcm10163639
  25. Nizzari, M., Thellung, S., Corsaro, A., Villa, V., Pagano, A., Porcile, C., Russo, C., & Florio, T. (2012). Neurodegeneration in Alzheimer disease: role of amyloid precursor protein and presenilin 1 intracellular signaling. J Toxicol, 2012, 187297. https://doi.org/10.1155/2012/187297
  26. Ou, J. R., Tan, M. S., Xie, A. M., Yu, J. T., & Tan, L. (2014). Heat shock protein 90 in Alzheimer's disease. Biomed Res Int, 2014, 796869. https://doi.org/10.1155/2014/796869
  27. Pan, Y., Liu, G., Yuan, Y., Zhao, J., Yang, Y., & Li, Y. (2017). Analysis of differential gene expression profile identifies novel biomarkers for breast cancer. Oncotarget, 8(70), 114613-114625. https://doi.org/10.18632/oncotarget.23061
  28. Pathan, M., Keerthikumar, S., Ang, C. S., Gangoda, L., Quek, C. Y., Williamson, N. A., Mouradov, D., Sieber, O. M., Simpson, R. J., Salim, A., Bacic, A., Hill, A. F., Stroud, D. A., Ryan, M. T., Agbinya, J. I., Mariadason, J. M., Burgess, A. W., & Mathivanan, S. (2015). FunRich: An open access standalone functional enrichment and interaction network analysis tool. Proteomics, 15(15), 2597-2601. https://doi.org/10.1002/pmic.201400515
  29. Raman, K. (2010). Construction and analysis of protein-protein interaction networks. Autom Exp, 2(1), 2. https://doi.org/10.1186/1759-4499-2-2
  30. Sekaran, T. S. G., Kedilaya, V. R., Kumari, S. N., Shetty, P., & Gollapalli, P. (2021). Exploring the differentially expressed genes in human lymphocytes upon response to ionizing radiation: a network biology approach. Radiat Oncol J, 39(1), 48-60. https://doi.org/10.3857/roj.2021.00045
  31. Singh, K., Baird, M., Fischer, R., Chaitankar, V., Seifuddin, F., Chen, Y. C., Tunc, I., Waterman, C. M., & Pirooznia, M. (2020). Misregulation of ELK1, AP1, and E12 Transcription Factor Networks Is Associated with Melanoma Progression. Cancers (Basel), 12(2). https://doi.org/10.3390/cancers12020458
  32. Soeda, Y., & Takashima, A. (2020). New Insights Into Drug Discovery Targeting Tau Protein. Front Mol Neurosci, 13, 590896. https://doi.org/10.3389/fnmol.2020.590896
  33. Szklarczyk, D., Gable, A. L., Nastou, K. C., Lyon, D., Kirsch, R., Pyysalo, S., Doncheva, N. T., Legeay, M., Fang, T., Bork, P., Jensen, L. J., & von Mering, C. (2021). The STRING database in 2021: customizable protein-protein networks, and functional characterization of user-uploaded gene/measurement sets. Nucleic Acids Res, 49(D1), D605-D612. https://doi.org/10.1093/nar/gkaa1074
  34. Wahab Khattak, F., Salamah Alhwaiti, Y., Ali, A., Faisal, M., & Siddiqi, M. H. (2021). Protein-Protein Interaction Analysis through Network Topology (Oral Cancer). J Healthc Eng, 2021, 6623904. https://doi.org/10.1155/2021/6623904
  35. Wang, H., Xu, J., Lazarovici, P., Quirion, R., & Zheng, W. (2018). cAMP Response Element-Binding Protein (CREB): A Possible Signaling Molecule Link in the Pathophysiology of Schizophrenia. Front Mol Neurosci, 11, 255. https://doi.org/10.3389/fnmol.2018.00255
  36. Yang, L., He, T., Xiong, F., Chen, X., Fan, X., Jin, S., & Geng, Z. (2020). Identification of key genes and pathways associated with feed efficiency of native chickens based on transcriptome data via bioinformatics analysis. BMC Genomics, 21(1), 292. https://doi.org/10.1186/s12864-020-6713-y
  37. Yu, H., Zhao, F., Li, J., Zhu, K., Lin, H., Pan, Z., Zhu, M., Yao, M., & Yan, M. (2020). TBX2 Identified as a Potential Predictor of Bone Metastasis in Lung Adenocarcinoma via Integrated Bioinformatics Analyses and Verification of Functional Assay. J Cancer, 11(2), 388-402. https://doi.org/10.7150/jca.31636
  38. Zhou, W., Wu, J., Liu, X., Ni, M., Meng, Z., Liu, S., Jia, S., Zhang, J., Guo, S., & Zhang, X. (2020). Identification of crucial genes correlated with esophageal cancer by integrated high-throughput data analysis. Medicine (Baltimore), 99(20), e20340. https://doi.org/10.1097/MD.0000000000020340

Tables

Table 1. Top hub genes from the backbone network's high Degree, high BC and Cytohabb

Genes with high Degree

Degree value

High BC

BC value

Hub gens from Cytohabba

(Backbone network)

HSP90AA1

22

HSP90AA1

0.192917

HSP90AA1

GRB2

16

HSPA5

0.088243

HSPA5

MAPK3

16

GRB2

0.074645

GRB2

CREB1

14

MAPK3

0.073964

MAPK3

TRAF6

14

CREBBP

0.068398

TRAF6

HSPA5

12

PRKCB

0.063594

UBC

UBC

12

CREB1

0.047575

-----

FOS

12

TRAF6

0.045024

-----

MAPK14

12

UBC

0.040602

-----

CREBBP

11

RPS27A

0.035523

-----

Table.2. List of kinases enriched for the identified transcriptional factors and their genes

Gene

Transcription factors

Top Enriched kinases p = ≤0.01 (max top 10 kinases)

HSP90AA1, HSPA5 and RPS27A

TAF1

CK2ALPHA, CDK1, GSK3B, and CDK2

HSPA5

CREB1

HIPK2, CSNK2A1, CDK1, AKT1, GSK3B, PRKACA, RPS6KA3, RPS6KA1, and PKBALPHA

SP1

CK2ALPHA, MAPK1, MAPK3, MAPK14, CDK1, AKT1, GSK3B, ERK1, MAPK8 and CDK4

GRB2 and CREBBP

RUNX1

HIPK2, MAPK1, MAPK3, CSNK2A1, MAPK14, CDK1, GSK3B, MAPK8, JNK1, and SRC


GATA2

CDK1, AKT1, and GSK3B

CREBBP

RCOR1

CK2ALPHA and CDK2

RPS27A

SPI1

CK2ALPHA, CSNK2A1, CK2, ABL1, PRKCA, PRKACA , SRC, FYN, LYN, and PRKCQ

TRAF6

ELF and STAT5A

No kinase was enriched with in the network

MAPK3

MAPK3 itself is a kinase which regulates SP1and RUNX1 transcription factors in the PPI network

Table.3. list of tightly connected modules enriched in the core network 

Cluster

Score (Density*Nodes)

Nodes

Edges

Node IDs

 

1

17.889

19

161

RPL36, RPL28, RPS25, RPL32, RPS18, RPLP1, RPP38, RPS8, RPS6, RPL5, RPL12, MRPS10, MRPL2, RPS2, RPL15, RPS16, RPL18A, CCDC124, RPS27A

2

9.8

11

49

NUP98, NUP37, PHAX, NUP88, HSPA5, HSPA1L, SNUPN, HSPA6, NUP35, RAE1, NUP153

3

8.75

9

35

TARDBP, TRA2B, HNRNPL, CHTOP, TAF15, HNRNPH1, HNRNPA2B1, HNRNPF, SRSF1

4

6.545

12

36

HIST1H2BD, SAP130, KDM6B, AGO4, RUVBL2, HIST2H2AC, H2AFJ, LEF1, HIST1H2BK, FOS, TNRC6C, MAPK14

5

6.353

18

54

BCL6, MAPK3, MAPK1, NCF4, ATP6V0E1, NCF1, PRKCB, PTPN6, ATP6AP1, PRKCZ, ATP6V1H, BCR, ATP6V0D1, ATP6V1B2, TCIRG1, ATP6V1B1, ATP5G2, PRKCD

6

6.25

9

25

TRAF5, TRADD, CASP8, FADD, RIPK1, TNFRSF1B, TNFRSF1A, BID, SMPD1

7

5.6

11

28

POLDIP3, WDR25, PPWD1, SNRPA1, DHX38, SF3A1, CWC15, PLRG1, CWC27, THOC5, THOC2

8

4.4

16

33

UBE2W, CD79A, TRIM25, TRAF6, UBE2C, BTK, TAB1, UBE2E1, CD79B, MAP3K3, UBA1, PTPRC, UBE2H, UBE2F, TANK, VAV1