In this part, the used dataset is described in detail, and then the adopted methodology and approaches are described.
Dataset
We utilized a dataset provided by J Xin and et al. by retrieving the studies related to the genetic association in Alzheimer’s disease from PubMed. The writers of this paper have extensively reviewed 5298 reports. Then, by omitting the unrelated reports, 823 publications presenting more significant associations were chosen. Finally, there is a list of 431 genes known as Alzheimer-related genes(9).
Extracting protein complexes from Protein-Protein Interaction (PPI) network
First, a STRING database(10), which includes all interactions between proteins, was utilized to construct a PPI network. The network constructed for this study is a collection of genes with experimentally obtained interactions. The constructed PPI network was then clustered by the EAGLE algorithm, which is one of the algorithms in ClusterViz application(11). After examining different values, 3 and 2 values were selected as “CliqueSize Threshold” and “ComplexSize parameters, respectively. These are two parameters associated with the EAGLE algorithm. Among the 9 clusters, constructed by running this algorithm, 5 clusters were selected for the next research step. Four remaining clusters were eliminated because they were meaningless according to their structures and were not able to represent a gene complex. Five selected groups of genes are represented in Table 1.
Enrichment analysis of genes
Functional Enrichment analysis was accomplished for each cluster independently. The results are demonstrated in separate tables. The p-value of the most significant pathway is equal to 4.16E-04 and involved six genes. The related term to this pathway is Transcriptional and is found in cluster 1. In Cluster 3, the substantial pathway is Alzheimer’s disease. Its p-value is equal to 5.13E-11, and it includes ten genes. In the sixth cluster, the most crucial pathway is Hepatitis c. Its p-value is equal to 0.004662, including four genes. The next cluster, cluster 7, the most significant pathway is the Neuroteoohin signalling pathway with eight genes. Its p-value is 4.74E-11 and it includes eight significant genes. The last cluster, cluster 8, the most crucial pathway is Malaria. Its p-value is 7.32E-04 and includes three genes. These are the most significant pathways for each cluster. More comprehensive information, including all the pathways, is available in Additional File 1 (Supplementary Tables 3-7).
According to the gene ontology analysis, the most important biological processes for cluster 1 were respectively transcription from RNA polymerase II promoter (p-value=4.66E-13), with 21 genes, regulation of transcription from RNA polymerase II promoter (p-value=5.45E-13), with two genes, and positive regulation of cellular metabolic process (p-value=9.75E-13) with 24 genes. For cluster 3, membrane protein proteolysis was the most significant biological process (p-value=1.78E-15) and included nine genes. The next important processes are were notch receptor processing (p-value=7.92E-12) with six genes and membrane protein ectodomain proteolysis (p-value=8.90E-12) with seven genes. In the sixth cluster, the most important biological process with 18 genes (p-value=3.91E-09) was the negative regulation of the biological process. The next three important processes were respectively negative regulation of response to a stimulus with 12 genes (p-value=1.89E-08), regulation of apoptotic process with 12 genes (p-value=2.39E-08), and regulation of programmed cell death with 12 genes (p-value=2.64E-08). In cluster 7, the important processes are respectively, transmembrane receptor protein tyrosine kinase signaling pathway with 11 genes (p-value=4.21E-13), enzyme-linked receptor protein signaling pathway with 11 genes (p-value=1.99E-11), and cell surface receptor signaling pathway with 13 genes (p-value=2.55E-10). In cluster 8, significant processes are respectively vesicle-mediated transport with 10 genes (p-value=5.86E-08), endocytosis with 8 genes (p-value=9.86E-08), transport with 13 genes (p-value=2.08E-07) and establishment of localization with 13 genes (p-value=2.87E-07). The other important biological pathways sorted by their p-value are available in separate tables in the Additional File 1 (Supplementary Tables 8-12).
Bipartite miRNA-gene networks
Bipartite gene-miRNA networks, illustrated by Cytoscape.3.7.0, can examine complexes more in-depth. These bipartite networks are first analyzed by network specifications and then sorted by the degree of the nodes. The miRNAs with a higher degree as well as their associated genes and connections are normally selected. Bipartite gene-miRNA networks for each cluster are shown in Figure 1. The red oval nodes represent the genes, and the blue rectangular ones signify their related miRNAs. There is a representation of the miRNAs concerning each cluster in the Additional File 1 (Supplementary Table 1). After extracting bipartite networks, some of the genes are to be omitted. So, the lists of the genes are changed in the next step. These updated lists of genes are represented in the Additional File 1 (Supplementary Table2).
Table 1
list of the selected clusters along with their genes
Number of Clusters
|
Number of Nodes
|
Nodes
|
1
|
31
|
PPARG, CAMK2D, RXRA, SP1, ESR2, GRIN2B, FAS, POU2F1, CDKN2A, NOS3, AR, CCNT1, ABCA1, RUNX1, VDR, NME8, UBE2I, TP63, TP73, PPARA, NR1H2, UBD, PNMT, TBX3, ESR1, CLOCK, SNX3, PIN1, TP53, BCR, CAV1
|
3
|
22
|
PSENEN, APH1A, TARDBP, NCSTN, PSEN1, APP, UBQLN1, APBB1, APBB3, COL25A1, BACE2, BACE1, TRAK2, KLC1, CTSD, MAPK8IP1, APH1B, TGFB1, CD14, APBB2, PSEN2, TLR4
|
6
|
19
|
GSTP1, DAPK1, EIF2AK2, SLC6A3, SNCA, UBE2D1, LRP6, NLRP3, RAB7A, LRRK2, YWHAQ, GSK3B, TRAF2, TNF, SLC6A4, RCAN1, EIF4EBP1, UNC5C, SERPINA1
|
7
|
13
|
NEDD9, NGF, SORCS3, NTRK1, LCK, NTRK2, PTK2B, BDNF, NGFR, NTF3, IRS1, PIK3R1, CD44
|
8
|
13
|
RELN, IL10, IL1B, LDLR, VLDLR, SORL1, LRP8, LRPAP1, LRP2, LRP1, A2M, ATP7B, CLU
|
drug-gene interaction network
To introduce the potential drugs for Alzheimer's treatment, we used DGIdb 3.0 database to extract drug-gene interactions(12). Utilizing this database, drugs related to each cluster's genes are extracted and visualized in Figure 2.
The undirected associations between genes and drugs are illustrated. The relation between each gene group indicates the gene complexes, which are shown by separately drawn clusters. Blue hexagonal nodes and the red oval nodes represent the drugs, represent the genes. Drugs related to each cluster are illustrated in Additional File 1, Table 13.