Screening of Differentially Expressed Genes (DEGs)
The gene expression data (GSE48350) on 19 patients with AD were compared with those of 43 control samples (CTs) from GEO database (Tables S1). All samples were hippocampus tissues. After the run of ‘remove Batch Effect’ command in Limma package, the expression level of genes was basically at the same level, which was selected for downstream difference analysis (Fig. 1A, Table S2). Based on the cutoff criteria, a total of 88 DEGs (24 up-regulated and 64 down-regulated) were identified in AD samples (Fig. 1B). Heatmap was used to show gene expression levels (Fig. 1C). According to the absolute value of Log, top ten genes included LTF, SLC17A6, CFAP126, TMEM155, CALB1, SLC47A2, ANKIB1, NWD2, CP, and MRAP2.
GO Annotation and KEGG Pathway Enrichment Analysis
To further confirm the underlying functions of potential target genes, DEGs were analyzed by functional enrichment (Fig. 2, Table S3). The up-regulated genes were significantly enriched in KEGG pathways of complement and coagulation cascades and pertussis, and in the GO functions of negative regulation of endopeptidase activity, negative regulation of peptidase activity, response to molecule of bacterial origin, and negative regulation of proteolysis. The down-regulated genes were significantly enriched in the KEGG pathways of nicotine addiction and neuroactive ligand-receptor interaction, and in the GO functions of vesicle-mediated transport in synapse, synaptic vesicle cycle, neurotransmitter transport, and synapse organization.
Gene Set Enrichment (GSEA) Analysis
We further analyzed the microarray data with the software ‘Gene Set Enrichment Analysis’[29, 37]. Three up-regulated gene sets in AD including CHR2Q13, CHR19P12, and KRAS.PROSTATE_UP. V1_DN and one down-regulated gene set CHR4P14 were justified. No pathway associated with PCD was identified (Fig. 3, Figs. S4-S7). Three positional gene sets (i.e., CHR2Q13, CHR19P12, and CHR4P14) were enriched in different locations of different chromosomes. As the oncogenic signature gene set, the KRAS.PROSTATE_UP. V1_DN could be down-regulated in epithelial prostate cancer cell lines over-expressing an oncogenic form of KRAS gene.
WGCNA and Key Module Identification
All genes in the array were used to conduct WGCNA (Figs. 4 and 5). By setting the soft-threshold power as 18 (scalefree R2 = 0.9, blockSize = 7000, minModuleSize = 20, deepSplit = 2, mergeCutHeight = 0.25, hub_cut = 0.9, net_threshold = 0, slope = –0.96; Fig. 4A), we acquired a total of 12 modules (Fig. 5A; Table S8). The Tom diagram of the relationship between gene clustering and modules in each module of WGCNA was shown in Fig. 4B. The relationship between modules and genes in the modules was shown in Fig. 4C. The correlation coefficient between grey module and gene expression in the module was the lowest. The number of genes in each module was provided in Table1. The list of genes in each module was provided in Supplementary Table 10. Based on the heatmap of module-trait correlations, the key module containing a total of 160 genes was identified as the most positively correlated with AD (correlation coefficient = 0.40, P = 0.01; Fig. 5B).
Selection of Hub Genes
All the common genes from the selected modules were further analyzed by the online STRING database to construct the network. After the PPI analysis, cytoscape was used for visualization and the plug-in cytohubba was used to screen hub genes. The top 20 genes identified were defined as hub genes (Fig. 6; Table 2).
Gene Analysis by Venn Map
In order to verify the reliability of the predicted results, we used the Venn diagram to intersect the DEGs, the related genes from WGCNA analysis, and the genes related to AD in Genecards database. The results showed that there were 59 repeats between DEGs and AD related genes in Genecards and 57 repeats between genes in lightcyan module and AD related genes in Genecards, indicating that the diagnostic efficiency of the two methods was comparable (Fig. 7A, B). Therefore, we used the method of WGCNA to identify the gene module related to AD and PPI to identify the hub genes. Then, we used Venn diagram to intersect the hub genes and the genes related to PCD in Disgenet database. The results showed that 16 out of the 20 hub genes were duplicated, indicating that these hub genes were closely related to PCD. In order to investigate the functions of these hub genes, we further annotated these genes based on GO and KEGG databases. Results of GO annotation showed that these genes were closely associated with the axoneme and cilium (Fig. 7C, Table S9). The results of the enrichment analysis based on KEGG database showed that these hub genes were closely related to Huntington's disease, which shares many medical similarities with AD.
Gene Interaction Heatmap
Because of the application of different chip platforms in investigating the expression of genes, many genes were not detected in GPL570 [HG-U133_Plus_2] Affymetrix Human Genome U133 Plus 2.0 Array. Therefore, the top 30 genes were selected and defined as the genes most closely related to AD. These results suggested that hub genes showed varied degrees of correlation with genes related to AD (Fig. 8; S10).