Transcriptome Meta-Analysis Suggests the Existence of Two Molecular Subtypes in Alzheimer’s Disease

: Alzheimer’s disease (AD) is the most common type of dementia that affects the lives of nearly 50 million people globally. No efficient treatment or cure has been found for AD yet. Different studies have been performed on heterogeneity in AD from various perspectives. Analyzing AD as a disease with several subtypes might assist us to identify effective personalized treatments for AD patients. Here, we investigated the heterogeneity in AD from a molecular perspective. In particular, we used gene expression profiles of 93 samples from the hippocampus of AD patients. To increase the sample size and enhance the robustness of the results, we applied meta-analysis on three different datasets. We first combined gene expression profiles from multiple relevant AD studies, and then, using a clustering analysis, we found two distinct molecular subtypes across AD samples. We validated the obtained results using various statistical and randomization approaches. Furthermore, we showed that the two transcriptomic subtypes are statistically independent of sex, age and Braak stage of the subjects. Notably, one of the identified subtypes indicates high similarity to control samples, which suggests that AD onset does not necessarily affect the hippocampal transcriptome. By performing differential gene expression and pathway enrichment analyses for various settings, we found 2371 differentially expressed genes that can be used for discriminating between the two AD subtypes. Finally, using a clustering approach, we introduced a novel molecular signature for the two subtypes based on14 differentially expressed genes.


Introduction
Alzheimer's disease (AD), as a progressive neurodegenerative disease, is the most common type of dementia [1,2]. Memory loss, attention problems, deficits in language and visuospatial abilities are some of the symptoms that patients experience [3][4][5]. Despite its prevalence, there is currently no efficient treatment or cure for AD [6]. Several different studies have shown that AD is in fact clinically, pathologically, and biochemically heterogeneous [7,8]. The present "incurability" of AD is suggested to be linked to its heterogeneity [9]. A prominent possibility is that AD consists of multiple subtypes with different mechanisms that need to be treated using different approaches.
There are different viewpoints for subtyping AD. Based on the classical "staging hypothesis", the spread of neurofibrillary tangles initiates from the entorhinal cortex and progressively occupies the association cortex. According to this hypothesis, there are different stages of AD severity, namely the Braak stages I to VI [10,11]. In some recent studies, the staging hypothesis is challenged and a new "distinct subtypes hypothesis" is recommended. In this viewpoint, AD samples are categorized, based on brain atrophy patterns, into different subtypes [12][13][14][15] . The age of onset [16] and cognitive symptoms [17,18] are some other factors, based on which subtyping can be performed. However, there is currently no universal consensus on subtyping AD.
The complex nature of AD suggests the existence of multiple distinct subtypes and mechanisms at its molecular roots. To gain a better understanding of this disease, there have been several AD studies analyzing molecular data such as transcriptome [19][20][21]. Some studies focused on a specific subset of genes (such as synaptic genes in [22] and diabetes-related genes in [23]), while others focused on expression levels of all genes [24]. Molecular subtyping has been previously used for studying complex diseases such as cancer [25,26].
Analyzing post-mortem gene expression profiles obtained from brain regions can provide us with precious information for studying molecular variation during AD [27]. Of particular, the hippocampus has been investigated in several transcriptome studies of AD [28][29][30] since it is one of the early brain regions affected in this disease [31,32]. Furthermore, changes in the hippocampus (such as hippocampus atrophy) are associated with AD and some of its symptoms such as memory loss [33,34]. There was one study on molecular subtyping using RNA-seq data from different brain regions, that showed the hippocampal area demonstrates the greatest subtyping signal over the other regions [35].
In the present work, we used transcriptome data of hippocampus to investigate the existence of multiple AD subtypes. Limited sample size, poor reproducibility and low robustness are wellknown problems in the transcriptome analysis of AD. Integrating multiple datasets and performing meta-analysis increases the sample size, which in turn, may lead to more reliable conclusions [36][37][38]. Meta-analysis of AD expression profiles was previously used for differential 3 analysis of AD and healthy subjects or for comparison of different severities of AD [39,40]. Here, we incorporate this strategy for identifying the two different subtypes of AD. To this end, gene expression profiles of the hippocampus from three AD studies were combined and analyzed to gain a better understanding of molecular heterogeneity in AD.

Materials and Methods
Datasets: Hippocampus is a crucial component of the medial temporal lobe memory system. It is generally believed that at the onset of AD, neuronal loss and synaptic and intraneuronal molecular remodeling occur in the hippocampus [32]. Therefore, we focused on gene expression (GE) profiles of post-mortem hippocampus from AD subjects. From the GEO database [41], we found three relevant microarray datasets. We restricted our analysis to hippocampus samples of the datasets, which includes 93 AD and 25 control samples (Table  1). Code availability: All data analyses were performed in the R environment. The codes are available in Additional file 1.
Combining datasets: Dataset 1 and 3 are Affymetrix microarray datasets, while Dataset 2 is an Illumina microarray dataset. In order to integrate these three datasets, we performed the preprocessing steps shown in Figure 1. Part of the pipeline is adapted from previously published methods [17], [26]. : Schematic representation of the data processing for normalizing and combining the three datasets. The combining protocol is based on previously published methods [38,45]. Briefly, Affymetrix data were preprocessed by 'rma' function in 'affy' package [46] and re-annotated by biomaRt [47]. The Illumina data were "log2 transformed", and then, normalized (by quantile normalization approach from package 'preprocessCore' [48]) and re-annotated by biomaRt. After removing gene redundancies, for the combined dataset, batch effect removal was performed by ComBat from 'sva' package [49]. (Red steps: combining only AD samples to find AD clusters; Blue steps: combining AD and controls samples and applying the cluster labels obtained in the "red steps").

Determining the optimal number of clusters:
To the best of our knowledge, Dataset 1 [42] is currently the largest available dataset of microarray GE profiles of hippocampal samples from AD subjects. We investigated whether AD samples in this dataset can be clustered into statistically significant subgroups, which potentially represent the existence of multiple AD subtypes. To achieve this goal, we assumed that the data comes from k = 2, 3, ... ,10 subtypes, and hence, the dataset should be partitioned into k clusters. We used function 'pam' in package 'cluster' [50] to compute average silhouette value, as a measure of clustering validity [51]. The value k that maximizes the average silhouette value of the partitioning was chosen as the optimal number of clusters. We also used function 'pam' in package 'cluster' to determine the clusters [50] . In our case, k=2 was found to be the optimal value for clustering the data.

Statistical Analyses:
Details about statistical tools and analyses are presented in Additional file 1. In the following, we present an overview of statistical tools and analyses used in the present work.

Finding DEGs in the combined dataset
The 93 AD samples of our combined dataset were clustered into two groups: cluster 1 which comprises 56 samples, and cluster 2 which comprises 37 samples. Then, we combined our three datasets again, but this time, instead of only using AD samples, we also included healthy samples in all of our steps (including batch effect removal). Next, we partitioned our samples into three groups, one healthy group and two AD groups consisting the clusters that we found in the previous step. We determined the DEGs between the clusters and healthy samples using 'limma' package [52]. A DEG was considered statistically significant if its FDR was smaller than 0.05 and its absolute log fold change larger than 0.4.

Validation
To assess the reproducibility of the results obtained by our method, we repeated the above steps on the largest dataset (Dataset 1) to determine those genes that are expressed differentially between the two clusters of Dataset 1. Then we investigated whether the identified DEGs and overlaps are statistically significant. In particular, we compared the DEG analysis results to those of random permutations. To this end, two different randomization procedures were considered: 1. Randomizing Dataset 1 via shuffling the expression profile of each gene across samples. This step is repeated 10,000 times. 2. Randomly partitioning Dataset 1 with subgroup sizes equal to the sizes of the genuine clusters. This step is repeated 10,000 times. 6 To check if the statistically significant subgroups from Dataset 1 are observed in other datasets as well, firstly Datasets 2 and 3 were combined (see Figure 1). This combined dataset will be referred to as Dataset 2+3. Then, clustering of Dataset 2+3 was performed based on the GE profiles of those genes which were significantly upregulated in Dataset 1. Next, we compared the sets of DEGs that were found in Dataset 1 and Dataset 2+3. A good agreement between the two sets is expected to be observed if Dataset 2+3 includes a similar set of subgroups with comparable GE profiles.
To confirm that the agreement between the sets of DEGs is not by chance, Dataset 2+3 was randomly partitioned, with subgroup sizes equal to the sizes of the genuine clusters. Then, we compared the number of significant DEGs between the pair of clusters with those of the genuine clusters. This step is repeated 10,000 times.

Gene set enrichment analysis:
In order to understand the difference between AD subtypes at the molecular level, we analyzed the set of DEGs. To this end, we proceeded by the following steps. Firstly, all AD and control samples in Datasets 1, 2 and 3 were combined, as explained above ( Figure 1). Afterward, we identified DEGs between each AD subtype and control as well as between the two AD subtypes. Finally, we used DAVID [53] to perform gene set enrichment analysis for obtained DEGs in each case.

Identification of molecular signatures:
We determined a "molecular signature" that can be used as a set of markers for distinguishing the two AD subtypes. To achieve this goal, we used the identified gene clusters by the biclustering analysis. Each gene cluster represents a set of genes with comparable GE profiles. Then, in each cluster, the medoid gene (i.e., the gene whose average dissimilarity to all the genes in the same cluster is minimal) was chosen as the representative of that cluster to be in the molecular signature.

Clustering of the combined dataset and obtaining DEGs
The average silhouette width suggests that the best partitioning occurs at k=2 clusters ( Figure  2). Based on this result, we suggested a "two-subtype model" for hippocampal GE profiles in AD patients. We demonstrated that the GE profiles of the three AD datasets show two distinct patterns, which may represent two different subtypes of AD. In the next step, we combined the AD samples of the three datasets as explained in the Materials and Methods section. The combined dataset will be referred to as "Dataset A", in which genes with the largest variations across samples were determined. Figure 3a shows the top 300 genes with the greatest variation across samples. After batch effect removal, different datasets are practically indistinguishable, i.e., samples of different datasets are mixed. Based on 14427 mutual genes in the three datasets (after reannotation to Ensembl gene ID), Dataset A was portioned into two subgroups, that is, cluster "c1" with 56 AD samples and cluster "c2" with 37 AD samples ( Figure 3b). Interestingly, the two clusters showed no significant association with gender, age and Braak staging (see Additional file 2). Altogether, when only analyzing AD samples 1558 genes had a significantly higher GE level in c1 compared to c2, while only 88 genes had significantly lower GE levels in c1 compared to c2 (FDR<0.05, LFC>|0.4|). To check the significance of the identified DEGs, we randomly partitioned the 93 samples of Dataset A into two subgroups with 56 and 37 samples. This procedure was repeated 10,000 times and the number of genes with significantly higher (respectively lower) GE levels in c1 compared to c2 was counted. Notably, in none of these permutations, the number of genes with significantly higher GE levels in c1 exceeded 1558. Furthermore, only in 12 out of 10,000 permutations, the number of genes with significantly lower GE levels in c1 was larger than 88. From Fig. 3, one can conclude that the two subgroups of the AD samples are significantly different. From now on, we will refer to the two clusters as subtype 1 and subtype 2 of AD, or "s1" and "s2" for short. In Additional file Y, we extensively show that if the same procedure is separately applied to Dataset 1 and to Dataset 2+3, the results overlap significantly, which confirms that our "two-subtype model" is not an artifact of dataset combining.
One can ask two questions here: 1. How these two subtypes of AD differ from control samples? 2. What is the difference between s1 and s2 at the molecular level? 9 In all the above analyses, we only considered the AD samples. In order to answer the present questions, it is necessary to include the non-AD samples in the analysis. To this end, we combined AD and non-AD samples of the three datasets, as explained in Materials and Methods. This dataset will be referred to as "Dataset U". Then, the list of significant DEGs between s1 and control samples, s2 and control samples, and also between s1 and s2 were computed (LFC>0.4 and FDR<0.05). To our surprise, the GE profile of s1 samples was found to be very similar to that of the control samples ( Figure 4). The results suggest that in subtype 1 of Alzheimer's disease, the hippocampal transcriptome is not significantly influenced compared to control, despite the appearance of the disease. In contrast, in subtype 2, a considerable number of genes show altered GE patterns compared to control samples. All the previously published transcriptome analyses have implicitly assumed AD to be a "homogeneous" disease and determined the DEGs accordingly. Our results, in contrast, suggest that not only AD is heterogeneous, but also it includes a subtype (i.e., s1) that has a healthy-like GE profile. Obviously, in such a case, the distribution of samples over the subtypes can influence the detectable DEGs.
We showed that s1 and s2 samples do not show significant associations with gender, age and Braak staging. As mentioned in the Introduction, a small group of AD patients does not show atrophy in their hippocampus. Hence, one may presume that this group of patients are those in subtype 1, as this subtype is hardly different from controls samples. However, subtype 1 in our analysis includes a larger number of samples compared to subtype 2, which suggests that the similarity between subtype 1 and control samples must have other reasons, which are yet to be explained.

Pathway enrichment analysis results and comparing them to previous studies
In the next step, using DAVID, gene ontology (GO) and pathway enrichment analyses were performed for the union of two sets of significant DEGs, namely, "s1 vs. control" and "s2 vs.
control" comparisons. The results are presented in Additional file 4. The enriched cellular compartment (CC) terms include Cytosol, Extracellular Exosome, Proteasome Complex, etc. The enriched Bioprocess (BP) terms include regulation of cellular amino acid metabolic process, NIK/NF-kappaB signaling, regulation of mRNA stability, etc. The enriched pathways from KEGG database include Oxidative phosphorylation, Alzheimer's disease, etc.
Although poor reproducibility of transcriptome is a known problem in AD studies [18], still one may find a number of commonly enriched GO terms and KEGG pathways in studies focusing on hippocampal transcriptomes of AD subjects. To see if this is the case, we compared our top 20 enriched GO terms/pathways with those reported in four previous studies. The results are shown in Table 4. As expected, there is limited consistency between the enriched sets, which presumably reflects the dataset-to-dataset variability of AD samples. We then focused on those 508 genes which show differential GE patterns between s1 and s2, but are not differentially expressed between AD and control samples (Fig. 6). This set includes those genes which are differentially expressed in the two AD subtypes, and additionally, not differentially expressed in AD vs. control samples. We used the DAVID web server to perform GO and pathway enrichment analysis. The results are shown in Additional file 4.
There are certain similarities between the GO terms and KEGG pathways which are enriched both in "AD vs. control" and "subtype 1 vs. subtype 2" comparisons. However, many GO terms and pathways exist which are enriched solely in case of "subtype 1 vs. subtype 2" comparison. One can observe that certain cellular compartment terms related to the cell nucleus (including "nucleus", "nucleoplasm", and "nuclear chromosome, telomeric region"), "spliceosomal complex" and "chaperonin-containing T-complex" are exclusively enriched in the latter comparison. Furthermore, "protein polyubiquitination" as a biological process and "ubiquitin mediated proteolysis" as a KEGG pathway is exclusively enriched in the subtype-subtype comparison. These observations provide further evidence about the molecular basis of the difference between the two AD subtypes, which are yet to be studied deeply.

Molecular signatures of the two AD subtypes
Since the two proposed AD subtypes show significantly different GE patterns, we decided to present a "molecular signature" that can be used to distinguish between the two AD subtypes. To achieve this goal, the top 300 DEGs were grouped into 14 clusters based on the similarity of their gene expression profiles ( Figure 5). Then, one gene was selected as the representative of each cluster. The distinctive GE patterns of these 14 genes are shown in Figure 8, and further details on the list of these genes are presented in Additional file 4. We suggest that the set of these 14 genes can be used as a molecular signature whose GE levels can determine whether a sample belongs to subtype 1 or subtype 2 of the AD.
Here, it should be noted that while the present study provides strong evidence about the existence of two subtypes in AD, it can hardly give any insight into the molecular mechanism of AD pathogenesis. Recently, it has been shown that disease-associated changes in AD are highly cell-type specific [57][58][59]. Therefore, microarray-based transcriptome analysis, which studies "lumped" transcriptomes of multiple cell-types, might be incapable of determining the molecular changes that occur in AD. With the recent advances in the single-cell RNA-seq analysis [60], one might be able to better understand such cell-specific changes across multiple AD subtypes.

Conclusion
Previous studies on Alzheimer's disease (AD) have shown heterogeneity in various aspects of the disease. In the present work, we pinpoint that such heterogeneity is also present at the transcriptome level. Based on the meta-analysis of hippocampal transcriptomes, we show that two significantly different subtypes of AD exist. One of these subtypes has a highly similar transcriptomic profile to the transcriptome profiles of healthy (i.e., non-AD) subjects. Furthermore, we nominate signature genes for subtypes based on DEGs and identify altered pathway between subtypes and compared to controls.
This study has several limitations. The samples used in this study (and generally in brain transcriptome studies) are postmortem and the results might be different in alive samples. In addition, despite using multiple hippocampus datasets, the sample size was still limited. Using a larger sample size and analyzing other brain regions in the future could help us shed more light on each subtype's mechanisms and their differences. Low robustness is a common issue in AD transcriptomes studies that we hope to have decreased using a dataset with different microarray platforms and meta-analysis approaches.
Understanding heterogeneity in AD-related transcriptomes can potentially influence attempts to find biomarkers and to assess the validity and/or efficacy of potential treatments. In other