Here, we applied a multi-layered integrative approach to disentangle sources of variance within a cohort of elderly participants with normal cognition, mild cognitive impairment or mild AD dementia. We identified five major dimensions of heterogenicity that together comprehensively explained the variance within the cohort and were associated with core AD pathology. Further analysis revealed multiple interactions between single ‘omics modalities, distinct multi-omics molecular patterns differentially associated with amyloid aggregation, neurodegeneration, and tau hyperphosphorylation, and novel molecules associated with cognitive impairment. Specific signatures of four molecules improved the accuracy of both AD and cognitive decline prediction. Additionally, pathway enrichment showed over-representation of the hemostasis, immune response and extracellular matrix signalling pathways in association with AD.
Single modality feature selection: We first used Elastic-net regression, to identify molecules associated with individual biomarkers of CSF AD pathology without considering any possible interactions between different ‘omics modalities. This approach identified several proteins (SPARC-related modular calcium-binding protein 1, brain acid soluble protein 1, neuromodulin, pyruvate kinase PKM, thymosin beta-10, 14-3-3 protein zeta/delta, and fructose-bisphosphate aldolase A) in strong accordance with recent studies of the AD CSF proteome (16, 28). The zeta/delta isoform of protein 14-3-3 was associated with Aβ1–42, Tau, and P-Tau levels. This apoptosis inhibitor, one of the most abundant proteins in the brain, was previously found to exhibit altered levels in AD and modulate AD risk (29, 30). We also identified associations of neurofilament medium polypeptide with Tau levels and of reelin with Aβ1−42 and Tau levels. Both these molecules have previously been associated with AD (31–33). Regarding neuroinflammatory molecules, C-reactive protein and monocyte chemoattractant protein-1 have previously been associated with AD, albeit in plasma (34). In addition, we have also previously shown that soluble intracellular cell adhesion molecule-1 in CSF is associated with AD (18). At metabolite level, we identified 10 molecules in CSF associated with Tau and P-Tau, which differ from the blood biomarkers associated with AD identified in a recent study in a large sample (35). Overall, our approach identified more molecules associated with AD pathology as compared to previous studies. A likely source of differences is the use of Elastic-Net regression in the current study which eliminates saturation of the regression and could therefore identify more associations.
Heterogenicity within the cohort: An important strength of our study is to consider all interactions between multiple biological levels and their associations with the heterogenicity within the cohort. This was achieved by training a MOFA model on the multi-omics dataset which has the advantage of not giving any additional analysis weight to the established CSF biomarkers of core AD pathology while also reducing the complexity of the data to better depict the sources of variation. This revealed proteomic measures and CSF core AD biomarkers as the main contributors to the variance. This was to be expected since i) protein expression levels do not only reveal changes related to AD pathology, but also reflect the effects of different environments, life style, health conditions, and genetic backgrounds; all factors potentially affecting protein expression and regulation (36); and ii) our sample contains a large proportion of participants with AD, each displaying CSF AD biomarkers significantly different from subjects without AD. Nonetheless, this approach identified 21 proteins with previously reported association to AD, suggesting the MOFA approach can accurately disentangle the inter-individual heterogeneity driven by AD pathology and differentiate between individual (i.e., not repeated in the dataset) and cohort heterogeneity (i.e., underlying changes in many participants). Conversely, the metabolomic dataset was only responsible for a small amount of the cohort heterogeneity (3.7%), a possible explanation being that it represents individual heterogeneity for the most part caused by the environment, disease processes or nutritional habits. This low contribution of metabolomics to variance could also result from the lower dimensionality of the metabolomics dataset as molecules within had lower concentrations compared to molecules in the other modalities. Yet, despite this low level of variance, our model was able to correctly retrieve metabolites previously reported in association with AD, underlining the sensitivity of the model. This is further supported by the ability of our model to identify a four-molecule signature that improves the prediction of the occurrence of AD pathology, confirming that the identified LFs and molecules reflect metabolic differences resulting from the presence of AD pathology rather than from other factors. This also confirms the clinical and diagnosis relevance of the identified molecules.
Associations between LF and specific aspects of AD pathology: We next investigated how individual LFs related to specific aspects of AD pathology by comparing which CSF AD biomarkers were most strongly associated with each of them. This revealed that LFs 1–3 were primarily associated with CSF Tau levels suggesting relationships with neuronal injury while LF4 and LF5 were mainly associated with CSF Aβ1−42 suggesting implication in the development of amyloid pathology. In LF3 and LF4, the association of Aβ1−42 was opposite to those of Tau and P-Tau. We speculate that the molecules within and the associated alterations could play a role in both amyloid aggregation and tau-related neurodegeneration or represent a consequence of developing cerebral AD pathology.
Interactions between LFs and ‘omics modalities: Besides the identification of molecular profiles and metabolic pathways alterations associated with AD, our approach also disentangled how components of individual LFs interact with each other to explain variance within the cohort. In other words, the contribution of individual LFs to total variance results from a specific combination of the different ‘omics modalities. Indeed, while the variance explained by LF1 and LF2 was associated with all modalities, other LFs only contained a subset of these (one-carbon metabolism and metabolomics were only very weakly associated with LF3 and LF4, whereas lipidomics was nearly absent from LF3 and LF5); revealing specific interactions between a subset of molecules and particular metabolic pathways. Individual molecules also presented different patterns of association across LFs. For example, a subset of lipids, including PC 32:0, PC 34:1, LPA 18:3 and TAG 54:3, had a strong positive association with LF2 and a weak negative association with LF4. Since LF2 was associated with all tested modalities, this suggests these analytes interact within multiple biological pathways and could be within a hub of metabolic changes. LF2 is associated with both Tau and P-Tau; neurodegeneration and tau pathology could therefore relate to a more general metabolic alteration. The association of PC 32:0 with tau pathology in single ‘omics approaches supports this assumption. In contrast, LF4 is strongly associated with amyloid pathology and it is only associated with changes in lipids and proteins (in addition to CSF AD biomarkers). Therefore, only a subset of lipids appears to interact directly with amyloid pathology.
Novel associations uncovered by the MOFA model: The MOFA model uncovered additional relationships not revealed by single ‘omics exploration paradigms, such as the association of S-adenosylhomocysteine and glycoproteins associated with CSF Aβ1−42, Tau and P-Tau; and of total cysteine associated with Aβ1−42. Indeed, since the trained MOFA model did not only consider molecules from one modality but the whole dataset from different ‘omics, it was able to reveal additional associations resulting from the downstream effects of these molecules or from interactions with other modalities. While several analytes identified by the trained model have previously been associated with AD (see Additional File 1), it also uncovered novel associations, such as dynein light-chain 2, cytoplasmic (DYL2) and neurexophilin-4 (NXPH4). Both were associated with LF1 and with cognitive impairment (Additional File 3, Table S3). DYL2 is thought to regulate dynein function (37) and maintain cytoskeletal structure, therefore regulating synaptic function (38). NXPH4 structurally resembles neurexophilin-1, an α-neurexin ligand, which promotes adhesion between dendrites and axons and modulates specific cerebellar synapses and motor functions (39). Altered levels of these proteins may therefore be associated with neurodegeneration processes and related to cognitive impairment in AD. Another novel analyte we identified is the cholesteryl ester SE 27:1 16:0. While links between phosphatidylcholine metabolism and AD in general (40) and PC 32:0 in particular (41) have been previously reported, to our knowledge cholesteryl esters have not previously been associated with AD pathology. In our MOFA model, this cholesteryl ester was strongly correlated to LF4, suggesting a role in amyloid pathology. These molecules were also associated with cognitive performance as measured by MMSE (Additional File 3, Table S4). Together, these results demonstrate the capacity of integrative multi-omics to provide additional insights into the relationship of molecular alterations with specific aspects of the AD pathology.
Prediction of AD pathology and cognitive decline using MOFA-selected molecules: Molecular signatures associated with AD or predictive of cognitive decline were derived from our model. Both signatures contain four molecules each, taken from multiple biological levels, and share two common molecules, protein 14-3-3 zeta/delta and clusterin, suggesting common biological pathways associated with AD and cognitive decline. Both signatures also significantly improved the prediction performance when added to reference models. These findings demonstrate the ability of our model to identify molecules reflecting metabolic differences related to AD pathology or cognitive decline rather than to other factors. While these results need validation in an independent cohort, they already demonstrate the ability of our model to identify biomarker combinations that may be used in clinical practice.
Infer pathway relationships with AD pathology: One important strength of the MOFA approach is that it enables addressing the relationship between multiple biological pathways and associate them with sources of variance (i.e., LFs). Using over-representation of metabolic pathways, we were able to show that individual LFs, and the main related pathological aspects of AD (i.e., amyloid aggregation, neurodegeneration and tau pathology) are associated with distinct pathways. Hemostasis and immune response were the most over-represented. Only the immune response was associated with all LFs in which individual pathways could be identified. LF1 and LF2 presented a significant enrichment in biomolecules implicated in hemostasis, suggesting an association between this pathway and neuronal injury, and tau pathology. While an association between hemostasis and amyloid pathology pathway was previously described (42), in particular related to expression of amyloid precursor protein and release of Aβ (43), there have also been recent reports of an association between Tau and hemostasis (44). Molecules involved in the extracellular matrix were significantly enriched in LF2, also suggesting an association with tau-related pathology, in line with previous reports (45). However, this pathway was not detected within LF1 or other LFs. We therefore hypothesise that the molecules involved are those presenting a specific pattern of association with LF2, such as PC 32:0, PC 34:1, LPA 18:3 and TAG 54:3. Neuronal function was confined to associations with LF5, suggesting little variation and differences in signal transmission and synaptic function across the cohort since this LF only explained 8% of the variance. Nonetheless, this result suggests an association with amyloid pathology, which is in accordance with previous findings of amyloid being released in an activity-dependent fashion from neurons and modulating synaptic function and plasticity (46, 47). Overall, the enriched metabolic pathways suggest that AD pathology affects not only pathways related to neuronal biological systems but is linked to a broader spectrum of metabolic dysfunctions.
Limitations: The inclusion of some targeted analysis results in the multi-omics models may be considered as a limitation. While the proteomic and lipidomic dataset are hypothesis-free measurements and the study could be limited to this data, we chose to include further available modalities. In particular, we considered neuroinflammation and one-carbon metabolism given their previously reported associations with AD and relevance for brain metabolism. The replication of these and other previously reported associations in our MOFA model supports the validity of the new findings revealed in the present study. Our findings, in particular the identified biomarker combinations need validation in independent cohorts.