Predictive Models and Features of Patient Mortality across Dementia Types

doi:10.21203/rs.3.rs-2350961/v1

Download PDF

Article

Predictive Models and Features of Patient Mortality across Dementia Types

https://doi.org/10.21203/rs.3.rs-2350961/v1

This work is licensed under a CC BY 4.0 License

Journal Publication

published 28 Feb, 2024

Read the published version in Communications Medicine →

Version 1

posted

You are reading this latest preprint version

Dementia care is challenging due to the divergent trajectories in disease progression and outcomes. Predictive models are needed to identify patients at risk of near-term mortality. Here, we developed machine learning models predicting survival using a dataset of 45,275 unique participants and 163,782 visit records from the U.S. National Alzheimer’s Coordinating Center (NACC). Our models achieved an AUC-ROC of over 0.82 utilizing nine parsimonious features for all one-, three-, five-, and ten-year thresholds. The trained models mainly consisted of dementia-related predictors such as specific neuropsychological tests and were minimally affected by other age-related causes of death, e.g., stroke and cardiovascular conditions. Notably, stratified analyses revealed shared and distinct predictors of mortality across eight dementia types. Unsupervised clustering of mortality predictors grouped vascular dementia with depression and Lewy body dementia with frontotemporal lobar dementia. This study demonstrates the feasibility of flagging dementia patients at risk of mortality for personalized clinical management.

Biological sciences/Computational biology and bioinformatics/Machine learning

Biological sciences/Neuroscience/Diseases of the nervous system/Dementia

Dementia has become a growing public health concern, classified as the seventh leading cause of death¹ and the fourth most burdensome disease or injury in the United States in 2016 based on years of life lost². As of 2022, an estimated $1 trillion of global annual costs³ can be attributed to Alzheimer’s disease and other dementias, affecting an estimated 6.5 million Americans⁴ and 57.4 million people worldwide, and those numbers are expected to triple by 2050⁵. Unfortunately, the true mortality burden associated with dementia may still be underestimated, as dementia itself tends to be underreported on death certificates as the underlying cause of death⁶.

This immense healthcare burden of dementia can be attributed to the lack of curative drugs^7,8, the challenge in predicting patient trajectory, and the intrinsic difficulty in diagnosing dementia, which often requires the evaluation of various criteria^9,10, including risk factors (e.g., old age and family history), cognitive impairment screening questionnaires, neuropsychological testing (e.g., the Mini-Mental State Examination (MMSE)¹¹ and the Alzheimer’s Disease Assessment Scale—Cognitive Subscale (ADAS-Cog)¹²), physical examination, biomarkers, and neuroimaging. To aid the detection, diagnosis, and treatment of Alzheimer’s disease and related dementias, the National Institute of Aging (NIA) founded the National Alzheimer’s Coordinating Center (NACC)¹³ in 1999. Using existing protocols, standardized and multi-institutional databases were collected and built over the past decades, encompassing clinical records for tens of thousands of patients that may be used to develop predictive models¹⁴.

While machine learning models have been developed for the diagnosis or classification of dementia^4,15–20, they have rarely been applied to the prediction of near-term survival or mortality in dementia patients. Most dementia survival prediction studies utilize traditional nonparametric estimator and regression models, such as Kaplan-Meier estimator curves and Cox proportional hazards models^21,22, rather than advanced machine learning models. However, due to the issues of high dimensionality, non-linearity, censoring, heterogeneity, and missingness present in dementia clinical data, machine learning can often provide more accurate prediction compared to traditional statistical methods²³. In the few published studies that utilized machine learning to predict dementia patient mortality^23–25, the models achieved reasonable performances. However, two of these studies were conducted within small cohorts (< 2,000 patients) derived from single health systems or geographical regions, and important predictors of dementia patient mortality varied between studies. Moreover, none of these studies differentiated among dementia types, which remains a crucial next step for the personalized treatment and management of dementia^26,27. A systematic effort to build predictive models using a cross-institutional database encompassing multiple dementia types is required to resolve these open questions.

To resolve these challenges, we utilized the NACC database, the largest resource of its kind in the United States, to (1) develop robust machine learning models for predicting dementia patient mortality across various time frames, (2) identify key predictors of mortality, and (3) demonstrate how these predictors differ across dementia subtypes. Our models provide a method of flagging dementia patients at risk of near-term death and highlighting crucial differences between dementia types, which can contribute to the precision care of dementia.

Cohort characteristics and patient mortality across dementia types

Data for this study was obtained from the National Alzheimer’s Coordinating Center (NACC) database¹³, which spans 39 past and present Alzheimer’s Disease Centers (ADCs) across the United States. The NACC collects, audits, and distributes ADC-derived data across the U.S. The NACC data release used for this study included 45,275 unique participants and 163,782 visit records between 2005 and 2021.

Data were extracted for various dementia severity levels (Table 1, Supplementary Table 1). The mean age at visit increased from 73 years for those who had normal cognition (n = 16,379), to 76 years for those with dementia (n = 19,186). The percentage of females was 64.9% among normal cognition patients, compared to 52.0% among demented patients. For impaired-not-MCI (n = 1,840), MCI (n = 7,870), and dementia groups, the proportions of females were 58.3%, 52.4%, and 52.0%, respectively. Mean education years were 15–16 across all dementia severity levels in this population. The percent of participants with at least one AD risk APOE e4 allele was 23.1% among normal cognition participants, compared to 38.5% among demented participants. This NACC cohort was utilized for statistical analyses and machine learning model training in this work (Fig. 1a).

Table 1

Characteristics of NACC participants by cognitive status
Characteristic	Dementia severity
Characteristic	Normal cognition	Impaired-not-MCI	MCI	Dementia
Number of Participants	16379	1840	7870	19186
Age (years old), mean (SD)	73 (12)	73 (11)	76 (10)	76 (11)
Female, n (%)	10632 (64.9%)	1071 (58.3%)	4124 (52.4%)	9976 (52.0%)
Education (years), mean (SD)	16 (7)	15 (6)	16 (8)	16 (9)
Race, n (%)
White	12869 (78.6%)	1335 (72.6%)	6048 (76.8%)	16169 (84.3%)
Black/African American	2629 (16.1%)	348 (18.9%)	1324 (16.8%)	1918 (10.0%)
American Indian/Alaskan Native	138 (0.8%)	23 (1.3%)	83 (1.1%)	151 (0.8%)
Native Hawaiian/Pacific Islander	14 (0.1%)	5 (0.3%)	5 (0.1%)	25 (0.1%)
Asian	492 (3.0%)	40 (2.2%)	234 (3.0%)	420 (2.2%)
Other/multiracial/unknown	237 (1.4%)	89 (4.8%)	176 (2.2%)	503 (2.6%)
Hispanic ethnicity, n (%)	1115 (6.8%)	234 (12.7%)	766 (9.7%)	1473 (7.7%)
>= 1 APOE e4 allele, n (%)	3781 (23.1%)	400 (21.7%)	2152 (27.3%)	7395 (38.5%)

The whole cohort with 45,275 unique NACC individuals in this 2005–2021 time period was analyzed for comparing patient survival across different dementia types. We estimated survival time using the Kaplan-Meier method (Fig. 1b,c). Survival probability differed across primary etiologic diagnoses of dementia types (Fig. 1b). Patients with prion disease showed less overall median survival time than other dementia types (p < 0.0001). This is consistent with the rapid onset and progression feature of prion disease²⁸. The overall median survival time for Alzheimer’s disease was not reached, with 5- and 7- year survival rates of 76.05% and 66.63%, respectively. The overall median survival time in Lewy Body disease was 98.3 months (95% CI 84.2–119.5), with 5- and 7- year survival rates of 60.0% and 52.0%, respectively.

To illustrate the relationship between dementia severity and survival, we performed a survival analysis based on global clinical dementia rating (CDR) scores (Fig. 1c). The overall median survival time in global CDR score at 0 was not reached, with 5- and 7- year survival rates of 93.7% and 90.1%, respectively. The overall median survival time in global CDR scores at 1, 2, and 3 was 141.8 months (95% CI 128.8 – NA), 75.0 months (95% CI 69.0–81.0), 27.3 months (95% CI 25.3–29.3), respectively. With increasing global CDR score, which represents more severe cognitive impairment, patients generally showed worse outcomes. Moreover, to determine whether these trends were reflected in patients with comorbidities (i.e., other causes of death such as cancer and cardiovascular disease), we performed additional survival analysis on global CDR scores stratified by disease type. The results showed that regardless of whether or not dementia patients suffered from cancer or heart conditions, global CDR score remained significantly associated with mortality (Supplementary Fig. 1), suggesting that—even among dementia patients with comorbidities—dementia-related causes are likely still the dominating factors of mortality.

Predicting dementia patient mortality using age and standard global CDR

Since our survival analysis revealed that higher standard global CDR coincides with a faster decline in survival probability, we first built simplistic XGBoost ML models that utilize only two features, age, and standard global CDR, to predict mortality in dementia patients. We stratified our data into four datasets separated by survival endpoints, each with an 80/20 train/test split and a separate validation cohort of later visits not seen by the model: one-year survival (train: n = 60,367; test: n = 15,092; validation: n = 10,284 visit records), three-year survival (train: n = 53,272; test: n = 13,318; validation: n = 11,552), five-year survival (train: n = 47,196; test: n = 11,800; validation: n = 11,284), and ten-year survival (train: n = 32,569; test: n = 8,143; validation: n = 13,266) (Supplementary Fig. 2). We trained our models accordingly and employed Bayesian optimization²⁹ to select the optimal hyperparameters for each model.

The two-feature XGBoost models achieved an AUC-ROC of over 0.76 at all survival thresholds, though the higher thresholds achieved much higher AUC-PR scores, likely due to the large class imbalances at the lower thresholds. The full table of model performance for the two-feature XGBoost models in the test and validation sets is shown in Supplementary Table 2, and the AUC-ROC curves are shown in Supplementary Fig. 3. Overall, these basic models confirmed that age and clinical dementia rating alone provide a reasonable prediction of dementia patient mortality, and their contributions would be further elucidated with the inclusion of more clinical features in our subsequent analyses.

Multi-factorial machine learning models for predicting mortality in dementia patients

We proceeded to build multi-factorial models that introduced a wider array of clinical features into the machine learning models. Initially, we built XGBoost models encompassing all 189 features of our preprocessed datasets. These initial results identified numerous recurring features among the top features, ranked by SHapley Additive exPlanations (SHAP)³⁰ values, of each of the four survival time thresholds, and much of the explainability of the predictions could be attributed to these top few features alone (Supplementary Fig. 4). Therefore, we derived a parsimonious and informative feature subset across all four survival time thresholds by taking the union of the top five features from each model, thus enhancing clinical interpretability without a drastic tradeoff in predictive performance.

The SHAP bar plots showing all features at each survival threshold are shown in Supplementary Fig. 4, where we retain the top features. Notably, other leading causes of death in the US, such as stroke and other cardiovascular conditions, were ranked outside of the top 20 features at all survival thresholds.

After conducting feature selection, our resulting feature subset consisted of nine parsimonious features: ‘NACCAGE’ (Subject’s age at visit), ‘INDEPEND’ (Level of independence), ‘PERSCARE’ (Personal care), ‘TRAILB’ (Trail Making Test Part B — Total number of seconds to complete), ‘STOVE’ (In the past four weeks, did the subject have any difficulty or need help with: Heating water, making a cup of coffee, turning off the stove), ‘SEX’ (Subject’s sex), ‘SMOKYRS’ (Total years smoked cigarettes), ‘TRAILBRR’ (Trail Making Test Part B — Number of commission errors), and ‘EDUC’ (Years of education).

Utilizing the same train/test splits and validation cohorts as in the two-feature models, we employed this subset of nine features on new, Bayesian-optimized XGBoost models to predict survival/mortality across all dementia patients at each survival threshold. All four models achieved an AUC-ROC of over 0.8, though the lower thresholds (i.e., one-year and three-year). At the ten-year survival threshold, our model achieved the highest AUC-ROC and AUC-PR of all, with an AUC-ROC of 0.829 (95% CI: 0.814–0.832) and an AUC-PR of 0.905 (95% CI: 0.896–0.911). Notably, model AUC-PR was worse in one/three-year survival but increased dramatically at higher survival time thresholds, due to a higher proportion of mortality in patients and, thus, smaller class imbalances at these higher thresholds. Moreover, these performance trends were reflected in the external validation sets as well. The full table of model performance for the multi-factorial models in the test and validation sets is shown in Table 2, and the AUC-ROC curves are shown in Fig. 2a.

Table 2

Predictive performance of the nine-feature, multi-factorial models
Survival-time Threshold	Test Set			Validation Set
Survival-time Threshold	Accuracy (95% CI)	AUC-ROC (95% CI)	AUC-PR (95% CI)	Accuracy (95% CI)	AUC-ROC (95% CI)	AUC-PR (95% CI)
One-year	0.780 (0.770–0.788)	0.824 (0.820–0.850)	0.259 (0.257–0.301)	0.817	0.870	0.300
Three-year	0.750 (0.743–0.758)	0.825 (0.817–0.830)	0.566 (0.545–0.588)	0.752	0.837	0.539
Five-year	0.744 (0.735–0.749)	0.823 (0.813–0.826)	0.722 (0.702–0.730)	0.710	0.817	0.667
Ten-year	0.748 (0.733–0.755)	0.829 (0.814–0.832)	0.905 (0.896–0.911)	0.693	0.789	0.741

Additionally, to determine whether the model performance was consistent across Alzheimer’s disease centers (ADCs), we verified the performance of each model across all ADCs with at least 200 patients in the test set at each survival threshold. Overall, model performance remained consistent across ADCs, with discrepancies primarily occurring in the ADCs with the smallest patient populations. The full AUC-ROC curves stratified by ADC are available in Supplementary Fig. 5 and demonstrated the broad generalizability of our model.

We generated the bootstrapped SHAP plots (Fig. 2b) of the multi-factorial models to reveal key insights about the nine chosen features in relation to dementia patient mortality. Notably, a higher risk of mortality (positive SHAP value) was predicted by old age (‘NACCAGE’), male sex (‘SEX’), higher levels of dependency (‘INDEPEND’), higher levels of personal care required (‘PERSCARE’), greater difficulty in handling a stove or heating water (‘STOVE’), more years of education (‘EDUC’), and more seconds required to complete the Trail Making Test Part B (‘TRAILB’) across all four survival thresholds. More years of smoking (‘SMOKYRS’) and more commission errors on the Trail Making Test Part B (‘TRAILBRR’) were also predictive of mortality risk, though interestingly, the direction of the effects began to reverse at the longer survival thresholds. The full SHAP beeswarm plots for the multi-factorial models are shown in Fig. 2b.

Dementia type-specific models

The multi-factorial machine learning models provided a cogent framework for predicting mortality in an unspecified population of dementia patients. However, across dementia types, there may have been key similarities and distinctions that could not be captured in the pan-dementia analysis. Therefore, we stratified the NACC cohort into smaller cohorts based on dementia types to conduct sub-dementia analyses. For these analyses, we aimed to predict dementia patient mortality solely at the five-year survival threshold that provides a dataset with the smallest class imbalance while providing an extended time window for possible clinical actionability. We conducted our sub-dementia analysis on these eight dementia types with sample sizes greater than 100: no dementia (n = 42,135 visit records), Alzheimer’s disease (AD, n = 37,990), unknown (n = 6,317), frontotemporal lobar degeneration (FTLD, n = 4,290), Lewy body dementia (LBD, n = 3,182), vascular brain injury or vascular dementia (VaD, n = 2,288), cognitive impairment due to other reasons (n = 1,362), and depression (n = 1,354). We stratified the training, test, and validation sets of the five-year dataset by dementia type and then trained, optimized, tested, and validated an XGBoost model for each of the eight dementia types.

Performance-wise, the models built on the commonly-defined dementia types (e.g., AD, FTLD, LBD, and VaD) tended to perform better in the positive class (mortality) and, thus, generally had higher AUC-PR, whereas the models built on the non-dementia patients and the more ambiguous dementia types (e.g., depression, cognitive impairment for other specified reasons, and missing/unknown) were more robust at predicting the larger negative class (survival) and, thus, had higher AUC-ROC overall. All eight models achieved an AUC-ROC of over 0.79, with the no-dementia model attaining the highest AUC-ROC at 0.873 (95% CI: 0.859–0.879). The most consistent, all-around performer was the AD model, which is reasonable given that it was by far the most popular dementia type aside from the no dementia group. The full table of model performance for the sub-dementia models in their respective test and validation sets is shown in Table 3.

Table 3

Predictive performance of the individual dementia type models
Survival-time Threshold	Test Set			Validation Set
Survival-time Threshold	Accuracy (95% CI)	AUC-ROC (95% CI)	AUC-PR (95% CI)	Accuracy (95% CI)	AUC-ROC (95% CI)	AUC-PR (95% CI)
One-year	0.780 (0.770–0.788)	0.824 (0.820–0.850)	0.259 (0.257–0.301)	0.817	0.870	0.300
Three-year	0.750 (0.743–0.758)	0.825 (0.817–0.830)	0.566 (0.545–0.588)	0.752	0.837	0.539
Five-year	0.744 (0.735–0.749)	0.823 (0.813–0.826)	0.722 (0.702–0.730)	0.710	0.817	0.667
Ten-year	0.748 (0.733–0.755)	0.829 (0.814–0.832)	0.905 (0.896–0.911)	0.693	0.789	0.741

The clustered feature importance heatmap is shown in Fig. 3. Hierarchical clustering produced the following four clusters of dementia types: (1) VaD and depression, (2) FTLD and LBD, (3) AD and other dementia, and (4) no dementia and unknown. Notably, many of the key features in the pan-dementia cohort reappeared among the top features within most dementia types, including ‘NACCAGE’ (Subject’s age at visit), ‘INDEPEND’ (Level of independence), and ‘SMOKYRS’ (Total years smoked cigarettes). New features such as ‘NACCADC’ (ADC at which subject was seen), ‘VEG’ (Vegetables — Total number of vegetables named in 60 seconds), ‘TRAVEL’ (In the past four weeks, did the subject have any difficulty or need help with: Traveling out of the neighborhood, driving, or arranging to take public transportation), and ‘TRAILA’ (Trail Making Test Part A — Total number of seconds to complete) also emerged as important features across numerous dementia types.

Meanwhile, several key differences distinguished individual dementia types and their clusters from one another. For instance, in both VaD and depression, alongside general cognitive features, body measurements and vital signs, such as ‘HEIGHT’ (Subject’s height (inches)), ‘WEIGHT’ (Subject’s weight (lbs)), ‘NACCBMI’ (Body mass index (BMI)), ‘HRATE’ (Subject’s resting heart rate (pulse)), and ‘BPDIAS’ (Subject’s blood pressure (sitting), diastolic), were more important for predicting mortality than for any other dementia type. In the vascular dementia subgroup, ‘CVCHF’ (Congestive heart failure) was also a pivotal feature, second in importance after age and accounting for over 5% of the mortality prediction among VaD patients.

In the FTLD and LBD cluster, feature importance was distributed across a substantially wider array of cognitive features, with less importance attributed to age and smoking years as compared to the other dementia types. In FTLD, for instance, a number of new cognitive features emerged: CDR® Plus NACC FTLD features (e.g., ‘CDRLANG’ (Language) and ‘COMMUN’ (Community Affairs)), clinician judgment features regarding motor function (e.g., ‘NACCMOTF’ (Indicate the predominant symptom that was first recognized as a decline in the subject’s motor function) and ‘MOMODE’ (Mode of onset of motor symptoms)), and neuropsychological battery summary scores (e.g., ‘NACCMMSE’ (Total MMSE score (using D-L-R-O-W))). Accordingly, difficulty in performing functional and social activities, in addition to the loss of motor function, were crucial predictors of mortality in FTLD patients, more so than in any other dementia type. As for LBD, ‘CDRSUM’ (Standard CDR sum of boxes) superseded age as the most important feature, accounting for nearly 10% of the mortality prediction among LBD patients. However, other new features such as ‘ANIMALS’ (Animals — Total number of animals named in 60 seconds), ‘ORIENT’ (Orientation), and ‘NACCMMSE’ (Total MMSE score (using D-L-R-O-W) were also revealed to be relevant to the mortality prediction within the LBD subgroup.

The top features in the AD subgroup comprised almost entirely cognitive features, most of which overlapped with those of the multi-factorial models, with the addition of ‘CDRSUM’ (Standard CDR sum of boxes), ‘SHOPPING’ (In the past four weeks, did the subject have any difficulty or need help with: Shopping alone for clothes, household necessities, or groceries), and ‘TOBAC100’ (Smoked more than 100 cigarettes in life). The top features in the other subgroup similarly consisted primarily of cognitive features, with the addition of body measurements and vital signs, similar to the VaD and depression cluster.

Finally, in the no dementia and unknown subgroups, many of the features were typically associated with mental cognition, such as age and performance on neuropsychological exams, remained important predictors of mortality, though others were superseded by more general comorbidities and risk factors. For instance, the relative importance of ‘SMOKYRS’ (Total years smoked cigarettes) was higher in the no dementia group than in any of the dementia groups, accounting for 7.5% of the mortality prediction among no dementia patients. Other non-cognitive risk factors such as ‘HYPERTEN’ (hypertension) and ‘ENERGY’ (Do you feel full of energy?) were also revealed to be relevant for predicting mortality in non-dementia patients, despite having little to no contribution to the predictions in the dementia groups. These results re-affirmed that mortality predictors differ between non-demented and dementia patients, who show multiple survival factors related to their neuropsychological ability.

In this study, we developed machine learning models for predicting mortality through training, testing, and validation using 163,782 visit records of 45,275 unique NACC individuals in the United States from 2005 to 2021. We have demonstrated that machine learning models, which have thus far primarily been explored as screening or diagnosis tools in the context of dementia, have substantial utility in the prediction of mortality among dementia patients. First, we conducted multiple survival analyses, which confirmed that increasing global CDR scores coincided with decreased survival and showed that there was considerable variability in survival across dementia subtypes. Subsequently, we developed two-feature models (using only age and standard global CDR) and multi-factorial models (using nine features determined through feature selection) to predict dementia patient mortality at four distinct survival-time thresholds, all of which achieved high predictive performance. We additionally built machine learning models for eight different dementia subtypes and revealed key feature differences among them, though age and cognitive features derived from neuropsychological tests remained important predictors of mortality across all dementia types. These mortality predictors reveal similarities and differences in the etiology and clinical representation among individuals affected by different types of dementia.

The results of our global CDR survival analysis were consistent with those of past survival analyses in dementia patients^31,32, confirming that higher CDR scores correlate with reduced survival probability. With respect to dementia type, there have been very few studies investigating the association between dementia type and survival probability. In the studies that we identified, key distinctions were noted in the cohort composition and mortality risk of Lewy body dementia vs. Alzheimer’s disease²⁷, vascular dementia vs. Alzheimer’s disease in the context of depression²⁶, and among eight different dementia subtypes³³. Our dementia-type survival analysis confirmed this heterogeneity, as survival probability differed drastically across groups of patients with different primary etiologic diagnoses. However, whereas prior studies identified comorbidities such as cardiovascular disease³⁴ to be associated with reduced survival probability, we found that regardless of heart conditions, the survival curves separated decisively across patients with varied global CDR scores within the NACC cohort.

Subsequently, we built machine learning models tasked with predicting dementia patient mortality at one-, three-, five-, and ten-year survival thresholds. Our two-feature models, which utilized age and global CDR scores, achieved an AUC-ROC of over 0.76 at all four survival thresholds in the test set. Thus, age and global CDR provided a solid basis for predicting dementia patient mortality and, in the absence of additional clinical features, may alone be used to guide clinical judgment. Our multi-factorial models, for which we utilized SHAP to select a subset of nine features, achieved an AUC-ROC of over 0.82 at all four survival thresholds in the test set and comparable performance in the validation set. The crucial features used by the multi-factorial models confirm the known clinical indicators of dementia from a machine learning standpoint. The multi-factorial models revealed that a higher risk of mortality was predicted by older age^35–38, male sex^31,36–38, higher levels of dependency and personal care required³⁸, more years of education³⁹, more years of smoking⁴⁰, and poorer performance on neuropsychological exams like the Trail Making Test^41,42.

To our knowledge, our study is one of just a few studies to apply a machine learning-based approach to predicting mortality in dementia patients^23–25 (as opposed to statistical approaches), and the first study to do so within population subsets stratified by dementia type. In predicting dementia patient mortality at the five-year survival threshold, our dementia type-specific models all achieved an AUC-ROC of over 0.79 in the test set and similar performance in the validation set. Hierarchical clustering of survival predictors grouped the following dementia types together: (1) vascular dementia (VaD) with depression, (2) Lewy body dementia (LBD) with frontotemporal lobar dementia (FTLD), (3) Alzheimer’s disease (AD) with other dementia, and (4) no dementia with unknown. Since many dementia types present similar symptoms and disease progressions⁸, differentiating and targeting dementia type-specific symptoms and mortality predictors can be beneficial for patient populations⁴³. Across all four clusters (even in the no dementia and unknown cluster), many features from the multi-factorial models remained key predictors of mortality, such as age, level of independence, smoking, and performance on neuropsychological exams like the Trail Making Test.

First, within the VaD and depression cluster, body measurements and vital signs (e.g., height, weight, BMI, heart rate, and diastolic blood pressure) contributed to the mortality prediction more than for any other dementia type. For VaD, congestive heart failure was the second most important feature after age, consistent with VaD’s common risk factors⁸. Moreover, the grouping of VaD with depression confirms previous literature that has highlighted the synergistic effects of VaD and depression on patient mortality²⁶, as VaD patients tend to exhibit a higher baseline risk for psychiatric symptoms like depression^43,44. Second, within the FTLD and LBD cluster, features corresponding to MMSE score, standard CDR sum of boxes, and involvement in community affairs contributed more heavily to the mortality prediction. For FTLD in particular, features measuring difficulty in performing social and functional activities were the pivotal predictors of mortality, consistent with the pathological effects of FTLD⁸. Our findings regarding FTLD and LBD align with prior studies that have similarly grouped the two subtypes together and determined that executive dysfunction and activity disturbances are the key indicators of cognitive impairment for both^43,45. Third, within the AD and other dementia clusters, general cognitive features, namely those from the multi-factorial models, remained the most important predictors of mortality. Standard CDR sum of boxes was also an important predictor of mortality in AD patients, as were body measurements and vital signs for other dementia patients. The grouping of AD with other dementia may be attributed to the difficulty in differentiating AD from certain other types of dementia⁴⁶, and given that AD was by far the most prevalent dementia type in the NACC cohort, it is likely that the other dementia patients were generally similar to AD patients. Finally, within the no dementia and unknown cluster, general cognitive features such as performance on the Trail Making Test, surprisingly, remained important predictors of mortality. However, general comorbidities and mortality risk factors, such as smoking, hypertension, and lack of energy, demonstrated high relative importance as well, more so than for any of the dementias. Notably, as in the survival analysis, cardiovascular diseases did not appear in the top features in either the multi-factorial models or the dementia type-specific models, with the exception of congestive heart failure for VaD. The absence of these comorbidities from the top features in our machine learning models may suggest that cognitive decline is a stronger predictor of mortality in dementia patients than stroke or other comorbid cardiovascular conditions, though further studies could better interrogate this hypothesis.

Our study had several key strengths. First, the NACC database is the largest resource of its kind in the United States, covering a large, diverse patient population that was current through September 2021. Moreover, we highlight a conscious design choice in stratifying our data into train, test, and validation sets. By introducing a prospective validation set based on date, we were able to ascertain the ability of our models to predict mortality within a prospective cohort based on past data. In our pan-dementia analysis, the use of two-feature and nine-feature (multi-factorial) models provided a parsimonious, clinically feasible framework for predicting dementia patient mortality, while in our sub-dementia analysis, the comparison of important predictors of mortality across various dementia types may help to guide precision management and treatment of dementia.

However, our study also had limitations. Due to the high prevalence of missing values, largely attributed to the difficulty in acquiring certain data (e.g., neuropathological data) and differences in clinical procedures across ADCs, many features were preliminarily eliminated. Moreover, many features within the NACC data measure similar phenomena, certain variables have changed over time as updates were made to the UDS form, and many variables were derived from clinician diagnosis, precluding the use of a more granular feature selection method. By first eliminating variables with over 40% missing values and subsequently using MICE to impute the remaining features, we aimed to reduce some bias in the feature selection process⁴⁷, though we acknowledge the limitation of neglecting features that may only be ascertained by a selection of ADCs. We highlight that the best performance can be achieved if each ADC or clinic derives its own predictive model based on its respective available features. Moreover, our data-splitting method excluded patients who are lost to follow-up, which biases the deceased group towards a shorter survival time. This will likely make the predictors over different survival thresholds more similar to each other and overestimate the AUC values for the longer survival thresholds.

Overall, this study revealed that machine learning models have utility in predicting dementia patient mortality at various survival-time thresholds. Parsimonious models can be developed when limited clinical features are available, and dementia type-specific models can be used for distinguishing heterogeneous patient populations. If cross-validated and carefully implemented at the primary care level, such predictive models can improve personalized care of dementia.

Data sources

Our study utilized longitudinal data taken from the National Alzheimer’s Coordinating Center (NACC) database, following over 40,000 unique patients and spanning 39 past and present Alzheimer’s disease centers (ADCs) across the United States¹³. The NACC currently maintains a large relational database comprised of numerous individual datasets and forms. When the ADC program was first established in 1984, ADCs primarily collected cross-sectional data as part of a Minimum Data Set (MDS), which contained limited demographic and clinical data from each patient’s most recent visit. However, after the NACC was established in 1999, ADCs began collecting more extensive neuropathological data via the 2001 NP form, and then in 2005, a comprehensive longitudinal dataset known as the Uniform Data Set (UDS), replaced the MDS and became standardized across all ADCs^13,48. In 2015, version 3 of the UDS was implemented and currently remains in use¹³.

The raw, unprocessed dataset used for our study contained data from June 2005 up to the September 2021 data freeze, comprising 163,792 patient visits and 1,061 variables. These variables constituted a combination of demographic, comorbidity, neurological examination, clinical diagnosis, neuropathological, and genetic data that are linked to the NACC’s Uniform Data Set (UDS). The variables used in this study and their corresponding descriptors are available in Supplementary Table 3.

Survival analysis

To gain preliminary insights into the relationship between dementia and patient survival/mortality, we first conducted a survival analysis using global clinical dementia rating (CDR) and dementia type as stratification variables, excluding dementia types with fewer than 100 patients. For our survival analysis, we built Kaplan-Meier estimator curves with the “survfit” function from the survival⁴⁹ R package. We used each unique patient’s first visit as the starting point for tracking patient survival, and we calculated days of survival since the first visit based on (1) the time of death if the patient’s death was recorded within the timespan of the dataset or (2) the expiration date of the dataset if the patient was still alive.

Data cleaning

All data cleaning was conducted in R v4.1.2 (R Foundation for Statistical Computing, Vienna, Austria). First, we preserved NACCID (subject ID number) and NACCADC (ADC at which the subject was seen) but removed all other form header information and text field variables. We then re-encoded the remaining features, which consisted primarily of categorical variables that were originally encoded as type numeric. Accordingly, we converted all “Not available” codes (-4 and − 4.4), “Not assessed” or “Not applicable” codes (8, 88, 888, 888.8, and 8888), and “Unknown” codes (9, 99, 999, 999.9, 9999) to NA, accounting for the specific special code(s) corresponding to each variable. Additionally, we re-encoded the extra categories in the Neuropsychological Battery Summary Score variables (95, 96, 97, 98, 995, 996, 997, 998), which corresponded to an inability or refusal to complete the Mini-Mental State Exam (MMSE), to NA as well. Any variables with inconsistent coding (e.g., RACE with code 50 for “Other”) were manually re-encoded as appropriate. Finally, we converted all categorical variables from type numeric to type factor.

Missing data imputation

Generally, readily-available machine learning models are not compatible with missing data. Moreover, having large amounts of missing data can often affect model performance and generalizability across populations. Within the NACC dataset, which contains a large feature space and missing values scattered across variables, the removal of a row due to a single missing value can be especially detrimental and drastically reduce sample sizes. To avert potential bias introduced by manually selecting features, we opted to impute variables with missing values rather than only including patients with complete data.

The NACC Uniform Data Set has undergone several revisions since its inception in 2005, and the most recent version (version 3) was implemented in 2015. Consequently, certain variables that were collected in older versions of the UDS were no longer collected in UDS v3, and vice versa. Therefore, to minimize the number of features that did not contain sufficient non-missing values, we first omitted all variables with over 40% missing values. For the 189 remaining features, we imputed missing values using MICE (Multivariate Imputation by Chained Equations).⁵⁰ Multiple imputations is an imputation strategy that accounts for variability in missingness by generating multiple imputed datasets, which can then be aggregated into a single complete dataset⁵¹. Thus, multiple imputations generally outperform traditional machine learning methods used for imputation^52,53. MICE implements a form of multiple imputations that relies on predictive mean matching to predict the value of a given missing variable based on data points that most closely resemble the missing datapoint.

Data splitting

To evaluate patient survival status, we employed one year, three years, five years, and ten years as survival time thresholds. Accordingly, we determined each patient’s survival status based on survival threshold year length, clinic visit date, and patient’s time of death that was derived from variables NACCMOD (Month of Death) and NACCYOD (Year of Death), labeling each patient’s one-year, three-year, five-year, and ten-year survival as either 0 (survival) or 1 (deceased).

To assess the accuracy of our prediction models, we divided the whole cohort into a training/testing dataset and a separate validation dataset. In addition, to maximize the utilization of the dataset, our train/test datasets contained all patients who visited before June 2019 – [Survival Years] or died before June 2019, while our external validation datasets contained all patients who visited after June 2019 – [Survival Years] and had not died by June 2019. Thus, the model would have no access to the new records from the validation set in the training phase. For all train/test datasets and validation datasets, we firstly excluded patients with unknown time of death or lost follow-up (NACCMOD or NACCYOD = “99/9999: Unknown”) and kept all non-deceased patient records (NACCMOD and NACCYOD = “88/8888: subjects not deceased”) and patient records with a specific time of death. Then, we labeled patients who died before [Visit Date] + [Survival Years] as 1 (deceased), while the others in the datasets were labeled as 0 (survival).

Subsequently, for each survival time threshold, we stratified each survival dataset by date into a pan-dementia dataset that we used for training and testing and a separate validation cohort that we used to externally evaluate model performance. For our pan-dementia analysis, we included all dementia patients (i.e., all patients who received an impaired not MCI, MCI, or dementia diagnosis). However, for our sub-dementia analysis, we stratified our datasets by dementia type, including non-dementia patients as a baseline for comparison.

Machine learning models

After experimenting with several machine learning algorithms (i.e., random forest, logistic regression, and eXtreme Gradient Boosting), our machine learning algorithm of choice was eXtreme Gradient Boosting (XGBoost), a high-performance, tree-based ensemble learning method that uses gradient tree boosting to sequentially add new trees to reduce the errors from previous trees⁵⁴. We built XGBoost models for each of the one-year, three-year, five-year, and ten-year datasets, with the goal of predicting dementia patient survival/mortality under varying survival thresholds. We built all of our machine learning models in Python v.3.7.12 using the xgboost and scikit-learn⁵⁵ libraries.

Feature selection

For our pan-dementia analyses, we aimed to build XGBoost models to predict one-year, three-year, five-year, and ten-year survival among all dementia patients. Our first set of machine learning models utilized only two features: age and standard global CDR. These preliminary models served as a baseline of comparison for the more complex models and provided insight into how much of the mortality prediction could be explained by age and standard global CDR alone.

Subsequently, we built a more complex set of machine learning models that employed the larger feature space. However, in order to make our machine learning models more clinically feasible, we conducted feature selection using SHapley Additive exPlanations (SHAP)⁵⁶, a unified, model-agnostic framework for interpreting the predictions of machine learning models. The SHAP algorithm is rooted in game theory, relying on the calculation of Shapley values to evaluate the relative contribution of each feature to a given prediction. Though SHAP is most often used as a feature importance metric, it has demonstrated considerable utility as a feature selection method as well, even outperforming many conventional feature selection methods³⁰.

We used a variant of SHAP known as TreeSHAP⁵⁷, an enhancement to SHAP designed for tree ensemble methods, such as XGBoost. In our study, we trained default XGBoost classifiers with five-fold cross-validation on each of the four training sets and then took the union of the top five features from each model, ranked in order of decreasing mean absolute SHAP value.

Model training, testing, and validation

We trained our four XGBoost models on their respective training sets and tested their performance on their respective test sets, corresponding to their survival threshold. To account for any variability that may have been introduced by the random state of the train-test split, we conducted bootstrap resampling by generating fifty bootstrap samples, re-fitting the models on each bootstrap train set, and evaluating their performance on each bootstrap test set. All confidence intervals generated represent the 95% confidence intervals derived from bootstrap resampling. We also validated each model’s performance on its respective external validation set, which we set aside during data splitting.

Hyperparameter optimization

To optimize model performance, we used Bayesian optimization to identify the optimal hyperparameters for each XGBoost model, implemented with the BayesianOptimization²⁹ Python library. Bayesian optimization is a robust hyperparameter optimization algorithm that employs Bayes’ theorem and Gaussian processes to efficiently search the hyperparameter space. Given a black-box function, Bayesian optimization builds a probabilistic surrogate model of the objective function, which is then searched by an acquisition function that incrementally selects hyperparameters to optimize the surrogate model^58,59.

For each model, we applied fifty rounds of Bayesian optimization with five-fold cross validation, optimizing the following hyperparameters: ‘n_estimators’, ‘max_depth’, ‘colsample_bytree’, ‘min_child_weight’, ‘learning_rate’, ‘subsample’, and ‘gamma’. Additionally, to account for class imbalances, we set the ‘scale_pos_weight’ parameter of each model to the ratio between the number of samples in the negative class (survival) and the number of samples in the positive class (mortality). The full lists of tuned hyperparameters for our two-feature, multi-factorial, and dementia-type models are available in Supplementary Table 4.

Sub-dementia analysis

In addition to predicting survival in all dementia patients, we conducted a sub-dementia analysis, analyzing discrepancies among dementia types. Since the majority of dementia-related studies are geared toward Alzheimer’s disease, highlighting the distinctions between dementia types may provide insight into the mechanisms of the various forms of neurodegeneration, thus guiding clinical practice.

For our sub-dementia analysis, we only used a five-year survival threshold, as the pan-dementia analysis demonstrated that five years provides a reasonable timeframe for capturing patient mortality without a drastic trade-off in predictive performance. To ensure that each of our dementia-type models received sufficient training data, we limited our analysis to eight dementia types, which each contained at least 1000 patients from the five-year dataset between training and testing (excluding validation).

Accordingly, we built XGBoost models for each sub-dementia dataset and applied the same Bayesian optimization methodology and train-test-validation framework as with our pan-dementia analysis. However, in order to conclusively note differences between dementia types, we included all 189 original features and allowed each model to designate the most important features corresponding to its respective dementia type.

Feature importance

For both our pan-dementia analysis and sub-dementia analysis, we used the aforementioned SHapley Additive exPlanations (SHAP)⁵⁶ to determine feature importance within our XGBoost models. To distinguish the most important features in each model, we created 50 bootstrap samples with randomized train-test configurations, fit the model on each training split, and then calculated SHAP values within each test split. We then aggregated the SHAP values across all bootstrap samples before ranking the features in order of decreasing mean absolute SHAP value, based on their relative contribution to the models.

Acknowledgments

We would like to acknowledge the National Alzheimer’s Coordinating Center and its participating patients and families who contributed data to the NACC database. The NACC database is funded by NIA/NIH Grant U24 AG072122. NACC data are contributed by the NIA-funded ADCs: P50 AG005131 (PI James Brewer, MD, PhD), P50 AG005133 (PI Oscar Lopez, MD), P50 AG005134 (PI Bradley Hyman, MD, PhD), P50 AG005136 (PI Thomas Grabowski, MD), P50 AG005138 (PI Mary Sano, PhD), P50 AG005142 (PI Helena Chui, MD), P50 AG005146 (PI Marilyn Albert, PhD), P50 AG005681 (PI John Morris, MD), P30 AG008017 (PI Jeffrey Kaye, MD), P30 AG008051 (PI Thomas Wisniewski, MD), P50 AG008702 (PI Scott Small, MD), P30 AG010124 (PI John Trojanowski, MD, PhD), P30 AG010129 (PI Charles DeCarli, MD), P30 AG010133 (PI Andrew Saykin, PsyD), P30 AG010161 (PI David Bennett, MD), P30 AG012300 (PI Roger Rosenberg, MD), P30 AG013846 (PI Neil Kowall, MD), P30 AG013854 (PI Robert Vassar, PhD), P50 AG016573 (PI Frank LaFerla, PhD), P50 AG016574 (PI Ronald Petersen, MD, PhD), P30 AG019610 (PI Eric Reiman, MD), P50 AG023501 (PI Bruce Miller, MD), P50 AG025688 (PI Allan Levey, MD, PhD), P30 AG028383 (PI Linda Van Eldik, PhD), P50 AG033514 (PI Sanjay Asthana, MD, FRCP), P30 AG035982 (PI Russell Swerdlow, MD), P50 AG047266 (PI Todd Golde, MD, PhD), P50 AG047270 (PI Stephen Strittmatter, MD, PhD), P50 AG047366 (PI Victor Henderson, MD, MS), P30 AG049638 (PI Suzanne Craft, PhD), P30 AG053760 (PI Henry Paulson, MD, PhD), P30 AG066546 (PI Sudha Seshadri, MD), P20 AG068024 (PI Erik Roberson, MD, PhD), P20 AG068053 (PI Marwan Sabbagh, MD), P20 AG068077 (PI Gary Rosenberg, MD), P20 AG068082 (PI Angela Jefferson, PhD), P30 AG072958 (PI Heather Whitson, MD), P30 AG072959 (PI James Leverenz, MD). This work was also supported by NIGMS R35GM138113 and ISMMS funds to KH.

COMPETING FINANCIAL INTERESTS

The authors declare no competing financial interests.

CONTRIBUTIONS

K.H. conceived the research, and J.Z., L.S., K.H. designed the analyses. Z.M. and K.C. provided the NACC data and consulted for its meaningful use. J.Z. and L.S. conducted the computational analyses. K.C. provided statistical suggestions. K.H. supervised the overall study. J.Z., L.S. wrote the manuscript, and K.H. edited the manuscript. All authors read, edited, and approved the manuscript.

DATA AND CODE AVAILABILITY

The data used in this study can be requested from the National Alzheimer’s Coordinating Center: https://nacc.redcap.rit.uw.edu/surveys/?s=KHNPKLJW8TKAD4DA. The code implemented in this study is available at: https://github.com/Huang-lab/dementia-survival-prediction.

Ahmad, F. B. & Anderson, R. N. The Leading Causes of Death in the US for 2020. JAMA 325, 1829–1830 (2021).
The US Burden of Disease Collaborators. The State of US Health, 1990-2016: Burden of Diseases, Injuries, and Risk Factors Among US States. JAMA 319, 1444–1472 (2018).
Xu, J., Zhang, Y., Qiu, C. & Cheng, F. Global and regional economic costs of dementia: a systematic review. The Lancet 390, S47 (2017).
Kumar, S. et al. Machine learning for modeling the progression of Alzheimer disease dementia using clinical data: a systematic literature review. JAMIA Open 4, ooab052 (2021).
Nichols, E. et al. Estimation of the global prevalence of dementia in 2019 and forecasted prevalence in 2050: an analysis for the Global Burden of Disease Study 2019. The Lancet Public Health 7, e105–e125 (2022).
Stokes, A. C. et al. Estimates of the Association of Dementia With US Mortality Levels Using Linked Survey and Mortality Records. JAMA Neurology 77, 1543–1550 (2020).
Brodaty, H. et al. The World of Dementia Beyond 2020. Journal of the American Geriatrics Society 59, 923–927 (2011).
Gauthier, S., Rosa-Neto, P., Morais, J. A. & Webster, C. World Alzheimer Report 2021: Journey through the diagnosis of dementia. Alzheimer’s Disease International: London, UK (2021).
Arvanitakis, Z., Shah, R. C. & Bennett, D. A. Diagnosis and Management of Dementia: Review. JAMA 322, 1589–1599 (2019).
Weller, J. & Budson, A. Current understanding of Alzheimer’s disease diagnosis and treatment. F1000Res 7, F1000 Faculty Rev-1161 (2018).
Folstein, M. F., Robins, L. N. & Helzer, J. E. The Mini-Mental State Examination. Archives of General Psychiatry 40, 812 (1983).
Rosen, W. G., Mohs, R. C. & Davis, K. L. A new rating scale for Alzheimer’s disease. The American Journal of Psychiatry 141, 1356–1364 (1984).
Besser, L. M. et al. The Revised National Alzheimer’s Coordinating Center’s Neuropathology Form—Available Data and New Analyses. Journal of Neuropathology & Experimental Neurology 77, 717–726 (2018).
Lin, M. et al. Big Data Analytical Approaches to the NACC Dataset: Aiding Preclinical Trial Enrichment. Alzheimer Dis Assoc Disord 32, 18–27 (2018).
Zhu, F. et al. Machine Learning for the Preliminary Diagnosis of Dementia. Scientific Programming 2020, e5629090 (2020).
Qiu, S. et al. Development and validation of an interpretable deep learning framework for Alzheimer’s disease classification. Brain 143, 1920–1933 (2020).
Joshi, P. S. et al. Temporal association of neuropsychological test performance using unsupervised learning reveals a distinct signature of Alzheimer’s disease status. Alzheimers Dement (N Y) 5, 964–973 (2019).
An, N., Ding, H., Yang, J., Au, R. & Ang, T. F. A. Deep ensemble learning for Alzheimer’s disease classification. Journal of Biomedical Informatics 105, 103411 (2020).
Gupta, A. & Kahali, B. Machine learning-based cognitive impairment classification with optimal combination of neuropsychological tests. Alzheimers Dement (N Y) 6, e12049 (2020).
Kim, J. P. et al. Machine learning based hierarchical classification of frontotemporal dementia and Alzheimer’s disease. Neuroimage Clin 23, 101811 (2019).
Sharma, R., Anand, H., Badr, Y. & Qiu, R. G. Time-to-event prediction using survival analysis methods for Alzheimer’s disease progression. Alzheimer’s & Dementia: Translational Research & Clinical Interventions 7, e12229 (2021).
Haaksma, M. L. et al. Survival time tool to guide care planning in people with dementia. Neurology 94, e538–e548 (2020).
Spooner, A. et al. A comparison of machine learning methods for survival analysis of high-dimensional clinical data for dementia prediction. Sci Rep 10, 20410 (2020).
Wang, L. et al. Development and Validation of a Deep Learning Algorithm for Mortality Prediction in Selecting Patients With Dementia for Earlier Palliative Care Interventions. JAMA Network Open 2, e196972 (2019).
Rose, S. Mortality Risk Score Prediction in an Elderly Population Using Machine Learning. American Journal of Epidemiology 177, 443–452 (2013).
Perna, L. et al. Incident depression and mortality among people with different types of dementia: results from a longitudinal cohort study. Soc Psychiatry Psychiatr Epidemiol 54, 793–801 (2019).
Williams, M. M., Xiong, C., Morris, J. C. & Galvin, J. E. Survival and mortality differences between dementia with Lewy bodies vs Alzheimer disease. Neurology 67, 1935–1941 (2006).
Geschwind, M. D. Rapidly Progressive Dementia. Continuum (Minneap Minn) 22, 510–537 (2016).
Nogueira, F. Bayesian Optimization: Open source constrained global optimization tool for Python. (2014).
Marcílio, W. E. & Eler, D. M. From explanations to feature selection: assessing SHAP values as feature selection mechanism. in 2020 33rd SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI) 340–347 (2020). doi:10.1109/SIBGRAPI51738.2020.00053.
Connors, M. H. et al. Predictors of Mortality in Dementia: The PRIME Study. Journal of Alzheimer’s Disease 52, 967–974 (2016).
Park, S., Lee, J.-Y., Suh, G.-H., Chang, S.-M. & Cho, M.-J. Mortality Rates and Risk Factors in Community Based Dementia Patients. Journal of Korean Geriatric Psychiatry 25–28 (2007).
Garre-Olmo, J. et al. Survival, effect measures, and impact numbers after dementia diagnosis: a matched cohort study. Clin Epidemiol 11, 525–542 (2019).
Mitchell, S. L., Miller, S. C., Teno, J. M., Davis, R. B. & Shaffer, M. L. The Advanced Dementia Prognostic Tool: A Risk Score to Estimate Survival in Nursing Home Residents with Advanced Dementia. Journal of Pain and Symptom Management 40, 639–651 (2010).
Todd, S., Barr, S., Roberts, M. & Passmore, A. P. Survival in dementia and predictors of mortality: a review. International Journal of Geriatric Psychiatry 28, 1109–1124 (2013).
Lee, K.-C. et al. Estimating the survival of elderly patients diagnosed with dementia in Taiwan: A longitudinal study. PLOS ONE 13, e0178997 (2018).
Piovezan, R. D. et al. Mortality Rates and Mortality Risk Factors in Older Adults with Dementia from Low- and Middle-Income Countries: The 10/66 Dementia Research Group Population-Based Cohort Study. J Alzheimers Dis 75, 581–593.
Golüke, N. M. S. et al. Risk factors for in-hospital mortality in patients with dementia. Maturitas 129, 57–61 (2019).
Qiu, C., Bäckman, L., Winblad, B., Agüero-Torres, H. & Fratiglioni, L. The Influence of Education on Clinically Diagnosed Dementia Incidence and Mortality Data From the Kungsholmen Project. Archives of Neurology 58, 2034–2039 (2001).
Alonso, A. et al. Cardiovascular risk factors and dementia mortality: 40 years of follow-up in the Seven Countries Study. Journal of the Neurological Sciences 280, 79–83 (2009).
Vazzana, R. et al. Trail Making Test Predicts Physical Impairment and Mortality in Older Persons. Journal of the American Geriatrics Society 58, 719–723 (2010).
Rosenberg, P. B. et al. The Association of Neuropsychiatric Symptoms in MCI with Incident Dementia and Alzheimer Disease. The American Journal of Geriatric Psychiatry 21, 685–695 (2013).
Chiu, M.-J., Chen, T.-F., Yip, P.-K., Hua, M.-S. & Tang, L.-Y. Behavioral and Psychologic Symptoms in Different Types of Dementia. Journal of the Formosan Medical Association 105, 556–562 (2006).
Ballard, C. et al. Anxiety, depression and psychosis in vascular dementia: prevalence and associations. Journal of Affective Disorders 59, 97–106 (2000).
Johns, E. K. et al. Executive functions in frontotemporal dementia and Lewy body dementia. Neuropsychology 23, 765–777 (2009).
Geldmacher, D. S. & Whitehouse, P. J. Differential diagnosis of Alzheimer’s disease. Neurology 48, 2S-9S (1997).
Mera-Gaona, M., Neumann, U., Vargas-Canas, R. & López, D. M. Evaluating the impact of multivariate imputation by MICE in feature selection. PLOS ONE 16, e0254720 (2021).
Beekly, D. L. et al. The National Alzheimer’s Coordinating Center (NACC) Database: The Uniform Data Set. Alzheimer Disease & Associated Disorders 21, 249–258 (2007).
Therneau, T. M. A Package for Survival Analysis in R. (2021).
Buuren, S. van & Groothuis-Oudshoorn, K. mice: Multivariate Imputation by Chained Equations in R. Journal of Statistical Software 45, 1–67 (2011).
Sterne, J. A. C. et al. Multiple imputation for missing data in epidemiological and clinical research: potential and pitfalls. BMJ 338, b2393 (2009).
Eekhout, I. et al. Missing data in a multi-item instrument were best handled by multiple imputation at the item score level. Journal of Clinical Epidemiology 67, 335–342 (2014).
Coley, N. et al. How should we deal with missing data in clinical trials involving Alzheimer’s disease patients? Curr Alzheimer Res 8, 421–433 (2011).
Chen, T. & Guestrin, C. XGBoost: A Scalable Tree Boosting System. in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 785–794 (Association for Computing Machinery, 2016). doi:10.1145/2939672.2939785.
Buitinck, L. et al. API design for machine learning software: experiences from the scikit-learn project. in ECML PKDD Workshop: Languages for Data Mining and Machine Learning 108–122 (2013).
Lundberg, S. M. & Lee, S.-I. A Unified Approach to Interpreting Model Predictions. in Advances in Neural Information Processing Systems vol. 30 (Curran Associates, Inc., 2017).
Lundberg, S. M., Erion, G. G. & Lee, S.-I. Consistent Individualized Feature Attribution for Tree Ensembles. (2018).
Snoek, J., Larochelle, H. & Adams, R. P. Practical Bayesian Optimization of Machine Learning Algorithms. in Advances in Neural Information Processing Systems vol. 25 (Curran Associates, Inc., 2012).
Wilson, J., Hutter, F. & Deisenroth, M. Maximizing acquisition functions for Bayesian optimization. in Advances in Neural Information Processing Systems vol. 31 (Curran Associates, Inc., 2018).

There is NO Competing Interest.

supplementalmaterial.docx

Download PDF

Journal Publication

published 28 Feb, 2024

Read the published version in Communications Medicine →

Version 1

posted

You are reading this latest preprint version

Predictive Models and Features of Patient Mortality across Dementia Types

Status:

Journal Publication

Version 1

Abstract

Figures

Introduction

Results

Cohort characteristics and patient mortality across dementia types

Predicting dementia patient mortality using age and standard global CDR

Multi-factorial machine learning models for predicting mortality in dementia patients

Dementia type-specific models

Discussion

Methods

Data sources

Survival analysis

Data cleaning

Missing data imputation

Data splitting

Machine learning models

Feature selection

Model training, testing, and validation

Hyperparameter optimization

Sub-dementia analysis

Feature importance

Declarations

References

Additional Declarations

Supplementary Files

Status:

Journal Publication

Version 1