Machine learning-based prediction of cognitive trajectories among middle-aged and elderly Chinese adults

Abstract


Abstract
Background Cognition represents heterogeneous in trajectories with increasing age. Early identi cation of such trend is essential for the prevention of high risk population related to cognition decline. This study aimed to explore the heterogeneity and determinants of cognitive trajectories, and construct prediction models for distinguishing cognitive trajectories among the middle-aged and elderly Chinese people at a community level.

Methods
Data was retrospectively collected from the China Health and Retirement Longitudinal Study with a consecutive survey in 3 waves (2011,2013,2015). The Mini-mental State Examination (MMSE) scale was used for cognitive measurement. The heterogeneity of cognitive trajectories was identi ed through mixed growth model, and the determinants of cognitive trajectories were analyzed by logistic regression. Machine learning (ML) algorithms, namely logistic regression (LR), support vector machine (SVM), and random forest (RF), combined with recursive feature elimination were used to predict cognitive trajectories using epidemiological variables. Area under the receiver operating characteristic curve (AUROC) and brier score were used to assess discrimination and calibration, respectively.

Results
Three cognitive trajectories were identi ed: "persistently-high" (PH, 83.3%) class with a high cognitive score in baseline and remained high subsequently, "medium-decrease" (MD, 6.6%) class with a medium cognitive score in baseline and a rapid decline subsequently, "low-increase" (LI, 10.1%) with the lowest baseline cognitive score and a rapid growth subsequently. For classi cation of PH vs. LI and MD vs. LI, the AUROCs of three ML methods were almost larger than 0.90, and brier scores were almost less than 0.10. For PH vs. MD, ML performed less well with the maximum AUROCs close to 0.68, while the calibration was still good.
The initial MMSE score was the most signi cant predictors for distinguishing MD vs. LI and PH vs. LI, while age was the most important predictors in distinguishing MD vs. PH.

Conclusions
Prediction of cognitive trajectories and identi cation of its key predictors are of great signi cance for understanding the heterogeneity of cognitive trajectories as well as interventions for cognitive decline.

Background
Dementia is a degenerative disease of the central nervous system accompanied by cognitive decline.
According to the World Alzheimer's Report, more than 50 million people worldwide were estimated to be living with dementia in 2020. Moreover, the number will increase to 152 million by 2050 with rapid aging [1].
With frequently exposure to unhealthy lifestyles and environmental hazards, middle-aged people also faced the risk of cognitive decline for the rejuvenation of chronic diseases [2]. Until now, the etiology and pathogenesis of dementia are still unclear, and there is no effective cure in clinical practice [3]. About three quarters of dementia patients worldwide were not diagnosed in time, especially in low-and middle-income countries. It was estimated that the annual global losses related to dementia were $818 billion, which was equivalent to the output of the 18th largest economy [1]. Therefore, it is crucial to identify the high risk individuals of cognitive decline and evaluate their cognitive trajectory for the timely prevention of dementia.
So far, most of the prediction studies related to cognition focused on clinical diagnosis [4]. Thus, data used in these studies were obtained from some speci c clinical cohorts. Speci cally, a lot of complex, costly, and invasive data including neuroimaging, cerebrospinal uid, and genetic data were collected, which was not suitable for early screening in a more general population [5]. In addition, previous studies mainly considered cognition at a cross-sectional level [6], which could not reveal cognitive trajectories over time. So cognitive trajectory prediction based on longitudinal data may shed more light on the change of cognition.
For population trajectory analysis, traditional models are not suitable for its assumption of homogeneity of all samples. Growth mixture model (GMM), a combination of traditional growth model and latent class analysis, can identify the potential trajectories with a consideration of variation contained in different trajectories as well as population in the same trajectory, and has been gradually used in cognitive trajectory study [7]. In terms of cognition prediction, machine learning techniques are frequently applied. Machine learning (ML), which is a set of computational methods that can discover complex nonlinear relationships between inputs and outputs, has been widely used in disease prediction [8]. In medical practice, ML can process complex medical data and construct prediction models for decision-making.
This study focused on two issues: rstly, mixed growth model was used to explore the cognitive trajectories among the middle-aged and elderly people in China, and the determinants of cognitive trajectories were also explored. Second, machine learning methods, namely logistic regression, support vector machine, and random forest, were used to predict the types of cognitive trajectories with epidemiological variables, and the importance of predictors were assessed by random forest, so as to develop speci c intervention strategies for high risk populations.

Data Source
This study retrospectively collected data from the China Health and Retirement Longitudinal Study (CHARLS, http://charls.pku.edu.cn/) with a consecutive survey in 2011, 2013, and 2015. CHARLS has been widely recognized and used in academia, as described elsewhere [9]. The three waves were used to identify cognitive trajectories classes for the middle-aged and elderly adults, and data in the rst wave were used as predictors. Participants aged 45 and above with a consecutive survey during three waves were included. Participants with missing values in MMSE items were excluded. Finally, a total of 4,962 participants were selected for model derivation and internal evaluation. Data used in study was approved by the biomedical ethics committee of Peking University, and all participants provided written informed consent.

Measurement of variables
Predictors in this study included sociodemographics, behavior and lifestyle, baseline cognitive and depression scores, physical examination information. For sociodemographics, age, gender, marital status (not in married vs. in married), and education (illiteracy vs. non-illiterate) were selected. For behavior and lifestyle, smoking, drinking, sleep time, and social activities were considered. Smoking was obtained through the following question: "Do you smoke now?". Drinking was obtained through: "Have you ever drunk in the past year?". Both smoking and drinking were converted into dichotomous variables. sleep time referred to the average sleep time in the past month, and social activities engagement were referred to participation in any social activities in the past month. Baseline cognitive scores, baseline depression scores were calculated through the Mini-Mental State Examination (MMSE) Scale and the Center for Epidemiologic Studies Depression Scale-10 (CESD-10) Scale [10][11]. Physical examination variables included waist circumference, chronic diseases (hypertension, diabetes, psychiatric problems, and memory-related diseases). Waist circumference referred to the circumference of the waist around the navel at the end of expiration when the subject was breathing calmly and is calculated in centimeters. Hypertension was de ned as systolic blood pressure ≥ 140mmHg and/or diastolic blood pressure ≥ 90mmHg, or use of antihypertensive drugs.
Diabetes referred to fasting blood glucose ≥ 126 mg/dL, HbA1c ≥ 6.5%, or current anti-diabetic treatments. Psychiatric problems, and memory-related diseases were collected through self-reported doctors' diagnosis.
For tmissing values, multiple imputation was performed before constructing prediction models. We generated 5 sets of data and obtained the mean values for each index.

Feature selection
Recursive feature elimination (RFE) is an ML method for feature selection that combines with several classi ers to eliminate redundant variables, thus identifying the most important factors for each classi er [12]. In order to select the best combination of predictors, a 10-od cross validation was combined with RFE, that was, RFE was performed on each subset of input data, and validation error of all subsets was calculated. Finally, the subset with smallest error was selected as the optimal combination.

Cognitive trajectories of middle-aged and elderly adults
Firstly, we performed a multiple linear regression to obtain the predicted cognitive score of each person adjusting for age, gender and education. Next, the adjusted Z score were calculated with the following equation: , where Y indicated the original cognitive score in 2011, 2013, and 2015, respectively, indicated the predicted population mean cognitive score, RMSE represented the root mean square error of the regression equation [13]. In this study, the adjusted Z score were used for trajectories analysis.
The Growth Mixed Model (GMM) was used to explore the heterogeneity of cognitive trajectories, which could divide populations into several groups based on the differences in growth trajectories. Previous studies suggested that a latent growth curve model (LGCM) and latent class growth model (LCGM) should be used to explore the shape of growth curve and the number of potential trajectory classes before GMM analysis [14][15]. When the optimal LCGM model was selected, the GMM model was tted subsequently. For model selection, statistical indices and interpretability are often considered. Statistical indices include sample-size adjusted Bayesian information criteria (SABIC), entropy, Vuong-Lo-Mendell-Rubin likelihood ratio test (VLMR-LRT), bootstrapped likelihood ratio test (BLRT), proportion of the smallest class, average posterior probability (APP). SABIC is an information criterion with a more reduction, representing an improvement of model. Entropy is a measure of classi cation accuracy, ranging from 0 to 1. The larger the entropy, the better the trajectories classi cation. VLMR-LRT and BLRT compare the results of the k-1 class model with k class model. A signi cant p value indicates that k class model is better than k-1 class model. Besides, each trajectory class must contain enough samples, no less than 5% of total population. For APP, it is recommended to be greater than 0.7.

Derivation and evaluation of cognitive trajectories prediction models
Logistic regression (LR) were used as the benchmark model for comparisons with another two machine learning methods, namely support vector machine (SVM) and random forest (RF). LR, a probabilistic nonlinear regression in statistical methods, were used to study the relationship between outcome and a set of factors. SVM is one of the most common ML algorithms based on maximum margin hyperplane as decision boundary. It is especially suitable for linear classi cation with small samples for its outputs only depend on support vectors. Meanwhile, it also performed well for nonlinear problems with kernel tricks [16]. RF, a popular ensemble learning method, is a combination of multiple decision tree classi ers. It integrates the prediction results of all decision tree classi ers to determine its nal output [17]. Besides, RF is often used to assess the importance of predictors [18].
A 10-fold cross validation was used to evaluate the performance of prediction models [19]. We used accuracy to evaluate the proportion of samples classi ed correctly. Speci cally, balanced accuracy was used in imbalanced data. The F1 score, which combines both precision and recall, was also calculated [20]. Area under the receiver operating characteristic curve (AUROC) was used to evaluate the discrimination of prediction models. Calibration was evaluated by brier score. For the above indices, 95% con dence interval (95% CI) were considered. Besides, the variable importance of predictors were plotted based on RF.

Statistical analysis
Continuous variables were presented as mean ± standard deviation. Categorical variables were presented as percentages. The comparisons of baseline characteristics among different trajectories were performed by appropriately choosing ANOVA test and chi square test. All the above analysis were conducted with SPSS 25.0. Trajectory class analyses were performed with Mplus 8.3 (Muthén and Muthén, 2019). Feature selection, model derivation, and model evaluation were performed with scikit-learn package in Python 3.7.6. A two-sided p-value of < 0.05 was considered statistically signi cant.

Results
Heterogeneous trajectories of cognition A linear mean trajectory was found with a negative slope for the cognition of middle-aged and elderly adults (intercept = 3.37, p < 0.001; slope = − 0.07, p < 0.001). The results of 2-5 classes LCGM model were presented in Table 1. A notable reduction of SABIC was observed in 2-class and 3-class. Although the results of VLMR-LRT and BLRT showed that more classes were much better (p < 0.05), the smallest class and average posterior probability of 3-class were > 5.0% and > 0.70%, respectively, and its entropy was 0.64, indicating that the 3-class were more favorable. Then, GMM was used to estimate the nal trajectories, and the GMM with 3-class was adopted as the nal model (SABIC = 39381.36, entropy = 0.74, VLMR-LRT p < 0.05, BLRT p < 0.001) . The cognitive trajectories were plotted (Fig. 1). The rst class (83.3%) had a high cognitive score in baseline and remained at a high level subsequently (intercept = 0.19, p < 0.001; slope = 0.018, p > 0.05), and was named "persistently-high" (PH). The second class (6.6%) had a medium cognitive score in baseline and decreased rapidly thereafter (intercept = -0.18, p > 0.05; slope = -0.77, p < 0.001), and was named "mediumdecrease" (MD). The third class (10.1%) had a relatively lower baseline cognitive score but showed a rapid growth subsequently (intercept = -1.453, p < 0.001; slope = 0.353, p < 0.001), and was named "low-increase" (LI). NOTE. Values are presented as mean ± standard deviation, number (%), or median (interquartile range).
* represents p < 0.05; ** represents p < 0.01; *** represents p < 0.001 Table 2 showed the results of baseline characteristics of the whole population and people with different trajectory classes. This study identi ed 4370 participants with "persistently high" class, 234 participants with "medium decrease" class, and 358 participants with "low increase" class. The mean age of the whole population was 57.50 years; and 54.4% of them were male. Of the whole population, 9.5% of them were illiteracy, 92.2% of them were married, 43.0% of them were smokers, and 37.5% of them had a history of drinking within one year. Moreover, the baseline score of MMSE and CESD-10 were 13.31 and 6.97 respectively. For the comparisons of three cognitive trajectories, signi cant differences were found in age, gender, education, social activity engagement, smoking, drinking, MMSE score, CESD-10 score, waist circumference, and memory-related disease (all p < 0.05).

Determinants of cognitive trajectory
The results of odds ratios (OR) and its 95% CI were presented in Table 3 with binary logistic regression.
Compared with "persistently-high" class, individuals with low baseline MMSE scores, memory-related disorders, younger, higher education, or stroke-free were at more risk of being classi ed as low-increase class. While those with low baseline MMSE score, high CESD-10 score, memory-related disorders, the habit of smoking, low waist circumference, or illiteracy were more likely to develop a trajectory of mediumdecrease. Similarly, we also noted that logistic regression was superior to all the other ML methods (p < 0.05) for recognizing cognitive trajectory except that in MD vs. LI (LR was equivalent to RF, p > 0.05) when LR was used as the benchmark model.

Feature importance
The importance of predictors obtained from RF were plotted for all variables (Fig. 3) Supplementary Fig. 1).

Discussion
In the present study, we described the cognitive trajectories of 5-year follow-up in a nationally representative sample of community-dwelling middle-aged and older people in China and explored its potential determinants. There were three distinct trajectories of cognition, namely, persistently-high, medium-decrease, and low-increase. Compared with previous studies [21][22], similarly, the persistently-high class accounted for most of the population, indicating that majority of participants kept stable cognition through their aging periods. However, we also found an uncommon trajectory of low-increase, which was featured for low cognitive score in baseline and gradually improvement subsequently.
Compared with "persistently-high" class, individuals with low baseline MMSE scores, memory-related disorders, younger, high education, or stroke-free were at more risk of being classi ed as low-increase class.
While those with low baseline MMSE score, high CESD-10 score, memory-related disorders, the habit of smoking, low waist circumference, or illiteracy were more likely to have a trajectory of medium-decrease [23]. Obviously, memory-related disorders can result in low MMSE score, and the baseline MMSE score determined the cognitive trajectory to a large extent. But compared to "persistently-high" class, why people who are younger, with high education, or in stroke-free probably developed to low-increase class? It seemed out of line, we gave the following possible explanations. Younger people may suffer acute disease that resulted in low cognitive function at baseline, but after rehabilitation treatment, their cognitive function will increase in a certain. In general, people with high education have abundant knowledge, in addition, they have more ways to reinforce cognitive recovery training. Stroke is associated with an increased risk of dementia [24], deteriorate patients' cognitive function, unluckily, the cognitive damage due to dementia is hard to reversible, so people in stroke-free have more potential in cognitive resilience. According to the above ndings, we put forward some proposals to improve cognition, delay cognitive decline and promote healthy aging, such as setting up universities for the elderly to promote education, encouraging the elderly to strengthen exercise and social interaction [25]. Besides, balanced diet and nutrition, absence of smoking, good quality of sleep, reasonable weight control also mean a lot for preventing cognitive decline.
We found that ML could well distinguish cognitive trajectories, especially for MD vs. LI class and PH vs. LI class. While for MD vs. PH, ML performed less well. It implied that the existing predictors might not be suitable for prediction. We also found that the traditional logistic regression was almost superior to the ML methods in the present study, and the same results were also found in previous suicide studies [26].
The importance of the variables varied among different subgroups, but the ve factors from feature selection were important for the prediction of the cognitive trajectory, which were the age, initial MMSE score [27], CESD-10 score, waist circumference, and sleep time, suggesting that these factors might be important for the prevention of cognitive decline. Previous studies have identi ed that cognitive function deteriorates naturally with age [28]. Depressive symptoms were associated with cognitive decline [29], so CESD-10 score can be viewed as a predictor. High waist circumference suggested abdominal obesity, nowadays, urban residents often replace walking with transportation and thereby have insu cient exercise, furthermore, the habit of drinking alcohol and unhealthy diet, and so on, which are not only associated with abdominal obesity but also have an impact on cognition [30][31]. Previous study has showed that sleep quality was negatively correlated with cognitive impairment [32][33].
This study has the following advantages: Firstly, the population was selected from a large community-based cohort study with good representation, and the predictive data was easy to obtain, the cost is low, and there is almost no risk to the population. Secondly, the mixed growth model not only takes the population heterogeneity into account, but also the differences among individuals in the population, which is helpful to dynamically understand the changing trend of cognition in middle-aged and elderly people. Finally, machine learning methods were used to predict the future cognitive trend of the population, which was helpful for the early identi cation of high-risk groups as well as timely personalized intervention.

Limitations
However, the shortcomings of present study should be noted. Firstly, the sample size was relatively small considering the availability of data. Secondly, we used three waves of data (2011, 2013 and 2015) for cognitive trajectory analysis. Considering the relative short periods, we might not be able to recognize the ne cognitive changes as well as the long-term trends of cognitive development. Thirdly, machine learning methods for predicting the cognitive trajectory were only assessed in internal validation, and external validation is needed in a wider population in our future studies. Fourth, we assumed that the predictors remained unchangeable, but in fact, some variables might change with time. In future, more advanced machine learning methods, such as long short-term memory (LSTM) model with the ability for analyzing time series data [34], could be used for cognitive trajectory prediction.

Conclusion
This study showed that three kinds of cognitive trajectories ("persistently-high", "medium-decrease", "lowincrease") were existed among middle-aged and elderly Chinese people. Based on easily available and low cost epidemiological data, especially for MMSE score and age, machine learning could effectively distinguish cognitive trajectories.

Declarations
Ethical Approval and Consent to participate Data used in study was approved by the biomedical ethics committee of Peking University, and all participants provided written informed consent.

Consent for publication
Not applicable.

Availability of data and materials
Data used in this study was accessible with the following link http://charls.pku.edu.cn/.

Competing interests
None.
Funding Figure 2 Importance of all variables. Values in x-axis represents the importance of each predictors (%). A for lowincrease vs. medium-decrease; B for low-increase vs. persistently-high; C for medium-decrease vs. persistently-high.

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.