Development of a Sex-Specic Risk Scoring System for Predicting Cognitive Normal to Mild Cognitive Impairment (SRSS-CNMCI)

Objective: We aim to develop a sex-specic risk scoring system for predicting cognitive normal (CN) to mild cognitive impairment (MCI), abbreviated SRSS-CNMCI, to provide a reliable tool for the prevention of MCI. Methods: Participants aged 61-90 years old with a baseline diagnosis of CN and an endpoint diagnosis of MCI were screened from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database with at least one follow-up. Multivariable Cox proportional hazards models were used to identify risk factors associated with conversion from CN to MCI and to build risk scoring systems for male and female groups. Receiver operating characteristic (ROC) curve analysis was applied to determine the risk probability cutoff point corresponding to the optimal prediction effect. We ran an external validation of the discrimination and calibration based on the Harvard Aging Brain Study (HABS) database. Results: A total of 471 participants, including 240 women (51%) and 231 men (49%), aged 61 to 90 years, were included in the study cohort for subsequent primary analysis. The nal multivariable models and the risk scoring systems for females and males included age, APOE ε4, Mini-Mental State Examination (MMSE) and Clinical Dementia Rating (CDR). The scoring systems for females and males revealed C statistics of 0.902 (95% CI 0.840-0.963) and 0.911 (95% CI 0.863-0.959), respectively, as measures of discrimination. The cutoff point of high and low risk was 33% in females, and more than 33% was considered high risk, while more than 9% was considered high risk for males. The external validation effect of the scoring systems was good: C statistic 0.950 for the females and C statistic 0.965 for the males. Conclusions: Our parsimonious model accurately predicts conversion from CN to MCI with four risk factors and can be used as a predictive tool for the prevention of MCI.


Introduction
Alzheimer's disease (AD) is a neurodegenerative disease that worsens over time [1] . There are three stages in the progression of AD: preclinical Alzheimer's disease, mild cognitive impairment (MCI) due to Alzheimer's disease and dementia due to Alzheimer's disease [2][3][4] .
According to the latest 2020 Alzheimer's disease report, it is expected that by 2050, 152 million people aged 65 and over will have Alzheimer's disease worldwide [1] . The total annual payments for health care and long-term care for those with AD are expected to increase from $305 billion in 2020 to more than $1.1 trillion in 2050 [1] , causing an enormous nancial burden to patients' families and society. Therefore, it is expected that if AD can be identi ed and predicted before the appearance of clinical symptoms or mild cognitive impairment, early prevention or treatment can be performed to reduce the incidence of AD.
At present, many researchers have developed predictive models for AD conversion to identify high-risk populations.
Studies have shown that 15% of MCI patients over 65 years of age developed AD after 2 years of followup [5] , 32% developed AD during 5 years of follow-up [6] , and 38% developed AD after 5 years or more of follow-up [7] . Therefore, MCI is a very dangerous stage. Once MCI develops, there is a great risk of continuing to develop AD, and life expectancy is reduced. We believe that more attention should be paid to the status prior to progression to MCI. If we can predict the risk before progression to MCI, monitoring can be performed earlier, and measures can implemented to delay disease development or even cure and return to a normal state.
Steenland et al. [8] developed a 'Framingham-like' prediction model for predicting progression from unimpaired cognition to amnestic mild cognitive impairment (aMCI) using a number of dichotomous risk factors, including memory summary score, hippocampus and Tau/Aβ ratio, and the C statistic of this model was 0.80. Due to the limited sample size, the training set and the test set were not divided, that is, there was neither internal validation nor external validation. In this study [8] , the classi cation of risk factors into four or two groups by quartile or ROC analysis was completely data-driven, and the risk factor grouping may not have clinical signi cance. Barnes et al. [9] used a Cox proportional hazards model to determine the risk factors affecting AD progression and established a point score ranging from 0 to 9 based on the predictors in the nal model, which was only internally validated using bootstrapping techniques, and similar to the previous study, lacked external veri cation.
Sex is recognized as one of the inherent important attributes of people that affects the process of AD [10,11] . Physiological characteristics, social status, living habits and other factors between men and women lead to differences in MCI risk. We expect that the accuracy of the predictive model will be improved by modeling male and female groups separately. Therefore, we aimed to develop a sex-speci c risk scoring system for predicting cognitive normal to mild cognitive impairment (SRSS-CNMCI) to provide a reliable tool for the prevention of MCI. We plan to perform external validation in a new heterogeneous database to enhance the reliability of the model prediction to make up for the shortcomings of previous studies.

Data source and participants
In this study, we used participant data from two independent cohorts: the Alzheimer's Disease Neuroimaging Initiative (ADNI) database (http://adni.loni.usc.edu/) for modeling and the Harvard Aging Brain Study (HABS) database (https://habs.mgh.harvard.edu/) for external validation. For up-to-date information regarding these speci c protocols in ADNI, please see www.adni-info.org.
Participants from the ADNI were included in this study if they were 1) diagnosed as cognitive normal at baseline, 2) the last follow-up of the participants in any of the three rounds of ADNI data collection was regarded as their end point, and 3) 61-90 years old.
Exclusion criteria included those who 1) only had baseline data, 2) were diagnosed with AD (converted to AD), and 3) reverted back to cognitive normal.
We screened participants in the HABS database for external validation using the same inclusion and exclusion criteria as for ADNI participants.

Informed consent
Each subject gave written informed consent for imaging and neuropsychological testing in accordance with the Human Subjects Research Committee Guidelines. Please see www.adni-info.org for further details.
As a longitudinal cohort study, the ADNI not only provided information on the progression of MCI but also contained information on the progression duration. Therefore, we developed sex-speci c risk scoring systems based on a multivariable Cox proportional hazards model. Age was forced to be included in multivariable models because previous studies have shown that age is the greatest risk factor [1] .
The modeling method was as follows.
Step 1: For each risk factor, Cox proportional hazards modeling yielded regression coe cients and was used to calculate the mean value of continuous risk factors or the proportion of each classi cation of risk factors. Step 2: Each risk factor was grouped according to its clinical signi cance or usage habits, and the median value of the group was selected as the reference value of the group.
Step 3: For each risk factor, the reference value of one of the most common groups was selected as the basic risk reference value of this factor. Step 4: According to the regression coe cient and reference value of each group of risk factors, the distance between the values of each group of risk factors and the reference values of the underlying risk factors can be calculated.
Step 5: The value of the constant representing 1 point was set up in the risk scoring system.
Step 6: The score corresponding to each group of risk factors was calculated according to the distance and constant.
Step 7: The range of the total score was obtained according to the combination of different groups of risk factors to calculate all risk probabilities corresponding to the range of the total score according to the variation of the Cox regression equation. The variation in Cox proportional hazards models has the form . Here, β, W and M represent the regression coe cient, reference value, mean or proportion of risk factors, respectively. S 0 (t) is the average survival rate of participants in the ADNI cohort at t years, estimated by Kaplan-Meier analysis.
Receiver operating characteristic (ROC) curve analysis was applied to determine the risk probability cutoff point corresponding to the optimal prediction effect [26] , and the risk probability exceeding this cutoff point was considered high risk. The risk probability corresponding to the maximum of Youden's index was selected as the boundary value of high and low risk.
Model discrimination was calculated using the C statistic, analogous to the area under the receiver operating characteristic curve (AUC) [27] , which represents an estimate of the risk probability that a model assigns a higher risk to those who convert to MCI than to those who do not. We estimated model calibration using the Hosmer-Lemeshow χ 2 statistic to compare the differences between predicted and actual event rates.
We ran an external validation of the discrimination and calibration based on a new cohort of individuals who was collected from the HABS database according to the same inclusion and exclusion criteria. All analyses were performed using Microsoft Excel 2016, SPSS Statistics 22.0 and Python 3.7.4.

Flow of screening participants
In the ADNI dataset, 1869 participants were selected for eligibility. According to the inclusion and exclusion criteria for the participants of the ADNI in this study, 510 participants were nally selected and were divided into two groups: male (n = 249) and female (n = 261). According to the requirements of data preprocessing, a total of 18 cases of missing values were excluded in males and 20 cases in females, as well as 1 case of abnormal CDR values in a subject. A total of 471 participants, including 240 women (51%) and 231 men (49%), aged 61 to 90 years, were eventually included in the study cohort for subsequent primary analysis (see Figure 1 for details).

Characteristics of included participants
Female participants with normal cognition at baseline (n = 240) were followed for a median of 3 years (interquartile range: 2.0-5.0, maximum: 12, mean: 3.87, std: 2.68), 37 converted to MCI during follow-up (15%). Male participants with normal cognition at baseline (n = 231) were followed for a median of 4 years (interquartile range: 2.0-5.0, maximum: 12, mean: 3.77, std: 2.67), 52 converted to MCI during follow-up (23%). eTable 1 (see in Supplementary Material) provides a description of the baseline demographic and clinical characteristics of the participants in our study by gender in the ADNI. Age, education in years, APOE ε4 carrying status, family history of dementia, and MMSE score were imbalanced between the male and female groups (both SD>0.1). There were no signi cant differences in race, systolic blood pressure, diastolic blood pressure or CDR score between the male and female groups (SD <0.1).

Univariate Cox regression analysis by gender
In the male and female subsets, univariate Cox regression analysis was conducted for the screened risk factors (eTable 2 see in Supplementary Material). Both MMSE and CDR scores were strongly correlated with MCI conversion (P<0.001 for MMSE and CDR in males, P<0.001 for CDR and P=0.003 for MMSE in females). Retirement in years (P=0.04) and the proportion of APOE ε4 carriers (P=0.019) were associated with the risk of MCI conversion in females, while age (P=0.019) was associated with MCI conversion in males. The P-values of other risk factors were all greater than 0.05, indicating that there was a high likelihood that there was no association with MCI transformation.

Multivariable Cox proportional hazards regression
The nal multivariable models (eTable 3a and 3b see in Supplementary Material) and the risk scoring systems (Table 1a and 1b) for females and males included age, APOE ε4, MMSE and CDR. When age, APOE ε4, MMSE and CDR were incorporated into the models, both the male and female models exhibited good signi cance overall (P<0.000).

SRSS-CNMCI development
We developed two SRSS-CNMCIs for female and male participants based on the ADNI dataset (Table 1a) and calculated the speci c absolute risk of conversion from CN to MCI within a 12-year period corresponding to the total score (Table 1b). Both systems had a total score range of 0 to 23, but the combination of risk scores was different. The maximum risk predicted by the scoring system for females was 65%, while for males, it was only 48%. The nal female and male Cox regression models revealed C statistics of 0.878 (95% CI 0.813-0.943) and 0.830 (95% CI 0.757-0.904), respectively, as measures of discrimination, with a C statistic of scoring systems for females and males at 0.902 (95% CI 0.840-0.963) and 0.911 (95% CI 0.863-0.959), respectively ( Figure 3). The C statistics demonstrated good t for both the nal female and male Cox regression models and scoring systems. The 12-year risk predicted by the scoring systems was similar to the observed risks (χ 2 = 35.56, P = 0.154 in females and χ 2 = 45.0, P = 0.271 in males). Comparing the risk of conversion from CN to MCI calculated from the scoring systems by sex (Figure 2), it could be intuitively seen that the risks for males and females under different risk factor combinations were different, and the risks for females were all higher than those of males.

ROC analysis
ROC analyses of the actual diagnostic risk of MCI conversion versus the risk of MCI conversion predicted by the scoring system provided a cutoff point of high and low risk achieved at the greatest diagnostic test accuracy. The cutoff point of high and low risk was 33% for females, of which more than 33% was considered high risk, while more than 9% was considered high risk for males (Figure 3). The C statistic showed good t for the dichotomized model classi ed as high or low risk: C statistic of 0.881 in females and C statistic of 0.873 in males.
In the female scoring system, the majority of women in the high-risk group were predicted to be 70 to 80 years old (72%), and the majority of women in the low-risk group were also predicted to be 70 to 80 years old (70%). The same trend was observed in the male scoring system ( Table 2).

Validation of SRSS-CNMCI
In the HABS database, a total of 283 participants, including 166 women (59%) and 117 men (41%), aged 61 to 90 years, were selected according to the inclusion and exclusion criteria of this study as external databases to evaluate the generalization performance of the scoring systems. As shown in eTable 6, participants were selected using the same inclusion and exclusion criteria. The distribution of HABS participants in all major risk factors was consistent with that of the ADNI (all SD<0.1). Refer to the appendixes for information on the HABS sample screening ow chart and description of baseline characteristics (eFigure 1 and eTable 4 see in Supplementary Material). The C statistics showed good t for the HABS samples using SRSS-CNMCI (C statistic 0.950 for the females and C statistic 0.965 for the males) ( Figure 3). The risk predicted by scoring systems in the HABS samples was similar to the observed risks (χ 2 = 30.0, P = 0.314 in females and χ 2 = 20.00, P = 0.220 in males).
According to the validation of both discrimination and calibration, the sex-speci c scoring systems constructed in this study have good performances. The distribution of high-and low-risk predicted by SRSS-CNMCI at different ages by sex were similar to those in HABS and were slightly different from those in ADNI (see eTable 5 and eFigure 2 see in Supplementary Material, respectively).

Summary of results
We presented a sex-speci c scoring system (SRSS-CNMCI) for predicting the risk of conversion to MCI within 12 years in cognitively normal adults aged 61 to 90 years. Our scoring system not only estimated the absolute risk of conversion but also assessed the risk grade, that is, whether the conversion risk of the participants is high or low, which will provide the most intuitive understanding of the risk. In SRSS-CNMCI, there were differences in MCI conversion risk between men and women, indicating that research on sexspeci c models is indeed a direction worthy of further exploration. This also indicates that speci c monitoring and treatment plans should be implemented for men and women.
Previous studies have found that there are signi cant gender differences in the incidence and progression of AD and MCI [28] , primarily in the following aspects. First, in terms of brain structure, Pfefferbaum et al. [29] found that in the study of patients with MCI and AD, women exhibited a faster decrease in brain volume than men, while men themselves had higher brain reserves, meaning that compared to women with AD, men with AD had the same nerve pathological changes, had a stronger ability to resist the disease, resisted the clinical symptoms of the disease and exhibited reduced incidence of disease. Second, in terms of hormones, studies of the effects of sex hormones on brain neurons found that sex hormones play a role in the entire life cycle of a person. Sex hormone levels and sexual genetic differences determine nerve regeneration in the brain, highlight form, facilitate axon guidance for the twoway aspect of the development of vessels and nerves, and the differences between men and women are the most notable features of sex hormones in the body type and have different expression levels [30][31][32] .
Third, in terms of genetics, among AD patients, the number of women carrying the APOE4 genotype is much higher than that of men, and women carrying one APOE4 allele have a 4-fold higher risk of developing the disease, while men with the same genotype show only a slight increase in prevalence [33] .
Fourth, in terms of social life, Wookyoo et al. [34] found that highly educated AD patients suffered far less damage in the structural connections of the brain than the general population. According to history, men are far more likely than women to obtain higher education and higher vocational positions, which may mean that men have stronger cognitive reserve than women, thus having stronger resistance to brain pathological attacks.
Therefore, we hypothesized that the development of SRSS-CNMCI from different gender perspectives will improve the prediction accuracy of the scoring system. From the baseline characteristic table (eTable 1) of this study and the prediction accuracy results of SRSS-CNMCI (Figure 3 (a) and (b)), it was indeed observed that there are many differences between men and women, which further strengthened the validity of our hypothesis.

Variable considerations
Referring to past research and clinical signi cance, we purposely incorporated clinical risk factors that are readily and routinely accessible in clinical trials and primary care. Our study only included data on demographic characteristics, genetics, cognitive tests, vital signs, and medical history and did not take into account neuroimaging or Cerebral Spinal Fluid (CSF) biomarkers. At present, most neuroimaging indexes included in the prediction model were the volume, surface area and thickness of a certain area of interest in the brain, such as middle temporal cortical thickness, hippocampal subcortical volume and right amygdala surface area [8,9,35] , which lack relatively strong speci city in relationship with MCI, so we did not include neuroimaging data in this study. For biomarkers with high speci city, due to incomplete data records in ADNI, biomarker information with su cient sample size meeting the inclusion criteria of this study could not be found, so it was not considered in this study. In the female multivariable Cox proportional hazards regression model, APOE ε4 was included, even though there was no signi cance in the model, because APOE ε4 is the gene with the strongest impact on the risk of late-onset Alzheimer's disease [1] , and the nal multivariable models were signi cant (P<0.001). The male multivariable Cox proportional hazards regression model yielded the same result. Clinical signi cance, previous studies, univariate analysis and multivariate analysis were integrated into the consideration of risk factors in this study. The difference in FHD was statistically signi cant only between men and women and had no effect on the conversion of MCI, which may be due to the large extent of recall bias and inaccuracy in the collection of this information. Therefore, FHD as a risk factor is not convincing enough to be considered in subsequent studies.

Study strength
First, previous studies on risk scoring and prediction models related to AD or MCI [8,36,37] rarely consider sex-speci c modeling to explore whether there are different prediction results and performance between men and women. The SRSS-CNMCI developed in this study demonstrated that there are differences in risk prediction between men and women, which cannot be ignored and is the basis for improving the accuracy of prediction across genders. Second, most previous studies only considered whether the end point was converted or whether the disease was present but ignored the in uence of time on the predicted results and did not include the follow-up time as an outcome indicator [8] . In this study, we comprehensively evaluated the performance of the scoring system [38] , estimated discrimination to evaluate the ability of the scoring system to distinguish the unconverted from the converted, and estimated calibration to evaluate the performance of the consistency between the predicted value and the actual value. Third, some studies have shown that if the scoring system can be validated in a new independent sample, the results of the study provide a good basis for early prevention and screening in the future [39] , so we ran an external validation in the independent cohort HABS. The key risk factors for the two databases were consistent and comparable, so it is reliable to use this external database for validation (eTable 6). The scoring system showed good performance in goodness of t and calibration, indicating that our scoring system has strong credibility in predictive ability. Forth, some studies have found that ROC analysis is useful to identify the optimal concentration threshold of CSF biomarkers [26] , therefore, we tried to use ROC analysis to determine the threshold of high and low risk predicted by the scoring system, showing that the threshold value is good for risk prediction (C statistic 0.881 in females and C statistic 0.873 in males), indicating that the threshold value is reliable.

Study limitations
First, the risk factors included in the model were not comprehensive. Our goal was to develop a simple and accurate predictive tool. If the most common and easily accessible clinical indicators, such as body mass index (BMI) and daily activities (e.g., exercise frequency and reading), can be incorporated to predict the risk of MCI conversion, they will be of greater value for early prevention. However, there is almost no record of height data in the ADNI database, which cannot be converted into BMI, furthermore, while data related to daily activities cannot be obtained from the ADNI database, some common variables mentioned above were not included in the scoring system. Second, the sample size we used for modeling was not very large. Although we used the world's largest AD database (ADNI), we included only small sample sizes for modeling. In the future, we will continue to enrich the sample size and further improve the prediction effect of SRSS-CNMCI. Third, the proportion of white people in the samples collected from ADNI and HABS was greater than 90%, and the population in the study was single. Even though the performance of external veri cation was good, SRSS-CNMCI still lacks the credibility to be promoted to other groups.

Conclusion
We successfully developed an SRSS-CNMCI prediction model with an accuracy of more than 90%, which can be used to accurately predict conversion from CN to MCI.