Amyloid PET-Positive Predictability of Machine Learning Algorithm Based on MDS-OAβ Levels

Young Chul Youn Chung-Ang University Hospital Jung-Min Pyun Uijeongbu Eulji Medical Center Hye Ryoun Kim Chung-Ang University First Campus: Chung-Ang University Sungmin Kang PeopleBio Inc. Nayoung Ryoo Seoul National University Bundang Hospital Young Ho Park Seoul National University Bundang Hospital Hae-Won Shin Chung-Ang University Seoul Campus: Chung-Ang University SangYun Kim (  neuroksy@snu.ac.kr ) Seoul National University Bundang Hospital https://orcid.org/0000-0002-9101-5704


Introduction
Alzheimer's disease (AD) is a degenerative brain disease associated with the loss of independent living due to the deterioration of cognitive function and is linked to the gradual loss of cortical neurons. 1 Pathologically, it is characterized by cerebral amyloidosis that begins about 20 years before the onset of clinical symptoms. 2,3 Early detection of AD is essential to patient outcomes and clinical trials of the disease modifying drugs.
Cerebral amyloidosis in AD has been evaluated based on the cerebrospinal uid amyloid-β 1-42 levels and amyloid positron emission tomography (PET) imaging. However, these approaches are invasive, costly, and associated with interlaboratory variability, which limits their clinical use. Therefore, efforts have been made to develop blood-based amyloid-targeted biomarkers. The Multimer Detection System-Oligomeric amyloid-β (MDS-OAβ) level is a valuable blood-based biomarker for AD; it is a modi ed sandwich immunoassay for measuring Aβ oligomerization in the plasma. 4,5 This technique involves introducing synthetic Aβ into the plasma to trigger oligomerization of Aβ prior to antigen-antibody reactions and measuring the oligomerization tendency of plasma Aβ in AD patients. 6 We have previously evaluated the role of MDS-OAβ levels in differentiating AD and community-based healthy participants, showing high sensitivity and speci city of this approach. 7 Meanwhile, we attempted to evaluate whether brain AD pathology can be predicted based on blood MDS-OAβ levels in studies on the relationship between MDS-OAβ, and magnetic resonance imaging or amyloid PET ndings. 8,9 Our previous studies have reported MDS-OAβ cut-off, sensitivity, and speci city values in AD diagnosis and amyloid PETpositivity. In the present study, we aimed to use machine learning algorithms to examine amyloid PETpositivity prediction accuracy of MDS-OAβ levels and factors that affect it using multi-center datasets.

Methods
This was an observational cross-sectional study to evaluate the role of MDS-OAβ levels in predicting amyloid PET-positivity, using machine learning. This study was based on data obtained from the multicenter Alzheimer's Disease All Markers Study (ADAM), involving the Seoul National University Bundang Hospital and Chung-Ang University Hospital, and on data from the Dementia Overcoming Project in Korea  In pre-processing, the dataset was randomly split into training (70%) and test (30%) datasets, using the train_test_split function from the scikit-learn library (https://scikit-learn.org/); the cardinality variables were normalized using mean and standard deviation. The feature (x_data) and outcome (y_data) variables were created in each dataset. The model was trained using the train dataset. The cost was calculated using logistic regression and minimized using the 'GradientDescentOptimizer'. Lastly, the accuracy, sensitivity, and speci city values of the amyloid PET prediction were calculated 50 times, using the randomly split test dataset with various combinations 'MDS-OAβ', 'MDS-OAβ + age', 'MDS-OAβ + gender', 'MDS-OAβ + age + gender'.

Results
Logistic regression was used to assess the predictive accuracy of the MDS-OAβ values alone and combined with other variables for amyloid PET-positivity ndings. The predictive accuracy of the MDS-OAβ levels was the highest at 78.16 ± 4.97%; the corresponding sensitivity and speci city values were 83.87 ± 9.40% and 70.00 ± 13.13%, respectively. The addition of gender feature to the MDS-OAβ levels did not improve the model's predictive accuracy; meanwhile, the inclusion of age led to a reduction in the predictive accuracy to below 60%. MDS-OAβ values showed 85% of sensitivity and speci city. 8 Adding age and Mini-Mental Status Examination variables could change the sensitivity and speci city to 91% and 82%, respectively. In this study, the predictive accuracy of the machine learning algorithm was approximately 78%, and the sensitivity and speci city values were 83% and 70%, respectively. It is not a bad prediction accuracy considering the results from the data obtained from multi-center using different anti-coagulants.
It was observed that the inclusion of age and gender in the model did not improve its predictive accuracy. As the number of input features increases in a model based on a relatively small dataset, the model's predictive power may decrease due to over tting. 15,16 Fig. 1 illustrates MDS-OAβ levels, strati ed by gender, age, and anticoagulant type. Intuitively, we could not draw any useful decision lines to distinguish amyloid positivity by age and gender. Since there were only 163 subjects with amyloid PET and MDS-OAβ values, including additional features in the model did not improve the model's predictive accuracy.
Pyun et al. used plasma MDS-OAβ values estimated using heparin anticoagulants; however, the present study included both heparin-based (n = 96) and EDTA-based (n = 67) plasma measures because different centers use different anticoagulant types. As shown in Fig. 1, to predict amyloid PET-positivity, the decision line of EDTA plasma-based MDS-OAβ values can be drawn at approximately 1.0 ng/ml; the corresponding heparin-based level is lower. Consequently, we included the anticoagulant type in the algorithm. Although the sensitivity and speci city values were somewhat lower than those previously reported, the present ndings are acceptable, given the use of data obtained from different centers, using different anticoagulants.
We also examined other models, including three-or four-layer deep neural networks and light gradient boosting models; however, the corresponding accuracy values were below 70% and 76%, respectively, and none of the models showed performance superior to that of logistic regression.

Limitations
One limitation of this study is the use of imbalanced data; speci cally, 62.58% of the participants were amyloid PET-positive; the ADAM participants showed a higher positive rate than the DOP participants.
Due to the small number of participants, we could not create a random balanced dataset, precluding the creation of an algorithm based on such a dataset. Datasets obtained in a clinical setting are unlikely to be balanced. The imbalance observed in this study was deemed acceptable; however, it requires that the presented ndings be interpreted with caution. However, our previous study did not show any differences in the accuracy of classi cation between imbalanced clinical and randomly selected balanced datasets. 17 The other limitation of this study is that its retrospective design, including the use of data obtained by different projects, whereby PET examinations were performed at the discretion of the attending neurologist rather than using on a standardized protocol.

Conclusions
The machine learning algorithm using logistic regression and multi-center MDS-OAβ values yielded satisfactory predictive accuracy, sensitivity, and speci city values suitable for the prediction of amyloid  (743)). The written informed consent requirement was waived due to the retrospective nature of this study.

Consent for publication
Not applicable.