Study participants
A single-center, case-controlled clinical trial was conducted and registered in the Korean Clinical Trials Registry (KCT0004758). Data of subjects over 50 years of age who visited Seoul National University Bundang Hospital (SNUBH) and underwent a T1-weighted MRI scan between January 2010 and September 2019 were retrospectively collected. Our data include brain MRI scans with clinical assessment and 18F-florbetaben PET scans from visitors to our dementia clinic as well as from participants of the Korean Longitudinal Study on Cognitive Aging and Dementia (KLOSCAD) [17].
A group of patients with AD and a group with normal cognition (NC) matched for age and sex were screened and enrolled using the following inclusion criteria. The AD groups included those who had: (1) a diagnosis of probable or possible AD according to the National Institute of Neurological and Communicative Disorders and Stroke and the Alzheimer’s Disease and Related Disorders Association (NINCDS-ADRDA) criteria, or MCI according to the International Working Group on MCI, and (2) amyloid deposition as determined by a positive 18F-florbetaben PET scan. The NC group included those who (1) had no subjective cognitive complaints, (2) had no objective cognitive decline in the Korean version of the Consortium to Establish a Registry for AD (CERAD-K) neuropsychological assessment battery, (3) were functioning independently in the community, and (4) had no amyloid deposition as determined by a negative 18F-florbetaben PET scan. Subjects who had any of the following conditions were excluded: (1) diagnosis of dementia with a cause other than or in addition to AD, i.e., mixed dementia, (2) brain pathologies on T1-weighted MRI that may cause cognitive deficits, (3) more than 1 year between the date of clinical assessment and date of MRI scan (NC and MCI participants only), and (4) white matter hyperintensities with a Fazeka’s rating of 3 or higher on fluid-attenuated inversion recovery images.
The data of the participants were retrospectively screened and collected starting from April 27, 2020 to June 5, 2020 (6 weeks). The employment of the DLCS on the data were conducted between June 8, 2020 to June 19, 2020 (2 weeks).
Sample size calculation
We employed both the sensitivity and specificity of DLCS to AD as primary outcome measures. We calculated the sample size needed to evaluate whether DLCS performed better than a reference, based on a one-sided α of 2.5% (Zα = 1.96), statistical power of 80% (Z1-β = 0.842), and the results of a pilot study. The pilot study tested the performance of DLCS using a dataset consisting of 367 AD patients and 316 controls with NC: 130 AD and 130 NC from SNUBH and 237 AD and 186 NC from the Alzheimer’s Disease Neuroimaging Initiative database. At a threshold value of 0.38, the DLCS yielded a sensitivity of 82.0% (95% confidence interval [CI], 77.7–85.8%) and specificity of 83.2% (95%CI, 78.6–87.2). To calculate the sample size n, we used the following formula [18]:

where p0 is the assumed sensitivity/specificity under the null hypothesis H0, and p1 is the targeted sensitivity/specificity under alternative hypothesis H1. The p0 and p1 values were defined as the lower and higher bounds of the 95%CI of the sensitivity and specificity from the pilot study (p0 = 0.777 and p1 = 0.858 for sensitivity; p0 = 0.786 and p1 = 0.872 for specificity). The null hypothesis was that the sensitivity/specificity of the DLCS is less than or equal to the lower boundary of the assumed sensitivity/specificity. The alternative hypothesis was that it is higher. Based on this, the necessary number of subjects with the disease was 188, and the number of subjects without the disease was 162. Therefore, the final estimated sample size was 350 subjects, consisting of 188 patients with AD and 162 normal controls that were matched for age (<5 years apart) and sex to the AD group.
Image acquisition
We acquired three-dimensional (3D) T1-weighted MR images in Digital Imaging and Communications in Medicine format using Philips Achieva and Ingenia scanners (Philips Medical Systems, Eindhoven, the Netherlands). The parameters were as follows: voxel dimensions = 1.0 × 0.5 × 0.5 mm3, slice thickness = 1.0 mm, echo time = 8.15 or 8.20 ms (for Achieva and Ingenia, respectively), repetition time = 4.61 ms, flip angle = 8°, and field of view = 240 × 240 mm.
We acquired 18F-florbetaben PET scans in 3D using a Discovery VCT scanner (General Electric Medical Systems, Milwaukee, WI, USA). The subjects were injected with 8.1 mCi (300 MBq) 18F-florbetaben (Neuraceq) through a slow single intravenous bolus (6 MBq) in a total volume of 10 mL. After a 90-min uptake period, 20-min PET images comprising four 5-min dynamic frames were obtained. Images of each time frame were reframed into one summed frame. Board-certified nuclear medicine physicians then determined Aβ-positivity based on visual interpretation of tracer uptake in the gray matter compared to neighboring subcortical white matter in the following four brain regions: the temporal lobes, frontal lobes, posterior cingulate cortex/precuneus, and parietal lobes.
Deep learning-based Alzheimer’s disease classification system
We used VUNO Med-DeepBrain AD (version 1.0.0, VUNO Inc., Seoul, South Korea), which is the DLCS for AD. The convolutional neural network model used in VUNO Med-DeepBrain AD has been previously described [16]. Briefly, the DLCS receives a subject’s T1-weighted image, extracts coronal slices from areas that span the medial temporal lobe, and feeds each coronal slice as a separate input into a convolutional neural network. The network, which uses Inception-V4 as its backbone, extracts various features that include structural and textural information of the brain from the coronal slice. The feature vector is then concatenated with the subject’s age and sex information (which is input to the system at the beginning with the MRI scan) and the location information (slice number) of the coronal slice, and entered into a fully connected network that calculates the probability of the slice belonging to that of a patient with AD. The probabilities of each slice are averaged to calculate a final score that represents the subject’s probability of having AD (score ranges from 0 to 1).
In this clinical trial, we processed the MRI data of subjects anonymously, omitting information that could identify the individual (name, sex, birth date, and hospital number). A researcher (K.J.S.), who was blinded to the subjects’ clinical diagnoses and did not participate in the construction of the study dataset, performed the processing of the subjects’ data with DLCS. The DLCS was installed on a desktop PC with the following specifications: Intel hexa-core 2.90 GHz CPU with 16 GB RAM running on Ubuntu 18.04.4 LTS.
Statistical analysis
We evaluated the accuracy of the DLCS in the diagnosis of AD by comparing its output (a continuous probability ranging from 0 to 1) with the subject’s clinical diagnosis. We defined sensitivity and specificity as the primary outcomes, and the area under the receiver operating characteristic curve (AUC) as the secondary outcome.
Continuous variables were compared using independent samples t-test, and categorical variables were compared using the chi-square test between groups. We estimated the 95%CIs of sensitivity and specificity using the Clopper-Pearson method [19] and the AUC using the DeLong test [18]. All statistical analyses were performed using SPSS, version 20 (SPSS Inc., Chicago, IL, USA) and MedCalc (version 16.4.3; MedCalc Software, Mariakerke, Belgium).
Standard Protocol Approvals, Registrations, and Patient Consents
This clinical trial (Korean Clinical Trials Registry identifier: KCT0004758) was approved by the Ministry of Food and Drug Safety in South Korea and the Institutional Review Board of SNUBH. The design and conduct of this study were in accordance with the principles outlined in the Declaration of Helsinki [20]. Because this clinical trial was conducted retrospectively, participation consent forms from subjects or legal guardians of the subjects were waived.
Data Availability
The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.