Alzheimer’s disease (AD) is a specific type of dementia associated with severe neurological deficits that affect cognitive, visual, sensory, and motor functions in people living with the disease1. With AD, neurodegeneration, a progressive loss of structure or function of neurons, is inevitable, and there is currently no cure for reversing this process. However, clinical studies have shown that neurodegeneration progresses. With early diagnosis, treatment, and therapeutic interventions, the process can be slowed. At present, a definitive diagnosis of AD remains a complex task because tests for the presence of amyloid plaques and phosphorylated tau are the true determinants of AD and can mainly be performed posthumously2. Other clinical practices depend on a multitude of evaluations, including clinical assessments, medical history reviews, cognitive assessments, and neuroimaging, and many years of study are needed to reach a diagnostic decision3. Neuroimaging methods, such as positron emission tomography (PET) and MRI, provide information on the extent of structural changes in the brain relevant for pathological alterations characteristic of the brain during degeneration4, particularly MRI. MRIs reveal several broad viewpoints, such as axial, coronal, and sagittal, with different levels of information for analysing brain neurodegeneration. Notably, the axial view revealed substantial atrophy of the cerebral cortex, leading to shrinkage of the outer layer of the cerebrum. This atrophy is accompanied by ventricle enlargement, reduced brain volume, and diminished gray matter5. In contrast, the coronal view highlights ventricle enlargement and emphasizes significant temporal lobe and cortical atrophy. This is a window into the widespread loss of neurons throughout the brain, accompanied by sulcus widening and gyrus thinning5. The sagittal plane provides the most visible information for AD diagnosis7. Brain neurodegeneration is evident in the sagittal plane, particularly in the frontal lobe, cerebellum, occipital lobe, thalamus, and corpus callosum, where learning and memory, mental function, motor function, and sensory function7 can be significantly impacted.
Despite the diagnostic potential of MRI, its sole reliance on early AD diagnosis faces numerous limitations9. For example, AD may elude visual detection, especially when numerous samples of MCI and AD patients are analysed, thus necessitating the need for a comprehensive clinical evaluation methodology that is reliable. Additionally, the interpretability of MRI scans varies among radiologists and clinicians, which can introduce inconsistency in diagnosis. However, the emerging techniques in machine learning offer promise in diagnosing AD, particularly in timeliness, thereby paving the way for more effective AD management and intervention.
Over the past decade, deep learning algorithms, including both pretrained networks and tailored architectures, have been successfully adopted for AD modelling. Pretrained networks have long-standing relevance in AD diagnostic research. Bae tailored a residual network-50 (ResNet50)10 for discriminating between MCI and AD patients and achieved an accuracy of 82.4%. The GoogLeNet, AlexNet, and ResNet-18 pretrained networks were exploited for classifying patients into cognitively normal, early mild cognitive impairment, mild cognitive impairment, and late mild cognitive impairment categories11. With accuracies of 96.39%, 94.08%, and 97.51% for GoogLeNet, AlexNet, and ResNet-18, respectively, ResNet-18 outperforms the other models in terms of performance. By integrating a 3D mobile inverted bottleneck convolution (MBConv) block in a 3D EfficientNet architecture12, accuracy, sensitivity, specificity, and AUC values of 86.67%, 75.00%, 90.91%, 97.16%, and 83.33%, respectively, were achieved for the sMCI and pMCI sets. In another work, the DenseNet-169 and ResNet-50 CNN architectures were exploited for early AD diagnosis 13. DenseNet-169 exhibited superior accuracy, surpassing ResNet-50, with scores ranging between 97.7% and 88.7%. The ResNet-18 pretrained network was useful for AD classification14. With the use of the Mish activation function (MAF) for enhancing the model's learning adaptability and a weighted cross-entropy loss function to ensure equitable consideration of the AD, MCI, and CN classes, the network achieved 88.3% accuracy on the preprocessed ADNI dataset.
Tailored deep learning algorithms are now paving the way for AD diagnosis. Basaiaa et al. 15 proposed a 3D CNN consisting of 2 convolutional blocks of 5 × 5 × 5 filter sizes and 10 blocks of 3 × 3 × 3 filter sizes. They utilized strides in place of max-pooling for downsampling. Their work achieved 74.8%, 75.1%, and 75.3% accuracy, sensitivity, and specificity, respectively, on the ADNI stable (s-MCI) and MCI conversion (c-MCI) sets. The model achieved 85.9%, 83.6%, and 88.3% accuracy, sensitivity, and specificity, respectively, on the AD and s-MCI sets. Another study proposed a CNN for AD diagnosis and stratification16. This research not only facilitated fast and accurate AD diagnosis but also offered classification for normal, MCI, and AD patients. Additionally, we addressed the challenging task of stratifying MCI into very mild dementia (VMD), mild dementia (MD), and moderate dementia (MoD) stages, akin to prodromal AD. The shallow network16 achieved an overall testing accuracy of 99.68%, which was greater than that of pretrained networks such as DenseNet121, ResNet50, VGG 16, EfficientNetB7, and InceptionV3. Although the dataset used was the Open Access Series of Imaging Studies (OASIS) dataset, it is important to note that their work shows the importance of custom-trained networks in AD diagnosis. A fine-tuned CNN classifier 17 called AlzheimerNet was shown to be capable of classifying Alzheimer's disease into five stages. With data preprocessing and augmentation, their method achieved 98.67% accuracy using the RMSProp optimizer. Considering the different patient groups used for diagnosing AD, 18 independent binary classifications were proposed: healthy control (HC) vs. AD, HC vs. pMCI, HC vs. sMCI, pMCI vs. AD, sMCI vs. AD, and sMCI vs. pMCI according to the deep belief network (DBN). The modifications to the DBN18 include dropout and zero-masking for overcoming overfitting, a preprocessing algorithm, a principal component analysis for dimensionality reduction, and a multitask feature selection approach. Using the ADNI dataset, accuracies ranging from 87.78–99.62% were observed. Hazarikar et al.19 replaced the downsampling layer in the traditional LeNet architecture with a fusion of min-pooling and max-pooling layers to retain both minimum-value and maximum-valued signals. Their model achieved an accuracy, precision, recall, and F1-score of 98%, 96%, 97%, and 98%, respectively, on the ADNI dataset. In another work, a VGG-TSwinformer architecture20 that combines a VGG-16 convolutional neural network and a transformer network was proposed and validated on the ADNI sMCI and pMCI cohorts. The accuracy, sensitivity, specificity, and AUC were 77.2%, 79.97%, 71.59%, and 0.8153, respectively. Similarly, another work21 also found architecture useful for their methodology. They created a hybrid architecture by combining AlexNet with LeNet and varying the filter sizes from 1 × 1, 3 × 3, and 5 × 5. Scores as high as 96%, 93%, 93%, and 96% for accuracy, precision, recall, and F1 score, respectively, were reported. Another notable architecture is the multiplane convolutional neural network (Mp-CNN) architecture, which simultaneously processes three planes, axial, coronal, and sagittal, of 3D MRI 22. The architecture of the Mp-CNN comprises 14 layers with rectified linear unit (ReLU) activation and softmax for multiclass classification, and it outperforms traditional 2D CNNs in multiclass classification associated with AD, MCI, and NC. The Swinformer has also been explored23 as a transformer-based CNN architecture for AD classification. The Swinformer combines a CNN module for planar feature extraction and a transformer encoder module for 3D semantic connections. They argued that Swinformer can capture local features more accurately. The pipeline included data preprocessing and augmentation strategies such as random rotation and mirror reflection and recorded an accuracy of 88.3%.
While it is obvious that transfer learning with the use of state-of-the-art pretrained models is a promising technique for diagnosing AD, the tailored deep learning algorithm outperforms traditional methods in terms of performance and shows that it is better at preserving the underlying structure of the data for diagnosing AD. However, of these works, none have captured the structural dynamics of neurodegeneration in the brain in individuals with MCI or AD, which leaves room for additional work to be done. Therefore, this paper seeks to bridge this research gap and makes the following contributions to AD research.
-
We propose a machine learning framework that integrates the novel SNeurodCNN architecture for modelling the structural neurodegeneration of the brain's cerebral cortex and for the task of discriminating between MCI and AD.
-
Investigate whether the varying viewpoints of the two planes of the sagittal axis, midsagittal and parasagittal, provide differing insights into structural neurodegeneration.
-
Investigating SNeurodCNN sensitives patients to brain neurodegeneration is relevant for identifying digital biomarkers (digi-biomarkers) for the regions of the brain where structural neurodegeneration is prevalent.