A deep learning approach for classification and diagnosis of Parkinson’s disease

Deep learning grabs a center attraction in industries, deep learning techniques are having great potential and recently these potentials are applied to healthcare problems, including computer-aided detection/diagnosis, disease prediction. Deep learning techniques are playing an important role in the classification and prediction of the diseases. The popularity of deep learning approaches is because of their ability to handle a large amount of data related to the patients with accuracy, reliability in a short span of time. However, the practitioners may take time in analyzing and generating the reports. In this paper, we have proposed a Deep Neural Network-based classification model for the classification of Parkinson’s disease. Our proposed method is one such good example giving faster and more accurate results for the classification of Parkinson’s disease patients with excellent accuracy of 94.87%. We have also compared the results with other existing approaches like linear discriminant analysis, support vector machine, K-nearest neighbor, decision tree, classification and regression trees, random forest, linear regression, logistic regression, multi-layer perceptron, and Naive Bayes.


Introduction
Deep learning gained momentum in industry and academia is because of its ability to find complicated patterns in massive datasets. Other approaches are heavy and complex in dealing with massive datasets, and some are unable to handle extensive datasets. Before deep learning, machine learning was popular in many applications but, due to the shortcomings of machine learning techniques, deep learning gained popularity. Deep learning is very popular in the diagnosis and the prediction (Vásquez-Correa et al. 2018;Wang et al. 2020;Sivaranjini and Sujatha 2020;Shahid and Singh 2020) of many diseases like diabetes, cancer, Alzheimer's and Parkinsonism, and many more (Jyotiyana and Kesswani 2020a;Wingate et al. 2020;Grover et al. 2018). In recent years, many researchers are also working on Parkinson's disease. The reason behind increasing research in Parkinson's disease is due to a large number of patients falling prey to it day by day. Parkinson's Progression Markers Initiative (PPMI) is the organization that provides many datasets of PD patients based on imaging modality. PPMI provides the dataset of MRI and PET images of patients who have Parkinson's disease in the USA. The actual cause of Parkinson's disease (PD) is still unknown, but some common factors, like genes, aging, and lifestyle play a major role. Major research done on PD shows that when dopamine level decrease in neurons, it affects the communication of neurons causing PD. Subjects suffering from PD may have some common symptoms like memory loss, tremor, bradykinesia, and weakness. Many studies show that patients suffering from PD have poor concentration power; the patient is not even able to write properly and easily forget the daily appointments and social gatherings (Hartelius and Svensson 1994;Politis 2014;Stern and Siderowf 2010;Grossman et al. 2003). Parkinson's disease is the most striking neuro-degenerative disease (NDD) (Jyotiyana et al. 2021) in old age people after Alzheimer (Son et al. 2017;Munteanu et al. 2015) and many researchers are working on PD or Parkinsonism (Grossman et al. 2003;Prashanth et al. 2014;Mostafa et al. 2019;Wroge et al. 2018). A study on changes in the handwriting of healthy and person suffering from PD is given in Grover et al. (2018) as the level of dopamine decrease, patient find difficulties in walking, speaking, and writing (Grover et al. 2018). Researchers have applied deep learning on tele-monitoring data of PD patients who have difficulties in speaking. People suffering from PD scramble with vocal symptoms and have trouble in the normal production of vocal sounds ( (Das 2010;De Rijk et al. 1997)). This condition is known as dysphonia and is a type of voice disorder including functional problems with voice (Ho et al. 1999) (Berardelli et al. 2001).
In this manuscript, we have designed a deep neural network-based model for the classification of Parkinson's disease (PD) patients. Major contributions of the research are as follows:

Related works
Das used different techniques (Das 2010) for the prediction of Parkinson's disease. In his study, he found that the overall accuracy of neural networks is 92.9%. There are other physiological problems associated with Parkinson's disease like bradykinesia (Berardelli et al. 2001) including slower movement, muscle rigidity, resting tremor, and weakness. This study includes symptoms of PD and the role of surgery and deep brain stimulation. Some researchers are also working on a clinical dataset of PD (Choi et al. 2017). They developed a deep learning-based FP-CIT SPECT interpretation system that helps in enhancing the imaging diagnosis of PD. This study focused on SWEDD (scan without evidence of dopaminergic deficit). It helps in differentiating between mild PD patients and severe PD with the help of SPECT imaging giving a total of 98.8% classification accuracy. Many researchers have done a worldwide PD-based survey in different countries (De Rijk et al. 1997;Zhang and Román 1993)). In this study, they have shown the demographic distribution of the population suffering from PD and the most affected age group. They found that people in the age group of 60 or more are most affected by Parkinson's disease. Fargel et al. (2007) performed a survey of PD patients and neurologists. Out of 500 patients who were surveyed, 49% were at an early stage and 51% were suffering from severe PD. This survey focused on the treatment of PD and improving the quality of life of patients. Jankovic (2008) discussed the features of PD and their diagnostic methods. Measures towards improving the condition of PD patients have also been discussed. Saeed et al. (2017) have described the clinical and pathological features of PD and how structural MRI helps in the diagnosis of PD. Many researchers and academicians are working on different techniques that can be used for the correct classification of Parkinson's disease. Some of the researches done in this area are summed up in Table 1.
Abbreviations used in Table 1 can be read as RF: random forest, NN: neural networks, ANN: artificial neural networks, SVM: support vector machine, DT: decision tree, PCA-NN: principal component analysis-neural networks, PCA-ANFIS: principal component analysis-adaptive neuro-fuzzy inference system, PCA-SVR: principal component analysis-support vector regression, DNN: deep neural networks, MLP: multi-layer perceptron, ROC: receiver operating characteristic, AUC: area under the curve and OCFA: optimized cattle fish algorithm.
Due to the limitations of existing approaches, we have proposed a deep neural network-based model for the classification of Parkinson's disease. In our proposed model, we have tried to increase the classification accuracy and at the same time reduce the overhead of classification by optimizing the number of layers in the model. This model can be further used for the early diagnosis of Parkinson's disease.

Methodology
We have collected the Parkinson's tele-monitoring data from the updated link https://www.kaggle.com/mountain guest/parkinsons-telemonitoring?select=parkinsons_updrs. data with the name Parkinson's tele-monitoring data. The dataset has multivariate attribute characteristics, and the number of attributes is 22. As we do not need a text time column so we choose only 21 attributes for our experiment. The status column specifies whether the subject is suffering from PD or not. The database contains a voice record of 42 patients at an early stage of PD. These 42 people were recruited for 6 months trial on a tele-monitoring device for remote symptom progression monitoring. Dataset contains following attributes are: Subject ID, Age, Sex, MDVP: Fo(Hz), MDVP:Fhi(Hz), Jitter(%), Jitter(Abs), Jitter:RAP, Jitter:PPQ5, Jitter:DDP, Shimmer, Shimmer(db), Shimmer:APQ3, Shimmer:APQ5, Shimmer:APQ11, Shimmer:DDA, NHR and HNR, Spread1, Spread2 and PPE, Status and more and The main aim of the data is to predict the motor and total UPDRS scores ('motor_UPDRS' and 'total_UPDRS') from the given voice measures. A brief description of the attributes in the dataset are given in Table 2.
We performed the data pre-processing, and after the selection of features, we chose 21 features for the classification of PD. Our model consists of total of 11 layers, in which the Input Layer takes 21 features as input and gives output in the next layer containing 42 nodes for processing. The first layer has 924 parameters. As shown in Fig 1. The layer wise distribution of the nodes and the respective parameters are given in Table 3.
The model was tested on different parameters as illustrated in Table 3; it gave the best results for 11 layers. If the number of layers was increased beyond this, there was no significant improvement in the results. In some cases, there was a drop in classification accuracy. The experiments were executed for 1000 simulations and after using the proposed PDCD model, the highest accuracy achieved was 94.87%. Detailed results are discussed in the next section.

Experiments and results
In this section, we have include the steps performed in this experiment in detail and further results are discussed.

Data pre-processing
All the features were taken during the pre-processing step, and we performed some basic checking for the null values or missing values. In our dataset, there were no null or missing values. So, we did not use any missing value handling techniques during the pre-processing step. We normalized the data, after splitting it into training and testing datasets. We used Standard Scalar to avoid manual processing. After data pre-processing, the PDCD model was applied to the data, and the results are discussed in the next subsection.

Model optimization
Optimization is an important task when we are dealing with datasets. We used ''adam'' optimizer for optimizing the model which had some features like, ease of implementation, computationally efficient for processing, takes less memory, is appropriate for handling noisy and null data, and typically demands negligible tuning for handling. Adam optimizer inherits the strengths of gradient descent  with momentum algorithm and the root mean square propagation (RMSP) algorithm. The combination of these two stated algorithms builds ''adam'' optimizer to deal with the huge and complex data. Adam optimizer outperforms the rest of the optimizer in terms of low training cost and high performance.

Classification
Classification is the last step before fitting the model and evaluating the model. After that, we classify it into two categories Parkinson's disease and Healthy. In our dataset, status 0 indicates a healthy person and 1 is for Parkinson's patients. As shown in Table 4 for value 0 in the status column we get precision 82%, recall 90%, F1 score 86% and support is 10 while for value 1 in the status column we get precision of 96%, recall of 93%, F1 score 95% and support 29. Total precision of the classification model is 93%, recall is 92%, the F1-Score is 92% and support is 39 and overall classification accuracy is 94% which is better as compared to other methods.

Performance comparison of PDCD model with other approaches
The proposed model was compared with other well established classification techniques (Han et al. 2011) like K-nearest neighbor (KNN) (Gupta et al. 2018), support vector machine (SVM) (Pan et al. 2012;Mandal and Sairam 2014;Tiwari 2016;Gao et al. 2018), linear discriminant analysis (LDA) (Dastgheib et al. 2012), decision tree (DT) (Oppedal et al. 2015;Yadav et al. 2012), classification and regression tree (CART) (Ramani and Sivagami 2011), random forest (RF) (Oppedal et al. 2015), linear regression (LR) (Ramani and Sivagami 2011), logistic regression (LogR) (Ramani and Sivagami 2011), multi-layer perceptron (MLP) (Pan et al. 2012;Tiwari 2016), Naive Bayes (NB) (Morales et al. 2013) comparative analysis is shown in Table 5. Abbreviations used in Table 5 can be read as KNN: K-nearest neighbour, SVM: support vector machine and, LDA: linear discriminant analysis, DT: decision tree, CART: classification and regression tree, RF: random forest, LR: linear regression, LogR: logistic regression, MLP: multi-layer perceptron, NB: Naive Bayes. As shown in Table 5, our proposed model has the highest accuracy of 94% and MLP achieves the second-highest accuracy after our proposed method. The precision of our proposed method is highest among all the methods. MLP, KNN, and LDA also achieve good precision. Similarly, recall of our proposed methods is highest and MLP also achieves recall equal to our proposed method.
As shown in Fig. 2, the accuracy of the proposed method PDCD is the highest among all the methods MLP, KNN and LDA are also having good accuracy rate and NB has the least accuracy.
A comparative analysis of precision is shown in Fig. 3. The precision of the proposed method PDCD is the highest among all the methods MLP, LR, KNN, and LDA also show good precision rate and logistic regression has the least precision. Figure 4 shows that the Recall of the proposed method is the highest among all the methods MLP, KNN and LDA    also perform well in achieving precision and NB has the least Recall value. Figure 5 shows the F1-Score of the proposed method PDCD is the highest among all the methods. MLP, KNN, LR, and LDA also have relatively good F1-Score and NB and LogR have the least F1-Score.

Effect of different parameters on model
We have performed many trials with different types of parameter(s) in the model. After performing the trials, we concluded that an increment in the number of layers enhances the accuracy of the model. The effect of different parameters on the performance of the model is given in Table 6.
As the numbers of layers are increased the number of nodes also increase hence increasing the accuracy of the model also. If we further increase the number of layers and simultaneously increase the number of epochs, then the accuracy of the model increases. If there is increase in the number of layers along with the number of parameters, results in increase in the accuracy of the model. Many trials with different types of parameters were performed and the conclusion reached was that whenever there was an increment in the number of layers, it enhances the accuracy of the model. Accuracy that we explored was up-till 11 layers through the trials of model execution were made up to 100 layers. There were some observations regarding the accuracy of the proposed model, such as the increase in the number of the layers is directly proportional to the number of the nodes and the accuracy of the model. Also, if number of layers increase then the number of parameters also increases resulting in increase in the accuracy of the model. Let the number of layers in the model be N L , number of nodes be denoted by N Nodes , number of parameters of nodes denoted by P and number of epochs be denoted by E.
The relationships between different parameters are as shown in Eqs. 1, 2, 3, and 4. Here, accuracy indicates the accuracy of the model.

Conclusions
Recently, deep learning is mostly used in prediction and diagnosis of many diseases, Deep learning replaced many state-of-the-art methods due to its reliability, accuracy and great potential to deal with the data. Deep learning required less data preprocessing, DL model take care of filtering and normalization task as compared to machine learning. Deep learning has a promising future especially in the field of engineering and health sector. In this paper, we designed a Deep Neural Network-based model for the classification of PD. The proposed model has an accuracy of 94.87% which is reasonably satisfactory as compared to other classification techniques. Different parameters were tested and the results indicate that deep neural networks can be effectively used for early diagnosis of the Parkinson's disease. This study concludes that DNN is better approach for classification and predication of diseases.

Appendix A: Proposed algorithm
In this sub-section, we discuss the proposed Algorithm that classifies the subject with Status 1 indicating that the subject is suffering from Parkinson's and Status 0 indicating that the subject is Healthy (Table 7). Data availability Enquiries about data availability should be directed to the authors.

Declarations
Conflict of interest None.
Human participants and animals No human and animal participants were used.