Adel Al-Zebari et al have compared performance of various machine learning algorithms for diabetes detection. MATLAB classification Learner Tool has been used in this work that covers Decision tree, Discriminant analysis, SVM (Support Vector Machine), k-NN (k-Nearest Neighbor), Logistic regression and ensemble learners and their variants; totally 26 classifiers are considered. The results are evaluated on a 10-fold cross-validation basis and average classification accuracy is considered for performance measure. To increase accuracy deep neural networks, feature selection techniques can be used in future [10].
G.A. Pethunachiyar used SVM with disparate kernel functions for classification of diabetes. The simulation model of the proposed system includes 5 phases. After collecting the data, selection process is carried out by rectifying the errors (inconsistency in data or missing values or wrong information). Next data will be divided as training (70%) and testing dataset (30%). For efficient prediction SVM technique has been selected and model has been built. Test data is applied to the model in order to make prediction. The Linear, Polynomial and Radial kernel based SVM has been implemented in this work. Confusion matrix is used for calculating prediction accuracy. To evaluate 3 kernel functions ROC (Receiver Operating Characteristic curve) is used. Linear kernel with SVM predicts more accurately compared to other kernels [11].
Pahulpreet Singh Kohli et al, have applied various machine learning techniques on 3 different diseases datasets for disease prediction. Feature selection is carried out by backward modeling using p-value test. The proposed model includes 4 phases: Initially dataset is explored in Python environment. Next during data munging, the missing values are replaced with mean value and mode value for the continuous variable and categorical variable, respectively. Next features are selected very cautiously to improve the performance of the model. The attributes are eliminated using backward selection method (based on p-value it is eliminated) and the model is refitted. After selecting the features 5 algorithms including Decision Tree, Logistic Regression, Random Forest, Adaptive Boosting and Support Vector Machine were compared. The dataset has been divided into training set(90%) and test dataset(10%).In future data munging, selection of features and model fitting steps can be automated; pipeline structure for preprocessing data would improve results [12].
Samrat Kumar Dey et al have developed a web application using Tensorflow for successful prediction of diabetes. This proposed model requires patient's data for successful diagnosis, and the techniques like SVM, ANN (Artificial Neural Network), KNN and Naive Bayes are used for predicting the disease. The dataset is divided into 2 parts: training and testing dataset. Preprocessing of data and data normalization would increase accuracy of the model. Min Max Scaler normalization model is used to improve accuracy. Deep learning model can be adopted in future to predict diabetes [13].
Sidong Wei et al, have done a comprehensive exploration on DNN (Deep Neural Network), Logistic regression, SVC (Support Vector Classifier), Naive Bayes and Decision tree techniques for identification of diabetes. This work has been carried out in 4 steps: Initially the best preprocessor is identified for the classifier. Next the parameters are optimized. In third step, these techniques are compared by accuracy, later relevance of these features are considered. The features like Plasma glucose concentration, age and number of times pregnant were found to be more significant [14].
S. Hari Krishnan et al, used machine learning techniques to measure the blood glucose level. A Photoplethysmograph (PPG) based system is used determine the glucose parameters which uses light sources of 3 different wavelengths. The light is illuminated, the skin at the wrist along with the reflected light are captured by a photodiode receiver and the same is conditioned, digitalized and sent to Arduino UNO microcontroller. The PPG signal is derived by the microcontroller in accordance with the blood glucose values. The waveform is preprocessed and segmented in order to obtain peak of the signal. To obtain the statistical features like mean, skewness, variance, kurtosis, standard deviation and entropy the Random forest technique is implemented on the acquired signal. The model is designed and trained to estimate the blood glucose from the features extracted. The future would focus on estimation of correlation of the feature sets with different machine learning techniques [15].
M.Shanthi et al, proposed and developed a model for diagnosing T2D using ELM (Extreme Learning Machine) technique. The ELM mathematical model has one hidden layer feed-forward network, which can create hidden nodes at random. Parameters are randomly generated for the hidden nodes initially. The next output matrix is calculated, and then the network's optimal weight is given as the output. From the characteristics, input weight, and activation functions, the output is obtained. The activation functions available are a triangular basis, sine, hard-limit, and sigmoid. This model assists medical experts to forecast T2D [16].
Sajratul Yakin Rubaiat et al, introduced an approach to predict type 2 diabetes using neural network. This analysis is carried out in two methods: The first method involves data recovery followed by selection of features, the selected features are inputted to MLP (multilayer perceptron) neural network classifier. Second approach uses K-means algorithm. Neural network based method involves 3 steps such as data recovery (missing data are replaced with mean value to complete the dataset), selection of features (features with more impact on risk factor identification are selected) and Multilayer Perceptron Classifier (hyper parameters are selected. K-means reduces noise very effectively and its output has been used as feature for the model. Model can be trained using these 2 methods and predict whether or not a person is diabetic at an early stage. The first method is more efficient and requires less computation compared to k-means [17].
Maham Jahangir et al, presents a novel prediction framework which uses AutoMLP (automatic multilayer perceptron) combined with an outlier detection method. This method involves 2 stages: pre-processing of data with outlier detection following by training of AutoMLP. In the second stage it is used to classify the data. Compared to the other architectures of neural network AutoMLP gives higher accuracy. The attributes like plasma glucose level, blood pressure and number of times pregnant are found to be more relevant [18].
Ali Mohebbi et al used CGM (Continuous Glucose Monitoring) signals for adherence detection in diabetic patients. A considerable amount of signals were simulated using a T2D adapted version of the MVP (Medtronic Virtual Patient) model. Different classification algorithms were compared by using a comprehensive grid search. Logistic regression, Convolutional Neural Network (CNN), and Multi-Layer Perceptron techniques have been used in this work. CNN shows better performance in classification [19].