The implementation of the suggested research project is built upon this section. In this section, the pre-processing processes are described, followed by a discussion of the implementation-specific final categorization algorithms. In the current study, RF, KNN, SVM & DNN algorithms are used to create models to identify individuals with kyphosis who have received previous therapy for the condition and have personal records that indicate when they will require skilled surgery. The models' impacts are then evaluated and studied.
An explanation of the dataset: Kyphosis dataset was obtained from Kaggle [20]. This kyphosis dataset has address patients’ records that had a remedial kyphosis disorder procedure. This choice of the knowledge represents three inputs (Age, Variety, and Start) and one output (Kyphosis). The credits of the dataset are clarified in Table 1 given beneath:
Table 1. Description of Dataset
Characteristic
|
Explanation
|
Kyphosis
|
Whether or not the kyphosis issue persisted after the procedure
|
Age
|
Monthly age of the patient
|
Number
|
Involved vertebrae with in procedure
|
Start
|
The count of the surgery's first or highest vertebra that was affected
|
Data Preprocessing: The Scikit-Learn software was used to preprocess the data. Figure 4 demonstrates how the kyphosis region was split into 0s and 1s using the Label Encoder. The data were standardized using the StandardScaler function of the sklearn library to encourage better performances with the ML/DL models.
Classification Learning Algorithms: A computer programme is educated to make new observations or classifications depending on the data it is provided through the process of classification in machine learning and statistics. The following learning algorithms (RF, SVM, KNN and DNN) were used in proposed models for training and prediction of outcomes.
Random Forest (RF): To improve the dataset's predictive accuracy, Random Forest is likely a classifier that consists of a group of selection bushes on various subsets of the given dataset. The random forest uses the forecasts from each tree and the majority votes of predictions to anticipate the final phrase output rather than relying solely on one selection tree [21, 22].
Support Vector Machine (SVM): One of the best classifiers, SVM exhibits some degree of linearity. The SVM is supported by sound mathematical intuition and has the ability to handle some nonlinear scenarios by employing a nonlinear basis function [23]. The goal of the Support Vector Machine rule is to create the sole decision boundary or line that may categorize n-dimensional space, allowing us to quickly assign the novel datum to the correct class over the long term [6, 24].
K-Nearest Neighbor (KNN): The KNN method simply saves the dataset during the preparation stage, and once it receives fresh data, it groups that data into a collection that is plentiful and equivalent towards the incoming information. [25, 26].
Deep Neural Network (DNN): A network with multiple layers and numerous neurons in each layer is known as a DNN. Every layer in a DNN will alternately turn on and off whenever another layer's output serves as that layer's input in the forward direction. An ANN and a deep neural network differ in that the former has 1 input layer, 1 output layer & at most 1 hidden layer. The later, however, should include more hidden layers [27]. Planned work has used 3 hidden layers for the aim of extracting a top quality feature from the dataset. The DNN model was implemented using the Keras model.
Hyperparameter Tuning & Control: Via training a model with existing information, we are competent to suit the model parameters. However, there are other reasonable parameters, called hyperparameters that cannot be directly learned from the regular training method. They're sometimes preset before the actual training method begins. These parameters categorical necessary properties of the model like its complexity or how briskly it needs to learn [28]. During this work, hyperparameter tuning was performed to pick the foremost effective parameters.
Cross validation: As we all know already, K-fold is applied that the model will learn well. It’s typically applied once one doesn't have enough information or uniformly distributed information [29]. K-fold cross validation additionally suffers from sampling. Stratified K-fold got to be hottest over K-fold once handling classification tasks with unbalanced class distributions. As a result, in the current work, to assess the models, stratified K-fold cross validation was employed.
Various Performance Measures:
The subsequent performance indicators are accustomed valuate predictions for stated classification problem using machine learning and deep learning models [30].
Confusion Matrix: An example of a confusion matrix is shown below. It is just a table with the dimensions "Actual" and "Predicted" and the scales "True Positives (TP), "True Negatives (TN)," "False Positives (FP)," and "False Negatives (FN)".
• TP− In this, each actual category & expected category of knowledge purpose is one.
• TN − In this, each actual category & expected category of knowledge purpose is zero.
• FP − In this, actual category of knowledge purpose is zero & expected category of knowledge purpose is one.
• FN − In this, actual category of knowledge purpose is one & expected category of knowledge purpose is zero.
Classification Accuracy: Classification accuracy should be outlined because the range of correct predictions created as a quantitative relation of all predictions created.
Accuracy of classification equals (TP + TN) / (TP + FP + FN + TN)
Precision: Because of the variety of accurate documents that ML/DL models returned, precision, which is used in document retrievals, is also described.
Precision equals TP / (TP + FP)
Sensitivity or Recall: Recall is also discussed since the ML/DL models' range of positive results.
Recall equals TP / (TP + FN)
Specificity: Specificity, in distinction to recall, is also outlined because the range of negatives came back by ML/DL models.
Specificity equals TN / (TN + FP)
F1-Score: The weighted average of recall and precision is used to generate the F1-score. Precision and recall both have an equal impact on the F1 score.
F1-score equals 2 x (recall x precision) / (recall + precision)
AUC- ROC Score: AUC-ROC metric can tell us regarding the competence of model in identifying the categories.
Balanced Accuracy Score: The balanced accuracy in binary and multiclass classification issues are accustomed cope with unbalanced datasets. It’s outlined because the average of recall obtained on every category. The simplest worth is one and also the worst worth is zero.
Proposed System
We demonstrate how the suggested predictive model's working design interacts with other significant deep learning and machine learning applications, tuning and control of hyperparameters, and model testing using k-fold cross validation. The proposed system has been used to classify individuals with, kyphosis illness and healthy individuals. The outcomes of varied ML/DL prognosticative models for kyphosis illness were tested. The favored machine learning models (RF, SVM, KNN), and deep learning models (DNN) were utilized within the system. The framework of the system is demonstrated in Figure 4.
We must select a dataset for Figure 6. that is related to the patients' kyphosis illness. Any and all discrepancies that might have existed in the dataset at the time the data was gathered must be removed during the preparation stage. The next step asks us to select the testing mode and the classification techniques that will be applied during implementation. With the help of the algorithms for hyperparameter tuning and control, it is now possible to put the classification algorithms that were previously explained into work.
We will ultimately need to compare and contrast each of the developed algorithms to see which one is best and offers the highest level of accuracy when predicting the outcome.
Implementation of Proposed System
The models were implemented via Google Colab notebook using the Python programing language version 3.6.9. With the help of the Keras model, the DNN model was implemented. The deep multilayer perceptron architecture with regularization and dropout employed in the developed DNN learning model is based on deep learning [31]. In this part, the aftereffects of the examination effort has been discussed by playing out an analytical investigation on the chosen dataset and ML/DL models with RF, SVM, KNN, and DNN algorithms supporting stratified K-Fold (K=5 and 10) cross validations. The results of the machine learning and deep learning models were given for model evaluation and compared after performing hyperparameter tuning with stratified K-Fold cross validation.
5.1. Exploratory Analysis: Exploratory examination showed that the absence of kyphosis in 79% of patients while the presence of kyphosis in 21% of patients as appeared with Figure 7. In Figure 8, it was possible to see a link between kyphosis and the number addressing 0.36. Figure 9 contains frequent examples of the kyphosis sickness, either present or missing, based on the characteristics of the individuals.
5.2. Random Forest (RF) Model Based Outcome: Utilizing stratified 5-fold cross validation, the RF model was developed. The achieved accuracy using numerous K-folds is shown in Table 2. Figure 10 illustrates how precisely Fold-3 has discriminated. The stratified 5-fold cross validation was validated by the RF model with 85.22% as a mean accuracy & 0.0485 as a std. deviation. After hyperparameter tuning using 5-fold stratified cross validation, the optimal parameter was selected at n_estimators=400 and criterion=gini.
Table 2. RF Model’s accomplished classification accuracies with stratified 5-fold cross validation
K-Folds
|
Accuracy
|
1
|
82.35%
|
2
|
81.25%
|
3
|
93.75%
|
4
|
81.25%
|
5
|
87.50%
|
Mean: 85.22% std. deviation: 0.0485
|
The stratified 10-fold cross validation had been carried out following extra preparation on the RF model. Table 3 provides the achieved accuracy with the multiple K-folds. Fold-5 had distinguished with foremost extreme accuracy which is envisioned in Figure 11. It is found that, the RF Model at that time accomplished 83.89% as a mean accuracy and 0.0982 as a std. deviation supported the 10-Fold stratified cross validation. Given hyperparameter tuning through stratified 10-fold cross validation, the finest parameter was selected at n_estimators=150 and criterion=gini.
Table 3. RF Model’s accomplished classification accuracies with stratified 10-fold cross validation
K-Folds
|
Accuracy
|
1
|
88.89%
|
2
|
87.50%
|
3
|
62.50%
|
4
|
75.00%
|
5
|
100.00%
|
6
|
87.50%
|
7
|
87.50%
|
8
|
75.00%
|
9
|
87.50%
|
10
|
87.50%
|
Mean: 83.89% std. deviation: 0.0982
|
In addition to classification accuracy, Table 4 provides other effective performance measures for the RF model based on stratified 5-fold & 10-fold cross validation.
Table 4. RF Model’s accomplished other performance measures with stratified 5-fold & 10-fold cross validation
K- Fold stratified cross validation
|
Recall/ Sensitivity
|
Specificity
|
Balanced Accuracy Score
|
Precision
|
F1 Score
|
AUC-ROC Score
|
5-fold (K=5)
|
0.48
|
0.95
|
0.72
|
0.70
|
0.51
|
0.72
|
10-fold (K=10)
|
0.35
|
0.96
|
0.65
|
0.60
|
0.43
|
0.65
|
5.3. Support Vector Machine (SVM) Model Based Outcome: The SVM model was created using stratified 5-fold cross validation. Table 5 provides the achieved accuracy with the multiple K-folds. Fold-3 had discriminated with the utmost precision, as seen in Figure 12. It is discovered that the SVM model supported the stratified 5-fold cross validation by achieving 85.22% as a mean accuracy and 0.0839 as a standard deviation. The best parameters were selected after hyperparameter tuning using a stratified 5-fold cross validation, and they comprised kernel=rbf, gamma=0.1, and C=3.
Table 5. SVM Model’s accomplished classification accuracies with stratified 5-fold cross validation
K-Folds
|
Accuracy
|
1
|
82.35%
|
2
|
75.00%
|
3
|
100.00%
|
4
|
81.25%
|
5
|
87.50%
|
Mean: 85.22% std. deviation: 0.0839
|
Additional modeling work and stratified 10-fold cross validation were done on the SVM model. Table 6 provides the achieved accuracy with the multiple K-folds. Fold-6 had been separated with the utmost precision, as seen in Figure 13. Support Vector Machine (SVM) Model at that time was found to have supported the stratified 10-Fold cross validation with 85.14% as a mean accuracy & 0.0940 as a standard deviation. The best hyperparameter values were chosen to be kernel=rbf, gamma=0.1, and C=8 using stratified 10-fold cross validation.
Table 6. SVM Model’s accomplished classification accuracies with stratified 10-fold cross validation
K-Folds
|
Accuracy
|
1
|
88.89%
|
2
|
87.50%
|
3
|
62.50%
|
4
|
87.50%
|
5
|
87.50%
|
6
|
100.00%
|
7
|
87.50%
|
8
|
75.00%
|
9
|
87.50%
|
10
|
87.50%
|
Mean: 85.14% std. deviation: 0.0940
|
Apart from accuracy of classification, accomplished other performance measures for the SVM model based on stratified 5-fold & 10-fold cross validation are given away in Table 7.
Table 7. SVM Model’s accomplished other performance measures with stratified 5-fold & 10-fold cross validation
K- Fold stratified cross validation
|
Recall/ Sensitivity
|
Specificity
|
Balanced Accuracy Score
|
Precision
|
F1 Score
|
AUC-ROC Score
|
5-fold (K=5)
|
0.48
|
0.95
|
0.72
|
0.68
|
0.51
|
0.72
|
10-fold (K=10)
|
0.55
|
0.94
|
0.75
|
0.68
|
0.54
|
0.75
|
5.4. K-Nearest Neighbor (KNN) Model Based Outcome: The KNN Model was created using stratified 5-fold cross validation. Table 8 provides the achieved accuracy with the multiple K-folds. Fold-3 and Fold-5 were recognized in Figure 14 with the highest degree of precision. The K-Nearest Neighbor (KNN) Model was found to have 85.22% a mean accuracy & 0.0740 as a standard deviation that supported the stratified 5-fold cross validation. The best parameters were chosen at n_Neighbor=9 and p=2 after hyperparameter tuning by stratified 5-fold cross validation.
Table 8. KNN Model’s accomplished classification accuracies with stratified 5-fold cross validation
K-Folds
|
Accuracy
|
1
|
82.35%
|
2
|
75.00%
|
3
|
93.75%
|
4
|
81.25%
|
5
|
93.75%
|
Mean: 85.22% std. deviation: 0.0740
|
The stratified 10-fold cross validation was carried out following additional K-Nearest Neighbor (KNN) Model preparation. Table 9 provides the accuracy results obtained with the numerous K-folds. Figure 15 shows how Fold-2, Fold-4, Fold-5, Fold-6, Fold-7, Fold-9, and Fold-10 were differentiated with the utmost precision. It is found that, K-Nearest Neighbor (KNN) Model at that time accomplished 84.03% as a mean accuracy and 0.0535 as a std. deviation supported the 10-fold stratified cross validation. Given hyperparameter tuning through 10-fold stratified cross validation, the finest parameters had been chosen to be n_Neighbor=11 and p=2.
Table 9. KNN Model’s accomplished classification accuracies with stratified 10-fold cross validation
K-Folds
|
Accuracy
|
1
|
77.78%
|
2
|
87.50%
|
3
|
75.00%
|
4
|
87.50%
|
5
|
87.50%
|
6
|
87.50%
|
7
|
87.50%
|
8
|
75.00%
|
9
|
87.50%
|
10
|
87.50%
|
Mean: 84.03% std. deviation: 0.0535
|
Table 10 provides accomplished performance measures for the KNN model based on stratified 5-fold & 10-fold cross validation in addition to classification accuracy.
Table 10. KNN Model’s accomplished other performance measures with stratified 5-fold & 10-fold cross validation
K- Fold stratified cross validation
|
Recall/ Sensitivity
|
Specificity
|
Balanced Accuracy Score
|
Precision
|
F1 Score
|
AUC-ROC Score
|
5-fold (K=5)
|
0.53
|
0.94
|
0.74
|
0.63
|
0.52
|
0.74
|
10-fold (K=10)
|
0.50
|
0.94
|
0.72
|
0.55
|
0.46
|
0.72
|
5.5. Deep Neural Network (DNN) Model Based Outcome: The DNN model was created using stratified 5-fold cross validation. The achieved accuracy using numerous K-folds is shown in Table 11. Figure 16 illustrates how precisely Fold-3 has discriminated. The stratified 5-fold cross validation was found to be validated by the DNN Model, which had 87.72% as a mean accuracy & 0.0666 as a standard deviation. The best parameters for hyperparameter tuning were selected using the following settings: learning rate = 0.0005, epochs = 100, batch size = 10, dropout = 0.2, kernel_initializer = "uniform," inputs = 60, 35, and 25 neurons in 3 hidden layers, severally rectified linear measure activation functions in the final layer, as well as the Adam Optimizer.
Table 11. DNN Model’s accomplished classification accuracies with stratified 5-fold cross validation
K-Folds
|
Accuracy
|
1
|
82.35%
|
2
|
81.25%
|
3
|
100.00%
|
4
|
87.50%
|
5
|
87.50%
|
Mean: 87.72% std. deviation: 0.0666
|
Additional preprocessing and stratified 10-fold cross validation were done on DNN model. The overall accuracy using different K-Folds is shown in Table 12. Fold-10 has discriminated with the utmost precision, as seen in Figure 17. It was discovered that the DNN model at the time achieved 87.64% as a mean accuracy & 0.0561 as a standard deviation. The stratified 10-fold cross validation was supported by the deviance. The best parameters were chosen after hyperparameter tuning using stratified 10-fold cross validation, having three inputs, 30, 30, and 25 neurons in three hidden layers with severally rectified linear measure activation functions, sigmoid activation so at final layer with one neuron, as well as the Adam Optimizer.
Table 12. DNN Model’s accomplished classification accuracies with stratified 10-fold cross validation
K-Folds
|
Accuracy
|
1
|
88.89%
|
2
|
87.50%
|
3
|
87.50%
|
4
|
87.50%
|
5
|
87.50%
|
6
|
87.50%
|
7
|
87.50%
|
8
|
75.00%
|
9
|
87.50%
|
10
|
100.00%
|
Mean: 87.64% std. deviation: 0.0561
|
In addition to classification accuracy, Table 13 provides other effective performance measures for the DNN model based on stratified 5-fold & 10-fold cross validation.
Table 13. DNN Model’s accomplished other performance measures with stratified 5-fold & 10-fold cross validation
K- Fold stratified cross validation
|
Recall/ Sensitivity
|
Specificity
|
Balanced Accuracy Score
|
Precision
|
F1 Score
|
AUC-ROC Score
|
5-fold (K=5)
|
0.62
|
0.95
|
0.79
|
0.90
|
0.65
|
0.79
|
10-fold (K=10)
|
0.55
|
0.97
|
0.76
|
0.70
|
0.57
|
0.76
|