Alternative medical professionals have already done a decent job of diagnosing and predicting disease by comparing and applying a number of machine learning algorithms.
In [5], Kyphosis must be addressed since some cases are congenital and some are iatrogenic, depending on the patient's needs and current state. Depending on the severity of the patterns, patients who develop kyphosis at a young age may be treated effectively with counseling and exercises; in other cases they may require surgery.
Machine learning methods such Random Forest, Support Vector Machines & Artificial Neural Networks were used in [6] to identify and forecast the kyphosis disease. Artificial Neural Networks were discovered to be more accurate than the other two techniques. SVM-Linear model accuracies of 79.01 percent and 77.76 percent, SVM-RBF model accuracies of 82.72 percent and 85.19 percent, and SVM-Poly model accuracies of 82.72 percent and 85.19 percent were obtained by the authors using the SVM, RF & ANN models. They discovered that the ANN (3-6-6-1) version performed the best after using grid search techniques, with 86.42 and 85.19 percent supporting 10-fold & 5-fold cross validations, respectively.
The authors in [8] looked at the connection between a front postural stability, round shoulders, and progressive kyphosis illness. These three issues could appear separately or together. There was a significant correlation between cervical lordosis values and thoracic kyphosis values.
The authors of [9] examined adult kyphosis and covered a variety of surgical and nonsurgical therapies. Physical therapy is part of the non-surgical treatment, which hasn't changed in years but was used to lessen the symptoms. To better the outcomes, however, the surgical procedure has undergone a significant evolution. The three cases that were shown helped to explain these surgical procedures for various ailments.
Artificial intelligence and machine learning have made numerous advances to the field of spine research, according to Fabio Galbusera et al. [10]. It includes various tools and cutting-edge techniques developed for radiological image segmentation, clinical outcome prediction, and picture identification. The author discusses several decision support systems, computer-aided diagnosis, and problems in this review. Additionally, the author turned in a responsible paper on data security and privacy, including all pertinent problems.
In [11], ML performed admirably and has a lot of promise in the spine. ML could aid clinical staff in raising the medical standard, increasing work productivity, and lowering adverse occurrences. However, in order for clinicians to embrace the use of models in actual work, there must be more randomized controlled trials and improvements in interpretability.
Kavitha et al. [12] used thirteen inputs, two hidden nodes, and one output to model an ANN for detecting heart disease. Additionally, ANN was employed in a study by Noura [13] to predict cardiac disease with an accuracy of 88%. Due to its prognosticative power as a model, According to Animesh et al. [14], ANN is frequently used in diagnosis and care systems. Mrudula et al[15] .'s comparison of SVM and ANN for the prediction of heart disease revealed that the ANN model outperformed the SVM model.
Additionally, Tahmooresi et al. [16] evaluated a variety of machine learning techniques for detecting breast cancer, including SVM, ANN, KNN, and DT. They did a thorough literature analysis on numerous machine learning methods to learn more. They came to the conclusion that SVM performed better than all other models, with 99.8% accuracy, in recognizing the illness known as breast cancer.
In their study, Kuo et al. [17] looked at how the NB, SVM, LR, DT, and RF models were presented in predicting the clinical costs of spinal fusion. They came to the conclusion that, when compared to the other procedures, the RF model performed the best in terms of prognostication, with an accuracy of 84.30 percent.
Abdullah et al. [18] used Random Forest & K-Nearest Neighbors models to detect spinal anomalies. They discovered that the K-Nearest Neighbors (KNN) model surpassed the Random Forest (RF) model, which had 79.57 percent accuracy, with an accuracy of 85.32 percent.
The SVM, LR, bagging SVM & bagging LR models, which are all publicly accessible in the Kaggle repository, are used in [19] on a dataset of 310 samples. Several performance metrics, such as training and testing accuracy, memory, and miss rate, were used to assess the performance of the classification of aberrant and normal spinal patients. In addition, curves for precision-recall, kernel parameter optimization, and receiver operating characteristic analysis are employed to assess classifier models. The results show that the observed training accuracies for SVM, LR, bagging SVM & bagging LR are 86.30%, 85.47%, 86.72% & 85.06%, respectively, when 78% of the data are used for training. SVM, LR, bagging SVM & bagging LR, on the other hand, all perform equally well on the test dataset, with an accuracy of 86.96%. SVM, however, stands out from the competition because of its superior recall value and miss rate.
A literature analysis revealed that the art of labour is in decline due to various problems, including a loss of accuracy and a wider reach. Following the review, we attempted to apply the chosen classification algorithms to a dataset on the kyphosis disease that was taken from the Kaggle repository. In this study, we examined how well the most widely used machine learning and deep learning algorithms, RF, SVM, KNN & DNN, performed when tuned hyperparameters and stratified K-fold cross validation were used.
Materials and Methods
The implementation of the suggested research project is built upon this section. In this section, the pre-processing processes are described, followed by a discussion of the implementation-specific final categorization algorithms. In the current study, RF, KNN, SVM & DNN algorithms are used to create models to identify individuals with kyphosis who have received previous therapy for the condition and have personal records that indicate when they will require skilled surgery. The models' impacts are then evaluated and studied.
An explanation of the dataset
Kyphosis dataset was obtained from Kaggle [20]. This kyphosis dataset has address patients’ records that had a remedial kyphosis disorder procedure. This choice of the knowledge represents three inputs (Age, Variety, and Start) and one output (Kyphosis). The credits of the dataset are clarified in Table 1 given beneath
Table 1
Characteristic
|
Explanation
|
Kyphosis
|
Whether or not the kyphosis issue persisted after the procedure
|
Age
|
Monthly age of the patient
|
Number
|
Involved vertebrae with in procedure
|
Start
|
The count of the surgery's first or highest vertebra that was affected
|
Data Preprocessing
The Scikit-Learn software was used to preprocess the data. Figure 4 demonstrates how the kyphosis region was split into 0s and 1s using the Label Encoder. The data were standardized using the StandardScaler function of the sklearn library to encourage better performances with the ML/DL models.
Classification Learning Algorithms
A computer programme is educated to make new observations or classifications depending on the data it is provided through the process of classification in machine learning and statistics. The following learning algorithms (RF, SVM, KNN and DNN) were used in proposed models for training and prediction of outcomes.
Random Forest (RF)
To improve the dataset's predictive accuracy, Random Forest is likely a classifier that consists of a group of selection bushes on various subsets of the given dataset. The random forest uses the forecasts from each tree and the majority votes of predictions to anticipate the final phrase output rather than relying solely on one selection tree [21, 22].
Support Vector Machine (SVM)
One of the best classifiers, SVM exhibits some degree of linearity. The SVM is supported by sound mathematical intuition and has the ability to handle some nonlinear scenarios by employing a nonlinear basis function [23]. The goal of the Support Vector Machine rule is to create the sole decision boundary or line that may categorize n-dimensional space, allowing us to quickly assign the novel datum to the correct class over the long term [6, 24].
K-Nearest Neighbor (KNN)
The KNN method simply saves the dataset during the preparation stage, and once it receives fresh data, it groups that data into a collection that is plentiful and equivalent towards the incoming information. [25, 26].
Deep Neural Network (DNN)
A network with multiple layers and numerous neurons in each layer is known as a DNN. Every layer in a DNN will alternately turn on and off whenever another layer's output serves as that layer's input in the forward direction. An ANN and a deep neural network differ in that the former has 1 input layer, 1 output layer & at most 1 hidden layer. The later, however, should include more hidden layers [27]. Planned work has used 3 hidden layers for the aim of extracting a top quality feature from the dataset. The DNN model was implemented using the Keras model.
Hyperparameter Tuning & Control
Via training a model with existing information, we are competent to suit the model parameters. However, there are other reasonable parameters, called hyperparameters that cannot be directly learned from the regular training method. They're sometimes preset before the actual training method begins. These parameters categorical necessary properties of the model like its complexity or how briskly it needs to learn [28]. During this work, hyperparameter tuning was performed to pick the foremost effective parameters.
Cross validation
As we all know already, K-fold is applied that the model will learn well. It’s typically applied once one doesn't have enough information or uniformly distributed information [29]. K-fold cross validation additionally suffers from sampling. Stratified K-fold got to be hottest over K-fold once handling classification tasks with unbalanced class distributions. As a result, in the current work, to assess the models, stratified K-fold cross validation was employed.
Various Performance Measures:
The subsequent performance indicators are accustomed valuate predictions for stated classification problem using machine learning and deep learning models [30].
Confusion Matrix
An example of a confusion matrix is shown below. It is just a table with the dimensions "Actual" and "Predicted" and the scales "True Positives (TP), "True Negatives (TN)," "False Positives (FP)," and "False Negatives (FN)".
• TP − In this, each actual category & expected category of knowledge purpose is one.
• TN − In this, each actual category & expected category of knowledge purpose is zero.
• FP − In this, actual category of knowledge purpose is zero & expected category of knowledge purpose is one.
• FN − In this, actual category of knowledge purpose is one & expected category of knowledge purpose is zero.
Classification Accuracy
Classification accuracy should be outlined because the range of correct predictions created as a quantitative relation of all predictions created.
Accuracy of classification equals (TP + TN) / (TP + FP + FN + TN)
Precision
Because of the variety of accurate documents that ML/DL models returned, precision, which is used in document retrievals, is also described.
Precision equals TP / (TP + FP)
Sensitivity or Recall
Recall is also discussed since the ML/DL models' range of positive results.
Recall equals TP / (TP + FN)
Specificity
Specificity, in distinction to recall, is also outlined because the range of negatives came back by ML/DL models.
Specificity equals TN / (TN + FP)
F1-Score
The weighted average of recall and precision is used to generate the F1-score. Precision and recall both have an equal impact on the F1 score.
F1-score equals 2 x (recall x precision) / (recall + precision)
AUC- ROC Score
AUC-ROC metric can tell us regarding the competence of model in identifying the categories.
Balanced Accuracy Score
The balanced accuracy in binary and multiclass classification issues are accustomed cope with unbalanced datasets. It’s outlined because the average of recall obtained on every category. The simplest worth is one and also the worst worth is zero.