All data for this study were obtained from the Department of Neurosurgery, Fujian Medical University Union Hospital. The study recorded the medical variables of patients who were hospitalized and underwent TMD between January 2018 and January 2021. The data included patients' basic information, medical history, physical examination, preoperative test results, and preoperative scores. Retrospective analysis was conducted, and deep learning and machine learning algorithms were used to establish a predictive model for the one-year postoperative recovery of patients with lumbar disc herniation.
Inclusion criteria
(1) age of inclusion: 12–85 years old; (2) have typical sciatica with or without lumbar pain and other symptoms; (3) those who have been ineffective after standardized conservative treatment for more than 3 months and seriously affect their lives, or those with severe pain, cauda equina dysfunction, muscle strength loss, muscle atrophy and other symptoms; (4) the straight leg raising test on the affected side is less than or equal to 70°; (5) confirmed by CT and MRI lumbar disc protrusion, and the location of the protrusion matches the corresponding neurological symptoms; (6) receiving PMTM technology treatment.
Exclusion criteria
(1) those with missing imaging data or unable to follow up as required; (2) those with segmental lumbar instability suggested by frontal and lateral lumbar X-ray and hyperextension and hyperflexion; (3) those with other serious physical, psychological or mental diseases; (4) those with rheumatic immune diseases that may cause similar symptoms; (5) those who are participating in other clinical trials.
Data collection
To construct and validate the prognostic model, we retrospectively collected clinical data related to patients with LDH who met the inclusion and exclusion criteria. The potential predictors included 43 variables related to patients' medical history, examination, and preoperative test results, with the cure rate of the lumbar Japanese Orthopedic Association (JOA) score one year after TMD as the outcome measure.
The following variables were included as factors in the analysis: age, gender, height, weight, body mass index (BMI), high-risk occupation (occupations that require prolonged sedentary or high-intensity physical activity), family history (with first-degree relatives affected by LDH), history of lumbar trauma, duration of disease, duration of preoperative conservative treatment, duration of preoperative pain medication, low back pain, underlying diseases (hypertension, diabetes), history of smoking, history of alcohol abuse, angle of preoperative physical examination (as measured by the straight leg raise test), sensory impairment, muscle strength classification of the affected limb, Barthel scale, serum creatine kinase (CK) and serum albumin (ALB) levels, lumbar degeneration, associated lumbar disc herniation, American Society of Anesthesiologists (ASA) grading, Oswestry Disability Index (ODI) score, preoperative low back pain and leg pain numerical rating scale (NRS) scores, the number of surgical segments as determined by The JOA, surgical time, and intraoperative bleeding. These are shown in Table1.The cure rate score of the lumbar JOA 1 year after TMD surgery was also used as an outcome measure. Further details on these factors are provided in supplementary file 1.
Table 1
Descriptive statistics of different influencing factors in a study population grouped by whether the improvement in lumbar JOA score was > 60% 1 year after TMD.
Variables
|
Total (n = 273)
|
JOA ≤ 60%(n = 117)
|
JOA>60% (n = 156)
|
p
|
gender, n (%)
|
|
|
|
0.82
|
female
|
127 (47)
|
53 (45)
|
74 (47)
|
|
male
|
146 (53)
|
64 (55)
|
82 (53)
|
|
age, n (%)
|
|
|
|
< 0.001
|
40
|
58 (21)
|
15 (13)
|
43 (28)
|
|
40–60
|
130 (48)
|
52 (44)
|
78 (50)
|
|
60
|
85 (31)
|
50 (43)
|
35 (22)
|
|
height, n (%)
|
|
|
|
< 0.001
|
1.6
|
75 (27)
|
44 (38)
|
31 (20)
|
|
1.6–1.7
|
112 (41)
|
34 (29)
|
78 (50)
|
|
1.7
|
86 (32)
|
39 (33)
|
47 (30)
|
|
weight, n (%)
|
|
|
|
0.663
|
50
|
14 (5)
|
7 (6)
|
7 (4)
|
|
60
|
70 (26)
|
26 (22)
|
44 (28)
|
|
70
|
100 (37)
|
45 (38)
|
55 (35)
|
|
80
|
59 (22)
|
23 (20)
|
36 (23)
|
|
90
|
24 (9)
|
13 (11)
|
11 (7)
|
|
90+
|
6 (2)
|
3 (3)
|
3 (2)
|
|
BMI, n (%)
|
|
|
|
0.089
|
18,5
|
6 (2)
|
2 (2)
|
4 (3)
|
|
18.5–24
|
130 (48)
|
46 (39)
|
84 (54)
|
|
24–28
|
111 (41)
|
56 (48)
|
55 (35)
|
|
28
|
26 (10)
|
13 (11)
|
13 (8)
|
|
trauma, n (%)
|
|
|
|
0.763
|
no
|
262 (96)
|
113 (97)
|
149 (96)
|
|
yes
|
11 (4)
|
4 (3)
|
7 (4)
|
|
hypertension, n (%)
|
|
|
|
0.021
|
no
|
211 (77)
|
82 (70)
|
129 (83)
|
|
yes
|
62 (23)
|
35 (30)
|
27 (17)
|
|
diabetes, n (%)
|
|
|
|
0.027
|
no
|
245 (90)
|
99 (85)
|
146 (94)
|
|
yes
|
28 (10)
|
18 (15)
|
10 (6)
|
|
bibulosity, n (%)
|
|
|
|
0.005
|
no
|
253 (93)
|
102 (87)
|
151 (97)
|
|
yes
|
20 (7)
|
15 (13)
|
5 (3)
|
|
smoke, n (%)
|
|
|
|
0.854
|
no
|
208 (76)
|
88 (75)
|
120 (77)
|
|
yes
|
65 (24)
|
29 (25)
|
36 (23)
|
|
ASA, n (%)
|
|
|
|
0.002
|
1
|
194 (71)
|
72 (62)
|
122 (78)
|
|
2
|
70 (26)
|
37 (32)
|
33 (21)
|
|
3
|
8 (3)
|
7 (6)
|
1 (1)
|
|
acesodyne, n (%)
|
|
|
|
0.059
|
no
|
168 (62)
|
64 (55)
|
104 (67)
|
|
yes
|
105 (38)
|
53 (45)
|
52 (33)
|
|
hormone, n (%)
|
|
|
|
0.701
|
no
|
261 (96)
|
113 (97)
|
148 (95)
|
|
yes
|
12 (4)
|
4 (3)
|
8 (5)
|
|
CTT, n (%)
|
|
|
|
0.096
|
3
|
119 (44)
|
46 (39)
|
73 (47)
|
|
6
|
46 (17)
|
25 (21)
|
21 (13)
|
|
12
|
35 (13)
|
16 (14)
|
19 (12)
|
|
24
|
22 (8)
|
5 (4)
|
17 (11)
|
|
24+
|
51 (19)
|
25 (21)
|
26 (17)
|
|
WLPT., n (%)
|
|
|
|
0.068
|
3
|
85 (31)
|
35 (30)
|
50 (32)
|
|
6
|
38 (14)
|
17 (15)
|
21 (13)
|
|
12
|
40 (15)
|
18 (15)
|
22 (14)
|
|
24
|
24 (9)
|
4 (3)
|
20 (13)
|
|
24+
|
86 (32)
|
43 (37)
|
43 (28)
|
|
lumbago, n (%)
|
|
|
|
0.137
|
no
|
72 (26)
|
25 (21)
|
47 (30)
|
|
yes
|
201 (74)
|
92 (79)
|
109 (70)
|
|
SLETA, n (%)
|
|
|
|
0.496
|
40
|
73 (27)
|
27 (23)
|
46 (29)
|
|
60
|
118 (43)
|
53 (45)
|
65 (42)
|
|
60+
|
82 (30)
|
37 (32)
|
45 (29)
|
|
DOS, n (%)
|
|
|
|
0.008
|
nothing
|
144 (53)
|
53 (45)
|
91 (58)
|
|
mild
|
103 (38)
|
46 (39)
|
57 (37)
|
|
obvious
|
26 (10)
|
18 (15)
|
8 (5)
|
|
MS, n (%)
|
|
|
|
0.158
|
1
|
1 (0)
|
1 (1)
|
0 (0)
|
|
2
|
1 (0)
|
1 (1)
|
0 (0)
|
|
3
|
3 (1)
|
2 (2)
|
1 (1)
|
|
4
|
68 (25)
|
34 (29)
|
34 (22)
|
|
5
|
199 (73)
|
79 (68)
|
120 (77)
|
|
Babinski, n (%)
|
|
|
|
0.105
|
negative
|
263 (96)
|
110 (94)
|
153 (98)
|
|
positive
|
10 (4)
|
7 (6)
|
3 (2)
|
|
CK, n (%)
|
|
|
|
0.973
|
198
|
253 (93)
|
109 (93)
|
144 (92)
|
|
198+
|
20 (7)
|
8 (7)
|
12 (8)
|
|
Albumin, n (%)
|
|
|
|
1
|
35
|
259 (95)
|
111 (95)
|
148 (95)
|
|
35+
|
14 (5)
|
6 (5)
|
8 (5)
|
|
Number, n (%)
|
|
|
|
0.013
|
1
|
129 (47)
|
44 (38)
|
85 (54)
|
|
2
|
95 (35)
|
45 (38)
|
50 (32)
|
|
3
|
29 (11)
|
14 (12)
|
15 (10)
|
|
4
|
16 (6)
|
12 (10)
|
4 (3)
|
|
5
|
4 (1)
|
2 (2)
|
2 (1)
|
|
SSN, n (%)
|
|
|
|
0.095
|
1
|
3 (1)
|
2 (2)
|
1 (1)
|
|
2
|
2 (1)
|
1 (1)
|
1 (1)
|
|
3
|
22 (8)
|
14 (12)
|
8 (5)
|
|
4
|
158 (58)
|
69 (59)
|
89 (57)
|
|
5
|
88 (32)
|
31 (26)
|
57 (37)
|
|
segments, n (%)
|
|
|
|
0.023
|
1
|
263 (96)
|
109 (93)
|
154 (99)
|
|
2
|
8 (3)
|
7 (6)
|
1 (1)
|
|
3
|
2 (1)
|
1 (1)
|
1 (1)
|
|
Operation time, n (%)
|
|
|
|
0.177
|
2
|
85 (31)
|
32 (27)
|
53 (34)
|
|
3
|
121 (44)
|
50 (43)
|
71 (46)
|
|
3+
|
67 (25)
|
35 (30)
|
32 (21)
|
|
Collapse, n (%)
|
|
|
|
0.014
|
no
|
195 (71)
|
74 (63)
|
121 (78)
|
|
yes
|
78 (29)
|
43 (37)
|
35 (22)
|
|
LS, n (%)
|
|
|
|
0.169
|
no
|
253 (93)
|
105 (90)
|
148 (95)
|
|
yes
|
20 (7)
|
12 (10)
|
8 (5)
|
|
Osteoporosis, n (%)
|
|
|
|
< 0.001
|
no
|
227 (83)
|
83 (71)
|
144 (92)
|
|
yes
|
46 (17)
|
34 (29)
|
12 (8)
|
|
Calcification, n (%)
|
|
|
|
0.02
|
no
|
126 (46)
|
44 (38)
|
82 (53)
|
|
yes
|
147 (54)
|
73 (62)
|
74 (47)
|
|
SD, n (%)
|
|
|
|
0.207
|
1,5
|
111 (41)
|
42 (36)
|
69 (44)
|
|
1.5+
|
162 (59)
|
75 (64)
|
87 (56)
|
|
position, n (%)
|
|
|
|
0.136
|
-2
|
21 (8)
|
8 (7)
|
13 (9)
|
|
-1
|
116 (46)
|
44 (40)
|
72 (50)
|
|
0
|
117 (46)
|
59 (53)
|
58 (41)
|
|
location, n (%)
|
|
|
|
0.827
|
1
|
117 (43)
|
52 (44)
|
65 (42)
|
|
2
|
108 (40)
|
47 (40)
|
61 (39)
|
|
3
|
44 (16)
|
17 (15)
|
27 (17)
|
|
4
|
4 (1)
|
1 (1)
|
3 (2)
|
|
Grade, n (%)
|
|
|
|
0.612
|
1
|
101 (37)
|
41 (35)
|
60 (38)
|
|
2
|
112 (41)
|
47 (40)
|
65 (42)
|
|
3
|
60 (22)
|
29 (25)
|
31 (20)
|
|
Modiccange, n (%)
|
|
|
|
0.978
|
1
|
47 (28)
|
22 (28)
|
25 (28)
|
|
2
|
55 (33)
|
26 (33)
|
29 (32)
|
|
3
|
66 (39)
|
30 (38)
|
36 (40)
|
|
Pfirrmann, n (%)
|
|
|
|
< 0.001
|
1
|
2 (1)
|
0 (0)
|
2 (1)
|
|
2
|
29 (11)
|
10 (9)
|
19 (12)
|
|
3
|
102 (37)
|
32 (27)
|
70 (45)
|
|
4
|
89 (33)
|
37 (32)
|
52 (33)
|
|
5
|
51 (19)
|
38 (32)
|
13 (8)
|
|
lumbagoNRS, n (%)
|
|
|
|
< 0.001
|
0–2
|
85 (31)
|
26 (22)
|
59 (38)
|
|
3–4
|
121 (44)
|
50 (43)
|
71 (46)
|
|
5–6
|
62 (23)
|
39 (33)
|
23 (15)
|
|
7–8
|
5 (2)
|
2 (2)
|
3 (2)
|
|
legpainNRS, n (%)
|
|
|
|
0.524
|
0–2
|
6 (2)
|
2 (2)
|
4 (3)
|
|
3–4
|
92 (34)
|
35 (30)
|
57 (37)
|
|
5–6
|
144 (53)
|
64 (55)
|
80 (51)
|
|
7–8
|
31 (11)
|
16 (14)
|
15 (10)
|
|
pre.operationJOA, n (%)
|
|
|
|
< 0.001
|
10
|
58 (21)
|
39 (33)
|
19 (12)
|
|
10–15
|
166 (61)
|
60 (51)
|
106 (68)
|
|
16–24
|
49 (18)
|
18 (15)
|
31 (20)
|
|
pre.operationODI, n (%)
|
|
|
|
< 0.001
|
0–20
|
2 (1)
|
1 (1)
|
1 (1)
|
|
21–40
|
57 (21)
|
18 (16)
|
39 (25)
|
|
41–60
|
126 (47)
|
41 (36)
|
85 (54)
|
|
61–80
|
61 (23)
|
39 (35)
|
22 (14)
|
|
81–100
|
23 (9)
|
14 (12)
|
9 (6)
|
|
occupation, n (%)
|
|
|
|
< 0.001
|
no
|
145 (53)
|
44 (38)
|
101 (65)
|
|
yes
|
128 (47)
|
73 (62)
|
55 (35)
|
|
Numbness after, n (%)
|
|
|
|
< 0.001
|
no
|
141 (52)
|
38 (32)
|
103 (66)
|
|
yes
|
132 (48)
|
79 (68)
|
53 (34)
|
|
Reduction of lumbago, n (%)
|
|
|
|
0.112
|
no
|
140 (51)
|
67 (57)
|
73 (47)
|
|
yes
|
133 (49)
|
50 (43)
|
83 (53)
|
|
Reduction of leg, n (%)
|
|
|
|
< 0.001
|
no
|
90 (33)
|
66 (56)
|
24 (15)
|
|
yes
|
183 (67)
|
51 (44)
|
132 (85)
|
|
JOAimprovement, n (%)
|
|
|
|
< 0.001
|
no
|
20 (7)
|
20 (17)
|
0 (0)
|
|
yes
|
253 (93)
|
97 (83)
|
156 (100)
|
|
ODIdifference, n (%)
|
|
|
|
< 0.001
|
no
|
45 (16)
|
34 (29)
|
11 (7)
|
|
yes
|
228 (84)
|
83 (71)
|
145 (93)
|
|
reoperation, n (%)
|
|
|
|
< 0.001
|
no
|
258 (95)
|
102 (87)
|
156 (100)
|
|
yes
|
15 (5)
|
15 (13)
|
0 (0)
|
|
Proximal.lumbar.process,n (%)
|
|
|
|
< 0.001
|
0
|
264 (97)
|
108 (92)
|
156 (100)
|
|
1
|
9 (3)
|
9 (8)
|
0 (0)
|
|
Recurrence, n (%)
|
|
|
|
< 0.001
|
0
|
249 (91)
|
93 (79)
|
156 (100)
|
|
1
|
24 (9)
|
24 (21)
|
0 (0)
|
|
CTT,Conservative Treatment Time
|
WLPT,Waist Leg Pain Time
|
SLETA,Straight leg elevation test Angle of affected limb,
|
DOS, disturbance of sensation
|
MS, Muscle strength
|
Number, Number of salient segments
|
SSN, Surgical segment number,
|
segment, Number of operative segments
|
Collapsa, Collapse of intervertebral space
|
LS, Lumbar spondylolisthesis
|
Calcification, Calcification of ligaments Hyperplasia of bone
|
SD, Sagittal diameter
|
position, Sagittal disc herniation horizontal position
|
location, Transected herniated dic location
|
Grade, Grading of transected disc herniation
|
Numbness after, Numbness in the year after surgery
|
Reduction of lumbago ,Reduction of lumbago NRS 1 year after surgery≧2
|
Reduction of leg, Reduction of leg pain NRS 1 year after surgery>2
|
JOA improvement ,JOA improvement rate1year after surgery≧25
|
ODIdifference, ODIdifference 1 year after surgery>20
|
Proximal lumbar process, Proximal lumbar process within 1 year after surgery
|
Recurrence, Recurrence occurred within 1 year after surgery
|
Outcome indicators
Cure rate scores for lumbar JOA score at one year after TMD surgery were calculated using the same method as before the operation. The cure rate was calculated as follows: [(post-treatment score - pre-treatment score) ÷ (full score 29 - pre-treatment score)] × 100%. This rate reflects the improvement of lumbar spine function before and after treatment, and is utilized to evaluate the clinical efficacy of the intervention. A cure rate of 100% indicates complete recovery, while a cure rate of greater than 60% is considered to be significantly effective. Improvement rates falling within the range of 25–60% are categorized as effective, while those below 25% are classified as ineffective. To process the data, patients with an improvement rate of lumbar JOA score > 60% (significant efficacy or cure) one year after TMD were recorded as 1, while patients with an improvement rate of lumbar JOA score ≤ 60% (effective but not significant or ineffective) were recorded as 0.
Feature Engineering
Feature engineering is a process that involves transforming raw data into features that are more suitable for modeling. By doing so, the resulting features are able to capture relevant patterns, thereby improving the predictive accuracy of machine learning and deep learning models on unseen data10. In this study, the feature engineering process includes data preprocessing and feature selection. Specifically, missing values are filled in using the mean interpolation method11, 12, and a Spearman correlation matrix heat map is generated to assess the degree of correlation among the features. If a large number of features are highly correlated, feature selection is performed to improve model performance.
Spearman ρ correlation matrix heat map
We conducted a correlation analysis of the data using a Spearman ρ correlation matrix heat map13. The Spearman correlation matrix heat map is suitable for analyzing data that do not conform to a normal distribution, as well as data that contain categorical variables. It can measure the correlation between any two variables, with a value of + 1 indicating a total positive correlation, -1 indicating a total negative correlation, and 0 indicating no correlation. The results of the correlation analysis can be visually represented using a heat map, which uses color to indicate the magnitude of the correlation, making it easier and more intuitive to interpret the results.
Machine learning and deep learning
We employed a systematic framework based on machine learning and deep learning to construct prognostic models. To this end, we divided the data into a training dataset for developing the predictive model and a test dataset for evaluating the accuracy of the model14. The data were randomly divided into two groups in a ratio of 70:30, with 69.9% (191/273) of the samples designated as the training set for developing the predictive model, and 30.1% (82/273) of the samples designated as the test set for evaluating the accuracy of the model. Once the training set was defined, an optimal model was developed using nine different machine learning algorithms, including Decision Tree (DT), Random Forest (RF), Extreme Gradient Boost (XGBoost), Support Vector Machine (SVM), Parsimonious Bayes (NB), K Nearest Neighbour (KNN), Ridge Regression (Logistic Regression with L2 Regularisation), Logistic Regression without Regularisation, and Neural Network (NN).
To optimize the accuracy of the predictive models, a grid search was conducted on the hyperparameters for each of the nine ML algorithms used. A 10-fold cross-validation was employed, whereby the training data set was divided into 10 equally-sized folds, and the model was created using 90% of the data in each fold, with the remaining data used to evaluate the model's accuracy. The process was repeated 10 times, with each fold being used for one of the 10 training steps15, 16. The area under the receiver operating characteristic (ROC) curve, also known as area under the curve(AUC), was used as the primary accuracy metric during the grid search17. The AUC is a performance measure that evaluates the strengths and weaknesses of the learner and is widely used in clinical settings to assess the performance of ML algorithms on test datasets18. In addition to the AUC, accuracy, precision, recall, and F1 values were also reported to provide a comprehensive picture of the algorithm's performance17.
The modeling and prediction process for deep learning is similar to traditional machine learning, with the main difference being that deep learning is end-to-end and can automatically extract high-level features, greatly reducing the reliance on feature engineering in traditional machine learning7.
Statistical analysis
All data analyses were conducted using Python version 3.8.3. Each machine learning algorithm was implemented using the scikit-learn library in Python, with the exception of the XGBoost algorithm, which was implemented using its own dedicated Python package. The Table 1 was generated using the tableone package in the R language.