Probabilistic Graphical Model using Bayesian Networks for Predicting Clinical Outcome after Posterior Decompression in Patients with Cervical Spondylotic Myelopathy

The objective of this study was to develop a probabilistic graphical model (PGM) to show the personalized prediction of clinical outcome in patients with cervical spondylotic myelopathy (CSM) with different clinical conditions after posterior decompression and to use the PGM to identify causal predictors of the outcome. Methods We included data from 59 patients who had undergone cervical posterior decompression for CSM. The candidate predictive parameters were age, sex, body mass index, trauma history, symptom duration, preoperative and last Japanese Orthopaedic Association (JOA) scores, gait impairment, claudication, bladder dysfunction, Nurick grade, American Spinal Injury Association (ASIA) grade, smoking, diabetes mellitus, cardiopulmonary disorders, hypertension, stroke, Parkinson disease, dementia, psychiatric disorders, arthritis, ossication of the posterior longitudinal ligament, cord signal change in T1-weighted images, postoperative kyphosis, and cord compression ratio. Statistical and Bayesian network analyses were used to create the PGM and identify predictive factors. follow-up of demyelinating tumors, previous cervical surgery, or intradural Twenty-ve variables were assessed: last follow-up JOA score (LastJOA), sex, age, body mass index (BMI), cervical trauma history, smoking, diabetes mellitus (DM), hypertension, arthritis, psychiatric disorder, stroke, Parkinson’s disease, dementia, cardiopulmonary disease, ossication of the posterior longitudinal ligament (OPLL), claudication, gait disturbance, bladder dysfunction, symptom duration before surgery, preoperative JOA score (PreJOA), American Spinal Injury Association (ASIA) grade, Nurick grade, cervical cord signal changes in T1MRI images, postoperative kyphosis (PostKyphosis) identied in a postoperative X-ray, and cord compression ratio observed on the preoperative cervical MRI (CR). The CR was calculated by dividing the sagittal diameter by the transverse diameter of the spinal cord at the most compressed level. A smaller CR value indicated greater cervical compression. Twenty-three of the 25 variables, LastJOA PostKyphosis, evaluated preoperatively. Thirteen of these study using multicenter clinical data. Second, although the mean follow-up time was 33 months, the time to obtain the LastJOA was different. Increasing the number of enrolled patients in a future study will allow us to apply the BN model for each year of follow-up. The imaging status such as severity of the T2 signal change in the cervical cord on MRI was not considered in the present study. Adding this imaging variable analyzed by convolutional neural networks may help to improve the BN model’s predictive ability. The predictive model used in the present study may be controversial from the viewpoint of condence. However, other statistical analyses or predictive models have not provided greater accuracy or better intuitive results than the nal BN with same number of patients and factors. Using information such as sex, dementia, and the PreJOA, the nal BN may help to anticipate the probability of the LastJOA after posterior decompressive surgery for each patient with CSM. The causal predictors of surgical outcome for CSM were sex, dementia, and preoperative JOA score. Use of the PGM with the Bayesian network may be useful personalized medicine tool for predicting the outcome for each patient with CSM.


Background
Cervical spondylotic myelopathy (CSM) is a common degenerative spine disorder that can cause spinal cord dysfunction [1]. It is controversial whether decompressive surgery should be performed in patients with CSM [2,3]. The natural history of CSM shows that the symptoms of myelopathy will worsen over time without surgical treatment in 20-60% of patients [4,5]. Some studies have reported that this surgery can arrest the natural history of CSM and disease progression, and prevent further neurological decline [6,7]. Therefore, surgical decompression to remove the compressive pathology and increase the space available for the spinal cord is considered to be an acceptable treatment for CSM. However, surgical decompression surgery does not always improve the clinical outcome with patients with CSM.
To address the limitations of surgical decompression, researchers have studied the key factors that predict the surgical outcome for patients with CSM. The independent factors identi ed as predictors including age, Japanese Orthopaedic Association (JOA) scores, Nurick grade, impaired gait, duration of symptom, diabetes, psychiatric comorbidities, sex, smoking, changes in T1-weighted MRI (T1MRI) images, and the compression ratio of the spinal cord [1,[8][9][10][11][12]. It is unclear which patients will experience meaningful improvement after surgical decompression, and it is a challenge for spine surgeons to predict who is likely to bene t the most from surgical decompression.
A prediction tool would be valuable to creating an expectation for the surgical outcome for individual patients with CSM and for supporting the surgeon's decision. To identify statistically meaningful predictive factors from previous studies, multivariate logistic regression analyses have been conducted to strengthen the results of univariate analyses [13]. However, previous statistical analyses have several disadvantages. Although predictive factors may correlate signi cantly with other factors, the logistic regression analyses shows only parallel relationships between factors and cannot identify causal interactions [14]. In addition, each outcome of interest should be trained using its own individual model [15]. It is di cult to obtain su cient data for thorough statistical analysis and the treatment guidelines for some kinds of diseases are not universally accepted or may change rapidly (e.g., malignant diseases or rare diseases) [15]. Surgeons sometimes ask for expert opinion in di cult cases.
It would be helpful to have a prediction tool to overcome obstacles such as incomplete, insu cient, or missing data and to identify causal relationships between factors. Such a prediction tool may help in the treatment of CSM by improving the probability of a successful treatment outcome. Given the limitations of the regression-based approach, it may not be suitable for risk prediction for individual patients using what-if scenarios and effect-to cause reasoning in the era of precision and personalized medicine [16].
Using machine learning algorithms, Bayesian networks (BNs) can identify casual relationships between variables and show predictive inferences using probabilistic graphical modeling (PGM) [17]. PGM using BNs can be used to predict risk at the individual patient level and to show multiple outcomes and exposures in a single model [15]. Several recent articles have described the use of BNs in medical prediction models [15,18]. In this study, we applied BNs to facilitate understanding of the interaction between surgical outcome for CSM and clinical factors associated with the clinical outcome after posterior decompressive surgery. We hoped to show that a PGM using BNs provides a way to predict the clinical outcome of CSM after posterior decompression and to help the spine surgeon make decisions about the surgical treatment of patients with CSM.

Methods
After receiving approval from our Clinical Research Ethics Board (number: 20200330/30-2020-20/043), the medical records of the 59 consecutive patients who underwent posterior decompressive surgery at our hospital between 2012 and 2016 were reviewed retrospectively. The patients were diagnosed with CSM on the basis of their clinical signs and symptoms of cervical myelopathy and concordant MRI ndings of cervical cord compression with or without signal changes because of spondylosis. The 59 patients underwent posterior decompressive surgery with a minimum follow-up of 12 months. No patients had demyelinating disease, tumors, previous cervical surgery, or intradural pathology.
Twenty-ve variables were assessed: last follow-up JOA score (LastJOA), sex, age, body mass index (BMI), cervical trauma history, smoking, diabetes mellitus (DM), hypertension, arthritis, psychiatric disorder, stroke, Parkinson's disease, dementia, cardiopulmonary disease, ossi cation of the posterior longitudinal ligament (OPLL), claudication, gait disturbance, bladder dysfunction, symptom duration before surgery, preoperative JOA score (PreJOA), American Spinal Injury Association (ASIA) grade, Nurick grade, cervical cord signal changes in T1MRI images, postoperative kyphosis (PostKyphosis) identi ed in a postoperative X-ray, and cord compression ratio observed on the preoperative cervical MRI (CR). The CR was calculated by dividing the sagittal diameter by the transverse diameter of the spinal cord at the most compressed level. A smaller CR value indicated greater cervical compression. Twenty-three of the 25 variables, excluding LastJOA and PostKyphosis, were evaluated preoperatively. Thirteen of these variables are known as clinical risk factors associated with surgical outcome in patients with CSM: T1MRI, claudication, arthritis, cardiopulmonary disease, psychiatric disorder, symptom duration, gait impairment, bladder function, PostKyphosis, smoking, age, PreJOA, and CR [1,10,11,19].
We used statistical analysis to search for factors signi cantly associated with the LastJOA and compared the factors identi ed in the PGM using BN analysis of the 25 variables for the 59 patients enrolled in the present study.

Statistical analysis
To identify factors associated with the LastJOA, we selected meaningful variables using least absolute shrinkage and selection operator (LASSO) analysis. We then used multivariable logistic regression with backward elimination for the variables with > 80% selection rates in the LASSO analysis. All statistical analyses were performed using IBM SPSS Statistics (version 26; IBM Corp., Armonk, NY, USA).

Methods
We included data from 59 patients who had undergone cervical posterior decompression for CSM. The candidate predictive parameters were age, sex, body mass index, trauma history, symptom duration, preoperative and last Japanese Orthopaedic Association (JOA) scores, gait impairment, claudication, bladder dysfunction, Nurick grade, American Spinal Injury Association (ASIA) grade, smoking, diabetes mellitus, cardiopulmonary disorders, hypertension, stroke, Parkinson disease, dementia, psychiatric disorders, arthritis, ossi cation of the posterior longitudinal ligament, cord signal change in T1-weighted images, postoperative kyphosis, and cord compression ratio. Statistical and Bayesian network analyses were used to create the PGM and identify predictive factors.
Before surgery, 36 patients (68%) had gait disturbance and 17 patients (28%) had bladder symptoms. ASIA grades D and E were observed in 30 and nine patients, respectively. Hypertension and OPLL were found in 54% and 52% of all patients, respectively. Fewer than one-third of the 59 patients had a comorbidity such as DM, psychiatric disease, stroke, arthritis, cardiopulmonary disease, Parkinson disease, or dementia (Table 1). The LASSO analysis identi ed selection rates > 80% for the PreJOA, ASIA grade, and psychiatric disorders as 97.8%, 88.0%, and 83.0%, respectively. Therefore, we included these three variables in the multiple linear regression, which showed that all were signi cantly related to the LastJOA (p < 0.0001, 0.0020, and 0.0291, respectively) ( Table 2).

BN Analysis
We report the three BN structures (S 1 , S 2 , and S 3 ), S od , and the nal structure (S f ) chosen along with the results of the validation of S f .

BN with all variables
Among the 12 BNs, the one with the best log-likelihood score showed signi cantly better data t than the second-best BN (viz., where S 1 and S′ are the rst-and second-best BNs, respectively, with 25 variables, and D 1 is the dataset with the same number of variables with 59 patients. In the follow-up BN learning comprising the 12 BNs, one BN (S 1 ) with 25 variables showed signi cantly better t (> 99.999%) than the second-best BN. Sex, dementia, psychiatric disorders, and the PreJOA were the direct plausible cause (parents) of the LastJOA, and the ASIA grade was the plausible effect of the LastJOA. Gait impairment was the direct cause (coparent node) for the LastJOA and the ASIA grade ( Fig. 2A). Subsequently, 16 variables within the second-degree MB of the LastJOA in the rst BN (S 1 ) were selected in the second BANJO analysis. The second BN (S 2 ) that best t the datasets with 16 variables and 59 patients is shown in Fig. 2B. The parent variables of the LastJOA in the second BN (S 2 ) were the same as those in the rst BN (S 1 ) except for psychiatric disease.
In the third BANJO analysis, ve variables within the rst-degree MB of the LastJOA in S 2 were selected. The 12 BNs in the third BANJO analysis had the same log-likelihood score (-222.6719) and structure.
The nal BN (S 3 ) obtained after the third BANJO analysis is shown in Fig. 3A. In S 3 , dementia, sex, and the PreJOA were plausible direct causes (parents) of the LastJOA. Because the Order algorithm summarizes the signi cantly causal BN structures identi ed, the most likely summarized structure (Fig. 3B) shows a similar structure as that for S 3 . The Order algorithm yielded the following order as the most probable: gait impairment, the PreJOA, sex, psychiatric disease, ASIA grade, and the LastJOA.
However, the likelihood score of S od was superior to that for S 3 , which suggested that S od re ected the relationship and causality between the LastJOA and other variables in the current data better than S 3 .
Therefore, S od was selected as the nal BN structure (S f ). In S f (Fig. 3B), sex, dementia, and PreJOA nodes were direct parents of the LastJOA node. Although gait impairment and ASIA grade correlated with the LastJOA, if the PreJOA was conditioned ("if we knew the PreJOA of the selected patient"), the information for gait impairment and AISA grade for the patient did not in uence the LastJOA. Sex was a direct parent of both dementia and the LastJOA nodes, simultaneously.

Learning causal BN parameters
The parameters (probabilities) of the nal BN (S f ) with six variables were learned from a new dataset that contained six variables and 59 patients (denoted as D 3 ) that had been extracted from the dataset D 1 containing the 25 variables from all of the 59 subjects (Fig. 4A). Changing the LastJOA conditioned as state 0 to state 2 increased the probabilities of state 2 in the PreJOA, not having dementia, and being male ( Fig. 4B and C). Sex and the PreJOA were direct parent (causal) variables for the LastJOA.
Conditioning the sex variable from state 0 (female) to state 1 (male) decreased the probability of state 0 and increased the probability of state 2 in the LastJOA (Fig. 5A and B). Men with a high PreJOA had a higher probability of a high LastJOA than did women with same preoperative state (74% vs. 2%, respectively). This suggested that men with CSM may have a higher probability of a good outcome after posterior decompressive surgery than women with CSM.
Changing the PreJOA from state 0 to 2 increased the probability of state 2 in the LastJOA ( Fig. 5C and D). Patients with a high PreJOA had a better outcome after the surgical treatment compared with those with a low PreJOA. Dementia seemed to predict a poor outcome of the LastJOA (Fig. 6A and B). With conditioned dementia as state 1, a change in PreJOA did not in uence the LastJOA (Fig. 6C and D). These ndings suggest a 76% probability of the LastJOA can be expected in men with CSM but without dementia and a PreJOA > 12 (Fig. 7A.). By contrast, the probability of state 1 or 2 was estimated as 31% for the Last JOA (Fig. 7B).

Validation
We used LOOCV to evaluate the prediction accuracy of the nal BN parameterized by D 3

Discussion
The nal BN showed that, among the 24 factors used in the analysis, sex, dementia, the PreJOA, and gait impairment were causal factors associated with the LastJOA. ASIA grade was related to the LastJOA and was a child node of gait impairment. Although the nal BN showed that dementia, sex, the PreJOA, AISA grade, and gait impairment were more closely related to the LastJOA than the other 19 factors, sex, dementia, and the PreJOA were direct causal factors for the LastJOA. In the multivariate analysis, the PreJOA, ASIA grade, and psychiatric disorder were signi cant predictors of the LastJOA. Having a psychiatric disorder was also a parent node of dementia in S 2 . Including clinical information about comorbidity in patients with dementia showed that the comorbidity of having a psychiatric disorder did not in uence the LastJOA (Fig. 2B). Therefore, having a psychiatric disorder was not included as a variable in the nal BN analysis. The nal BN structure re ected the statistical relationships between the LastJOA and three variables: the PreJOA, ASIA grade, and having a psychiatric disorder.
Psychiatric comorbidities, sex, PreJOA, ASIA grade, and gait impairment are signi cant factors related to surgical outcome [10][11][12]. Depression or bipolar disorder are signi cantly associated with clinical outcome assessed with the JOA score [11]. Although no study has reported statistical associations between dementia and surgical outcomes of patients with CSM, having dementia may in uence cervical stenosis [23]. Because only four of 59 patients had dementia in the present study, dementia seemed not to be a signi cant variable in the statistical analysis. The rst and second BN structures considered 25 and 16 variables simultaneously and showed that dementia was a direct parent node and having a psychiatric disorder was a parent node only of the LastJOA. In the present study, the nal BN structure showed that the LastJOA was signi cantly related to the three direct causal factors sex, having dementia, and the PreJOA (Figs. 4-7). As a predictive model, the BN may provide a personalized prediction by predicting the probability of the target outcome according to the change in each factor for each patient.
Although the amount of clinical data is increasing and becoming more complicated, data to explain differences in phenotype are incomplete in the medical eld, and there will always be uncertainty when analyzing data. It is di cult to use only logistic regression analysis to describe correlations because of the inability to identify causal relationships between predictors. Therefore, BN analysis, which is graphic and intuitive to the clinician, may help to identify layered and causal correlations between predictors more clearly than a graphical model [13,20]. The BN structure showing the entire network between variables may help to identify the organic relationships with a target variable and the important factors to focus on, and to determine the best ways to improve the clinical outcome. Furthermore, BNs with their associated methods are especially suited for reasoning with uncertainty [24]. Our model allows us to intuitively understand the causal correlations between the factors in the nal BN (S f ) of the present study.
Although the present study showed an advantage of BN analysis over conventional analysis, our study has several limitations. First, we tried to identify 24 factors as closely related to the LastJOA as possible. However, only 59 patients had no missing data. Therefore, the number of patients with a full dataset of the considered factors was too small to create a complete predictive model. We are planning a future study using multicenter clinical data. Second, although the mean follow-up time was 33 months, the time to obtain the LastJOA was different. Increasing the number of enrolled patients in a future study will allow us to apply the BN model for each year of follow-up. The imaging status such as severity of the T2 signal change in the cervical cord on MRI was not considered in the present study. Adding this imaging variable analyzed by convolutional neural networks may help to improve the BN model's predictive ability.
The predictive model used in the present study may be controversial from the viewpoint of con dence. However, other statistical analyses or predictive models have not provided greater accuracy or better intuitive results than the nal BN with same number of patients and factors. Using information such as sex, dementia, and the PreJOA, the nal BN may help to anticipate the probability of the LastJOA after posterior decompressive surgery for each patient with CSM.

Conclusions
The causal predictors of surgical outcome for CSM were sex, dementia, and preoperative JOA score. Use of the PGM with the Bayesian network may be useful personalized medicine tool for predicting the outcome for each patient with CSM.  Figure 1 An example of Bayesian networks structure. Variable A in uences the likelihood of the LastJOA, which in uences on variables B and C. The conditional independence between nodes shows as the probability that the expression of variables B and C is not in uenced by variable A given information for the LastJOA. LastJOA = last follow-up score of Japanese Orthopaedic Association