Perhaps the greatest challenge is the selection of the optimal metric(s) for the diagnosis of instabilityPCS. The remainder of this document will address this challenge. Multiple sagittal plane instabilityPCS metrics and interpretation criteria have been used in clinical diagnosis and research studies. Due to the lack of a validated “gold standard” test for instabilityPCS, all prior research studies were “validated” against unvalidated criteria. Elmose et al recently cataloged the use of various spinal instability definitions.(163) Other review papers have also discussed spinal instability criteria.(5, 6, 13, 26) (7, 164) The following is an attempt to summarize and build upon that work, with the goal of working toward a standardized and practical diagnostic test for instabilityPCS.
Intervertebral rotation and/or translation above a limit
The most common approach to classifying a level as stable or unstable is to use a threshold level of intervertebral rotation and/or translation.(165) The threshold levels of rotation and translation detailed on page 352 of the text book by White & Panjabi (166) have been used in many studies. These thresholds for intervertebral rotation and translation were originally intended to be used within a point system that includes other factors.(127) They were not intended to be used in isolation, though they commonly are. The White & Panjabi motion thresholds are sagittal plane translation of > 4.5 mm or 15% of the sagittal width of the vertebral body and sagittal plane rotation greater than 15° at L1-2, L2-3, and L3-4, 20° at L4-5, and 25° at L5-S1 on flexion and extension radiography. As White & Panjabi noted, these criteria are based on a study by Posner et al(34) (with Dr. White as a co-author) using cadaver spines and are based on simulation of traumatic and not degenerative instability. Limitations of these criteria include the following:
-
May not be appropriate for use in assessing for instability associated with degenerative changes, since they were based on simulation of traumatic injuries. (167) The diagnostic performance of these thresholds is unknown.
-
They are very dependent on patient effort when asked to flex and extend. A patient may exceed the White & Panjabi radiographic criteria when motivated to maximally flex and extend during activities of daily living, but this would not be apparent if they do not maximally flex and extend when radiographs are obtained.
-
As previously discussed, unless radiographic magnification is known, errors will exist when translation is measured in millimeters. The magnitude of the error could result in misdiagnosis.
-
They were adapted from a study of 7 cadaver spines using methodology that would be considered low-tech by today’s standards, and the study has never been repeated. Spinal segments were tested with combinations of compressive and shear forces that may not be the best representation of common physiologic forces.
As noted, the White & Panjabi criteria were adapted from the Posner et al study. Yone et al. tested a more direct interpretation of the Posner et al. study. (168) Yone et al. used the Posner et al. study to classify a level as unstable using Table 2. Note that Yone et al. used measurements of sagittal plane offset and disc angle from any available lateral radiograph and not the measured translation and rotation that occurs between flexion and extension as was reported by Posner et al.
Table 2
Criteria used by Yone et al(168) to classify levels as unstable.
Level
|
Sagittal Plane Offset (% Endplate Width)
|
Intervertebral Disc Angle (degrees)
|
Anterior
|
Posterior
|
L1-L2 to L4-L5
|
8
|
9
|
-9
|
L5-S1
|
6
|
9
|
1
|
Yone et al found that outcomes were worse when lumbar stenosis patients were treated by decompression alone if the level was unstable by their interpretation of the Posner et al criteria. This interpretation is not an assessment of translation between flexion and extension but an assessment of the maximum amount of spondylolisthesis measured from any single X-ray. The results are also dependent on how much stress the patient applied to the spine when asked to flex or extend. The Yone et al study may best be used in support of the observation that lumbar stenosis surgery outcomes are worse in the presence of spondylolisthesis, for the type of stenosis surgery they used. The Posner et al criteria have also been applied to analysis of the apparent reduction of spondylolisthesis measured by comparing a standing lateral radiograph to a supine MRI exam. (169) In that study, patients were included if they were clinically suspected of having instability. The Posner criteria applied to flexion-extension X-rays were used as the “gold standard”. A total of 45/75 (60%) patients were found to be “unstable” per the Posner criteria applied to flexion-extension X-rays versus 32/75 patients (42.6%) using spontaneous reduction seen on magnetic resonance imaging.
In a paper discussing indications for lumbar spine fusion, Hanley described instability criteria with supporting evidence: “Most surgeons define segmental instability as either 10° of angular motion or 4 mm of translation on controlled flexion-extension radiographs.”(170) No other supporting evidence was given in this paper that is commonly cited to justify the use of the > 10 deg rotation instability criteria. In another paper, Hanley et al. cited Spratt et al. for the translation criteria and elaborated that 5 mm translation is the threshold that should be used at L5-S1.(171, 172) As documented in multiple reviews, studies of intervertebral rotation in asymptomatic volunteers have found that 10 deg of angular rotation is well within normal limits. (66, 173, 174) Thus, using the 10° of rotation criteria, many asymptomatic people would be diagnosed as having unstable levels. The logic for using a 10° rotation threshold may be that symptomatic patients may be reluctant to flex and extend, and thus, 10 deg rotation would be high in symptomatic patients. However, as previously noted (Fig. 2), symptomatic patients can average more than 10 deg of rotation with good flexion-extension protocols. Thus, the 10° rotation criterion is not supported by scientific evidence.
A threshold level of > 3 mm translation has been used as an indicator of instability.(175, 176) This threshold is referenced in papers by Boden et al, Iguchi et al, and Kanemura et al.(17, 19, 145) None of these papers stated if corrections were made for variable radiographic magnification. Boden and Weisel measured translation from pre-employment X-rays of 40 male volunteers(145) using a method attributed to Quinnell and Stockdale(177) and concluded that “Normal lumbar vertebral levels should have less than 3.0 mm of dynamic antero-posterior (AP) translation (< 8% of vertebral body width).” These criteria are specific to their implementation of the Quinell and Stockdale method (which does not appear to be commonly used and was intended to be only an approximation of true displacement). (177) The Boden and Weisel criteria are not validated to be applicable to other translation measurement protocols. With other methods of measuring translation, translation of 8% endplate width is well within the normal range of translation. (66)
Despite the limited scientific justification, a review by Simmonds et al. cited many studies where radiographic instability is defined as “a disc angle change > 10° or change in translation > 3 mm, from standing or supine radiographs to dynamic radiographs” (7) More recent studies also use these criteria. (178) (176). Even with these low thresholds for rotation and translation, it must be appreciated that a patient may have > 10 deg of rotation or > 3 mm of translation during activities of daily living, but that may not be detected due to insufficient patient effort when asked to flex and extend. Collecting high-quality flexion-extension X-rays or making an adjustment for patient effort is needed.
Several authors have investigated the potential of reporting rotation as a percentage of total rotation through the lumbar spine (e.g., L1 to S1 rotation).(52, 149, 150, 174, 179–181) Several published studies provide data documenting that different levels contribute unevenly to total motion or that different levels in the spine are sequentially recruited as motion proceeds between flexion and extension.(82, 156, 182, 183) Thus, the proportion of motion that each level contributes could depend on what proportion of the flexion-extension cycle was captured by the flexion-extension X-rays used to measure rotation. A standardized flexion-extension protocol with validated quality control criteria might help to avoid that limitation. Robust reference data may also help to account for uneven contributions to overall motion.
Enlargement of the Neutral Zone
Panjabi et al and others have described the concept of neutral and lax zones in the relative motions between vertebrae.(77, 79–85, 88, 90) The neutral and lax zones are where intervertebral rotation and translation can occur with little or low force. Motion within the normal neutral/lax zones is poorly controlled. The motion is insufficient to reliably assess the ability of the annulus and ligaments to restrain motion to within normal limits since the annulus and ligaments are not stressed when motion is within the neutral zone. Once motion is outside of the normal neutral and lax zones, higher forces are required to achieve intervertebral motion in a healthy spine, and it becomes possible to detect if the intervertebral motion restraints are functioning normally. A reliable test for instabilityPCS requires that the spine is loaded to the point where motion would be outside the neutral/lax zones in a healthy spine, and the intervertebral motion restrains would be restraining motion if they are functioning normally.
When intervertebral motion restraints are incompetent, the NZ can become larger, and that phenomenon may serve as the foundation for a diagnostic test for instabilityPCS.(79, 80, 184) The NZ can be measured in the laboratory using cadaver spines where both the applied load and resulting displacements can be accurately measured. Direct clinical use of the NZ to diagnose instabilityPCS would require measurement and analysis of intervertebral motion versus load/moment curves for individual motion segments. No clinically practical methods currently exist to directly measure intervertebral loads or moments in a living person. It may be possible to estimate spinal loading using models(185–187) and combine that with noninvasively measured motion and thereby estimate the NZ in patients. However, that hypothesis and the required methodology have yet to be fully developed and tested. It is also not known if this would provide additional, actionable and clinically efficacious value beyond a simpler analysis of non-invasively measured intervertebral motion. Attempts have been made to measure the neutral zone intraoperatively, but this would have limited clinical utility. (86, 87) Nevertheless, if a spine is stressed sufficiently, by a valid flexion-extension or other protocol, it may be possible to reliably detect that rotations or translations are greater than would be expected if the intervertebral motion restraints were functioning normally and the neutral/lax zones were normal.
Abnormal COR
In two-dimensional images, the center-of-rotation (COR) describes a point about which one vertebra rotates with respect to an adjacent vertebra.(188) The COR can be measured from just two images (e.g., flexion and extension) or between any two frames from a series of images that capture a full flexion-extension cycle. A series of images allows for analysis of movement of the COR during the flexion-extension cycle and thereby assessment of the instantaneous center of rotation (ICR). (189)
The COR is typically reported as anterior-posterior and cranial-caudal coordinates relative to a frame of reference defined by the inferior vertebra. In clinical practice, a clinician would need to know if the coordinates of the COR are normal or abnormal. Reference data for the COR between flexion and extension in a population of asymptomatic volunteers are available.(66, 188, 190–192) There are some consistencies and some differences between studies. The external validity or clinical efficacy of currently available data is unknown.
When measured from a series of radiographic images obtained over the entire flexion-extension cycle, the variability of the COR can be reported, with the hypothesis that during the flexion-extension cycle, the COR will move substantially more in the presence of instabilityPCS.(151, 152, 193, 194) Evidence supports that the 2D coordinates of the COR shift with instability, and motion of the COR during the flexion-extension cycle is wider in degenerated cadaveric spines and in patients suspected to have lumbar spine instability.(89, 151, 156, 190, 192) The movement of the COR as the spine flexes may also depend on exactly how the spine is loaded(195), supporting the need for a standardized loading protocol. The location of the COR is partly determined by facet joint forces,(196) so how the spine is loaded when flexion and extension X-rays are obtained may influence COR data. Thus, patient positioning protocols should be standardized to the extent possible if COR is used to diagnose instabilityPCS. Quantifying the continuous movement pattern of the COR between flexion and extension is challenging. The anterior-posterior width or cranial-caudal height of the COR movement pattern is one possibility. The area that includes all COR points is another option. Determining whether the COR movement pattern throughout a flexion-extension cycle has greater diagnostic efficacy than the coordinates of a single COR point measured from end-range flexion and extension is unknown. The coordinates of the COR are also correlated with other intervertebral motion metrics. (197) This will be discussed later in this paper. It is not known if COR has advantages over other metrics that may be easier to interpret.
Facet fluid sign or vacuum sign
Numerous investigators have observed and studied what appears in an MRI exam to be an abnormally large amount of fluid in the facet joints and have suggested or investigated an association between this facet fluid sign and instability. (59, 163, 198–208) A recent review concluded that dynamic spondylolisthesis is 8 times more likely in the presence of a facet fluid sign.(209) Other authors have noted or studied the vacuum sign that can appear in the facet joints on a CT exam or even radiographs.(11, 210–214) The hypothesis is that with instabilityPCS, an abnormally large gap can occur between articular processes that comprise a facet joint. That gap can fill with either fluid or gas, and little is known about how to optimize the diagnostic utility of these phenomena.
There are potential limitations to the use of the fluid sign to diagnose instabilityPCS. First, fluid exists in a healthy joint, and criteria must be validated to determine whether the amount of fluid is normal or abnormal. The gap between facet joints can be uneven when viewed in coronal or sagittal plane images, particularly in full flexion or extension or with collapsed disc height. Thus, axial slices in the wide part of the gap may show a thick fluid layer, whereas slices in the narrow part may not. No study has rigorously validated strict interpretation criteria; for example, a fluid gap > 1.5 mm must be detected in at least two slices through the facet joint. Since the orientation of the slice plane relative to the facet joint is highly variable, it may be difficult to obtain two good slices through the facet joint, especially with thick slices and volume averaging. Analysis of the gap from thin slice CT (relative to a normative database) or high-resolution isotropic MRI are potential options. Thus, abnormal facet widening may exist but be undetected in some exams, or normal facet gaps may be diagnosed as abnormal. Without strict criteria, substantial intraobserver error can be expected. Although it may be possible to obtain reasonable observer agreement in controlled research studies(203, 204, 215), assessment reliability in routine clinical practice may be more difficult to achieve(216). In addition, even if good fluid sign agreement can be obtained, the sensitivity and specificity of a gold standard test for instability need to be determined.
Second, a vacuum sign in the facet joints, as observed in a CT exam, is also considered an indicator of instability (although this is not validated against a gold standard).(3, 6, 212, 217–219) However, a vacuum would not appear as a bright fluid sign in an MRI exam. This supports that an abnormally wide gap in a facet joint must first be filled with fluid prior to the MRI exam. If not, the MRI exam could yield a false-negative instabilityPCS diagnosis. It is not known under what conditions and how long it takes for an abnormally wide facet joint gap to fill with fluid.
Third, it is not known whether supine positioning will always correctly stress the spine to provoke facet joint widening in the presence of instabilityPCS. Presumably, there must be sufficient forces between vertebrae to cause abnormal facet gapping in a supine patient with instabilityPCS. Upon review of midsagittal slices from CT exams, it is not clear how this could be a reasonable expectation at all levels from L1-L2 to L5-S1, given the wide variability between patients in supine lordosis and variability in thickness and composition of soft tissues posterior to the spine. Comprehensive validation of the reliability of the facet fluid sign is needed. Despite all of the potential limitations, it has been suggested that the facet fluid sign is the best currently available test for lumbar spine instability.(208) Evidence for the ability of the facet fluid sign to predict clinical outcomes is beginning to emerge.(220) The sensitivity and specificity will not be definitively known until a gold standard test for instabilityPCS is available.
Rotation Dependent Translation (RDT)
Between the flexed and extended positions, the amount of sagittal plane translation between vertebrae, corrected for the amount of rotation (to help control for variability in patient effort), may serve as the basis for a diagnostic test for lumbar instabilityPCS.(13, 59) Healthy facet joints have a very strong capsule that, together with the geometry of the facet joints, allows for only small sagittal plane translations. (42, 161, 221, 222) A healthy intervertebral disc will also limit translation. Since there is no reason why sagittal plane translation would be desirable independent of rotation, it is likely that normal intervertebral translations are limited to what is required to achieve the intervertebral rotations required for activities of daily living. A diagnostic test for abnormal RDT has the potential for detecting abnormally high translations that can occur with instabilityPCS. (13, 59, 66) This is somewhat supported by data documenting that abnormal RDT is associated with the facet fluid sign.(59) Sagittal plane instability is believed to require both laxity of the facet joint and disc degeneration.(223) Some support exists for an association between abnormal translation and facet degeneration (206, 224) Disc degeneration alone may not result in abnormal sagittal plane shear translation.(225, 226)
There is currently only limited evidence of an association between abnormal RDT and symptoms.(13) This is understandable given the lack of a validated test for abnormal RDTs. Although no strong evidence currently exists, abnormally high translation may irritate nerve roots or facet joint nociceptors, potentially causing inflammation and thus forming an indirect association between abnormally high translation and symptoms. As Kirkaldy-Willis and Farfan hypothesized 40 years ago, “size reduction of the lateral nerve root canal may of itself produce minor symptomatology, but with the increased motion it may become a severe clinical problem.”(1) The association between the degree of stenosis and symptoms is only moderate.(227–230) Lumbar spinal stenosis can be found in asymptomatic people.(231) It is largely unknown why stenosis results in symptoms in only some people. In addition to the pressure on nerve roots that may be caused by stenosis, inflammation or prior irritation contributes to symptoms.(49, 232, 233) It is possible that abnormal RDT results in greater symptoms when the nerve roots are inflamed and that abnormal RDT can be found in asymptomatic people. Thus, analogous to disc degeneration observed on X-ray or MRI, the diagnosis of abnormalities in RDT may be helpful in symptomatic patient management even if abnormal RDT (or disc degeneration) can be asymptomatic. The role of abnormally high (or possibly abnormally low) RDT in patient symptoms can be studied once a diagnostic test for abnormal RDT is validated.
Abnormal RDT can only be diagnosed if normal RDT is documented. Data to help define normal RDT have been published where a specific definition of rotation and translation was used (Fig. 3).(66) RDT is level dependent, requiring level-specific look-up tables to interpret measurements. However, RDT can also be reported as the number of standard deviations from average.(59) This metric can be referred to as the sagittal plane shear index (SPSI). SPSI simplifies the interpretation of RDT. A value of 0 would mean that RDT is exactly average (for the specific level) for asymptomatic and radiographically normal levels. A value between − 2 and 2 is within the 95% confidence interval for asymptomatic volunteers. A value of 3 would indicate RDT is 3 Std Dev above average normal, and this would be objectively abnormal.
Retrospective reanalysis of flexion-extension radiographs was performed to help better understand RDT. The reanalysis was performed using a fully automated emulation of the previously validated Quantitative Motion Analysis (QMA) method (SpineCAMP™, Medical Metrics, Inc., Houston, TX). (234–236) This method uses a pipeline of neural networks and coded logic to produce four anatomic landmarks for each vertebra(237) and determine transformation matrices to move landmarks from the flexion to the extension image. The registered landmarks are then used to calculate the intervertebral motion metrics. Many researchers have developed neural networks to place anatomic landmarks on vertebral bodies in spine radiographs.(238–243) It is expected that the results described below can be reproduced using any method validated to reliably place standardized anatomic landmarks on vertebral bodies. Standard placement of lumbar vertebral landmarks has been previously described.(237)
Based on a reanalysis of flexion-extension radiographs for 162 asymptomatic volunteers (66), the R2 was 0.61 between a normalized expression of RDT (SPSI) and the cranial-caudal coordinate of the COR. This relationship with the cranial-caudal coordinate of the center of rotation (COR) is as expected.(189) The strength of the relationship between RDT and COR helps to appreciate that the accumulation of knowledge regarding COR has relevance to the diagnostic tests based on RDT.
The potential for RDT to serve as the basis for a diagnostic test for instabilityPCS would be strengthened if sagittal plane intervertebral translation is linearly related to rotation. Figures showing the relationship between intervertebral translation and rotation support that the relationship between translation and rotation can be approximately linear.(13, 132, 156, 160–162) However, there is also ample evidence to support that translation is not always linearly related to rotation. (111, 154, 180, 244, 245) It may be that in a healthy spine (or in a cadaver spine tested ex vivo), with no confounding effects from the active and neural control elements of spinal stability, translation is approximately linearly related to rotation when the motion segment is outside the neutral zone and moving in the elastic zone. However, if the motion segment is in the neutral zone, particularly if the neutral zone is abnormally large due to instabilityPCS, translation cannot be assumed to change linearly with rotation. It is also likely that muscle spasms, spikes in pain during flexion or extension, or neural control issues (such as uncertainty about how to flex and extend for the test) could cause non-linearity in the relationship between translation and rotation. Thus, although translation can be linearly related to rotation in some cases, that linear relationship cannot be assumed true for all levels in all patients. The linearity of the translation as a function of rotation may also depend on the extent of spondylolisthesis.(159) Finally, the SPSI metric requires dividing translation by the amount of rotation, and this becomes unstable when rotation approaches zero.(13, 66)
Alternatively, a diagnostic test for abnormal RDT can be based on data documenting that translation is linearly related to rotation (outside the neutral zone) across a population of normal healthy spines. With that interpretation, normal translation for any amount of rotation can be determined from a linear regression equation fit to translation versus rotation data for a population of healthy motion segments. The upper and lower limits of the 95% confidence interval for this linear regression can be used to determine if the translation is within or outside of normal for the specific amount of rotation that was measured. That interpretation can be reported as a standardized metric, where a value of zero indicates that translation was exactly the average found in healthy motion segments. A value of 3 would indicate that the translation 3 standard error of the forecast was above the average normal in healthy motion segments. The standard error of the forecast provides a point estimate of the translational variability that exists at a specific amount of rotation and at a specific level.
Using previously published data from radiographs of 162 normal and asymptomatic volunteers (66), the amount of sagittal plane translation that occurs for different amounts of rotation was observed to have a relatively linear relationship between translation and rotation. Figure 4 shows the data for the L4-L5 level. The relationship between translation and rotation was approximately linear for all levels (L1-L2 to L5-S1), with an R2 of > 0.55 for L1-L2 to L4-L5 and R2 = 0.17 for L5-S1. The observed linear relationship between translation and rotation across a population allows the use of these data to estimate average normal translation and 95% confidence intervals for a specific amount of rotation. Until proven otherwise, it can be assumed that use of this RDT test for instabilityPCS requires that the spine be stressed so it would be outside of the neutral/lax zones in the absence of instabilityPCS. Until better quality control criteria are validated, it can also be argued, based on a review of existing neutral-zone data (77, 79–85, 88, 90), that 5 deg of intervertebral rotation between flexion and extension is sufficient to assure that the motion segment is adequately stressed.
Several different types of sagittal plane intervertebral translation can be measured, such as translation of the posterior-inferior corner of the superior vertebra in the direction defined by the superior endplate of the inferior vertebra, translation of the superior vertebral centroid relative to the inferior vertebra, or translations in the cranial-caudal direction. It is valuable to contemplate the potential physiologic implications of an abnormality for a specific type of translation measurement.
When sagittal plane translation is measured as the translation of the posterior inferior corner of the superior vertebra, in the direction defined by the superior endplate of the inferior vertebra (Fig. 3), it can be described as motion that could particularly affect tissues in the foraminal and spinal canal regions. If the foraminal region translation is reported as an index relative to the average and 95% confidence intervals for an asymptomatic population, then this can be referred to as the Foraminal Sagittal Translation (FST) index. If the FST-index is zero, then translation is exactly average relative to the asymptomatic population at the level being assessed for the amount of rotation that occurred. An abnormally high (e.g., 3) FST index would inform clinicians that the posterior-inferior edge of the superior vertebra is translating in the sagittal plane much more than it should, with respect to the inferior vertebra. The FST-Index may help in assessing the Kirkaldy-Willis and Farfan hypothesis: “size reduction of the lateral nerve root canal may of itself produce minor symptomatology, but with the increased motion it may become a severe clinical problem.”(1) Examples of levels with normal and abnormal FST-Index can be viewed at:
https://www.dropbox.com/sh/7z3vu3i977ip530/AAD8Oc-Ref_PAJd0tEPopkdXa?dl=0
In each online example, the inferior vertebra for the level (e.g., L5 if L4-L5 is the target) will remain in a constant position on the display as the flexion and extension images are alternately displayed. This is referred to as “stabilization” and facilitates interpretation of the relative motion between vertebrae.
One additional advantage of a standardized and normalized metric such as the FST-index is that there is no need to calibrate X-rays to obtain accurate measurements in units of millimeters.
A diagnostic test for instabilityPCS should detect abnormalities in populations of patients where abnormalities might be expected (e.g., lumbar spinal stenosis) but not detect abnormalities where instabilityPCS would not be expected (e.g., subjects in disc arthroplasty trials where instability was an exclusion criterion). It is yet unknown how high the FST index (or alternative metric) must be before it becomes clinically significant. An FST-index > 2 can be just outside of normal limits and may be within the test error. A FST-index > 3 is well outside of normal limits and may prove to be a more efficacious diagnostic threshold.
In its role as an imaging core laboratory, Medical Metrics, Inc. (MMI) has analyzed thousands of flexion-extension exams from studies of treatments for spinal stenosis as well as studies of disc arthroplasty and biologic treatments for disc degeneration. MMI pools data from multiple studies to develop benchmark data that can be used to help identify problems with incoming flexion-extension studies. These flexion-extension exams were retrospectively analyzed using the previously described SpineCAMP, and the resulting intervertebral motion data were used to help understand the prevalence of an abnormal FST index in different populations of patients. In calculating the following prevalence data, only pretreatment data and only levels with > 5 deg rotation are included. The pooled analysis included 7,621 pretreatment flexion-extension studies. Table 3 has the proportion of treatment and adjacent levels where the FST-Index was > 2 (includes borderline abnormalities) and where the FST-index was > 3 (more substantially abnormal). In stenosis, fusion, and dynamic stabilization patients, the high FST-Index is generally at the treatment level, while in disc arthroplasty and biologic treatment patients, the high FST-Index is generally at an adjacent level. InstabilityPCS might be expected in a proportion of stenosis and fusion patients. Instability adjacent to disc arthroplasty levels may affect treatment outcomes.
The proportion of levels that would be classified as abnormal using previously described instability criteria was more variable. Including all pretreatment data (not just those with rotation > 5 deg as with the FST-Index) and defining instability as > 10 deg rotation(170), 9–12% of treatment levels in spinal stenosis and fusion studies would be classified as unstable, and 25–42% of treatment levels in studies of disc arthroplasty or biologic treatments for disc degeneration would be classified as unstable. This difference between study types may be in part due to how different symptoms affect patient willingness to flex and extend but suggest that the > 10 deg rotation criteria will misclassify many levels in patient populations where instability would not be expected. Of note, in the asymptomatic population previously discussed, 72% of levels have > 10 deg rotation.
Using the White & Panjabi criteria, depending on the study type, between 0.1 and 2.1% of treatment levels would be classified as unstable if unstable is defined as rotation > 15 deg at L1-L2 to L3-L4, > 20 deg at L4-L5, or > 25 deg at L5-S1. If instability was defined as intervertebral translation > 8% endplate width, then 4–9% of treatment levels would be classified as unstable (all study types combined). The scale factor was not known for all studies, so intervertebral translation instability criteria that are in units of millimeters could not be assessed in the pooled data. These data support that the White and Panjabi-based criteria may fail to diagnose a proportion of levels with abnormal motion.
Table 3
Prevalence of FST-Index abnormalities in pooled data from different study types.
Study Type
|
Index Levels
|
Adjacent Levels
|
% with
FST-Index > 2
|
% with
FST-Index > 3
|
% with
FST-Index > 2
|
% with
FST-Index > 3
|
Treatment for Lumbar Stenosis
|
11
|
6
|
5.1
|
1.9
|
Selected for Fusion Surgery
|
15
|
7
|
4.1
|
1.2
|
Selected for Dynamic Stabilization
|
16
|
7
|
6.3
|
2.4
|
Disc Arthroplasty
|
2.5
|
0.7
|
5.8
|
2.7
|
Biologic for disc treatment
|
3.4
|
0.9
|
6.1
|
2.3
|
A relationship between intervertebral disc degeneration and segmental stability has been previously hypothesized and studied.(1, 246, 247) This relationship is apparent in the previously described asymptomatic volunteers, as shown in Fig. 5. Disc degeneration was graded by an experienced musculoskeletal radiologist. Although a significant relationship is evident in Fig. 5, this is only a trend, and there is wide variation in the FST index within each radiographic grade of disc degeneration, even within KL grade 0 (Fig. 6). This is also evidence of limitations in the Kellgren-Lawrence radiographic grading system that may inadequately detect early stages of degeneration.
Abnormalities in disc height changes with loading
In the early stages of degeneration, the intervertebral disc can become more flexible.(62, 160, 226, 248, 249) This may also be the state where biologic treatments to halt or reverse degeneration may be most effective since nutrient supply is less impaired, which is required for cell viability.(250, 251) Abnormally high cranial-caudal or “vertical” translations between vertebrae might be diagnostic of loss of pressure in the nucleus, softening of the annulus, incompetence of the longitudinal ligaments, annular avulsions, or other causes. (252–256) Such excessive vertical translations may be associated with the abnormal intervertebral loading patterns found with degeneration that may activate pain-sensing nerves in the facets or endplates. (257) Degeneration has been shown to alter loading across the disc space.(258, 259) Discs can become stiffer with advanced degeneration (Figs. 5 and 6).
Disc “softness” could be measured by comparing disc height in loaded versus unloaded positions and identifying abnormally high disc height changes with loading/unloading. An adjustment may be required to account for the phenomena of diurnal change in disc height.(260) Disc height changes with loading have been previously reported.(261) (262)
The diagnosis of abnormal vertical translations would also be dependent on whether the spine is adequately stressed. One option is to compare disc space between a loaded (eg upright standing) versus a minimally loaded (eg supine) position. Detecting disc height compressibility may require a prolonged period of standing in some patients to compress the disc to its lowest possible height. (263) Similarly, prolonged unloading in a supine position may be required for the disc to achieve its maximum possible height, although that has not been adequately studied. Analysis of compressibility with increased loads on the spine has been used to study the effects of backpacks(264) but may be impractical as a routine diagnostic test.
Alternatively, the change in anterior and posterior disc heights between flexion and extension may be diagnostic for abnormal disc compressibility.(265) Using a specific definition of change in disc heights (Fig. 7), the change in anterior and posterior disc heights was found to be linearly related to intervertebral rotation across the population of asymptomatic volunteers that was previously described. This may be a powerful phenomenological foundation for a diagnostic test: a change in anterior or posterior disc height greater than what occurs in healthy discs may be diagnostic for hypercompressible (and also hypo-compressible) discs.
Similar to the FST-index, an anterior disc widening index (ADW-Index) and a posterior disc widening index (PDW-Index) can be calculated based on predicting the average normal disc widening for the amount of rotation and the 95% confidence intervals for the standard error of the forecast. These indices may help identify discs where the change in the anterior or posterior disc heights is abnormally high (or abnormally low) for the amount of rotation that was measured. The clinical value of these metrics remains to be determined. The importance of applying quality-control criteria to assure that the spine is adequately stressed is unknown with respect to vertical translations. A similar approach might be possible when comparing disc height changes between the supine and standing positions, but that would require valid normative data.
Applied to the pooled, pretreatment lumbar spine flexion-extension studies described above, Table 4 provides the prevalence of ADW abnormalities at the treatment and adjacent levels (for levels where the rotation was > 5 deg). PDW-Index abnormalities were rare at any level (treatment or adjacent) in any study type: < 1% in stenosis, fusion, or dynamic stabilization studies; 5% in disc arthroplasty and disc biologics studies.
Table 4
Prevalence of ADW-Index abnormalities in pooled data from different study types.
Study Type
|
Index Levels
|
Adjacent Levels
|
% with
ADW-Index > 2
|
% with
ADW-Index > 3
|
% with
ADW-Index > 2
|
% with
ADW-Index > 3
|
Treatment for Lumbar Stenosis
|
27
|
15
|
7.6
|
3.4
|
Selected for Fusion Surgery
|
48
|
26
|
6.9
|
2.8
|
Selected for Dynamic Stabilization
|
40
|
22
|
7.8
|
3.0
|
Disc Arthroplasty
|
11
|
3.9
|
4.6
|
1.3
|
Biologic for disc treatment
|
6.3
|
1.6
|
2.1
|
0.6
|
Composite Metrics
From the perspective of nerve roots and tissues within the spinal canal, lateral recesses, and foramen, some level of mechanical tissue “agitation” may occur from the translational component of motion, and some “agitation” may occur from the rotational component. The foraminal area is known to change with flexion-extension and is smaller in the presence of degeneration.(266) Foraminal “agitation” could be quantified using a metric such as the foraminal agitation area (FAA - Fig. 9). The FAA is dependent on intervertebral rotation and translation but also on disc height. Since intervertebral rotation is dependent on effort exerted by the patient when asked to flex or extend, it may be helpful in clinical practice to correct for that source of variability. In radiographically normal levels in asymptomatic volunteers, the FAA is linearly related to rotation (Fig. 10), and this may be captured by expressing FAA as a Foraminal Agitation Index in units of the standard error of the estimate from the average FAA found in healthy discs. The standard error of the estimate provides the expected variability in translation for a specific amount of rotation.
It is reasonable to hypothesize that instabilityPCS may best be diagnosed using a composite score composed of disc morphometry, disc health, and intervertebral motion metrics. Integration with standardized symptom assessments and specific MRI findings may also improve clinical efficacy.(267, 268) It must be appreciated that there can be multiple sources of symptoms in a patient from multiple different levels. It may thus be unreasonable to expect a definitive association between a metric that may diagnose instabilityPCS at a specific level and symptoms. It is possible that patients may have severe symptoms from something other than instabilityPCS and that patients may have instabilityPCS but not be symptomatic. This is analogous to the issue of stenosis, where people can have significant stenosis yet remain asymptomatic.(269) Stenosis is not a definitive indication for treatment, and instabilityPCS will not be a definitive indication for treatment. Nevertheless, it is reasonable to test for an association between pretreatment instabilityPCS and treatment outcomes, since instabilityPCS may prove to be one true indication for the optimal treatment in appropriately symptomatic patients. However, a treatment that generally works for patients with instabilityPCS may also appear to fail due to a different source of symptoms in a particular patient. Development and validation of such a metric to diagnose instabilityPCS will require a large sample size to identify and sort through all the important factors. Multilayer perceptron models or alternative methods may help to learn the clinical scenarios where instabilityPCS is important and useful in diagnosis and treatment planning. Multiple large-scale spine registries are currently enrolling patients and collecting potentially valuable standardized patient-reported outcomes.(270, 271) Many of the knowledge deficits noted in this paper could be addressed if these registries would collect high-quality flexion-extension studies for cohorts of enrolled patients, and then obtain validated intervertebral morphometry and motion metrics. This small incremental effort may lead to efficacious new strategies for optimizing patient outcomes.
The effects of translation and rotation on nerve roots and other tissues may be exacerbated by congenitally narrow foramen or spinal canals, loss of disc height, spondylolisthesis, stenosis, and other factors. This hypothesis can be appreciated by viewing examples of intervertebral motion in symptomatic patients:
https://www.dropbox.com/sh/7z3vu3i977ip530/AAD8Oc-Ref_PAJd0tEPopkdXa?dl=0