Between Group-Level Minimally Important Change and Individual Treatment Responders

doi:10.21203/rs.3.rs-224611/v1

Download PDF

Research Article

Between Group-Level Minimally Important Change and Individual Treatment Responders

https://doi.org/10.21203/rs.3.rs-224611/v1

This work is licensed under a CC BY 4.0 License

Journal Publication

published 14 Jun, 2021

Read the published version in Quality of Life Research →

Version 1

posted

You are reading this latest preprint version

Purpose

Estimates of the minimally important change (MIC) can be used to evaluate whether group-level differences are large enough to be important. But responders to treatment have been based upon group-level MIC thresholds, resulting in inaccurate classification of change over time. This article reviews options and provides suggestions about individual-level statistics to assess whether individuals have improved, stayed the same, or declined.

Methods

Review of MIC estimation and an example of misapplication of MIC group-level estimates to assess individual change. Secondary data analyses to show how perceptions about meaningful change can be used along with significance of individual change.

Results

MIC thresholds yield over-optimistic conclusions (i.e., classify those who have not changed as responders to treatment). Individual change statistics can be used along with individual retrospective ratings of change.

Conclusions

Future studies need to evaluate the significance of individual change using appropriate individual-level statistics such as the reliable change index or the equivalent coefficient of repeatability.

Health Economics & Outcomes Research

Meaningful change

minimally important difference

responder

reliable change index

The significance of group-level change is evaluated to assess treatment efficacy and effectiveness. In addition, group-level minimally important change (MIC) thresholds are used because trivial mean change can be statistically significant if the sample size is large enough. The MIC indicates if statistically significant group mean differences are large enough to be important or meaningful to patients and clinicians. Identifying those who improve (“responders” to treatment) provides important supplemental information to group-level change. This paper reviews approaches for assessing MIC and estimating responders to treatment. We note that while group-level MIC thresholds have been used to identify responders to treatment in HRQOL studies [1,2], other approaches are more appropriate.

Estimating the MIC

MIC estimates rely on anchors to provide an external indication of the level of underlying change. The variety of possible anchors makes a single MIC estimate problematic. It is advisable to use multiple anchors whenever possible, but the most used anchor is a retrospective rating of change question such as:

How is your health now compared to 6 weeks ago?

Much better

A little better

About the same

A little worse

Much worse

This example item refers to change in ‘‘health.’’ Depending on the context and measure being evaluated, the anchor might be worded more specifically such as ‘‘physical functioning,’’ ‘‘pain,’’ ‘‘getting along with family,’’ etc. The choice of words is likely to result in different MIC estimates. In addition, there are known limitations of retrospective ratings of change, which include a tendency to reflect the patient’s current state more than change, potentially due to recall bias [3,4].

Change on the target measure should be correlated and have a monotonic association with change indicated on the anchor. The mean change on the target measure should be larger for the subgroup of people who report they are much better on the anchor than mean change for the other subgroups. And those who report no change on the anchor should have no more than minimal change on the target measure [5]. The mean group change on a HRQOL measure for those who report being “a little better” (improvement) or “a little worse” (decrement) are the basis for MIC estimates. But sometimes investigators fail to limit the MIC estimate to those who changed a little and include all those reporting any change on the retrospective rating of change item. This was the case in a sample of 123 adult surgical patients with spinal [6] and in a study of 223 patients with chronic low back pain [7]. Including all those who change rather than focusing on those with minimal but important change lead to MIC estimates that were too large.

Identifying Responders to Treatment

Individual-level variation and change can be estimated using simulation modeling for time series data, but it requires a minimum of 10 observation in the data stream [8]. Similarly, Moinpour et al. [9] estimated mixed effect models and noted that the PROMIS fatigue computer adaptive test would need 15 total assessments to obtain 0.90 reliability of change. Because of limits on research budgets and concerns about respondent burden, nearly all longitudinal HRQOL studies are limited to a few waves of assessment (e.g., two time points). Guidance for identifying responders to treatment for this environment are needed. Hence, we review approaches for estimating individual change from baseline to a single post-baseline assessment.

Table 1 lists several formulae previously proposed for estimating the significance of individual change that are analogous to between group t-tests [10-11]. All the formulae include individual change in the numerator and error in the denominator. The different methods vary in how they estimate error--for example, the time 1 standard

Following the conventional p < .05 threshold for group-level research, responders are usually defined by an RCI of 1.96 or larger. A variant of the RCI used for cognitive measures corrects for practice effects [120], though caution has been raised about use of this particular RCI variant [13]. The denominator of the RCI for item response theory (IRT) calibrated measures uses IRT standard errors at time 1 and time 2 [14]. The coefficient of repeatability indicates the amount of change necessary to be significant on the RCI and is, therefore, equivalent to it. This coefficient is also known as the minimally detectable change, smallest real difference, and the smallest detectable change [15].

Variations to these methods have been proposed to account for regression to the mean (see Table 1). Regression-based approaches compare observed scores at time 2 with regression predicted scores based on time 1 score and other time 1 variables. This can be useful clinically because time 2 status is compared to what would be expected based on time 1 characteristics.

MIC Thresholds Should not be used to Identify Responders to Treatment

There are two major problems with applying group-based MIC methods to categorize individual patients as having changed or not: one conceptual and one statistical. The conceptual issue regards using averages derived from groups that may be relevant to any one patient. MIC estimate are averages of individual-level MICs, implying a distribution of individual MICs; small changes may be meaningful for some and large changes for others [16,17]. Even if such MIC estimates are derived from patient-reported anchors representing the construct of interest, these averages may not represent change that is meaningful to individuals. For example, an individual patient who would consider only a large magnitude improvement in physical function to be meaningful likely is not interested in achieving the average improvement, since the average value falls below that individual’s perception of meaningful change. The statistical issue is that group-based MIC methods drastically underestimate the amount of change needed to be significant at the individual level due to the large measurement error around individual change scores [18]. “Any inspection of measured data reveals an order of magnitude difference between the variability in group versus individual changes” [19]. Thus, group-based MIC estimates will often be indistinguishable from individual score error [20].

Abu et al. [21] is a recent example of using MIC thresholds to identify whether patients improved or declined on the Atrial Fibrillation Effect QualiTy-of-Life (AFEQT) Questionnaire. A five-point change threshold was used as the threshold for “clinically meaningful change.” This threshold was based on group-level MIC estimates from a prior study of the AFEQT MIC that used physician assessment of functional status [22]. The authors concluded that 22% declined and 40% improved from baseline to 1 year later in a sample of 1097 older adults with atrial fibrillation. Table 2 shows the standard deviations, internal consistency reliabilities, and coefficients of repeatability for the four ADEPT scores we computed. The coefficients of repeatability are two-to-three times larger than the 5-point change threshold the authors used. Ironically, Abu et al. could have adopted the more appropriate SDC estimates (equal to the coefficient of repeatability) reported by Spertus et al. [22].

It is clear that Abu et al.’s [21] paper is among the set of cases where the MIC, derived from group-based estimates, falls well below the coefficient of repeatability. When this is the case, Kemmler, et al. [20] suggest increasing the MIC thresholds to the coefficients of repeatability. Terwee, et al. [15] recommend looking to see how measurement error might be reduced: 1) increasing homogeneity of the study sample’s scores at the first measurement timepoint and thereby reducing the SD; and/or 2) increasing the reliability of the measure. Both options are made difficult if the amount of SD reduction or reliability increase is not trivial.

Using the Abu, et al. [21] example, we calculated and plotted the SD’s needed at 0.90, 0.95 and 0.99 reliability on the AFEQT. Figure 1 uses the approximate SD (~17.5) and coefficient of repeatability (~15) observed for the AFEQT overall scale at 90% as a starting point, then scenarios under which the reliability is increased or the SD is decreased can be examined. As seen in the plot, at 0.90 reliability, the SD must drop to about 5 for the coefficient of repeatability to equal the MIC. If the reliability were 0.99, SDs under 17.5 would result in a coefficient of repeatability at or less than the MIC. This example demonstrates the types of conditions required for an instrument’s coefficient of repeatability to equal its MIC. Many instruments will not achieve such low SD’s or high reliabilities under any circumstances.

Combining Statistically Significant And Meaningful Individual Change

A clinician or researcher might also regard relative standing on the measure at the follow-up time point to be important. In some areas of medicine, change in clinical status alone is enough to be important. For example, COVID-19 patients who changed to a more positive level on a six-point ordinal scale (not hospitalized; hospitalized but not requiring supplemental oxygen, hospitalized, requiring supplemental oxygen, hospitalized, requiring nasal high-flow oxygen therapy, non-invasive ventilation, or both; hospitalized, requiring invasive mechanical ventilation, extracorporeal membrane oxygenation, or both; dead) were regarded as improved in one study [23]. Or, a primary care physician might be interested in whether a patient ends up within the normal blood pressure range following initiation of high blood pressure medicine. Similarly, a rehabilitation clinician might want to know if a patient with impaired physical functioning at the beginning of treatment ends up functioning as well as other people with a similar condition. The FDA has suggested that meaningful change needs to be assessed in addition to significant individual change [1]. Some contend that any individual change that is significant at p<.05 is substantial and likely to be meaningful to patients [10, 24].

Jacobson and Truax [25] classified change as 1) recovered (statistically significant and clinically significant); 2) improved (statistically significant but not clinically significant); 3) unchanged (not statistically significant), and 4) deteriorated (statistically significant decrement). In one study, responders were those with significant individual improvement on the Functional Disability Inventory (FDI) and improvement in the FDI severity level (no/minimal disability, moderate disability, severe disability) [26]. These change categories offered by Jacobson and Truax may be more appealing than use of either statistically significant change (coefficient of repeatability) or the MIC alone.

Secondary Analysis Combining Significant and Meaningful Individual Change

To illustrate how significant individual change and meaningful individual change can be presented together, we conduct a secondary analysis of the Impact Stratification Score (ISS) administered in a prospective comparative effectiveness clinical trial of 750 active-duty U.S. military personnel [27]. The average age of the sample was 31; 76% were males and 67% white. Most of the participants reported low back pain for more than 3 months.

The ISS was proposed for use with chronic low back pain patients by a National Institutes of Health Pain Consortium research task force. The ISS is the sum of the PROMIS-29 v2.1 physical function, pain interference and pain intensity scores [28]. The ISS has a possible range of 8 (least impact) to 50 (greatest impact). Physical function (4 items with response options ranging from without any difficulty = 1 to unable to do = 5) and pain interference (4 items with response options ranging from not at all = 1 to very much = 5) each contribute from 4 to 20 points, and the pain intensity item contributes from 0-10 points. The task force proposed three categories of ISS severity: 8-27 (mild), 28-34 (moderate), and 35-50 (severe).

Following guidelines by de Vet et al. [29], Dutmer et al. [7] estimated a SEM of 5.2 for the ISS based on test-retest reliability. But test-retest reliability estimates can be problematic. Test-retest reliability can underestimate reliability when there is true underlying change. Reeve et al. [30] noted that:

ISOQOL respondents agreed that as a minimum standard a multi-item PRO

measure should be assessed for internal consistency reliability.…

However, they did not support as a minimum standard that a multi-item PRO

measure should be required to have evidence of test–retest reliability. They

noted practical concerns regarding test–retest reliability; primarily that some

populations studied in PCOR are not stable and that their HRQOL can fluctuate

This phenomenon would reduce estimates of test–retest reliability, making the

PRO measure look unreliable when it may be accurately detecting changes over

time. In addition, memory effects will positively influence the test–retest reliability

when the two survey points are scheduled close to each other.

We estimated a much smaller SEM of 2.4 using an internal consistency reliability estimate from another study [27]. In this dataset, we examine significance of individual change on the ISS between baseline and 6 weeks later using the coefficient of repeatability (= 6.6). In addition, we compare the significance of change with self-reports on a retrospective rating of change item administered at 6 months: “Compared to your first visit, your low back pain is: much worse, a little worse, about the same, a little better, moderately better, much better or completely gone?”

Thirty-seven percent of the sample improved significantly on the ISS over these 6 weeks and 59% reported on the retrospective change item that they were better (16% a little better, 14% moderately better, 23% much better, and 6% completely gone). Among those who improved significantly on the ISS, 89% reported they were better on the retrospective rating item. Thirty-three percent of the sample improved significantly and reported improvement on the retrospective change item (statistically and clinically significant), 4% improved significantly but did not report that they were better on the retrospective change item (statically but not clinically significant), 26% did not improve significantly but reported improvement on the change item, and 37% did not improve significantly or report improvement on the change item.

Extending this application to further illustrate how group-based methods of estimating MICs can underestimate significant individual change, we compared two alternative ways of defining improvement on a retrospective rating of change item to identify optimal cut points on the ISS. The first way is more inclusive in that improvement from baseline to 6 weeks later included those who reported on the retrospective change item at 6 weeks that one’s back pain was either a little better, moderately better, much better or completely gone. The second way is more restrictive as improvement was limited to those who reported their back pain was moderately better, much better or completely gone on the retrospective change item.

The Youden;[31] index, (sensitivity + specificity)-1, suggested an optimal cut point of 5 points for change on ISS from baseline to 6 weeks later for the first definition of improvement: sensitivity of 65%, specificity of 82%, negative predictive value of 62%, and positive predictive value of 84%. For the second definition of improvement, the Youden index indicated an optimal cut point of 7 points for ISS change: sensitivity of 66%, specificity of 85%, negative predictive value of 77%, and positive predictive value of 76%. The group-level thresholds estimated for the second definition that excluded those who said they were a little better from the improvement group were closer to the coefficient of repeatability.

In contrast to significant group-level change that can be trivial in magnitude if the sample size is large, significant individual change is substantial and worth noting regardless of whether the patient reports that they have improved. As suggested by Jacobson and Truax [25], researchers and clinicians may also be interested in whether those who have significantly improved on a HRQOL measure perceive that they have done so. One can separate people who improved significantly and report at time 2 that they have improved since time 1 from those who do not perceive they have improved. One could also note who reaches a desirable status such as becoming symptom free or ending up within the “normal” range at time 2.

Using group-level estimates of meaningful change (group means) to classify individuals as responders to treatment is inappropriate. Doing so results in overoptimistic estimates of the number of people who improve (i.e., too many will be classified as improved). Ironically, a MIC estimate might yield similar numbers of responders as individual-level significance tests if the estimate erroneously includes those who changed by more than a minimally important amount [32]. In our secondary analysis we observed that the optimal cut-point on the ISS using one way of classifying improvement (i.e., those who reported that they were moderately better, much better or their back pain was completely gone) over 6 weeks was similar to the coefficient of repeatability for individual change.. But including people who felt they were a little better as improvers resulted in an overoptimistic number of responders. Future work is needed to investigate whether group-level threshold estimates based on retrospective ratings of more than a little improvement converge with appropriate individual-level significance tests.

A fundamental criterion for a responder is that the individual improves significantly (i.e., individual change is greater than estimated measurement error). Individual-level statistical indices such as the reliable change index or the equivalent coefficient of repeatability have been available for decades. These or parallel item response theory approaches that allow reliability to vary across the true score continuum need to be used to determine if patients have stayed the same, deteriorated, or improved. Clinical trials and observational studies should routinely report responders to treatment using the significance of individual change.

Some may argue for using a significance level other than p < .05 to identify individual change that doesn’t mean the conventional cutoff. One possible strategy is to use a combination of one-tailed and two-tailed tests of significance and report five levels of change: definitely worse (two-tailed), probably worse (one-tailed), same (one-tailed), probably better (one-tailed), and definitely better (two-tailed). This classification preserves more information and, therefore, helps to address to some extent concerns about missing noteworthy individual change Others might favor even more liberal significance levels to capture more potential responders. Indeed, Donaldson [19] entertained focusing on likely instead of unlikely values and classifying individuals into categories such as: almost certainly changed, quite likely changed, and probably stayed the same.

Note

A group-level version of the coefficient of repeatability has been proposed by dividing the formula in Table 1 by the ) (Yuan et al in press).

Funding: Hays received partial funding support from the University of California, Los Angeles (UCLA), Resource Centers for Minority Aging Research Center for Health Improvement of Minority Elderly (RCMAR/CHIME) under NIH/NIA Grant P30-AG021684. Dr. Peipert received partial funding support from the National Institute on Aging (P30-AG059988).

Conflicts of interest/Competing interests: None

Ethics approval: N/A

Consent to participate: N/A

Availability of data and material: N/A

Code availability: N/A

Authors' contributions: RDH wrote the first draft and JDP provided edits to it.

FDA. (2018). Patient-Focused Drug Development Guidance Public Workshop. Methods to Identify What is Important to Patients & Select, Develop or Modify Fit-for-Purpose Clinical Outcomes Assessments. https://www.fda.gov/media/116277/download. Accessed 4 November 2020.
Coon, C. D., & Cook, K. F. (2018). Moving from significance to real-world meaning: methods for interpreting change in clinical outcome assessment scores. Qual Life Res, 27(1), 33–40. doi:10.1007/s11136-017-1616-3.
Schwartz, N., & Sudman, S. (1994). Autobiographical memory and the validity of retrospective reports. New York: Springer-Verlag.
Norman, G. R., Stratford, P., & Regehr, G. (1997). Methodological problems in the retrospective computation of responsiveness to change: The lesson of Cronbach. Journal of Clinical Epidemiology, 50(8), 869–879. doi:10.1016/S0895-4356(97)00097-8.
Hays, R. D., & Reeve, B. B. (2010). Measurement and modeling of health-related quality of life. In J. Killewo, H. K. Heggenhougen & S. R. Quah (eds.), Epidemiology and Demography in Public Health (pp. 195–205). Elsevier.
Yuan, L., Zeng, Y., Chen, Z., Li, W., Zhang, X., & Ni, J. (2020). Risk factors associated with failure to reach minimal clinically important difference after correction surgery in patients with degenerative lumbar scoliosis. Spine, 45(24), E1669–E1676. doi:10.1097/BRS.0000000000003713.
Dutmer, A. L., Reneman, M. F., Schiphorst Preuper, H. R., Wolff, A. P., Speijer, B. L., & Soer, R. (2019). The NIH Minimal Dataset for Chronic Low Back Pain: Responsiveness and Minimal Clinically Important Change. Spine, 44(20), E1211–E1218. doi:10.1097/BRS.0000000000003107.
Borckardt, J. J., Nash, M. R., Murphy, M. D., Moore, M., Shaw, D., & O’Neil, P. (2008). Clinical practice as natural laboratory for psychotherapy research: A guide to case-based time series analysis. American Psychologist, 63(2), 77–95. doi:10.1037/0003-066X.63.2.77.
Moinpour, C. M., Donaldson, G. W., et al. (2017). The challenge of measuring intra-individual change in fatigue during cancer treatment. Quality of Life Research, 26(2), 259–271. doi:10.1007/s11136-016-1372-9.
Hays, R. D., Brodsky, M., Johnson, M. f., Spritzer, K. L., & Hui, K. K. (2005). Evaluating the statistical significance of health-related quality of life change in individual patients. Eval Health Prof, 28(2), 160–171. doi:10.1177/0163278705275339.
Duff, K. (2012). Evidence-based indicators of neuropsychological change in the individual patient: Relevant concepts and methods. Archives of Clinical Neuropsychology, 27(3), 248–261. doi:10.1093/arclin/acr120.
Bruggemans, E. F., Van de Vijver, F. J., & Huysmans, H. A. (1997). Assessment of cognitive deterioration in individual patients following cardiac surgery: correcting for measurement error and practice effects. J Clin Exp Neuropsychol, 19(4), 543–559. doi:10.1080/01688639708403743.
Maassen, G. H. (2000). Principles of Defining Reliable Change Indices. Journal of Clinical and Experimental Neuropsychology, 22(5), 622–632. doi:10.1076/1380-3395(200010)22:5;1-9;FT622.
Jabrayilov, R., Emons, W. H. M., & Sijtsma, K. (2016). Comparison of classical test theory and item response theory in individual change assessment. Applied Psychological Measurement, 40(8), 559–572. doi:10.1177/0146621616664046.
Terwee, C. B., Terluin, B., Knol, D. L., & de Vet, H. C. W. (2011). Combining clinical relevance and statistical significance for evaluating quality of life changes in the individual patient. Journal of Clinical Epidemiology, 64(12), 1465–1467. doi:10.1016/j.jclinepi.2011.06.015.
Ingelsrud, L. H., Roos, E. M., Terluin, B., Gromov, K., Husted, H., & Troelsen, A. (2018). Minimal important change values for the Oxford Knee Score and the Forgotten Joint Score at 1 year after total knee replacement. Acta Orthopaedica, 89(5), 541–547. doi:10.1080/17453674.2018.1480739.
Terluin, B., Eekhout, I., & Terwee, C. B. (2017). The anchor-based minimal important change, based on receiver operating characteristic analysis or predictive modeling, may need to be adjusted for the proportion of improved patients. Journal of Clinical Epidemiology, 83, 90–100. doi:10.1016/j.jclinepi.2016.12.015.
Lord, F. M. (1963). Elementary models for measuring change. In C. W. Harris (Ed.), Problems in measuring change. Madison: University of Wisconsin Press.
Donaldson, G. (2008). Patient-reported outcomes and the mandate of measurement. Qual Life Res, 17(10), 1303–1313. doi:10.1007/s11136-008-9408-4.
Kemmler, G., Zabernigg, A., Gattringer, K., Rumpold, G., Giesinger, J., Sperner-Unterweger, B., et al. (2010). A new approach to combining clinical relevance and statistical significance for evaluation of quality of life changes in the individual patient. Journal of Clinical Epidemiology, 63(2), 171–179. doi:10.1016/j.jclinepi.2009.03.016.
Abu, H. O., Saczynski, J. S., Mehawej, J., Tisminetzky, M., Kiefe, C. I., Goldberg, R. J., et al. (2020). Clinically meaningful change in quality of life and associated factors among older patients with atrial fibrillation. Journal of the American Heart Association, 9(18), e016651. doi:10.1161/JAHA.120.016651.
Spertus, J., Dorian, P., Bubien, R., Lewis, S., Godejohn, D., Reynolds, M. R., et al. (2011). Development and validation of the Atrial Fibrillation Effect on QualiTy-of-Life (AFEQT) Questionnaire in patients with atrial fibrillation. Circ Arrhythm Electrophysiol, 4(1), 15–25. doi:10.1161/CIRCEP.110.958033.
McElvaney, O. J., Hobbs, B. D., Qiao, D., McElvaney, O. F., Moll, M., McEvoy, N. L., et al. (2020). A linear prognostic score based on the ratio of interleukin-6 to interleukin-10 predicts outcomes in COVID-19. EBioMedicine, 61, 103026. doi:10.1016/j.ebiom.2020.103026.
King, M. T., Dueck, A. C., & Revicki, D. A. (2019). Can Methods Developed for Interpreting Group-level Patient-reported Outcome Data be Applied to Individual Patient Management? Medical Care, 57(Suppl 5 1), S38–S45. https://doi.org/10.1097/MLR.0000000000001111.
Jacobson, N. S., & Truax, P. (1991). Clinical significance: A statistical approach to defining meaningful change in psychotherapy research. Journal of Consulting and Clinical Psychology, 59(1), 12–19. doi:10.1037//0022-006x.59.1.12.
Sil, S., Arnold, L. M., Lynch-Jordan, A., et al. (2014). Identifying treatment responders and predictors of improvement after cognitive-behavioral therapy for juvenile fibromyalgia. Pain, 155(7), 1206–1212. doi:10.1016/j.pain.2014.03.005.
Goertz, C. M., Long, C. R., Vining, R. D., Pohlam, K. A., Kane, B., Corber, L., et al. (2016). Assessment of chiropractic treatment for active duty, U.S. military personnel with low back pain: study protocol for a randomized controlled trial. Trials, 17, 70. doi:10.1186/s13063-016-1193-8.
Deyo, R. A., Dworkin, S. F., Amtmann, D., et al. (2014). Report of the NIH Task Force on research standards for chronic low back pain. Pain Med, 15(6), 569–585. doi:10.1016/j.jpain.2014.03.005.
de Vet, H. C. W., Terwee, C. B., Knol, D. L., & Bouter, L. M. (2006). When to use agreement versus reliability measures. Journal of Clinical Epidemiology, 59(10), 1033–1039. doi:10.1016/j.jclinepi.2005.10.015.
Reeve, B. B., Wyrwich, K. W., Wu, A. W., Velikova, G., Terwee, C. B., Synder, C., et al. (2013). ISOQOL recommends minimum standards for patient-reported outcome measures used in patient-centered outcomes and comparative effectiveness research. Quality of Life Research, 22(8), 1889–1905. doi:10.1007/s11136-012-0344-y.
Youden, W. J. (1950). Index for rating diagnostic tests.. Cancer, 3, 32–35.
Yuksel, S., AyhaA, S., Nabiyev, V., Domingo-Sabat, M., Vila-Casademunt, A., Obeid, I., et al. (2019). Minimum clinically important difference of the health-related quality of life scales in adult’s deformity calculated by latent class analysis: Is it appropriate to use the same values for surgical and nonsurgical patients? The Spine Journal, 19(1), 71–78. doi:10.1016/j.spinee.2018.07.005.

Due to technical limitations, the tables are only available as a download in the supplemental files section.

Download PDF

Journal Publication

published 14 Jun, 2021

Read the published version in Quality of Life Research →

Version 1

posted

You are reading this latest preprint version

Between Group-Level Minimally Important Change and Individual Treatment Responders

Status:

Journal Publication

Version 1

Abstract

Figures

Introduction

Discussion

Declarations

Note

References

Tables

Supplementary Files

Status:

Journal Publication

Version 1