A Head-to-head Comparison of the EQ-5D-3L Index Scores Derived from the Two EQ-5D-3L Value Sets for China

doi:10.21203/rs.3.rs-1250025/v1

Download PDF

Research Article

A Head-to-head Comparison of the EQ-5D-3L Index Scores Derived from the Two EQ-5D-3L Value Sets for China

https://doi.org/10.21203/rs.3.rs-1250025/v1

This work is licensed under a CC BY 4.0 License

You are reading this latest preprint version

Background: Two EQ-5D-3L (3L) value sets (developed in 2014 and 2018) co-exist in China. The study examined the level of agreement between index scores for all the 243 health states derived from them at both absolute and relative levels and compared the responsiveness of the two indices.

Methods: Intraclass correlations coefficient (ICC) and Bland-Altman plot were adopted to assess the degree of agreement between the two indices at the absolute level. Health gains for 29,403 possible transitions between pairs of 3L health states were calculated to assess the agreement at the relative level. Their responsiveness for the transitions was assessed using Cohen effect size.

Results：The mean (standard deviation, SD) value was 0.427 (0.206) and 0.649 (0.189) for the 3L₂₀₁₄ and 3L₂₀₁₈index scores, respectively. Although the ICC value showed good agreement (i.e., 0.896), 88.9% (216/243) of the points were beyond the minimum important difference limit according to the Bland-Altman plot. The mean health gains for the 29,403 health transitions was 0.234 (3L₂₀₁₄ index score) and 0.216 (3L₂₀₁₈ index score). The two indices predicted consistent transitions in 23,720 (80.7%) of 29,403 pairs. For the consistent pairs, Cohen effective size value was 1.05 (3L₂₀₁₄index score) or 1.06 (3L₂₀₁₈ index score); and the 3L₂₀₁₄ index score only yielded 0.007 more utility gains. However, the results based on the two measures varied substantially according to the direction and magnitude of health change.

Conclusion：The 3L₂₀₁₄ and 3L₂₀₁₈index scores are not interchangeable. The choice between them is likely to influence QALYs estimations.

EQ-5D-3L

China

value set

index score

comparison

The EQ-5D-3L (3L) is the most widely used utility instrument in valuing health-related quality of life (HRQoL) [1-4] for use in quality-adjusted life years (QALYs) calculation. It has a classification system consisting of five dimensions: mobility (MO), self-care (SC), usual activities (UA), pain/discomfort (PD), anxiety/depression (AD), with three functioning levels (no problems, moderate problems, and extreme problems) in each dimension. The system thus defined 243 (3³) possible health states [5], and each of them can be coded into a five-digit number ranging from “11111” to “33333” (e.g., 12321 means no problems in mobility, moderate problems in self-care, extreme problems in usual activities, moderate problems in pain/discomfort and no problems in anxiety/depression). A single utility index score can be assigned to each health state by using a value set, which was developed in a valuation study based on general population’s health preferences. Since health preferences differ across populations [6,7], a number of 3L value sets have been derived in different countries/regions [8]. Some countries (e.g., Korea, USA, and China) even developed two value sets due to respective reasons [9-14]. Taking China for example, compared to the first value set developed in 2014 (i.e., 3L₂₀₁₄value set) using a sample comprising residents mainly form urban areas, the second value set developed in 2018 (i.e., 3L₂₀₁₈value set) adopted a more representative sample of residents from both rural and urban areas (Table 1).

Despite the availability of the EQ-5D-5L (5L, a new version of 3L) index score with improved psychometric properties [15-18], the 3L index score is still with great usefulness due to considerations of consistency and continuity in decision making process [19]. Indeed, the National Health Service Survey in China continually used the 3L to measure the HRQoL of Chinese residents even after the publication of the 5L value set for China in 2017 [20]. Moreover, the 3L can also be used to generate the 5L index score based on the 5L information and a crosswalk function [21], thus utilizing the advantages of 5L descriptive system.

Similarly, the 3L₂₀₁₄ value set is still more frequently used than the 3L₂₀₁₈ value set, albeit with its disadvantage in the sampling method. According to Web of Science, the former has been cited in 190 articles by January 11, 2022, 119 of which cited it after the availability of the latter. In contrast, the 3L₂₀₁₈ value set has been cited only twenty-five times since its publication [22,23]. Given the noticeable differences in coefficients of scoring algorithms for the two value sets (Table 1), it is unlikely that the two value sets would yield identical utility index scores for the same health state. However, it remains unclear to what extent the use of different utility scores generated from the two value sets would affect results of QALYs computation, which mainly depends on the difference in utility scores rather than absolute utility scores. Moreover, it is not known whether the difference in the utility scores is clinically important as well. Our previous study has compared the two 3L indices in diabetes patients, and found that they had different discriminative power and the choice between them may impact the QALYs estimation [24]. Another study has also compared them in patients with gastric cancer and healthy controls, and showed that the 3L₂₀₁₄ index score had better ability to distinguish the patients from controls [25]. Nevertheless, either of them was based on a single disease, it is not known whether the findings could be generalized to general populations or other patients in China.

Hence, the study aimed to: 1) examine the level of agreement at both absolute and relative levels of all the 243index scores derived from the two 3L value sets for China; and 2) compare the responsiveness of two indices (i.e. to capture the real changes in health states over time).

The 3L index scores generated from the two 3L value sets for China

The two 3L value sets were developed using different sampling methods, valuation protocols [26,27], modeling methods, leading to distinct algorithms for calculating the 3L index scores (Table 1). For example, the utility score for health state “23221” is 0.466 (i.e., 1-0.039-0.099-0.208-0.074-0.092-0.022) according to the 2014 algorithm or 0.568 (i.e., 1-0.077-0.291-0.037-0.027) according to the 2018 algorithm. In the study, both algorithms were used to generate the two index scores of all the 243 3L health states for analysis.

Statistical Analysis

We assessed the distributions of the two indices (i.e., 3L₂₀₁₄index score and 3L₂₀₁₈ index score) using the Shapiro-Wilk test. T-test or Wilcoxon rank-sum test were then used to compare their mean values wherever appropriate.

A two-way mixed intraclass correlation coefficient (ICC)[28] and Bland-Altman plot [29] were adopted to assess the degree of agreement between the two indices at absolute level. The agreement was considered good when the ICC value was higher than 0.7. The Bland-Altman plot was used to visualize and assess the level of agreement across different utility segments, whereby the Y-axis depicts the differences in score between the two indices, and the X-axis represents their mean values. A limit of 0.074, that is the minimally important difference (MID) of the 3L index score [30], was used to determine whether the magnitude of the difference would be clinically important.

To examine the agreement of the two 3L index scores at relative level, we simulated all the possible health states transitions that may occur over time. All the 243 health states were paired to form 29,403 (C²₂₄₃) health state combinations, each of which was used to simulate a pair of health states before and after treatment. It was assumed that the health states with higher index scores were as the states after treatment (post-treatment), and the lower were as the health states before treatment (pre-treatment) [31]. Hence, the health gains of our simulated treatment were always positive. However, the index score of the same health state may vary when changing from one value set to the other, thus a health state labeled as pre-treatment when using the 3L₂₀₁₄value set may represent post-treatment instead when using the 3L₂₀₁₈ value set in the same pair, or vice versa. This was what we considered as an “inconsistent” pair of health states [32], whereby the choice of index scores would have a substantial impact on health outcomes, i.e. one may generate a positive health gain, while the other may result in health losses.

On the contrary, for a “consistent” pair, the health state representing pre-treatment remained unchanged regardless of using either the 3L₂₀₁₄ or 3L₂₀₁₈ value set. Given the magnitude of health gains may vary from one value set to another, the consistent group was further divided into four subgroups according to the perceived direction and magnitude of the change before and after treatment: (1) major improvement (i.e. at least one dimension in the health transition is increased from level 3 to level 1 or level 2, and no dimension is decreased ); (2) minor improvement (i.e. at least one dimension in the health transition is increased from level 2 to level 1, and no dimension is increased from level 3 to 1 or 2, nor is the level of any dimension decreased); (3) mixed response with minor deterioration (i.e. at least one dimension is decreased from level 1 to 2 and no dimension is decreased from level 1 or 2 to 3); (4) mixed response with major deterioration (i.e. at least one dimension is decreased from level 1 or 2 to 3) [31]. It should be noted that, if the level of one dimension deteriorates yet the level of the others improves in a health transition, it would be considered as a mixed response with some deterioration and thus assigned to either subgroup 3 or 4. We then compared the health gains yielded from the two 3L indices for all the transitions, consistent transitions, and each subgroup of the consistent transitions.

We also compared the responsiveness of the two 3L indices within the consistent group by using Cohen effect size[33]. It is commonly used to measure the effect size of a treatment, and is independent of the sample size which is unlike the significance test. It is calculated as the difference in the mean scores between post-treatment and pre-treatment divided by the standard deviation of the pre-treatment. The effect size was categorized as small (0.2–0.5), moderate (> 0.5–0.8), or large (> 0.8) [33]. Given that the hypothetical treatment was fixed in our simulation, the effect size would reflect the ability of an index score to discern changes in two known health states. The higher the effect size, the more responsive the index score is. We calculated and compared Cohen effect size for all the consistent pairs and each subgroup of the pairs. Microsoft Excel and Stata and SAS were used for statistical analysis.

The two 3L indices were both normally distributed according to the Shapiro-Wilk test (Figure 1). Overall, the 3L₂₀₁₄ value set generated systematically lower index scores compared with those yielded from the 3L₂₀₁₈ value set. The mean (standard deviation, SD) value of all the index scores was 0.427 (0.206) for the former and 0.649 (0.189) for the latter, with the difference in mean being 0.222 (p<0.001) (Table 2); the 3L₂₀₁₄ value set also had lower scores for 239 out of 243 health states. Meanwhile, the difference and variance between the two index scores were not invariant but generally increased with the increasing in health-state severity (Figure 2). For example, the index score of the second-best health state was 0.887 (for state “11211”) and 0.973 (for state “11121”); while the minimum index score was -0.149 and 0.170 (for the worst state “33333”) according to the 3L₂₀₁₄ or 3L₂₀₁₈ value set, respectively. Although the overall agreement between the two kinds of index scores was good (ICC= 0.896), 88.9% (216/243) of the points were beyond the MID limit according to Bland-Altman plot (Figure 3).

On the other hand, the difference between the two indices was not so obvious for the 29,403 health transitions: the mean differences (SD) were 0.234 (0.173) and 0.216 (0.158) for the 3L₂₀₁₄ and 3L₂₀₁₈ index scores, respectively. Similarly, in 23,720 (80.7%) of 29,403 transitions, the two indices generated consistent results for health gains before and after a simulated treatment, with the difference in mean health gains for the transitions being only 0.007 (p<0.001) (Table2). Among the consistent transitions, the number of pairs for each subgroup was 6,752 (major improvement), 781(minor improvement), 4,515 (mixed response with minor deterioration), and 11,672 (mixed response with major deterioration).

In the subgroups of major/minor improvement, the 3L₂₀₁₄index score yielded greater magnitude of health gains at 0.411/0.151(vs. 0.310/0.072 from the 3L₂₀₁₈ index score). However, it generated similar or lower health gains compared to the 3L₂₀₁₈ index score in the subgroups of “mixed response with minor deterioration” (health gains: 0.246 for both index scores) and “mixed response with major deterioration” (health gains: 0.069 vs 0.118) (Table 3)_.

The two indices also showed a similar level of sensitivity to change for all the consistent changes, with Cohen effect size values at 1.05 and 1.06, respectively. Nevertheless, the value varied substantially across the subgroups. In the subgroups of major/minor improvement, the 3L₂₀₁₄ index score demonstrated higher values than the 3L₂₀₁₈ index score (Cohen effect size: 2.38 vs 1.72/0.88 vs 0.45). While in the subgroup of mixed response with major deterioration, the result was reversed (Cohen effect size:0.37 vs 0.66); in the subgroup of mixed response with minor deterioration, the two index scores demonstrated similar responsiveness with Cohen effect sizes at 1.48 vs 1.42. (Table 3).

In the study, we compared the agreement of all the two 3L index scores generated from the two 3L value sets for China. We found that the 3L₂₀₁₄ index score was systematically lower than the 3L₂₀₁₈ index score at absolute level, but their differences at relative level varied in terms of the direction and magnitude of the health change.

It is not surprising that the 3L₂₀₁₄ index score was much lower given the 3L₂₀₁₄ algorithm has larger values in 8 out of 10 parameters and two more terms (i.e., constant and N3) further pulling down the scores (Table 1). The difference and variance between the two index scores were also increased with the increasing in health-state severity. With regard to the former, the difference in level-3 (L3) parameters between the two algorithms is in general larger than the difference in level-2 (L2) parameters. This, plus the use of N3 term, lead to the increased difference. The latter could be ascribed to the fact that the 3L₂₀₁₈ algorithm has two L3 parameters with larger values (i.e., MO3 and SC3) than those of the 3L₂₀₁₄algorithm. As a result, for health states including the problems, the difference between the index scores may be reduced rather than increased, resulting in larger variance for all health states including L3 problems. Difference in algorithm parameters may be attributed to several factors such as the valuation protocol, modeling method, as well as the sample used [13,14]. The sample for the 3L₂₀₁₈ algorithm including the rural population, who may be more likely to live with economic hardships over years. Hence, they may be able to endure more pain and suffering, leading to a relatively higher estimate in utility values for health problems than the better-off residents. In addition, the 3L₂₀₁₈ value set used an open-ended time trade-off (TTO) question. The developers of the 3L₂₀₁₈ value set believed that due to cultural reasons, death is a taboo in China, especially in rural areas. When using the TTO method, the researchers did not tell the respondents to imagine die immediately after living in a hypothetical health state for a period of time. Therefore, the respondents may make variant assumptions about the length of life and health states of the continued lives, which may have led to an overestimation of the TTO.

The two indices generated consistent results for the majority (80.7%) of health transitions. For the transitions involving improvement only, the results would always be consistent regardless the differences in scoring algorithms. On the other hand, the inconsistent results would be presented for the transitions including both improvement and deterioration in different dimensions. Compared to the 3L₂₀₁₄ algorithm, the parameter coefficients of the 3L₂₀₁₈ algorithm display greater variance. Its parameter value for L2 and L3 problems of the 3L₂₀₁₈ algorithm varied from 0.027 (PD2) to 0.077 (MO2), and 0.041 (PD3) to 0.291(SC3); while such the parameters for the 3L₂₀₁₄ algorithm ranged from 0.074 (UA2) to 0.099 (MO2) and 0.205(AD3) to 0.246 (MO3). For example, a health transition resulted from health state “11131”to “11113” would be considered as health gain and health loss according to the 3L₂₀₁₄(0.031) algorithm and 3L₂₀₁₈ (-0.136) algorithm, respectively.

With regard to all the consistent health transitions, both the index scores showed similar health gains and responsiveness, but they varied considerably across the four subgroups. The health gains and responsiveness of the 3L₂₀₁₄ index score were found to be better or greater than those of the 3L₂₀₁₈ index score in the “major improvement” and “minor improvement” subgroups. On the other hand, in the subgroups of “mixed response with minor deterioration” and “mixed response with major deterioration”, the two index scores generated similar or even reversed results. For the subgroups 1 & 2, the 3L₂₀₁₄ algorithm overall has larger parameter values, indicating the health gain from a transition from extreme/moderate problems to no problems is much greater according to it. Similarly, the magnitude of difference between L2 and L3 parameters is also generally larger for the 3L₂₀₁₄ algorithm, leading to comparable conclusions for the transitions from extreme problems to moderate problems. For the subgroups 3 & 4, the 3L₂₀₁₄ algorithm has relatively similar parameter values across the five L2 and the five L3 parameters. Hence, for a health transition involving both improvement and deterioration, the magnitude of health gain from the improvement in a certain dimension may be offset to a large extent by the deterioration from another dimension according to the 3L₂₀₁₄ algorithm. The health gains and responsiveness based on it were therefore no larger and better than those based on the 3L₂₀₁₈algorithm in the subgroups.

It should be bear in mind that in reality the frequencies of the 243 health states and 29,403 transitions would be distributed disproportionately. For example, the state “11111” has been the most frequently observed in a number of studies in China, which may lead to different conclusions [24]. Also, the absolute utility score could also influence the QALY calculation to some extent. Hence, more empirical studies are warranted to further assess the impact in various settings in China.

Our results suggested a substantial difference between the 3L₂₀₁₄ and 3L₂₀₁₈index scores at absolute level; while their differences at relative level differed according to the type of health change. Our findings suggested that choosing which value set to generate 3L index score is very likely to influence QALYs estimate in China.

ICC, Intraclass correlations coefficient; HRQoL, health-related quality of life; QALYs, quality-adjusted life years; MO, mobility; SC, self-care; UA, usual activities; PD, pain/discomfort; AD, anxiety/depression; MID, minimally important difference; SD, standard deviation; TTO: time trade-off.

Ethics approval and consent to participate

Not applicable.

Consent for publication

No individual’s personal data is included.

Availability of data and materials

Please contact author for data requests.

Competing Interests

The authors declare that they have no competing interests in this work.

Funding

This study was funded by the Project of the Key Discipline Construction, Shanghai 3-Year Public Health Action Plan under grant no. GWV-10.1-XK18.

Authors' contributions

P.W. had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis. RY.Z., P.W. and W. W. wrote the main manuscript text; RY.Z. analyzed the data; all authors reviewed and approved the final manuscript. All authors read and approved the final manuscript.

Acknowledgements

Not applicable.

Wang H, Kindig DA, Mullahy J. Variation in Chinese population health related quality of life: results from a EuroQol study in Beijing, China. Qual Life Res. 2005;14(1):119-132.
Fang H, Farooq U, Wang D, Yu F, Younus MI, Guo X. Reliability and validity of the EQ-5D-3L for Kashin-Beck disease in China. Springerplus. 2016;5(1):1924.
Wang HM, Patrick DL, Edwards TC, Skalicky AM, Zeng HY, Gu WW. Validation of the EQ-5D in a general population sample in urban China. Qual Life Res. 2012;21(1):155-160.
Payakachat N, Ali MM, Tilford JM. Can The EQ-5D Detect Meaningful Change? A Systematic Review. Pharmacoeconomics. 2015;33(11):1137-1154.
Brooks, R. EuroQol: the current state of play. Health Policy. 1996;37(1):53-72.
Johnson JA, Luo N, Shaw JW, Kind P, Coons SJ. Valuations of EQ-5D health states: are the United States and United Kingdom different? Med Care. 2005;43(3):221-228.
Badia X, Roset M, Herdman M, Kind P. A comparison of United Kingdom and Spanish general population time trade-off values for EQ-5D health states. Med Decis Making. 2001;21(1):7-16.
EuroQol: EQ-5D-3L Valuation. https://euroqol.org/eq-5d-instruments/eq-5d-3l-about/valuation/. Accessed 11 January 2022.
Lee YK, Nam HS, Chuang LH, Kim KY, Yang HK, Kwon IS, et al. South Korean time trade-off values for EQ-5D health states: modeling with observed values for 101 health states. Value Health. 2009;12(8):1187-1193.
Jo MW, Yun SC, Lee SI. Estimating quality weights for EQ-5D health states with the time trade-off method in South Korea. Value Health. 2008;11(7):1186-1189.
Shaw JW, Johnson JA, Coons SJ. US valuation of the EQ-5D health states: development and testing of the D1 valuation model. Med Care. 2005;43(3):203-220.
Shaw JW, Pickard AS, Yu S, Chen S, Iannacchione VG, Johnson JA, et al. A median model for predicting United States population-based EQ-5D health state preferences. Value Health. 2010;13(2):278-288.
Liu GG, Wu H, Li M, Gao C, Luo N. Chinese time trade-off values for EQ-5D health states. Value Health. 2014;17(5):597-604.
Zhuo L, Xu L, Ye J, Sun S, Zhang Y, Burstrom K, et al. Time Trade-Off Value Set for EQ-5D-3L Based on a Nationally Representative Chinese Population Survey. Value Health. 2018;21(11):1330-1337.
Mulhern B, Feng Y, Shah K, Janssen MF, Herdman M, van Hout B, et al. Comparing the UK EQ-5D-3L and English EQ-5D-5L Value Sets. Pharmacoeconomics. 2018;36(6):699-713.
Janssen MF, Bonsel GJ, Luo N. Is EQ-5D-5L Better Than EQ-5D-3L? A Head-to-Head Comparison of Descriptive Systems and Value Sets from Seven Countries. Pharmacoeconomics. 2018;36(6):675-697.
Ferreira LN, Ferreira PL, Ribeiro FP, Pereira LN. Comparing the performance of the EQ-5D-3L and the EQ-5D-5L in young Portuguese adults. Health Qual Life Outcomes. 2016;14:89.
Jin X, Al SF, Ohinmaa A, Marshall DA, Smith C, Johnson JA. The EQ-5D-5L Is Superior to the -3L Version in Measuring Health-related Quality of Life in Patients Awaiting THA or TKA. Clin Orthop Relat Res. 2019;477(7):1632-1644.
NICE: Position statement on use of the EQ-5D-5L valuation set for England (updated October 2019). https://www.nice.org.uk/Media/Default/About/what-we-do/NICE-guidance/NICE-technology-appraisal-guidance/eq5d5l_nice_position_statement.pdf. Accessed 11 January 2022.
Luo N, Liu G, Li M, Guan H, Jin X, Rand-Hendriksen K. Estimating an EQ-5D-5L Value Set for China. Value Health. 2017;20(4):662-669.
van Hout B, Janssen MF, Feng YS, Kohlmann T, Busschbach J, Golicki D, et al. Interim scoring for the EQ-5D-5L: mapping the EQ-5D-5L to EQ-5D-3L value sets. Value Health. 2012;15(5):708-715.
Web of Science https://www.webofscience.com/wos/alldb/summary/50f2dbbf-8766-453c-8aae-b82a0289b4b6-1e0ea3da/relevance/1. Accessed 11 January 2022.
Web of Science. https://www.webofscience.com/wos/alldb/summary/88935a16-068a-48fe-83dd-6875792ca22f-1e16f622/relevance/1. Accessed 11 January 2022.
Pan CW, Zhang RY, Luo N, He JY, Liu RJ, Ying XH et al. How the EQ-5D utilities are derived matters in Chinese diabetes patients: a comparison based on different EQ-5D scoring functions for Chin Qual Life Res. 2020 Jun 12. Epub ahead of print.
Xia R, Zeng H, Liu Q, Liu S, Zhang Z, Liu Y, et al. Health-related quality of life and health utility score of patients with gastric cancer: A multi-center cross-sectional survey in China. Eur J Cancer Care (Engl). 2020;29(6): e13283.
Dolan P. Modeling valuations for EuroQol health states. Med Care. 1997;35(11):1095-1108.
Kind P. A Revised Protocol for the Valuation of Health States Deﬁned by the EQ-5D-3L Classiﬁcation System: Learning the Lessons from the MVH Study. York: Centre for Health Economics, University of York, 2009.
Machin D, Fayers PM. Quality of life: the assessment, analysis, and reporting of patient-reported outcomes. 3rd ed. Chichester, UK: John Wiley; 2016.
Bland JM, Altman DG. Statistical methods for assessing agreement between two methods of clinical measurement. Lancet. 1986;1(8476):307-31021.
Walters SJ, Brazier JE. Comparison of the minimally important difference for two health state utility measures: EQ-5D and SF-6D. Qual Life Res. 2005;14(6):1523-1532.
Nan L, Johnson JA, Shaw JW, Coons SJ. A comparison of EQ-5D index scores derived from the US and UK population-based scoring functions. Med Decis Making. 2007;27(3):321-326.
Kiadaliri AA. A Comparison of Iran and UK EQ-5D-3L Value Sets Based on Visual Analogue Scale. Int J Health Policy Manag. 2017;6(5):267-272.
Cohen J. Statistical power analysis for the behavioral sciences. 2nd ed. Hillsdale, NJ: Lawrence Erlbaum; 1988.

Table 1 Characteristics of the two EQ-5D-3L value sets for China

	3L₂₀₁₄	3L₂₀₁₈
Sample size used	1,222 respondents	6,000 respondents
Sampling area	Beijing, Shenyang, Nanjing, Chengdu, and Guangzhou (urban area)	Jiangsu, Guangdong, Hebei, Chongqing, and Shaanxi (one rural and one urban area)
Sampling method	quota sampling	A multistage, stratified, clustered random sampling
Number of health states directly valuated	97	43
Valuation protocol used	Paris protocol	MVH protocol
The range of index scores	[-0.149,1]	[0.170,1]
Scoring parameter	1-(0.039+0.099MO2+ 0.246MO3+0.105SC2+0.208SC3+ 0.074UA2+0.193UA3+0.092PD2+ 0.236PD3+0.086AD2+0.205AD3 +0.022*N3)	1-(0.077MO2+0.267MO3 +0.044SC2+0.291SC3+0.037UA2 +0.054UA3+0.027PD2+0.041PD3 +0.036AD2+0.177AD3)

Paris protocol: a successor of the MVH protocol for valuation of EQ-5D-3L health states

MVH: The Measurement and Valuation of Health protocol

TTO: time trade-off

MO: mobility; SC: self-care; UA: usual activities; PD: pain/discomfort; AD: anxiety/ depression; N3: if any level 3 problems were present in a state

2: level 2 problems; 3: level 3problems

For instance, the utility score for “22213” was 1-0.039-0.099-0.105-0.074-00.205-0.022=0.456 (3L₂₀₁₄ value set)

Table 2 Comparison of the two EQ-5D-3L index scores at absolute and relative levels

	n	Mean	SD	Minimum	Maximum
EQ-5D-3L index score
3L₂₀₁₄	243	0.427	0.206	-0.149	1
3L₂₀₁₈	243	0.649	0.189	0.170	1
3L₂₀₁₄-3L₂₀₁₈	243	-0.222	0.121	-0.529	0.043
health gains from transitions
3L₂₀₁₄	29,403	0.234	0.173	0	1.149
3L₂₀₁₈	29,403	0.216	0.158	0	0.830
3L₂₀₁₄-3L₂₀₁₈	23,720^*	0.007	0.152	-0.521	0.529

Table 3 Responsiveness of the two EQ-5D index scores in simulated transitions between EQ-5D-3L health states

	All Consistent Transitions (n=23,720)		Major Improvement (n=6,752)		Minor Improvement, (n=781)		Mixed Response with Minor Deterioration (n=4,515)		Mixed Response with Major Deterioration (n=11,672)
	3L₂₀₁₄	3L₂₀₁₈	3L₂₀₁₄	3L₂₀₁₈	3L₂₀₁₄	3L₂₀₁₈	3L₂₀₁₄	3L₂₀₁₈	3L₂₀₁₄	3L₂₀₁₈
Mean (SD) Pre-treatment score	0.329 (0.193)	0.552 (0.184)	0.213 (0.173)	0.486 (0.180)	0.450 (0.172)	0.692 (0.159)	0.374 (0.166)	0.570 (0.173)	0.370 (0.186)	0.574 (0.179)
Mean (SD) Post-treatment score	0.531 (0.188)	0.748 (0.158)	0.624 (0.182)	0.796 (0.151)	0.601 (0.193)	0.765 (0.170)	0.620(0.150)	0.815 (0.123)	0.439 (0.156)	0.692 (0.153)
Mean (SD) Health gains	0.203 (0.244)	0.195 (0.215)	0.411 (0.169)	0.310 (0.168)	0.151 (0.073)	0.072 (0.039)	0.246 (0.156)	0.246 (0.152)	0.069 (0.224)	0.118 (0.230)
Cohen Effect size	1.05	1.06	2.38	1.72	0.88	0.45	1.48	1.42	0.37	0.66

No competing interests reported.

Download PDF

Editorial decision: Major revision
23 Feb, 2022
Reviews received at journal
04 Feb, 2022
Reviewers agreed at journal
21 Jan, 2022
Reviews received at journal
20 Jan, 2022
Reviewers agreed at journal
19 Jan, 2022
Reviewers invited by journal
18 Jan, 2022
Editor assigned by journal
18 Jan, 2022
Submission checks completed at journal
13 Jan, 2022
First submitted to journal
11 Jan, 2022

You are reading this latest preprint version

A Head-to-head Comparison of the EQ-5D-3L Index Scores Derived from the Two EQ-5D-3L Value Sets for China

Status:

Version 1

Abstract

Figures

Introduction

Methods

Results

Discussion

Conclusion

Abbreviations

Declarations

References

Tables

Additional Declarations

Status:

Version 1