Study design and data collection
The Cancer Screening Program in Urban China, a major public health service project supported by the central government of China beginning in August 2012, was designed to screening programs for lung, breast, colorectal, liver, stomach, and esophageal cancers . Meanwhile, a multicenter cross-sectional study was conducted in 12 provinces between September 2013 and December 2014, with appropriate screening interventions targeted at specific types of cancer . The study protocol was approved by the Institutional Review Board of the Cancer Hospital of the Chinese Academy of Medical Sciences (Approval No. 15-071/998). All participants gave their [written] informed consent. The EQ-5D-3L is referred to as EQ-5D in the rest of the article.
This study involved 243 subjects according to the following criteria: 40-74 years old; ability to provide written informed consent; diagnosed with lung, breast, stomach, esophagus, colorectal, and liver cancer; completed both the EQ-5D and FACT-G scales and subscales. Exclusion criteria included: the refusal to sign the consent form, non-cancer-related subjects, missing or duplicate responses on the questionnaire, and being unable to understand the questions, or record their evaluations. The questionnaire survey was conducted through a face-to-face interview between the investigator and the followed-up subjects.
For information regarding age, sex, marital status, level of education, family population, employment, family financial pressure, and significant life events, patients were required to complete a health and demographic questionnaire. Simultaneously, the questionnaires were also completed face-to-face with the community doctors, who were trained by research assistants. Scales that needed to be completed by patients included the EQ-5D and, FACT-G.
The EQ-5D scale is a generic preference-based instrument that provides a simple and universal health measurement method for clinical and economic evaluation. It consists of a two-part questionnaire. The EQ-5D descriptive system consists of five dimensions (i.e., mobility, self-care, usual activities, pain/discomfort, and anxiety/depression), each with three levels of health, indicating no problems, some or moderate problems, and extreme problems . The EQ-5D index scores were calculated using an algorithm based on societal preferences from the general population-based valuation. We calculated the EQ-5D health states using the TTO method developed by Gordon G. Liu et al., who developed a utility algorithm based on a TTO survey of 1147 Chinese respondents . The EQ-5D utility index ranges from -0.149 to 1, where 1 indicates full health, 0 indicates a state equivalent to death, and a negative value implies that the respondent’s health state is worse than death . Nevertheless, no negative values were observed in this study. At present, the state of health is calculated using a 20-centimeter visual analog scale (VAS) that ranges from 0 to 100. The worst imaginable health state was scored as 0 and the best imaginable health state was scored as 100 .
Participants also completed the FACT-QoL questionnaire using a version specific to their tumor type. FACT-G is scored by summing the individual scale scores, with higher scores indicating better QoL. The FACT-G produces four subscale scores that reflect the patient’s QoL: physical wellbeing (PWB) (7 items), social/family well-being (SFWB) (7 items), emotional well-being (EWB) (6 items), and functional well-being (FWB) (7 items) . The scales for different disease-specific types are different. However, the cancer-specific HRQoL uses instruments, including 27-item FACT-G and several items of cancer-additional concerns scale, that contained a breast cancer subscale, a lung cancer subscale, an esophageal cancer subscale, a colorectal cancer subscale, a gastric cancer subscale, and a hepatocellular carcinoma subscale. All items were rated on a 5-point Likert scale, with higher scores indicating better HRQoL.
Five functions mapping from the FACT-G to the EQ-5D (i.e., the ordinary least squares [OLS] model, generalized linear model [GLM], Tobit model, the censored least absolute deviations [CLAD], and the two-part model [TPM])were included in this study. Most of the previous studies suggested that, the OLS model may not be appropriate when preference-based scores are highly skewed . The ceiling effect may also invalidate the normality assumption of OLS . The GLM with Gamma family and identity link predicts EQ-5D utility, which relaxes the assumption of the OLS that allows the skewed distribution of utility values. The Tobit model is an alternative model that accounts for the ceiling effect, thus limiting predictions within a credible range. However, it is sensitive to normal distribution and heteroscedasticity. The CLAD model assumes that the median is more resistant than the mean to ceiling effects and is a possible solution to the heteroscedasticity problem as well, which minimizes the sum of absolute differences between observed and predicted values [22, 32]. The TPM is specifically designed to deal with limited dependent variables, which divide the data into two parts to predict responders in perfect health and those who are, not. The TPM with logistic regression is used to predict the probability of EQ-5D utility at the ceiling in the first part, a truncated OLS to predict EQ-5D index for those individuals whose EQ-5D utility is below the ceiling in the second part and combined they obtain the overall utility value [19, 22].
Five model specifications were used to develop the mapping functions. The OLS model, GLM, Tobit model, CLAD model, and TPM were performed in five different models. We increased the squared terms and the interaction terms to improve the model accuracy for this study, as suggested in the literature . Model 1 uses the FACT-G overall scores to regress the EQ-5D utility indices, Model 2 uses all domain scores on the FACT-G, Model 3 includes only statistically significant domains, Model 4 includes Model 3 and squared terms of statistically significant domains from Model 2, and Model 5 includes Model4 and statistically significant domains from Model 2.
We calculated the goodness of fit of each model to assess how well the responses to the FACT-G predicted EQ-5D utility. Model goodness of fit was measured using mean absolute error (MAE), mean square error (MSE), root mean square error (RMSE), mean absolute percentage error (MAPE), Akaike information criteria (AIC), and Bayes information criteria (BIC) to examine the differences between mean observed and predicted EQ-5D utility, in which lower values indicate better model performance. The coefficient of determination, R2, and adjusted R2 also estimated how well the model explained the values in OLS. However, it is not available for other regression models. Instead, we computed the square of the correlation coefficient (r) between the observed and predicted values of each model, with r2 being equivalent to R2 in OLS . To penalize the complexity of the model, we defined the adjusted r2 as follows: adjusted r2where n represents the sample size and p is the number of parameters in the model. Predictive ability was evaluated by a paired t-test to compare the differences in the distributions between the observed and mapped EQ-5D utility scores. The different EQ-5D utility scores from different models with demographic and clinical features were examined by non-parametric analysis. Moreover, we selected the lowest MAE/RMSE and the highest r2 and adjusted r2 as the best performing models. The EQ-5D observed values and predicted utility values were compared in patients with different demographic and clinical characteristics in the different models.
All statistical analyses were performed in STATA version14.1 and all hypothetical tests were two-tailed, and p-value < 0.05 was considered statistically significant in this study.