The study was conducted between August 2016 and November 2017 at a tertiary psychiatric hospital, which serves majority of psychiatric patients in Singapore. Patients were included in the study if they are Singapore citizens or permanent residents, aged 21 years and above, literate in English and having a clinical diagnosis of depressive disorder. A total of 249 participants who have completed PHQ-8, EQ5D, SF6Dand HUI-3 questionnaires were included in the analyses.
The study was approved by the relevant institutional ethics review board (National Healthcare Group Domain Specific Review Board (DSRB) (Reference no: 2016/00215). A written informed consent was obtained from all study participants.
The eight-item Patient Health Questionnaire depression scale (PHQ-8) is a self-reported questionnaire designed to measure depressive symptom severity in research and clinical care . It assesses how often in the past two weeks participants experienced eight depressive symptoms. Each symptom is rated on a 4-point Likert scale ranging from 0 (not at all) to 3 (nearly every day) with total scores ranging from 0 to 24. The PHQ-8 has been widely used to measure the severity of depressive symptoms in psychiatric patients in Singapore [8, 9].
The EQ-5D is a generic preference-based measure for subjectively describing and valuing health-related quality of life that has been developed by the EuroQol Group . It comprises of two versions – EQ-5D-3L and EQ-5D-5L. The EQ-5D-3L included five questions on mobility, self-care, pain, usual activities, and psychological status with three possible answers for each item (1=no problem, 2=moderate problem, 3=severe problem). The utility scores of EQ-5D-3L were calculated using the scoring algorithm developed in Singapore (Luo et al., 2014). The EQ-5D-5L is a new version of the EQ-5D comprised of five questions on mobility, self-care, pain, usual activities, and psychological status with five possible responses for each item (1=no problem, 2=slight problems, 3=moderate problems, 4=severe problems, 5=extreme problems). The utility scores of EQ-5D-5L were developed by van Hout et al. using a crosswalk project that maps EQ-5D-5L utility scores from the EQ-5D-3L .
The HUI3 is a generic comprehensive health status classification instrument (Feeny et al., 1995). It generates utility scores using a utility scoring function derived from a representative sample of the general Canadian population based on the Standard Gamble and visual analogue scale methods (Horsman et al., 2003). The utility score ranged between -0.36 and 1. The HUI3 comprised of eight domains: vision, hearing, speech, ambulation, dexterity, emotion, cognition, and pain, with 5 to 6 levels per attribute derived from 15-multiple choice questions. The utility scores obtained from Chinese and Malay versions of the HUI3 have been demonstrated to be equivalent and valid in Singapore (Luo et al., 2007).
The SF-36 is a generic instrument that can be used to generate SF-6D utility scores using a utility scoring function derived from a representative sample of the general UK population (Brazier et al., 2002). The utility score ranged between 0.29 and 1. It has six domains: physical functioning, role limitation, social functioning, pain, mental health, and vitality, with 4–6 levels for each domain. The utility scores derived from Chinese and English versions of the SF-6D have been demonstrated to be equivalent and valid in Singapore (Wee et al., 2004).
Statistical analyses were carried out using the STATA software version 13 (StataCorp LP, College Station, TX). Since the distribution of utility scores derived from generic preference-based measures such as EQ-5D are often not normally distributed and had higher ceiling effect at a value of 1 , we used a beta regression mixture model (betamix) to map the utility scores. The results were compared against two common regression methods including Ordinary Least Square (OLS) and Tobit . The beta regression mixture model is a two-part model that incorporates a multinomial logit model and a beta mixture model in their algorithms. Studies have increasingly suggested that this regression method outperforms linear regression model [14-16]. In order to determine the best performance of the prediction model, three main different model specifications were included in each regression methods. The first model included only PHQ total scores as a main predictor for the utility score; the second model included PHQ total scores, age and gender, and the third model included PHQ total scores, PHQ-squared, age and gender. The performance of regression methods was assessed using the following criteria. Both mean absolute error (MAE) and root mean square error (RMSE) were used as a main criterion to compare the performance of regression methods. Values from both indices were ranked and summed to get an average ranking. The regression model with the lowest average ranking values was considered to be the best prediction model [6, 16, 17].