The analysis in this paper was based on data from the PoMeT trial (15), which investigated the impact of Positive Memory Training on depression symptoms of schizophrenia patients (n=100) in the UK between 2014-2016. The trial received ethical approval from the Berkshire Research Ethics Committee (REC ref 13/SC/0634). Patients were eligible for inclusion if they were between 18-65 years of age, had a DSM-V diagnosis of schizophrenia or schizoaffective disorder, and had at least a mild level of depression as measured by scoring 14 or more on the Beck Depression Inventory-II (27). Participants were identified by trial research assistants, working in collaboration with care coordinators based within community mental health teams. Randomisation was stratified by site and severity of depression (above and below a BDI-II score of 29, i.e. a severe level of depression) using randomised-permuted blocks (15). Patients were assessed at four time points through the 9-month study period: baseline, 3 months, 6 months and 9 months. More details about the PoMeT trial can be found in Steel et al (15).
The OxCAP-MH is a self-reported, 16-item instrument developed in the context of mental health outcome measurement, where items are rated on a 1–5 Likert-scale and each question provides an equal contribution to the overall score. The 16 items cover a broad range of individual wellbeing aspects including: Overall health, Enjoying social and recreational activities, Losing sleep over worry, Friendship and support, Having suitable accommodation, Feeling safe, Likelihood of discrimination and assault, Freedom of personal and artistic expression, Appreciation of nature, Self-determination and Access to interesting activities or employment (9). The OxCAP-MH initial score (16-80 scale) is converted on to a 0–100 scale referring to minimum and maximum capabilities using the formula: 100 × (OxCAP-MH total score – minimum possible score)/possible range (17). Higher scores indicate better capabilities; items 2, 4, 5, 6, 9, 10, 11, 12, 13, 14, 15 and 16 are reverse coded. The OxCAP-MH has shown validity (5, 17), responsiveness (5, 17) and feasibility (9) in several settings and mental health disease areas, including schizophrenia and depression (14, 15, 24). It is currently available in the English, German (28), Hungarian (29) and Luganda languages with further language translations ongoing. The OxCAP-MH does not yet have a preference-based value set, so far it has been used in economic evaluations as a score; however, research is on-going to develop a weighting system for its domains.
The ICECAP-A is a brief self-reported measure for the general adult population with five items, each of which can take one of four levels ranging from full capability to no capability. The domains include Stability (being able to feel settled and secure), Attachment (being able to have love, friendship and support), Autonomy (being able to be independent), Achievement (being able to achieve and progress), and Enjoyment (being able to have enjoyment and pleasure) (18). The ICECAP-A has shown validity (22, 23, 25, 30, 31) reliability (32, 33), responsiveness (34) and feasibility (20) in different populations, including depression (35). Beside the original English language version, it is also available in German (32), Chinese (36), Welsh, Dutch, Danish, Persian and Italian languages (37). The ICECAP-A has a preference-based value set derived from the UK general population (30) and it is increasingly used in economic evaluations (38). The preference-based scores of the ICECAP-A range between 0 and 1.
The EQ-5D is one of the most commonly used self-reported generic health status measures, and its validity and reliability have been reported in various health conditions and populations (39). The EQ-5D comprises five dimensions: mobility, self-care, usual activities, pain/discomfort and anxiety/depression. Beside the original 3-level version (40), a more sensitive, 5-level version exists since 2009 (41), and both versions have value sets developed in several countries (42). This study used the UK value set developed for the EQ-5D-5L. As part of the EQ-5D-5L, respondents’ self-rated health is also recorded on a vertical visual analogue scale (EQ VAS) where scores range between 0-100 referring to worst imaginable health state and best imaginable health state, respectively. The EQ VAS can capture further important, complementary information to the health state information patients provide when they self-report their health on the EQ-5D descriptive system (43).
The analysis also included comparative information from four condition-specific instruments used in the POMET trial due to their ability to capture important aspects of the condition from the patient’s perspective and thereby enabled the assessment of the OxCAP-MH, ICECAP-A, and EQ-5D-5L instruments’ ability to reflect clinically relevant mental health outcomes. The Beck Depression Inventory (BDI), General Anxiety Disorder (GAD), Rosenberg self-esteem scale (RSES), and the Warwick-Edinburgh Mental Well-being Scale (WEMWBS) are all mental-health specific, self-reported measures.
BDI is a self-reported measure of depressive symptoms and their severity in adolescents and adults according to the Diagnostic and Statistical Manual for Mental Disorder (44). It has 21-items scored on 4-point polytomous response scale ranging from 0 to 3 (27). Scores range between 0 and 63 with higher score representing more severe depression.
GAD is a self-reported measure of anxiety symptoms over the last two weeks. It consists of seven items scored on a 0-3 scale with higher score indicating more severe symptoms (range from 0 to 21) (45). The cut-off scores of 5, 10 and 15 reflect mild, moderate and severe anxiety symptoms, respectively (46).
RSES is a 10-item, self-reported instrument that measures global self-worth by measuring both positive and negative feelings about the self (47). Items are answered using a 4-point polytomous response scale format ranging from strongly agree to strongly disagree. Items 2, 5, 6, 8, 9 are reverse scored.
The self-reported WEMWBS instrument was developed in the UK to assess mental wellbeing including affective-emotional aspects, cognitive-evaluative dimensions and psychological functioning. It is a 14-item scale with 5 response categories (‘none of the time’, ‘rarely’, ‘some of the time’, ‘often’, ‘all of the time’), with a total score ranging from 14–70. A higher score indicates a higher level of mental wellbeing (48).
The statistical analysis focused on exploring and comparing the measurement properties of the OxCAP-MH, ICECAP-A, EQ-5D-5L and the EQ VAS. Correlations of baseline scores to test and compare construct validity across the scales, exploratory factor analysis (EFA) and investigation of responsiveness to change were carried out. To better understand how these instruments perform, Appendix 1 demonstrates the floor and ceiling effects observed with the four scales.
For all analyses, the level of significance was determined at p < 0.05, unless stated otherwise. Group comparisons of mean baseline scores were conducted using the Wilcoxon rank-sum test (49, 50) for two-group comparisons and Kruskal-Wallis one-way ANOVA for multiple group comparison (51). Analysis was conducted on complete cases, excluding missing items at the relevant time point, unless stated otherwise. EFA was conducted with the freely available FACTOR software, and we used STATA Version 16 for all other analyses.
Convergent validity indicates the degree to which two measures of constructs that theoretically should be related, are in fact related (52, 53). The hypothesis, that capability instruments and their items have stronger correlation with each other than with a HRQoL instrument, was tested through exploring the correlation between baseline scores. Spearman’s rank correlations across OxCAP-MH, ICECAP-A, EQ-5D-5L, EQ VAS and condition-specific measures were calculated at total score-level at baseline and assessed based on Cohen's effect size classification, namely < 0.3 is small, 0.3 - < 0.5 is moderate and ≥ 0.50 is large (54).
Next, locally weighted smoothing curve (LOWESS) fit lines were used to graphically indicate nonlinear trends in the scatter plots between OxCAP-MH, ICECAP-A, EQ-5D-5L and EQ VAS scores (55). LOWESS is a form of nonparametric regression that plots a line of central tendency between two variables on a scatterplot, thereby visualizing the relationship across the possible score ranges. LOWESS captures general patterns in the relationship between two measures without making assumptions about their actual relationship (56).
Exploratory factor analysis
EFA is a method to uncover the underlying structure of variables and is therefore used to assess whether an instrument measures what it intends to measure (57). It was used to evaluate the construct validity through the factors (or concepts) assessed by the instruments and their relevance to the underlying construct. EFA was conducted on the baseline scores of the OxCAP-MH, ICECAP-A and EQ-5D-5L to examine the overlap between the constructs of the two capability measures and the multidimensional measure of HRQoL, and to study how far they share the same set of underlying factors. This was examined through factor loadings, which can range from -1 to 1. Loadings close to -1 or 1 indicate that the variable strongly influences the factor, whilst 0 indicates low influence. Further details on the methods of EFA can be found in Appendix 3.
Responsiveness was defined as the ability to capture clinically important changes over time (58). Patients filled out each scales at both baseline and 9 months, which allowed for an exploration of change in mean scores over time across all instruments. Responsiveness was assessed by Spearman’s rank correlation between baseline to endpoint change scores.