Patient recruitment and data collection, management, and analysis were subcontracted to Mapi/ICON plc who provided a final report to the EORTC detailing the methodology and findings.
Sample
Recruitment was carried out through a UK-based recruitment agency and patients were eligible to participate if they were 18 years or older, currently receiving cancer treatment as confirmed by a clinician, able to read and understand English, voluntarily agreed to participate in the study, and provided written informed consent.
Pilot testing
Five patients were interviewed to test the acceptability, understanding, and relevance of the instructions for the QLQ-C30 voice script.
Equivalence testing
In addition to the previously described eligibility criteria, patients in the equivalency testing were required to have no changes in treatment planned between the paper and phone version completion. To support equivalence between paper-and-pen and phone administration modes using an ICC >0.70 and a minimally acceptable level of 0.50, a sample size of 63 patients was required [12]. Two waves of recruitment were conducted. In the first wave, 50 patients were recruited, the appropriate number for an equivalence threshold of ICC >0.90. Since protocol deviations were observed in which only 26 patients completed the paper and phone versions of the QLQ-C30 within the pre-specified 2-day timeframe, a second wave of recruitment was therefore conducted to address these limitations. Thirty-seven additional patients were recruited based on the same eligibility criteria, bringing the total sample size to 63.
Study Design
Pilot testing
Patients’ interviews were conducted by trained qualitative researchers and audio-recorded for the purpose of analysis. Interviews lasted approximately 60 minutes and were based on a study-specific interview guide, which contained a summary of the methods to conduct the interview, along with semi-structured questions. The guide also contained questions regarding demographic and clinical variables to capture during the interview. Patients’ responses were recorded anonymously on a grid detailing results per patient and the results were qualitatively reviewed and summarized. The QLQ-C30 phone script was subsequently revised accordingly. The interview recordings were destroyed after completion of the analysis, with an anonymised copy retained for the study files.
Equivalence testing
A randomised, cross-over design was used to compare the self-administered paper version and the hetero-, phone-administered version of the QLQ-C30 in patients currently receiving treatment for cancer, following recommendations as set out in the International Society for Pharmacoeconomics and Outcomes Research (ISPOR) PRO Mixed Methods Task Force [19]. Patients were randomised (1:1) to complete either the paper or the phone-administered version first. After providing informed consent, each patient completed a brief sociodemographic and clinical form. Depending on randomisation, patients were then asked to either complete the paper version of the QLQ-C30 and return it to the recruitment agency in a prepaid envelope or respond by phone to the questionnaire following the phone script as presented by the interviewer, a trained qualitative researcher. The interviewers recorded patients’ responses on a paper version of the QLQ-C30. The paper version of the QLQ-C30 was estimated to take approximately 30 minutes to complete and administration time for the phone version was recorded for each patient. Any comments or observations made by the patient during the phone administration were recorded on a feedback form.
Two days after the first completion of the QLQ-C30, patients were asked to complete it again using the other mode of administration. The date of completion of the paper version was noted for each patient, to assess compliance with the pre-specified two-day time frame. For patients who completed the phone interview first, the recruitment agency waited for confirmation of interview completion from the study team before sending the paper version by post.
Data Analysis
Patients were described in terms of clinical and socio-demographic variables, as reported during the phone interview (pilot testing) or on the socio-demographic/clinical form (equivalence testing). Age, gender, educational status, and disease history were reported. All data processing and analyses were performed with SAS® software for Windows, Version 9.2 or later (SAS Institute, Inc., Cary, NC, USA).
Pilot testing
Feedback from patients was compiled in an analysis grid, and reported per patient based on a qualitative assessment of the questionnaire, its instructions and individual items, with any additional comments also recorded.
Equivalence testing
All patients who met the inclusion criteria and completed enough items in the QLQ-C30 questionnaire during each administration for each domain to be scored were included in the equivalence testing analysis. Responses to items from the QLQ-C30 were described based on completion and distribution of responses per administration mode. Missing data were described in terms of number and percent of missing responses per item along with number and percent of missing items per patient, including the number of patients with at least one missing item. Continuous variables were described based on their frequency, mean, standard deviation, median, first and third quartiles, and minimum and maximum values. Categorical variables were described based on the frequency and percentage of each response choice, with missing data included in the calculation of percentage.
Equivalence testing was performed at both the item and domain score levels, with the primary objective to evaluate equivalence at the score level between both modes of administration using ICC [20]. The widely used benchmark of ICC of >0.70 was used [21], with ICC values between 0.75 and 0.90 indicating good agreement and values greater than 0.90 indicating excellent agreement [22]. Weighted kappa coefficients [23] were used to assess the extent to which both administration modes produced the same responses by patients to the QLQ-C30 items (results are reported in Appendix A). Following Fleiss’ guidelines [24], a kappa value greater than 0.75 was characterized as excellent, 0.40-0.75 as fair to good, and less than 0.40 as poor. Mean differences in item-level scores were also calculated and are displayed in Appendix B.
To ensure robustness of results between the two waves of recruitment, a sensitivity analysis was conducted to compare the ICC values between patients included prior to the study amendment (first wave of recruitment: n=26) and those included after (second wave of recruitment: n=37) using scores from the paper and phone administration modes of the QLQ-C30. Additional sensitivity analyses were conducted on the full group of patients included in the equivalence testing (n=63) to compare ICC scores by age (<60 vs. >60) and gender.