In this paper we have demonstrated different approaches to the clinical verification of participant reported outcomes that have been used with varying success in our clinical trials unit. We have developed the approaches to verification on a trial-by-trial basis to accommodate the requirements of the individual trial, the type of participant-reported clinical outcome and the available resources.
The outcomes of interest have been varied, and include related hospitalisations, repeat surgery, further low-trauma fractures and post-catheterisation UTI. Because of the diverse nature of the trials in which our case studies were nested and the different approaches to verification used, we did not set out to assess whether some clinical outcomes were more accurately reported by participants than others. However, limited data from other studies suggests that this may be the case[6-9]. A possible reason for this may be that participants find some outcomes easier to report than others because of the terminology and language used in the questionnaire, how familiar participants are with the outcome, or what they are told by medical staff. For example, in the CATHETER case-study some participants were given prophylactic antibiotics and may have thought this was to treat rather than to prevent an infection. We are aware of one previous study comparing participant and clinician reports of UTI, which showed 82 and 84% agreement at 3 and 12 months post prostatectomy[14], but it is not clear what question(s) participants were asked in order to compare this data or what the clinical definition of a UTI was. In CATHETER, participants were asked “have you had a urine infection” and “have you received antibiotics for a urine infection”. It was therefore likely the trial identified symptomatic UTIs where participants sought treatment and prophylactic antibiotics rather than asymptomatic microbiologically confirmed UTIs.
There may be a further mismatch in what participants and clinical/research staff understand (or misunderstand) by a medical term. For example in CATHETER, participants were asked which antibiotic they had been prescribed and in a number of cases the participant could not recall the name, or described a drug that was either not an antibiotic or would not be used to treat UTIs. An alternative approach may have been to list antibiotics used to treat UTIs and ask the participants whether they had received any of these. Alazzawi provide an example of four participants incorrectly reporting that they had a stroke when they had experienced a transient ischaemic attack (TIA)[9]. In a population-based survey, only 29% of those reporting a stroke had this verified by hospital records[15]. Participants who experience a TIA may understand that they have had “mini-stroke”, but not recognise that this is the same as a TIA, so such misunderstandings are perhaps unsurprising. Furthermore, participants may not be provided with sufficient information about a medical term to accurately record the outcome of interest. For example, Dushey et al[3] noted 36.7% accuracy of participant reported major bleed episodes – the clinical verification undertaken followed strict criteria, but participants were not given any criteria as to what constituted a major bleed.
The timing at which participants are asked to report an outcome (in relation to both the outcome itself, and in relation to the original event) may also impact on the accuracy of their reporting. Other factors which may impact on the accuracy of participant reported clinical outcomes is how relevant the outcome is to the participant themselves, or how much impact it has on their life. The need for repeat surgery is likely to have a bigger impact on the participant’s life than, for example, a urinary infection.
Within this piece of work, we set out to verify the clinical events that were reported by participants. Participants may of course have experienced relevant clinical events that they did not report in a questionnaire. When relying on participant reported outcomes only, the magnitude of this is not known. This may be a limitation of the verification methodology more generally.
A further limitation (of our case-studies, and previous work in this area) is that the “clinical verification” provided by medical records may not be a true gold standard. For example, the records themselves may be subject to inaccuracies.
An alternative to verifying participant reported outcomes using individual patient’s medical records would be to confirm them using routine data-sets, but again there may be limitations to such an approach. Routine data may be subject to inaccuracies; for example, Information Services Division (ISD) have a standard for accuracy of routine data of 90% and for general/acute inpatient and day cases (SMR01) they report accuracy of 89-94% [16]. Possible reasons for inaccuracy of routine data include operations being miscoded (perhaps due to a lack of clinical engagement in coding [17]) or operations being coded to the wrong participant. For certain conditions, there may be other limitations in relation to using routine data to verify participant reported clinical outcomes. Firstly, there may be insufficient information to enable one to identify which limb was involved or to identify specific types of operation. Secondly, operations carried out privately will not be captured in routine data, which may be particularly problematic if this constitutes a large proportion of the operations. A third consideration relates to potential time-lags in routine data being coded and made available to researchers, and in the approvals process to access routine data for research purposes. Fourthly, participants may not have given consent for use of routine data at the outset of the trial. Finally, some outcomes are not routinely captured into national datasets. For example, for the CATHETER outcome (UTI), there is no national register, although general national prescribing data could be used to provide information about antibiotic prescriptions this could not be linked to individuals. Although routine data was available for KAT and data about all participants was requested from GPs for RECORD, in this paper we did not attempt to validate these sources.
Considering that there may be inaccuracies in both participant reported clinical outcomes and in data captured in medical records, decisions have to be made in terms of which source to treat as “correct”. In the four case-studies presented herein, different approaches were taken. In KAT[10] and REFLUX[11], the information from the clinician (GP or surgeon) was considered to be the “correct” information, and unless the event was verified by the GP or surgeon, it was not included in the analysis. In RECORD[12] clinical verification of fractures reported by participants was sought (from the recruiting site, GP and/or central data sources). Fractures which could not be confirmed (n=16) were not included in the main analysis. In CATHETER[13] where GP information was received this was regarded as the definitive data. However, in the small number (8 of 830 participants) of participant reported UTIs that could not be confirmed by the GP (for example no response was received from the GP or the participant was no longer registered with the GP practice) the researchers used alternative sources (i.e. responses to other questions within the questionnaire) to verify the data. Therefore, before embarking on a process to verify participant reported clinical outcomes, strategies to deal with mismatches in data should be considered and documented.
The potential benefit of verification of participant reported clinical outcomes is difficult to quantify, and again is not something we set out to do in this paper. There may be advantages in terms of the external generalisability of the results to the population, which can make the results appear more relevant to clinicians practicing in that area. Equally there may be merit if there are particular safety concerns where accurate reporting is essential. However, there may be less impact on the comparison of interventions particularly in studies where the participant is blinded to the intervention and therefore less likely to be biased by knowledge of their treatment allocation. Thus, there may be greater benefit of clinical verification of participant reported outcomes in studies that are not blinded (where participants may have preconceptions about the intervention they have had). Similarly, if the outcome (and particularly any misclassification of outcome) is not equally distributed between arms of the parent trial, the benefit of verification may be greater.
Whilst the potential benefit may be hard to quantify, the costs involved in any approach to verification should be carefully considered. Of the 6,882 clinical outcomes reported in participant questionnaires (KAT Trial), only around 6% were confirmed as being relevant (i.e. related to the trial knee) and included in the statistical analysis[10]. In RECORD, where they used three questions to capture the primary outcome (‘broken any bones’, ‘how did you break’ and ‘which bone(s)’) the trial team were better able to identify which reports were potentially relevant and should be subject to further verification[12]. Therefore, refining the question asked of KAT participants, i.e. ask the participants to report knee-related hospital readmissions only or ask a series of questions to capture additional/relevant information, may increase the proportion of events confirmed as relevant, and reduce the time involved in coding and/or verifying those later identified as not relevant. Checking single cases against hospital or primary care records may simply involve a time-cost for the trial team and the clinical staff, which will be variable contingent on the number of cases being checked. For example in REFLUX[11], only 19 surgeons were contacted for further information, compared to CATHETER[13], where the GPs of over 800 participants had to be contacted. To facilitate the workload in CATHETER, a full-time member of staff was employed, their main duty to manage the process of collecting and processing the resulting data over a three to six month period. There are also both time and financial costs associated with obtaining routine data and linking this to trial cohorts.
Recommendations for practice
Our primary recommendation is that during the planning phase of a trial, careful consideration should be given not only to what participant reported clinical outcomes should be collected and how these will be collected – but also whether any verification of these should be undertaken. The decisions reached are very likely to vary on a trial-by-trial basis – it is our opinion that there is no optimal, “one size fits all” solution. However, making decisions about any clinical verification of participant reported outcomes at the outset of the study will help ensure that any costs associated with verification are covered, that the trial time-line includes adequate time for collection and verification of outcomes, and that the consent sought from participants to allow any verification is sufficient. Furthermore, we recommend that consideration is given to how any discrepancies in outcome identified during the verification process will be dealt with (for example whether unverified outcomes will be included in the analysis or not).
If trial participants are being asked to report clinical outcomes as part of a questionnaire, it is important that appropriate language is used so that they can accurately respond about the clinical information that is being requested from them. The use of very “medical” terms may be inappropriate. Careful piloting of such questions may help avoid misinterpretation, confusion or inaccurate responses.
Recommendations for future research
Further reports of approaches to, and impact of, clinical verification of participant reported outcomes will help inform researchers. More formal cost-benefit type analyses, considering both the impact and the cost of verification, would further inform decisions about the relative value of verification in different settings.