Can a Complex Vocational Rehabilitation Intervention Be Delivered With Fidelity? Fidelity Assessment in the Fresh Feasibility Trial

Determining whether complex rehabilitation interventions are delivered with delity is important as differences can occur between sites, therapists delivering the intervention and over time; threatening trial outcomes and increasing the risk of Type II and Type III errors. Aims: to (1) evaluate implementation delity of vocational rehabilitation delivered in FRESH, a multi-centre feasibility randomised controlled trial; and (2) understand factors affecting delivery. methods evaluation. Fidelity was measured quantitatively using intervention case report forms, delity checklists and clinical records. Qualitative data from mentoring records, interviews with the intervention therapists, participants and their employers and NHS staff at each site explored moderators of implementation delity. The quantitative and qualitativedata informed data collection tools and analysis. Data were examined against a logic model and benchmarked against an earlier cohort study.


Contributions To The Literature
The 'Conceptual Framework of Implementation Fidelity' guides data collection and analysis of delity in a feasibility trial.
The use of a benchmark provided a measure of the degree to which OTs adhered to the intervention as described in the logic model.
Whilst multiple measurement methods provide delity results at the end of a study, mentoring provides a useful method for monitoring of delity throughout.

Background
Every year in the UK, around 160,000 people sustain a traumatic brain injury (TBI) (1) de ned as an injury to the brain caused by a trauma to the head (head injury) (2). Causes include road tra c accidents, assaults, falls and accidents (1). TBI can result in impaired mobility, cognition and social skills, problems with mood and managing emotions (3,4) and reduced quality of life (5,6). Such factors can affect the person's ability to regain functional independence and return to work (7). Return to work is an important rehabilitation goal for many survivors of TBI (8). Employment provides economic security, but supports physical, psychological and social health (9). However, vocational rehabilitation (VR) services to support people recovering from brain injury are rare, and data on their effectiveness is limited (9,10).
The Facilitating Return to work through Early Specialist Health-based interventions (FRESH) study (11) was funded by the National Institute for Health Research (NIHR) Health Technology Assessment (HTA) Programme to test the feasibility of conducting a future study to determine whether early VR delivered by occupational therapists (OTs) to support people with TBI in returning to work is effective and costeffective. It involved a multi-centre, feasibility, parallel group randomised controlled trial with feasibility economic evaluation.
An important component of FRESH was an embedded mixed-methods process evaluation to provide context to the study ndings and identify implementation issues for consideration in a de nitive trial. The process evaluation involved a comprehensive assessment of intervention delity, which is acknowledged as key to understanding how intervention adaptations made during the trial relate to participant and trial outcomes (12) or how they can be used to optimise the intervention before embarking on a fully powered trial (13).
Intervention delity is particularly important in trials where a complex intervention, such as VR, is being delivered, because of the potential differences across sites, providers and over time (14)(15)(16). It is important to understand whether variable intervention effects could result from differences in intervention delity (17). Ensuring clinicians are adequately prepared to implement the intervention and then monitoring delity throughout is important to optimise patient outcomes (15,18).
Moving directly from a feasibility study to a large-scale de nitive trial without considering implementation issues could perpetuate poor patient outcomes (19,20), prevent trialists from improving research design (21,22) and halt scienti c discovery of interventions that have been wrongly determined ineffective (23). Trialists who do not measure delity are at risk of ignoring type III errors, whereby outcomes are attributed erroneously to treatment effectiveness (24).
In FRESH the OTs were trained face-to-face, supported by a manual informed by a logic model describing the intervention components and processes (Additional le 3). The OTs were also supported by a mentor experienced in vocational rehabilitation for the duration of intervention delivery (25,26). The purpose of which was to monitor delity (14) (27,28) and enhance therapists' adherence (29).
The aim of this study was to systematically evaluate implementation delity and factors in uencing delity of a complex TBI VR intervention delivered by occupational therapists in the FRESH feasibility trial.

Methods
In the FRESH trial (11), 78 traumatic brain injured participants were randomly allocated to the VR intervention in addition to their usual NHS rehabilitation (intervention group) or usual NHS rehabilitation alone (control group) in three regions (the North West, London, and Yorkshire) of England over 12 months. The primary outcome was participants' work status, de ned as a minimum of an hour per week of paid or unpaid work, analysed at 12-months using an intention-to-treat approach.
Implementation delity was measured as part of an embedded process evaluation. Data collection was longitudinal (Additional le 1). Quantitative process data consisted of content of treatment records, delity checklists, mentoring records and clinical occupational therapy records). Qualitative data collection methods included interviews, therapy and mentoring records. Both types of data enabled evaluation of the intervention process implementation and its delity during the study as subsequently recommended by Toomey et al (2017) (40).
The Conceptual Framework for Implementation Fidelity (CFIF) (30) was used to guide both the measurement of delity and understand factors affecting its delivery. The CFIF structure facilitated the development of measurement tools, including a delity checklist (12,(41)(42)(43), data collection (44) and analysis (45).
Participants included patients recruited to the FRESH feasibility randomised controlled trial (RCT) who were randomised to receive the FRESH VR intervention. Inclusion criteria were people aged 16 years or above, admitted to one of three major trauma centres for 48 hours or more, with a new TBI (within 8 weeks) and who were in paid or unpaid work or full-time education prior to injury. Full eligibility criteria in the FRESH RCT are explained elsewhere (46). Clinical records and intervention session case report forms (CRFs) were collected for every intervention participant (n = 38). Purposive sampling was used to identify and recruit ve participants for telephone interviews from each site with a range of demographics and TBI severity who had received the intervention (n = 15).
The ve OTs who delivered the intervention were Health and Care Professions Council registered with expertise in VR. OTs attended two days of training, plus an additional day six-months after intervention delivery commenced. This is described elsewhere (25,26). Training was delivered by a team of four OTs with expertise in VR, TBI and research. Training was supplemented by an intervention manual and monthly individual mentoring by a member of the training team to support implementation during the intervention delivery period. OTs could contact their mentor for advice when they needed it.
Employer participants included line managers, human resource professionals, or occupational health professionals of patient participants in employment or teaching staff linked to participants in full-time education. A convenience sample of 15 employers were recruited ( ve from each site). Only employers of people with TBI (PwTBI) randomised to receive the intervention were eligible.
NHS staff participants included those staff at each site, who, in their usual role, were involved in managing, commissioning, or delivering TBI rehabilitation. A convenience sample of 15 NHS staff were recruited ( ve from each site). The VR intervention, described using the TIDieR checklist in Additional le 2 and elsewhere (47)(48)(49) was delivered to the FRESH participants by an OT (described above). The primary focus of the intervention was preventing job loss and optimising employment and education outcomes. The intervention started within eight weeks of injury and lasted up to 12 months. The logic model for the FRESH intervention is described in Additional le 3. Data was collected about every session delivered by every OT during the FRESH intervention delivery phase between January 2014 and January 2016. A description of each data collection tool, time points for data collection, data type, which CFIF construct they related to, their purpose and data usage are shown in Table 1. Quantitative adherence data (content, coverage, frequency and duration of intervention); and qualitative data explaining moderators of delity (participant responsiveness, intervention complexity, strategies to facilitate implementation, quality of delivery, recruitment, and context) were extracted from the tools and compared, through triangulation, for data veri cation and to identify missing data by an independent researcher. Each intervention session was recorded on an intervention CRF that was modi ed from one initially developed by Phillips et al (48) plus clinical records following the OTs' own local policies and procedures.
A delity visit checklist (Additional le 4) for use by an expert rater, was developed to measure whether OTs adhered to the core intervention processes. It was based on an observational checklist designed by Hasson et al (44) informed by the theoretical constructs of CFIF. As recommended by others (17,27), the checklist comprised the intervention components and core processes extracted from the FRESH intervention logic model (Additional le 3). Fidelity to each was rated on a 5-point ordinal scale as delivered: 'always', 'often', 'sometimes', 'seldom' or 'never' (where 'always' scored 1 and 'never' scored 5).
Factors that affected the delivery of the intervention were recorded alongside. Guidance notes helped assignment of ratings.
Each OT received four delity check visits by a post-doctoral research OT (JP) who was also a trial mentor. OTs provided anonymised copies of clinical notes and intervention CRFs prior to a visit. The researcher and OT met and discussed the intervention delivered to participants. The researcher systematically rated each component on the checklist. Following the visit, data were recorded on an Excel document by a member of the study team. Fidelity checklists were discussed between the mentoring team and were used to identify non-adherence to intervention delivery and then translated into topics for skill-building during ongoing mentor sessions (51).
Content of each monthly mentoring session was recorded on a mentoring CRF by the mentor, additional mentor support by email and phone calls were also collected.
Interviews with OTs were conducted early after training and later to capture the OTs varying experience of delivering the intervention. PwTBI and their employers and NHS staff were interviewed at the end of the intervention. Interviews followed a topic guide informed by the theoretical constructs of CFIF (30,44) to capture qualitative data on factors affecting implementation delity. Interviews took place by telephone and lasted approximately 45 minutes. They were digitally recorded, fully transcribed, cleaned and the data was uploaded to SQR Nvivo software for analysis.

Data Analyses
All constructs of delity, based on the CFIF, were measured to avoid gaps in reporting and data were compared to: the expected content, the proportion of components, required processes, required frequency and intervention duration (dose).
The intervention logic model and a benchmark were used to guide data analysis and interpretation. There is no speci c guidance to answer, "How much variation is allowed in delity measurement?" but Durlak and DuPre (52) (2008) indicated, in their meta-analysis of 542 interventions between 1976 and 2006, that outcomes were effective when interventions were delivered with 60-80% delity. They advised researchers not to expect 100% delity and recommended that the variation in delity across sites should be reported rather than only presented as a mean, because this can mask expected variation.
Carroll et al (30) suggest using a benchmark against which to measure delity, as this adds to understanding the quality of the intervention delivered. The benchmark used for this comparison was Phillips' (47) description of an early VR intervention for PwTBI, which informed the development of the FRESH intervention. Quantitative data about the proportion of components delivered by the OTs were compared to the same data provided by Phillips (2013) to illustrate how closely they matched.
Fidelity checklists were analysed using the 5-point scale after each monitoring visit. Data obtained from clinical records and intervention CRFs were triangulated to identify variations in delity and disagreement between data sources. Where there was disagreement e.g. in the recording of a session, the clinical record was considered more likely to represent what had occurred because therapists were more familiar with this form of documentation than the intervention CRF. Descriptive statistics were used to describe the quantity and content of the intervention delivered.
Intervention content was analysed by comparing each CRF with the description of the intervention session in the clinical records. The proportion of time spent on each intervention component was calculated from the CRF. Duplicated data were removed from the analysis and missing data were recorded.
The frequency and duration of the intervention was calculated using the dates recorded in the CRFs and clinical records. Total time spent in direct contact and indirect contact with patients was taken from the content proformas. Data from the CRF and clinical records were compared to these timeframes.
Text describing factors moderating implementation delity (participant responsiveness, intervention complexity, facilitation strategies and quality of delivery) were extracted and triangulated across multiple records ( delity visit checklists, clinical records, mentoring CRFs and interviews). Interview transcripts were analysed by at least two researchers using the framework method (53).

Results
Quantitative data from intervention CRFs were available for 38 PwTBI participating in the FRESH trial.
Of the 38, 15 consented to interview, had a mean age of 39.4 years (range 25-61), 80% (n = 12) were male, six had a severe TBI, four a moderate TBI and ve a mild TBI. Just over half (n = 7) were injured through falling, ve from road tra c collisions, two from assaults and one was unsure. Six had other rehabilitation being delivered and ve had occupational health services involved. Whilst all participants consented to the OT communicating with their employer, only seven consented to a workplace visit.
Participants' job roles included electrician, abattoir worker, carer, rigger, restaurant waiter, teacher, business owner, administrator, IT, warehouse worker, estates manager and doctor.
Five OTs (four women) were recruited with a mean age of 39.2 years (range 34 to 47 years). OTs were quali ed a mean of 11.4 years (range 12 to 15 years). Two quali ed in the UK and three overseas (South Africa, New Zealand and Australia). One held a higher degree in VR. All had experience in the National Health Service (NHS) and with people with neurological conditions (mean 9.7 years, range 3-15 years). Two OTs worked for the NHS (community and acute), two were private practitioners. One OT left the trial. Two OTs were based in one site, the other two sites had one OT each.
Of the 15 TBI participants, 13 consented to their employer being contacted for interview, one was selfemployed and one declined. Six employers consented and were interviewed. Four were line managers of the patient participant, one was a human resources manager and one an occupational health provider. They represented small, medium, and large employers. Two were third-sector organisations, two education facilities, one an NHS Trust and one a restaurant. Triangulation of data identi ed missing data from clinical notes and CRFs, letters that were not recorded on the CRF; intervention sessions recorded on the CRFs that were not recorded in the clinical records; and missing intervention session dates. Table 2 combines all the quantitative data sources (Intervention CRFs, clinical records and delity checklists) and illustrates whether each OT delivered the intervention with delity according to the adherence constructs of CFIF and indicates which type of moderating factor affected the delivery of the intervention. Overall, OTs delivered the FRESH VR with delity. The Fidelity Checklist which suggested the intervention was delivered as intended with core processes almost 'always' or 'often' followed by all therapists. Key: Π -delity met; Π* -delity met except for n = x cases; ** -missing data; timepoint 1 within 10 days of referral; timepoint 2 OT contact every 1-2 weeks, case manager 6-8 weeks; timepoint 3 On graded RTW, weekly for 4 weeks, then fortnightly for 8 weeks, then checks ≤ 8 weeks; timepoint 4 On full RTW contact is 4-8 weeks; RTW -return to work However, therapists delivered the intervention differently to each other, which was in uenced by caseload, participants' needs and circumstances. Because of the intervention's complexity, variation across OTs and different sites was expected. Variation was within 12% of the benchmark (Fig. 2) for most intervention components. An exception was a component (RTW) delivered 30% more than the benchmark by one therapist (OT-D) but within what is considered acceptable (52). Mapping relevant portions of text extracted from clinical records, delity checklists, mentoring CRFs and interviews to the CFIF moderating factors constructs highlighted and explained what affected intervention delivery (Table 2).
Coverage refers to ensuring that su cient proportion of the targeted population receive the entire intervention. Of the 38 PwTBI receiving the intervention, four (10.52%) withdrew after a mean of 17.5 visits (range 6-39) combined. Reasons included: "disagreement with the therapist regarding safety to drive"; "back at work"; "retired"; "coping" and "moved away and no contact details".
Having contact with the employer was considered a core component of the intervention and OTs had contact with all employers/teachers. OTs had direct contact with 14 (37%) employers and indirect contact (vicariously through the PwTBI) with 24 (63%) employers. The benchmark had direct contact with 28.5% (N = 8) employers, indirect contact with 55% (N = 16) and no contact with 16.5% (N = 5). OT-A and OT-B had direct contact with their PwTBI's employers for more than half of their caseload (67% (n = 6) and 55% (n = 4), respectively). OT-D had direct contact with 4 (24%, n = 17) employers and OT-C only had indirect contact with employers.
The three intervention components most frequently delivered across OTs, in descending order, were work preparation (22%) (benchmark 23%), return to work (19%) (benchmark 14%, and assessment (14%) (benchmark 15%). Due to an adaptation to the intervention CRF from the original study by Phillips (2013), there is no comparator to 'family support'. 'Family support' did not feature as a separate component in the benchmark and was included under the heading 'current issues' (Phillips, J. 2013).
Whereas Fig. 1 illustrates the mean proportion of time delivering each component by all OTs, individual variations are shown in Fig. 2 where data are normalised with 0% representative of the benchmark. Not investigating individual differences can "hide" some important ndings. For example, calculated as a mean, the fourth most frequently delivered component was "family support" but only OT-B had delivered this.
Whilst most components were delivered close to 10% variation, three components (current issues, RTW and work preparation) were delivered with greater. Moderating factors extracted from mentoring CRFs and clinical records explained this was due to tailoring the intervention to meet participants' needs. OT-D delivered more 'return to work' due to a single participant who successfully returned to work but then experienced new workplace relationship issues requiring proportionally more OT support. Proportionately, OT-C delivered more work preparation than all other OTs and the benchmark because one participant had pre-existing addiction issues and one with neuropsychological symptoms and this explained the additional preparation required for return to work. OT-B delivered more 'current issues' due to a single participant requiring additional support navigating multiple medical appointments.
Frequency and duration (dose) were recorded on the delity checklist as four key time points and ndings indicated close adherence to the key time points but with some variation. Intervention CRFs and clinical records revealed variations in the number of direct visits with participants. Based on Phillips (2013) study, it was anticipated that participants would receive an average of 11 sessions. The frequency of visits per participant was highest in the rst month and then declined in frequency. The mean number of face-toface sessions per therapist per whole caseload is shown in Table 3.  1. OT-A had a participant who returned to the same job, same employer and remained on the OT's caseload for 11 months but without receiving intervention and used the maximum 12-month allowable as a follow-up period in case any problems occurred (with job retention).
2. OT-B had a participant who did not return to work and needed referrals to 10 further services to meet trauma-related needs. Lengthy NHS waiting times meant the therapist monitored the PwTBI beyond 12 months until these services were in place.
3. OT-C had a participant who had not been in regular work prior to recruitment. Regular intervention was recorded over eight months and discharge was recorded close to 12-months, but contact was made again at the 16-month point but without clear reasons.
4. OT-D had a participant who returned to studying. Clinical records did not provide clear reasons for discharging after the 12-month point.
There was variation in the duration of intervention delivery between therapists and between participants. Nearly three months difference was measured between the participants seen for the shortest (9 days) and the longest (456 days) durations. There was no benchmark comparison for duration.

Moderating Factors
Factors affecting participant responsiveness and acceptability of the intervention varied. Regular contact with participants was required. OTs explained this was di cult in some cases, especially when people had returned to work. Some PwTBI re ected they had busy lives and found it di cult to make time for the intervention.
Communication was affected when participants temporarily stayed out of area, moved out of area or when aphasia interfered with communication. Even when PwTBI provided permission for OTs to contact their employers about RTW, some employers did not reciprocate, which was a barrier to intervention delivery.
The majority of PwTBI interviewed reported that they found the intervention helpful in achieving goals and considered the intervention was delivered appropriately. Mentoring and clinical records indicated that some PwTBI were unhappy with the intervention, or part of it and ceased engagement but did not formally withdraw from the trial. Records also indicated that some felt the intervention was not timed appropriately because of impending surgery or where there was no intention to work in the future.
To facilitate RTW, the OTs ideally worked directly with employers and most PwTBI provided consent, indicating acceptability. Employers interviewed also had positive views of the intervention. Employers valued the OT's expertise and clear communication, which helped to gain their trust and engagement.
Mentoring records revealed that not all PwTBI allowed contact with their employer, but reasons for this were not always recorded.
All OTs were enthusiastic about the intervention and its positive effect on recipients. Some OTs worried about the ethics of providing one group of people a higher intensity programme compared to the usual caseload, others found the autonomy away from usual work restrictions as liberating.
The complexity of the intervention meant that OTs were taught the principles of how to tailor the intervention. Qualitative data showed how this was accomplished by considering individual needs of the PwTBI, the employer and changing needs over time, resulting in variability in delivery. Data indicated the intervention was tailored without detracting from delity and OTs valued being allowed to tailor it.
Facilitation strategies included the intervention training package. Interview data indicated OTs felt prepared to deliver the intervention, and the manual was useful for some early after training whilst mentoring helped OTs develop con dence and expertise. Fidelity visits helped OTs remember and examine their delity to the intervention. OTs' own experience helped in overcoming complexity such as working to meet both the needs of the PwTBI and the employers. Adequate resources for example admin support, from sites helped manage the OTs' time.
Moderating factors related to context included access to NHS systems such as the IT infrastructure to support electronic transfer of referrals, using an NHS.net email account and a secure space for clinical records away from colleagues that might otherwise cause contamination. Local geography affected delivery for example, OTs in London spent more time travelling using public transport. Cooperative working between OTs and other community teams was variable dependant on pre-existing rehabilitation and the experience of the OTs involved. Geographical boundaries and availability of services affected access to specialist services required to support intervention delivery. Limited back ll of the OT's usual role and limited manager support meant OTs were occasionally pulled away from the FRESH VR intervention for example during winter pressures.

Discussion
The multiple data sources and mixed methods showed that OTs delivered a VR intervention with delity to TBI participants in a randomised controlled feasibility trial. Triangulation across and between quantitative and qualitative data sources enabled a detailed and rigorous analysis of delity (55). This design helped identify factors that moderated intervention delivery and explain delity violations (deviations in process or component delivery) to be addressed before a future large-scale trial. For example, cross-referencing quantitative (intervention case report forms (CRFs)) data indicated that one OT spent more than the expected amount of time dealing with 'current issues' and qualitative sources (clinical records and mentoring CRFs) helped to explain the reasons.
The delity measurement methods answered different delity questions. Intervention CRFs, clinical records and the delity checklist helped determine what and how much intervention was delivered for comparison against the benchmark. Descriptions of intervention delivery in clinical records, delity checklists and mentoring records indicated whether intervention processes were followed and explained moderating factors.
Quarterly monitoring visits by the OT mentor to complete delity checklists, each required an entire day to conduct, but provided insight into factors affecting delity. It was also an opportunity to realign intervention delity by re-educating the OTs as recommended by Moore et al., (2014). For example, at one site the delity checklist indicated risk assessments of the home were not routinely conducted, at another it suggested interventions were not always explicitly work-focussed. This was communicated to mentors, closely monitored, and extra support provided through mentoring until everyone was satis ed the intervention was being delivered as intended.
Mentoring records provided real-time indicators to delity deviations and offered a window into trial processes that could be addressed during the intervention delivery period in the live trial. Mentoring also appeared to be an important facilitating mechanism for supporting intervention delity, which is consistent with ndings reported elsewhere (56,57).
While multiple data sources corroborated ndings and facilitated interpretation of moderating factors from different perspectives, there was redundancy in some qualitative measures of delity. For example, interviews with TBI participants indicated issues (moderating factors) relating to how needs changed over time, which were also documented in the OT's clinical records. Trial OTs' frustrations in communicating with participants, were reported in both OT interviews and mentoring records. Given the resource implications of conducting, transcribing and analysing interviews, it could be argued that only using records of mentoring, may be more appropriate in a future trial (12).
The benchmark, which was derived from the description of the VR intervention delivered in an earlier study (47,48), offered quality assurance that the FRESH intervention was delivered with delity despite anticipated variation in delivery. Variation was expected because of complexities associated with TBI, the intervention and work context, and the fact that the VR was delivered by different OTs in different sites.
While some variation is concerned with improving the t of the intervention (58)(59)(60), it may also be seen as non-adherence, negatively impacting on patient outcomes (61). Understanding variations in intervention implementation during trials and potential effects on patient outcomes is important (62).
Using Stirman's (2013) system of coding variations to intervention delivery, the most frequently type observed in this study was 'tailoring'. Therefore, we explored both variation across and between OTs against the benchmark in terms of content and dose, which provided greater clarity about how OTs delivered VR. Trialists should examine this variation to understand implementation in real-world contexts and minimise dilution effects while achieving appropriate adaptation for local contexts (63). In this study, the range of delity measurement tools provided reassurance that variation was not nonadherence.
Some agreement is required as to what is an acceptable level of variation. The ndings in this study suggest that variation of up to 15% in intervention component delivery across all therapists may be acceptable but variation in excess of this should be explored. Tailoring a complex intervention may result in variation and qualitative data should be able to explain this. Providing it remains below 40%, this remains acceptable and consistent with others (52).
In this study the reported factors moderating intervention delivery provided important insights for future trials. They indicated in future there should be additional emphasis in training about the importance of completing intervention CRFs accurately along with practice sessions and emphasis on the importance of intervention dose. Changes in intervention CRFs should include clearer instructions for therapists to speci cally record dates allow accurate reporting of starting and ending of the intervention and introduction of a new CRF to record planned versus actual attendance with reasons for non-attendance.
Changes in the mentoring CRF should include clearer instructions for mentors to routinely discuss delity.
This study has indicated, which are the most useful methods for measuring delity in a future trial, so that we better understand why a complex intervention works or fails (23) and improve future trial designs (21). However, the intensity of data collection and the need for greater contextual understanding of the trial ndings should be balanced with collecting only what is necessary for investigating the effectiveness of the intervention (12).
Although triangulation usefully revealed information about intervention complexity, missing data, the OTs' training needs and the measurement process, which is consistent with others' ndings (44), this generated large volumes of data that required hours of analysis. In a future trial, measuring implementation delity in a proportion of trial therapists and using only the most useful measurement tools ( delity checklist, intervention CRFs and mentoring records) is recommended.
Face-to-face delity monitoring visits were time consuming, but they were valuable (15). The researcher (JP) who assumed the dual role of trainer and mentor was an experienced OT and academic who enabled a professional relationship with the trial OTs. This permitted in-depth enquiry about delity, which may not have been possible with someone less experienced. In a future trial the responsibility of delity monitoring could be held by mentors and data veri ed by an independent researcher.
Even though ndings report on the experiences of only four OTs who implemented a new complex intervention and are unlikely to be representative of all therapists, they highlight important points for consideration when training OTs to deliver complex interventions and measuring their delity in a trial context.

Conclusions
OTs delivered the FRESH VR intervention with delity but also with variation, as expected and this was measured by data from multiple sources. This was useful in a feasibility trial because it identi ed factors likely to affect intervention delity in the future. However, multiple methods answer different questions.
Fidelity checklists answer whether intervention processes were followed and explain the moderators. Adherence is answered with intervention CRFs and clinical records. Only expert mentoring provides realtime indicators to delity deviations and why. Mentoring facilitates delity and provides a window into trial processes if provided throughout intervention delivery. Focussing resources on providing mentoring to therapists delivering an intervention should be considered an important facilitatory tool for implementation delity that affords multiple bene ts. Some methods do not add value to delity measurement and may be wasteful of resources. Qualitative interviews with OTs, participants, employers and NHS staff did not provide additional useful data for delity measurement.
When planning to assess delity as part of a feasibility trial, it is important to capture moderating factors to be able to account for possible threats to delity in a de nitive trial and longer-term clinical implementation in the NHS. Availability of data and materials

List Of Abbreviations
The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.

Competing interests
Jain Holmes received consultancy fees as an employee of Obair Associates to train and mentor FRESH therapists and was subsequently granted PhD Studentship funding from the University of Nottingham and UK Occupational Therapy Research Foundation to work on this study. She was a paid employee of Obair Associates Ltd during the conduct of the study. Comparison of all OTs and Benchmark delivery of intervention components