Data-driven versus theory-driven methods to understand health service implementation problems: A case study of childhood vaccination barriers

Effective implementation requires identifying and addressing behavioural barriers to uptake. This study aimed to compare data-driven and theory-driven approaches to understanding implementation issues. We used childhood vaccination coverage as a case study, with international relevance and wide-ranging barriers contributing to low vaccine uptake. The study utilised the Behaviour Change Wheel framework, which incorporates both individual and system level barriers to behaviour, and is based on several levels of theory: the three components of the COM-B model (capability, opportunity and motivation) can be mapped to the 14 domains of the Theoretical Domains Framework (TDF), which is based on 84 underlying constructs. We rst conducted a review of systematic reviews of parent-level barriers to childhood vaccination. Subsequently we: 1) inductively coded these barriers into a data-driven framework, using thematic analysis; and 2) conducted theory-driven mapping of these barriers to TDF domains and constructs. Coding and mapping were undertaken by two authors independently, and discrepancies were resolved through discussion. The data-driven and theory-driven results were compared.

It is unclear whether theoretical approaches to understanding implementation problems, such as the TDF and COM-B, produce different results when compared to data-driven approaches, such as literature reviews This paper outlines a new behavioural diagnosis procedure comparing theory-driven and data-driven approaches to understand an implementation problem of global signi cance We illustrate a unique methodology using several levels of theory, and identify new directions to improve the speci city of theoretical behavioural constructs in future research The paper illustrates how data-driven and theory-driven approaches synergise to produce a more comprehensive understanding of health service barriers than using either approach alone Background Effective implementation of a health service program, guideline or treatment requires identifying and addressing behavioural barriers to uptake. This may involve reviewing existing literature if the problem is well researched, or conducting original research if the context or problem is new. Incorporating theoretical models or frameworks can ensure all possible behavioural in uences are considered 1 .
Behaviour change is complex. The use of theoretical models enables us to understand the mechanisms of change in behaviour, which can then be targeted in interventions. Multiple theories are used in healthcare, from simple models of individual health behaviour change like the Theory of Planned Behaviour 2 , to broader systems thinking approaches to map the complexity of policy drivers 3

. The
Behaviour Change Wheel (BCW) is one approach that attempts to bring individual and system level factors together 4 , based on the COM-B model that synthesises the 14 more speci c behavioural constructs in the Theoretical Domains Framework (TDF) 5 into broader categories.
The TDF summarises the many overlapping constructs in the behaviour change literature, and was developed through expert consensus from 128 theoretical constructs in 33 theoretical models of behaviour 6 . It provides an overview of 14 key theoretical constructs that explain health behaviour, and is a descriptive framework rather than a theory of causality. A separate systematic review of 19 frameworks for behaviour change interventions led to the BCW, which aims to guide the development of interventions by connecting the determinants of behaviour with behaviour change techniques 4 . Developed in conjunction with the BCW, and at its central core, is the COM-B model which proposes that behaviour is a product of the interaction between capability (psychological or physical), opportunity (social or physical) and motivation (automatic or re ective) 4,6 .
The COM-B and TDF have been mapped to each other, but there is some duplication of the current 14 TDF domains across the COM-B components. Capability is the psychological or physical ability to enact a behaviour, which includes the TDF domains of knowledge, skills, memory/attention/decision processes, and behavioural regulation. Opportunity is the physical and social environment that enables the behaviour, which includes social in uences and environmental context/resources. Motivation is the re ective and automatic mechanisms that activate or inhibit a behaviour, which includes the widest range of TDF domains: social/professional role and identity, beliefs about capabilities and consequences, optimism, intentions, goals, reinforcement and emotion. Table 1 summarises this theoretical relationship. Beliefs about Consequences: Acceptance of the truth, reality, or validity about outcomes of a behaviour in a given situation Beliefs about Capabilities: Acceptance of the truth, reality, or validity about an ability, talent, or facility that a person can put to constructive use Intentions: A conscious decision to perform a behaviour or a resolve to act in a certain way Goals: Mental representations of outcomes or end states that an individual wants to achieve Motivation: Re ective and Automatic Social/Professional Role and Identity: A coherent set of behaviours and displayed personal qualities of an individual in a social or work setting Optimism: The con dence that things will happen for the best or that desired goals will be attained Motivation: Automatic Reinforcement: Increasing the probability of a response by arranging a dependent relationship, or contingency, between the response and a given stimulus Emotion: A complex reaction pattern, involving experiential, behavioural, and physiological elements, by which the individual attempts to deal with a personally signi cant matter or event Primary research is often used to identify barriers to implementation in different health service contexts, and this is the approach generally used with the TDF 6 . Some issues have been well researched, but this evidence must be synthesised in order to inform comprehensive intervention design 7 . Previous reviews have applied theoretical frameworks to help with this. For example, the BCW can be used to describe interventions in terms of broader functions 8 , and the COM-B can be used to display barriers and facilitators at multiple levels (patient, provider and system) 8 . The TDF can be used together with the COM-B to group barriers and facilitators of health outcomes 9 , or as a stand alone framework 10 .
These reviews have not compared the utility of this deductive approach (i.e. theory-driven grouping of barriers according to the existing TDF or COM-B construct categories) versus inductive approaches (i.e. data-driven grouping of barriers based on thematic similarities derived from the barrier descriptions themselves). There are potential advantages and challenges to each approach. A deductive application of these theoretical frameworks ensures that all psychological constructs relevant to behaviour are considered, even if research has not identi ed every construct. However, since these theoretical frameworks are based heavily on psychological theory, the internal 'motivation' aspect is more clearly de ned than the more external 'opportunity' aspect. This imbalance does not necessarily align with the prevalence and signi cance of practical issues in health service implementation, which might be de ned as "physical opportunity".
The aim of this paper is to illustrate how data-driven and theory-driven approaches can complement each other to produce a more comprehensive understanding of health service barriers than using either approach alone, using parent uptake of childhood vaccination -an international issue with wide ranging barriers identi ed in multiple reviews -as a case study.

Theoretical approach
The study was based on the BCW framework because it incorporates both individual and system level barriers to behaviour, and is based on several levels of theory: the three components of the COM-B model can be mapped to the 14 domains of the TDF, which is based on 84 underlying constructs 4 .

Context: The Vaccine Barriers Assessment Tool (VBAT project)
This analysis is based on data gathered for the Vaccine Barriers Assessment Tool (VBAT) project, which aims to design and validate a survey tool to diagnose the causes of under-vaccination in children under ve years. Developed in Australia and New Zealand, VBAT aims to incorporate both access and acceptance barriers in a comprehensive tool which will include both short and long form versions, for different uses. A review of systematic reviews of barriers to childhood vaccination was conducted, and 583 descriptions of parental barriers to childhood vaccination uptake were extracted and categorised into data-driven categories 11 . Barriers were extracted if they were reported from or relevant to the speci c perspective of parents of children under ve years; barriers from the perspective of health professionals or the health system alone were not included. The ndings of the review were thematically organised into a data-driven framework of barriers. In a separate theory-driven process, the 583 barrier descriptions were mapped to the 14 domain version of the TDF, to check whether any theoretical determinants of childhood vaccine uptake were missing in the systematic review data. The purpose of this exercise for the VBAT project was to ensure that a comprehensive pool of potential survey questions could be generated that captured both access and psychological or acceptance barriers. The data-driven review and development of the VBAT items will be reported separately. In this paper, speci c terms are used to refer to data-driven versus theory-driven concepts as outlined in Table 2. independently conducted the data-driven coding of barriers in Excel. Two authors with expertise in behavioural science (CB) and vaccination (JT) independently conducted the theory-driven mapping of barriers to domains and constructs in a separate Excel spreadsheet. In both instances, disagreements were resolved through discussion. The data-driven and theory-driven categories were compared using cross-tabulations. Discrepancies between the two approaches were discussed with the wider VBAT study team, which includes expertise in vaccination programs and public and primary health (MD, DD, LT, ST) and psychometric assessment (DC).
The data-driven coding process produced 74 barriers (reported in detail elsewhere) within 7 categories (access, clinic or health system factors, concerns and beliefs, social determinants (e.g. socio-economic status), health perceptions and experiences, knowledge and information, social or family in uence). The theory-driven mapping of barriers was initially based on the 14 domains of the most recent TDF version, but this was expanded to consider the 84 constructs when it was di cult to categorise some barriers. The nal mapping was therefore completed using a combination of domain and construct, with de nitions documented for how these would be applied to the childhood vaccine context from the speci c perspective of parents (not health professionals or health systems).

Results
Mapping data-driven barriers to theory-driven domains The initial de nitions used to compare data-driven barriers with theory-driven domains/constructs led to 89% agreement at the domain level (e.g. all barriers relating to the clinic setting will be under the domain of Environmental Context and Resources). Resolving disagreements for the domains and subsequent constructs required further de nitions at the construct level before 100% agreement was reached. Table 3 illustrates this for the domain of Environmental Context and Resources, e.g. we decided that issues relating to how appointment times are managed will be under the construct of Organisational culture/climate; while issues relating to inconvenient access for the parent will be under the construct of Person x Environment Interaction. The full list of de nitions in available in Appendix 1.  Figure 1 shows the number of barriers represented in each theoretical domain, and Table 4 shows the relationship between the theory-driven COM-B components and TDF domains, and the data-driven barriers identi ed in the systematic review. Of the 14 TDF domains, 10 were clearly identi ed in the data from the barrier reviews. Some domains were not speci c enough to differentiate between types of barriers, while other domains were not well covered in the review data. Two domains did not specify the data-driven barriers clearly, with many different concepts grouped together under generic terms (Beliefs within Beliefs about Consequences; Barriers and Facilitators within Environmental Context/Resources). Six domains were not covered by the rst coder but two were included after discussion with the second coder (issues relating to good/bad communication between parent and provider were moved to Skills; things that help the parent to vaccinate their child on time such as including with another appointment were moved to Reinforcement). Four domains were not clearly covered in the nal agreed coding: Optimism, Intentions, Goals and Behavioural Regulation (with the exception of two very general barriers for Intentions and Goals with no further explanation). Within the 14 TDF domains, many speci c constructs were not identi ed in the data, especially those relating to health professional perspectives (e.g. Skills) or interventions (e.g. Behavioural Regulation). This is shown in yellow in Appendix 1. Emotion Anxiety about vaccination, fear of needles, psychosocial distress *Note: These 4 domains were not included in the rst round of coding. Intentions and goals were later included after discussion with a very lenient interpretation of the data-driven barriers to maximise he number of domains covered, given the aim of the exercise was to generate questionnaire items covering all possible behavioural in uences. No data-driven barriers could be interpreted as behavioural regulation or optimism.

Discussion
Overall, we found it useful to synthesise health service implementation barriers using both data-driven and theory-driven methods to gain a comprehensive understanding of the barriers to childhood vaccination. The data-driven categories represented the review data in a clearer way than the theoretical domains, with better differentiation; but the four missing theoretical domains were useful as a way to identify key gaps to be addressed in the item pool for developing a new tool to diagnose the causes of childhood under-vaccination.
Resolving con icts at the domain level was relatively easy, with 100% agreement reached quickly for the most relevant domain. However there were some barriers that could have been placed in 2 or 3 domains, e.g. previous experience of vaccine side effects could be framed as knowledge, beliefs or salient events.
Resolving con icts at the construct level was more di cult because many constructs within a domain were very similar when applied to the brief barrier descriptions extracted from reviews, for example the in uence of family member opinions could t within group identity, social norm or social pressure. The decisions made at construct level were arguably more subjective than the domain level, but both needed to be considered to make sense of many barriers that could be framed in different ways.
For this study it was necessary to go into more theoretical detail than the commonly used models: the COM-B and TDF. Importantly, the gaps identi ed in our data-driven review would not have been found if the analysis had only been done at the COM-B level, as all six components were addressed by the 10 data-driven domains. In addition, the 14 TDF domains were still not speci c enough for two coders to reliably map the barrier data so we were required to go back a step to the 84 theoretical constructs that informed the TDF development. We found it helpful to use a combination of domain and construct level to map the data. A previous review using the TDF identi ed some issues that could not be mapped to the TDF (e.g. clinician and patient characteristics), but some of these could be mapped at the construct level depending on the framing, e.g. under professional identity, skills, environment x person and resources constructs 12 .
This paper provides a methodology for anyone seeking to understand an implementation issue that already has a large amount of qualitative and/or quantitative research -complementing an earlier paper that focuses on how to apply the TDF in primary qualitative research 6 . There are several practical implications for other researchers seeking to comprehensively understand implementation barriers using theoretical models in this way. Firstly, you need to decide on very speci c framing for a health situation, e.g. only looking at the parent perspective on vaccinating their child determines how you frame barriers relating to the doctors' knowledge. Conducting this process from the health professional perspective would produce different results in terms of the theoretical constructs identi ed in the literature. Secondly, the COM-B model was not speci c enough with uneven explanation of different barrier types; so you may need to go into more detail at domain and construct level to interpret the data. Thirdly, theory was useful for identifying gaps in a data-driven review of literature, but data driven categories made more sense for the speci c implementation topic. So the value of using a theory-driven approach may depend on the purpose of conducting the review. For our purposes, this review will inform the development of a diagnostic tool to measure the causes of under-vaccination, requiring us to include the widest possible range of behavioural drivers. For other projects, it may be more prudent to focus only on the theoretical drivers that are within an organisation's control to address, or to identify data-driven issues from the perspective of key stakeholders to ensure their interest and support.
More generally, this study has implications for theoretical models commonly used in implementation science. Some constructs are vague and became catch alls -e.g. barriers and facilitators in this case. On the other hand some constructs are too speci c and hard to distinguish -e.g. group vs social norms, so it makes sense to combine this into one TDF domain. In our experience, the decision was often between constructs in different domains, rather than constructs within a domain, suggesting that there are some issues with the way the TDF domains are differentiated. On the other hand, the construct level was often too subjective and detailed to identify clear gaps in data. This suggests that overarching models like the COM-B and TDF need to be supplemented with more context-speci c models for different health areas (e.g. prevention versus treatment of infectious disease), targets of behaviour change (e.g. parents versus doctors), and the context (e.g. higher resource settings where psychological barriers may be more important, versus lower resource settings where practical access issues require greater differentiation). Another option would be to use broad implementation frameworks that include practical issues like cost, such as the Consolidated Framework for Implementation Research (CFIR) 13 . Other researchers have found it helpful to combine the TDF and CFIR for a more comprehensive approach 14 . A third option would be to add more speci c domains to the next version of the TDF to better differentiate between issues relating to "Environmental Context and Resources". In our review, this covered a very wide range of issues: socio-economic issues such as having low income, societal issues like the in uence of media, health system issues like vaccine supply and cost, and individual access issues like distance and time. This was found to be a catch all category in many previous reviews of clinicians and patients using the TDF 12,15−19 , so is not limited to the issue of vaccination barriers. For example, a review of barriers to low back pain guidelines found this domain was common to 4/5 clinician behaviours while many other domains were not covered at all 17 . Another review on diabetic screening identi ed 17 barriers in this domain versus 6 for the next most common domain 15 . Further development of this construct may need to be speci c to different health topics.
The TDF domains that weren't covered well by the vaccination barrier review data -optimism, behavioural regulation, intentions and goals -may have been found in other areas of the literature.
Optimism is often researched as a personality-based predictor of health 20 , and this conceptualisation is unlikely to be identi ed as a public health barrier or the target of a public health intervention. Behavioural regulation may not relate as well to occasional behaviours like vaccination, compared to something like eating healthy food, which requires daily monitoring 21 . Intentions and goals may be more likely found in theory-based intervention literature where intention is a common outcome 22 , rather than the barrier literature where intention is conceptualised more as a product of the barriers. These domains may be appropriate to understand other health contexts, but for this case study they were less relevant. However, the low prevalence of these domains appears to be similar to some previous barrier reviews on different topics (e.g. the reviews on low back pain guidelines and diabetic screening described above 15,17 ).
This study addressed reliability by using a method of independent coding using both inductive and deductive approaches. Our team included a wide variety of expertise to help contextual framing for theoretical constructs as applied to data-driven barriers. The limitations include restricting our review data to parent barriers only, which affected the way that health professionals' and heatlh system barriers were conceptualised. We also applied only one overarching framework to behaviour change models, and acknowledge that there are many other approaches to this theoretical issue.
In conclusion, using both data and theory approaches can help achieve a more comprehensive understanding of health service implementation problems. However, the process is subjective so requires a wide range of expertise to reduce biased interpretation and to maximise utility of the identi ed barriers for the speci ed purpose.

Conclusions
Using both data-and theory-driven approaches can help achieve a more comprehensive understanding of barriers to health service implementation. The data-driven categories represented the review data in a clearer way than the theoretical domains, with better differentiation; but the missing domains were useful as a way to identify additional issues to investigate further. Both approaches resulted in a comprehensive list of barriers to vaccination that would not have been achieved using either approach alone. This will inform a diagnostic tool to measure the causes of under-vaccination.

Declarations
Ethics approval and consent to participate Ethical approval not applicable.

Consent for publication
Not applicable Availability of data and materials The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.
CB planned the study, conducted the analysis, and drafted the paper. JT and JK conducted the analysis and were major contributors in writing the manuscript. DC, DD, LT, ST and MD contributed to group discussions about the analysis approach and interpretation of the results, and revised the manuscript. All authors read and approved the nal manuscript.

Figure 1
Number of data-driven barriers in each TDF domain

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.