DOI: https://doi.org/10.21203/rs.3.rs-1082323/v1
Implementation science is a pragmatic and multidisciplinary field, centred on the application of a wide range of theoretical approaches to close ‘know-do’ gaps in healthcare. The implementation science community is made of individuals on a continuum between academia and practice, but it is unclear to what extent the theoretical deliberations of implementation academics are translated into the work of implementation practitioners and on to patient benefit. This bibliometric study aims to use the field of clinical artificial intelligence(AI) implementation to sample the prevalence and character of theoretically informed implementation practices.
Qualitative research of key stakeholder perspectives on clinical AI published between 2014-2021 was systematically identified. Following title, abstract and full-text screening eligible articles were characterised in terms of their publication, AI tool and context studied, the theoretical approach if any and the research methods and quality. Descriptive, comparative and regression statistics were applied.
One-hundred-and-eleven studies met the eligibility criteria, with monthly eligible publication rate increasing from 0.7-4.0 between 2014-2021. Eligible studies represented 23 different nations and 25 different clinical specialities. A theoretical approach was explicitly employed in 39(35.1%) studies though 6 of these described novel theoretical approaches(15.1%) and the most frequently used theoretical approach was only used 3 times. There was no statistically significant trend in the prevalence of theoretically informed research within the study period. Of the 25 theoretically informed studies conducted in Europe or North America 19(76%) used theories that originated in the same continent.
The theoretical approaches which characterise implementation science are not being put to use as often as they could, and the means by which they are selected also seems suboptimal. The field may facilitate a greater synergy between theory and practice if the focus shifts from unifying implementation theories on to unifying and expanding the implementation community. By making more theoretical approaches accessible to practitioners and supporting their selection and application, theory could be more effectively harnessed to close healthcare’s ‘know-do gaps’.
PROSPERO ID 248025
A minority of recent qualitative studies of clinical AI implementation employ established theoretical approaches from implementation science.
Among the studies that draw on theoretical approaches there is great heterogeneity in both the specific approach and its mode of application.
Theoretical approach selection is at least partly motivated by convenience rather than the specific needs of the target implementation case.
There is a need to stop re-inventing unifying implementation theories, embrace the advantages of theory diversity and focus on supporting theoretical approach selection and use for a wider and more impactful implementation science community.
Implementation science is a relatively young field drawing on diverse epistemological approaches.(1) Its pragmatic goal of bridging ‘know-do gaps’ to improve real-world healthcare necessitates this multi-disciplinary approach. However, the challenge of aligning qualitative and quantitative methodologies has been a persistent source of agitation within the field.(2) Implementation science centres around the application of theoretical approaches to inform or evaluate implementation in a particular healthcare context. Consequently, defining and standardising the terminology and structure of these theoretical approaches has received considerable attention.(3–8) Evidence synthesis,(4) consensus methods(9) and passive taxonomies(10) have been used on many occasions, in attempts to unify the field with no evidence of growing or imminent agreement. This dogmatic discourse serves the interests of academics, but it risks alienating practitioners on whom the real-world impact of the field depends.(11) To objectively gauge the disconnect between the two ends of implementation science’s academia-practice continuum, the present study seeks to examine the prevalence of theoretically informed work within a representative field. Defining exact boundaries to this multidisciplinary field holds challenge and controversy, which can be avoided by sampling research which indisputably focuses on the process of implementation. This is why qualitative methods were made an inclusion criterion for the present study. They hold the necessary sensitivity to investigate the various contextual factors shaping implementation, but also the complexity inherent to implementation. The second motive for a focus on qualitative methods are their foundational relationship with theoretical approaches. An additional analytical layer of theory may be undesirable in quantitative research. In qualitative research however, as set out by Collins and Stockton, theory provides focus and organisation to the study, exposes and obstructs meaning, connects the study to existing scholarship and terms and identifies strengths and weaknesses.(12)
The relevance of the present study to contemporary implementation science also depends on sampling a prominent and active field of health research. Clinical artificial intelligence (AI) forms such a sample. Computer based AI was conceived more than 50 years ago and has been incorporated into clinical practice through computerised decision support tools for a few decades.(13, 14) However, advancing computational capacity and the feasibility and potential of deep learning methods have galvanised public and professional enthusiasm for all applications of AI, including healthcare.(15) The acknowledgment of this potential is formalised in the embedment of clinical AI into national healthcare strategic plans and by the recent surge of regulatory approvals issued for ‘software as a medical device’.(16–19) Scientific publication on clinical AI tools has dramatically increased in recent years, though the proportion of these using qualitative research methods is yet to be described.(20) Despite this, clinical AI remains an area with little demonstrable benefit to real-world patient care and so the well-cited ‘AI chasm’ provides an exemplary ‘implementation gap’.(21, 22)
This bibliometric study aims to describe the prevalence of theoretical approaches in contemporary health systems research, by sampling the field of clinical AI implementation. Each eligible study and the theoretical approach it may employ will be characterised to provide insights into the frequency and nature of qualitative methods used in clinical AI research.
This bibliometric study is part of a qualitative evidence synthesis which followed the PRISMA 2020 reporting guidelines (Supplementary materials 1) and was executed without amendments as published in an a priori protocol.(23, 24) A search string combining concepts related to AI, healthcare and qualitative methods was designed and tuned for sensitivity and specificity in MEDLINE(Ovid) (Figure 1).(25)
This search string was translated to Scopus, CINAHL(EBSCO), ACM Digital Library and Science Citation Index (Web of Science) to cover computer science, allied health, medical and grey literature (Supplementary materials 2). The search was executed in April 2021 and reviewed literature from 2014, which marked the first market authorisations for software as a medical device granted in Europe and the USA.(16) Only English language indexing was required, there were no exclusion criteria relating to full-text language. The initial results were de-duplicated using Endnote x9.3.3 (Clarivate Analytics, PA, USA) and two independent reviewers (JH, MA) performed full title and abstract screening against the criteria (Figure 1) using Rayyan.(26) The process was overseen by an information specialist (FB) and screening disagreements were arbitrated by a separate topic expert (GM). Eligible review and protocol manuscripts were included for reference hand searching only. Full-text review was performed independently in duplicate by two independent reviewers (JH, MA), with the same topic expert (GM) arbitrating. Corresponding authors were contacted up to three times when additional eligible data or reports were suspected (Figure 2).
Two reviewers (JH, MA) jointly extracted characteristics from 10% of articles together before completing extraction independently. These characteristics included a quality score using the Joanna Briggs’ Institute (JBI) 10-point Checklist for Qualitative Research.(27) Data concerning the year and type of publication, source title and field, source impact factor, implementation context, theoretical approach use, study methods and study participants were also collected. This included the nature of the clinical AI tool studied, categorised as rule-based or not depending on whether it executed a pre-existent human dictated decision tree or guideline, or drew on a non-human means of decision making, e.g. regression modelling or machine learning. For each theoretical approach identified the index article was sourced and underwent full-text review by a single reviewer (JH). Nilsen’s taxonomy of theoretical approaches was used to facilitate the organisation of the wide range of approaches employed to study the implementation of clinical AI tools and applications (Figure 3).(10)
Of the 111 eligible reports, 104(93.7%) were full reports in peer-reviewed journals, with three(2.7%) short reports, three(2.7%) theses and one(0.9%) conference abstract. Citations of all eligible papers are available in Supplementary Materials 3. Only two articles were not in English language (Italian and German). Among journal publications, the median impact factor was 3.1 (Interquartile range 2.1,4.5) and 56(58.3%), 6(6.3%) and 34(35.4%) were medical, nursing or midwifery and informatics journals respectively. The median quality score using the 10-point JBI Checklist for Qualitative Research was 8 (IQR 7,8). There was a significant increase in the rate of all formats of eligible publication over time (Figure 4)(Kendell’s tau r=0.691, p=0.018). Access to potentially eligible reports from two published protocols was not possible despite attempted correspondence.(28, 29)
Although there was some representation from developing nations, 88.1% of the 101 reports focusing on a single nation were in countries meeting the United Nations Development Programme’s definition of ‘high human development’(Figure 5).(30) The median human development index of the host nations for these 101 reports was 0.929(IQR 0.926,0.944).
In terms of the clinical AI application studied, 31(27.9%), 24(21.6%) and 56(50.5%) studies considered hypothetical, simulated and clinical applications respectively. The nature of the AI tool under investigation was rule based in 66(59.5%) of reports, non-rule based in 41(36.9%) and not specified in 4(3.6%). The tools studied were aimed directly at the public in 5(4.5%) studies, primary care in 45(40.5%), secondary care in 43(38.7%), mixed settings in 3(2.7%) and unspecified in 15(13.5%). Application was studied across a broad range of clinical specialties, though primary care dominated (28.8%, Figure 6). The tools used scalar and categorical clinical data in 83(74.8%) of studies, imaging data in 9(8.1%) and mixed inputs in 1(0.9%) to perform triage, diagnostic, prognostic, management or unspecified tasks in 17(15.3%), 15(13.5%), 8(7.2%), 47(42.3%) and 23(20.7%) studies respectively.
Clinicians, patients, managers, industry representatives, academics and carers were participants in 95(85.6%), 27(24.3%), 25(22.5%), 15(13.5%), 9(8.1%) and 6(5.4%) studies respectively. Interviews, focus groups, surveys, think aloud exercises, observation and mixed data collection methods were used in 54(48.6%), 19(17.1%), 12(10.8%), 1(0.9%), 1(0.9%) and 24(21.6%) studies respectively. Thematic analysis, framework analysis, content analysis, constant comparative analysis, descriptive analysis, grounded theory approaches, other specified and unspecified data analysis methods were used in 39(35.1%), 11(9.9%), 13(11.7%), 4(3.6%), 6(5.4%), 8(7.2%), 3(2.7%) and 27(24.3%) studies respectively.
In total 39(35.1%) studies stated some form of application of a theoretical approach. In six(15.4%) cases this was a novel implementation theory and the most frequently used theoretical approaches had just three applications (Table 1). Only 1 study explicitly drew on two separate theoretical approaches. There was no statistically significant change in the frequency of theoretical approach use over time comparing the 43 studies published before the median year of publication, 2019, (39.5%) with the 68 published thereafter (30.9%)(χ2, p=0.349). Of the 15 European studies that used pre-existent theoretical approaches, 10(66.7%) used theoretical approaches originating in Europe with the remainder originating in the USA. Of the 10 studies from North America, nine(90.0%) used theoretical approaches originating in the USA, with a single Canadian study using Methontology from Spain. Performing univariate linear regression with source impact factor as the dependent variable, neither the JBI qualitative research checklist score (unstandardised B = 0.12, 95% confidence interval -0.09, 0.32) or use of theoretical approach (unstandardised B = 0.55, 95% CI -0.23, 0.94) had significant predictive value.
Theoretical approach | Year | Author’s nationality | Nilsen's taxonomy(10) | Frequency |
---|---|---|---|---|
Awareness-to-Adherence Model(37) | 1996 | USA | process model | 1 |
Behaviour Change Theory(38) | 1977 | USA | classic theory | 1 |
Behaviour Change Wheel(39) | 2011 | UK | implementation theory | 1 |
Biography of Artefact(40) | 2010 | UK | classic theory | 1 |
Consolidated Framework for Implementation Research(4) | 2009 | USA | determinant framework | 3 |
Clinical Performance Feedback Intervention Theory(6) | 2019 | UK | implementation theory | 1 |
Disruptive Innovation Theory(41) | 1995 | USA | classic theory | 1 |
Fit Between Individuals Task and Technology(42) | 2006 | Germany | evaluation framework | 1 |
Flottorp Framework(43) | 2013 | Norway | determinant framework | 1 |
Heuristic Evaluation(44) | 1990 | Denmark | determinant framework | 1 |
Methontology(45) | 1997 | Spain | process model | 1 |
Normalisation Process Model(46) | 2007 | UK | process model | 1 |
Normalisation Process Theory(47) | 2009 | UK | implementation theory | 2 |
Process Evaluation Framework(48) | 2013 | UK | evaluation framework | 1 |
Programme Sustainability Assessment Tool(49) | 2014 | USA | determinant framework | 1 |
Rapid Assessment Process(50) | 2001 | USA | process model | 3 |
Rogers' Theory of Diffusion(51) | 1962 | USA | classic theory | 1 |
Siitig and Singh Framework(52) | 2010 | USA | process model | 2 |
Strong Structuration Theory(53) | 2007 | UK | process model | 2 |
Technology Acceptance Model(54) | 1989 | USA | determinant framework | 3 |
Theoretical Domains Framework(7) | 2005 | UK | determinant framework | 1 |
Theoretical Framing Theory(55) | 1999 | USA | classic theory | 1 |
Unified Theory of Acceptance and Use of Technology(56) | 2003 | USA | determinant framework | 2 |
Considering the use of theoretical approaches within eligible studies, it is striking that only one third explicitly used a theoretical approach. In contrast to a prior scoping review in the context of guideline implementation, these data did not suggest this low rate of theory application is increasing.(31) The heterogeneity was also notable, with no single theoretical approach being used more than three times and with highly varied modes of use (Table 1). This is not surprising given the plethora of theoretical approaches already documented within the field of implementation science and our own observation that 15% of the studies applied a novel theoretical approach.(32) However, it does raise the issue of how authors come to select a theoretical approach and if this process can be optimised. Given the apparent lack of association between the use of theoretical approaches or research quality with the impact factor of the publishing journal, there are no clear external incentives for researchers to rigorously construct their approach. If the judicious selection and application of implementation theory is not acknowledged by this crude but widely used marker of esteem, then other motives may influence research practice. Only 14(43.8%) of the studies applying a pre-existent theoretical approach provided any kind of rational for their selection and these explanations appeared to be rather vague, e.g. simply stating the method was commonly used. Taken alongside the observed tendency for authors to apply theoretical approaches that originated within their country or continent, it seems that choices made by researchers are based on convenience and familiarity to some extent at least. This status quo seems unlikely to fully harness the full potential of theoretical approaches. A more methodical pairing of each implementation case’s needs and each theoretical approach’s value proposition could substantially improve the quality of implementation practices.(33)
These observations are in support of calls to move the focus within implementation science away from the theoretical approaches themselves and on to increasing the accessibility and efficacy of their use.(34) The alternatives of reverential commitment to a handful of dominant theories, or continuing attempts to create a unifying theory of implementation appear untenable in the face of an ever expanding catalogue of theoretical approaches. Embracing the full extent and value of available theory will require a kind of higher order consolidation however, as implementation science cannot expect to achieve its goals if its advocates must absorb all published approaches. Such a requirement would limit meaningful movement of individuals and expertise between the worlds of research and practice. The use of theoretical approaches must be supported with an interface that is accessible to individuals at any point on the continuum between implementation academic and practitioner.(11, 33) To be sustainable this interface will require a persistent incentive for an independent entity to maintain and update a library of theoretical approaches, searchable by various attributes that describe their suitability to different implementation needs. Such tools, agnostic of any particular theoretical approach, have been very successfully applied in other qualitative methodologies and are becoming established within implementation science.(33, 35) This innovative implementation platform has the potential to minimise the theoretical knowledge requirements demanded of users and welcome an even broader audience to the implementation science community.
The present data have also illustrated the paucity of qualitative enquiry within the field of clinical AI implementation. A recent bibliometric study, similarly aiming to categorise published work on clinical decision support, but without any methodological constraints, identified 88-fold as many publications over the same time period.(36) Such imbalances limit understanding of the context into which tools are to be placed. This under-appreciation and under-investigation of context is a major contributor to the so-called ‘know-do gaps’ in healthcare, which prompted the establishment of the field of implementation science.(1, 56) If research of the context into which efficacious tools are to be implemented does not become more dominant, the translational “AI chasm” may persist regardless of the surge in policy and industry support.(16, 17, 21) This is because the complexity inherent to implementation cannot be meaningfully characterised, understood and managed without the flexibility afforded by qualitative research methods.(8) Such explorations of the broader contextual factors support the identification of mechanisms of successful implementation and facilitate the scale-up, sustainability and spread of innovations.(5) Encouragingly, there seems to be a notable rise in the rate of qualitative work (Figure 4) which in relative terms outstrips the contemporary proliferation of all-methods research in this area.(57) The need for the pragmatic brand of social science encapsulated by implementation remains clear then, but the data are less encouraging of how it has itself been implemented into research and practice.
While the present study addresses the paucity of qualitative assessments of real-world theoretical approaches in implementation science, it has some clear limitations. Implementation science is a multi-disciplinary field with poorly described boundaries.(2) A core assumption of the present study is that research into stakeholder perspectives of clinical AI tools using at least some qualitative methods would be broadly acknowledged within the field. This study’s value also depends on the assumption that approaches within the present sample are generalisable to implementation science practice to some extent. Analysing the motives for theory selection also proved challenging as authors rarely justify their choice, and when they do it is often easy to align the same arguments with a different theoretical approach. Further exploration of authors’ individual collaborations, epistemological commitments or prospective inquiry may yield a more informative insight into these motives.
A real-world sample of implementation science research in the contemporary and prominent field of clinical AI finds that a minority of research is theoretically informed. The type and mode of theoretical application is highly heterogenous and is at least in part led by convenience rather than the alignment of theory and implementation need. To engage a greater range of stakeholders and harness the full range and value of theoretical approaches, we recommend a shift in focus from unifying implementation theories to expanding and unifying the implementation community. To achieve this, we propose increasing the prominence of a theoretical approach agnostic interface as a gateway between implementation science research and practice.
AI Artificial Intelligence
ACM DL Association for Computing Machinery Digital Library
CI Confidence Interval
CINAHL Cumulative Index to Nursing and Allied Health Literature
EBSCO Elton Bryson Stephens Sr. Company
ENT Ear, Nose and Throat
IQR Interquartile Range
JBI Joanna Briggs’ Institute
PRISMA Preferred Reporting Items for Systematic Reviews and Meta-Analyses
UK United Kingdom
USA United States of America
As a bibliometric study using exclusively publicly available data, ethical approval was not sought.
Not applicable
Detailed search strategy from all five databases used is included within the supplementary materials (Supplementary materials 2). A full list of the 111 articles that were eligible for analysis is included within the supplementary materials (Supplementary materials 3).
Two of the authors (JH and PK) collaborate with DeepMind, a company involved in the development and application of clinical AI tools.
This study is funded by the National Institute for Health Research (NIHR) through the academic foundation programme for the second author (MA) and through a doctoral fellowship (NIHR301467) for the first author (JH). The funder had no role in the design or delivery of this study
JH contributed to the conception and design of the work, the acquisition, analysis and interpretation of the data and drafted the manuscript. MA contributed to the acquisition and analysis of the data. PK contributed to the design and conception of the work. FB contributed to the design of the work, the acquisition, analysis and interpretation of data and revised the manuscript. GM contributed to the conception and design of the work, the data acquisition, analysis and interpretation of data and revised the manuscript. All authors approved the submitted version and all authors agree to be personally accountable for their own contributions and to ensure that questions related to the accuracy or integrity of any part of the work are appropriately investigated, resolved and the resolution documented in the literature.
We would like to acknowledge the support of the lay members of the reference groups supporting the design and dissemination of this and other work relating to the same NIHR doctoral fellowship (NIHR301467); Rashmi Kumar, Janet Lunn, Trevor Lunn, Rosemary Nicholls, Angela Quilley and Christine Sinnett.