Similarly, case study design with its inherent flexibility provided a sound basis for harmonising the disparities between sites and provided a platform to test our core indicators. Yin[21] and Cresswell[30] promote the usefulness of this embedded multi-method design for its ability to add or remove data sources without detriment to the overall analysis. Case study design also gave us a solid data analysis strategy that could accommodate and make comparable and meaningful discrepancies across our partner countries. Case studies have been used in clinical practice and research for a number of decades in complex settings including integrated care[31] [26], as well as within an implementation science approach[32] and in EU studies[33].
In addition, the incorporation of case study design was a significant addition to theory building opportunities, going some way towards assembling a deeper theoretical understanding of integrated care. Eisenhardt & Graebner (2007)[34] suggest that theory is emergent, situated in and developed by recognising patterns of relationships among constructs within and across cases. The use of replication logic assisted by pattern matching assists with theory building, in that multiple cases serve as replications, contrasts and extension to the emerging theories. Within SUSTAIN, case study design supported the development of our propositions and consequent explanatory models. Theories embodied within the propositions could be tested and expounded, ultimately leading to theory building, in relation to our central concepts of person-centredness, prevention orientation, safety and efficiency in care delivery, and what seems to ‘work’ in integrated care improvements (see SUSTAIN final report De Bruin et al 2018).[46]
However our implementation research approach could be described as overly simplistic and lacking clear steps to achieve certain EIT goals, which leaves it open to interpretation. In addition, more guidance is needed regarding how each triangle component relate to each other, as well as how the evidence and stakeholders’ input connect to the triangle individually and as a whole. Importantly, it does not describe sufficiently well how the different context levels should be situated within the triangle and how, within a constantly changing environment, it misses out consideration of sustainability of the intervention. But, the simplicity of the EIT could be described as a strength, in that it is understandable and accessible by participants in the real world, vital in the highly participative stance of the framework. Given the variability within our projects and integrated care interventions generally, the lack of clear process information generated better ‘bottom up’ plans about how the components would work together and relate to the intervention as a whole. Indeed, Glasgow et al (2012) see other knowledge translation models as too complicated, academic or time consuming for those who wish to use the evidence. In contrast, they purport the EIT to be applicable and usable in a variety of situations, as we found.
Although stakeholder involvement was a key feature of our approach, the weaknesses of this must be recognised, and particular challenges arose at some of the sites. In the face of universal health and social care resource constraints, considerable commitment was required for stakeholders not only to develop and implement the improvements with research teams at sites, but also to take part in interviews and assist with obtaining quantitative indicators. Our partnership approach fostered through the EIT approach enabled sustained buy-in to a large extent. However during the course of the implementation plan roll-out, two sites withdrew due to competing priorities and a diversion of resources away from the SUSTAIN initiative. We were able to gather valuable data on the context and reasons for this withdrawal to supplement out analysis. Again, the adoption of case study design overcame these flexes during the data collection period and, overall, helped to create useful and transferable results[30].
Case study design also has its critics. One of the most commonly cited disadvantages of case studies is that findings can lack generalisability and scientific credibility because replication is difficult [35]. However, external validity can be stronger in multiple case study designs, which was the choice in SUSTAIN, and can be weak in more highly ranked randomised control trials. Such weaknesses in RCT design have been exposed in a number of systematic reviews and secondary analyses [36].
In practical terms, there are further difficulties that researchers can encounter. For example, there can be a tendency to become overwhelmed with data and the process can be very time consuming, particularly with regard to developing and blending thematic statements from the analysed data sources. This occurs particularly when propositions are lacking and there has been no attempt to link the data collection with the aims of the study in a focused way, or create some boundaries to data collection [25]. In SUSTAIN, we established clear objectives and propositions, protocols for every aspect of data collection and management, analytical templates for ensuring consistency with data analysis, and a shared quality-controlled database. Difficulties still arose however, so to supplement this and optimise uniformity of our evaluation approach, we arranged regular one-to-one progress and ‘trouble-shooting’ calls with research teams and devoted space at six monthly consortium meetings to deal with methods issues.
Moving now to the indicators, the extent to which we were able to develop a core set of applicable measures needs consideration. Given the difficulties with integrated care evaluation, we made efforts in our design to select meaningful and pragmatic instruments through a wide literature search, particularly with respect to measuring service user impact. A number of considerations resulted in a contraction of suitable instruments; for example, they had to be applicable to each of the very different integrated care improvements set up within the 14 SUSTAIN sites; they had to be suitable for administration to frail older people; and our central concepts of person-centredness, prevention orientation, safety needed to be reflected in the instruments. Authors have usefully illumined on the evaluation of integrated care and the utility of associated instruments, many of which were considered during the selection process.[37] [38] However, it became clear early on that several existing and validated indicators for frail older people with multimorbidity would be unsuitable. With quality of life measures for example, this was due to the high possibility that a relatively short interventions would have little impact; and recommended instruments such as PACIC (Vrijhoef et al 2009) were not ‘hitting’ all of our considerations sufficiently closely. We therefore narrowed our focus onto an examination of improvements to care and the personal impact of care delivery, which included degrees of person-centredness, experiences of co-ordination, and perceived control and independence.
With this in mind and after much deliberation within the SUSTAIN consortium, we selected the P3CEQ[39][40], and the PCHC[41], the latter validated for our population group. At the time of selection, the P3CEQ was relatively new but seemed suitable for administration across all sites and intervention types. We did experience, however, some repetition between these two questionnaires, and in some sites there were significant problems with recruitment and fatigue of older people. In response to this, the PCHC was withdrawn, as the P3CEQ seemed more tuned to the SUSTAIN themes and also included items on control and independence in health and social care. Case study design accommodated this adaptation. The data collection and analysis relating to the P3CEQ was not without its challenges during the course of SUSTAIN however. We found it needed essential preconditions (eg. face-to-face administration, collections of reasons for non-response) and administration and coding guidelines (eg where informal carers support service users to answer questions).[42] We conclude through our experiences, that establishing a solid and standard cross-country measure of older service user experiences for integrated care still remains fraught with complexity and somewhat elusive.
Obstacles were more apparent with obtaining quantitative indicators due to the availability, accessibility and reliability of appropriate health and social care data across partner countries. This is due for example to differences in what and how data is collected, variations in the geographical representation of data, and the general lack of social care data, and these problems are persistent. For example, across Europe, data is scattered across systems, is not interoperable, and there are privacy concerns and technical challenges that block effective data recording and sharing at local, national and European levels[18]. In addressing this somewhat ‘hostile’ environment, we co-created a core list with professionals and managers at the sites of what could be obtained from either routine service level data, clinical notes, care plans or other sources (see Efficiency data in table 1 for indicators that were deemed common across sites).
Collecting directly from clinical data and care plans had the potential to be a rich source of data [43] and could overcome the problem of aggregated measures, such as hospital admission data, and their sensitivity to projects where the population group is small and widely dispersed. Similarly, with the cost data, very few sites were able to extract specific costs related directly to the improvement interventions, but an estimate of staff hours was deemed possible, to give some indication of resource use. However using both clinical notes, care plans and staff hours were dependent upon the accurate recording of these events by busy practitioners and managers, which could not be assured, an aspect also acknowledged by Jefferies et al [44]. With clinical notes, this recording was variable and unless prompted, did not always yield the information required. Care plans were not always completed or available; other researchers have had similar experiences and list causes as staff time pressures, poor document construction and communication difficulties with service users, recognised in other studies [7]. With staff hours, although diaries/templates were made available at sites, staff worked across initiatives and were not always able to separate and accurately record specific hours dedicated to the improvement initiatives. So, in most cases this was estimated, and thus the ability to give a sound cost analysis was greatly reduced.
With this last point, difficulties with the measurement of cost in integrated care is the subject of much debate within the literature. Lack of standardised outcomes and continuous changes in care delivery, for example, render the employment of traditional economic models unusable [45]. While SUSTAIN was keen to avoid health economic methods that have a poor fit with the nature of integrated care, it was clear that our more pragmatic approach was also not optimal, and the search for a more reliable and attributable method should continue.
Any deficits within quantitative data were however compensated by the richness of our qualitative data sources. As well as service user and carer interviews, we obtained professional, managerial and other stakeholder viewpoints, alongside documentary evidence from care plans (where available), steering group meetings and field notes. These perspectives provided valuable insights into personal impacts of the intervention, contextual influences and more nuanced information about if, how and why improvements made a difference (see De Bruin et al [46]).
The discussion now moves finally to a consideration of why we selected implementation research over other methods such as realist evaluation. For SUSTAIN, the importance of gaining a consistent and understandable method across different institutions and contexts, as well as involving stakeholders not wholly conversant with research, was paramount. While our approach was not fault-free, realist methods also has its challenges regarding its complexity. For example, Greenhalgh et al (2009) noted that a set of more or less well-defined ‘mechanisms of change’ in reality can prove difficult to nail, and the process of developing CMO configurations is an interpretive task, achieved through much negotiation and dispute. In addition, the authors add that while realist evaluation can draw useful lessons about how particular preconditions make certain outcomes more likely, it cannot produce a simple recipe for success. Given that this latter aspect was a significant factor for our aim of promoting good knowledge transfer, the applicability of realist approaches to our design was limited, with implementation research seemingly more suitable.
Nevertheless, similarities are evident between these different evaluation approaches. While realist uses the development of CMO configurations, implementation research also investigates equally important factors affecting implementation (geographical, cultural beliefs, poverty), the processes of implementation themselves (multi-disciplinary working, local resource distribution) and the end product or outcome of the implementation (Peters et al 2013). Implementation research does not however link the components so strongly, circumnavigating the lengthy interpretation tendency of realist approaches. Nor does it, particularly in the case of EIT, lend itself to so readily to theory generation, unlike realist approaches. Given that, combining implementation research with case study design offered other opportunities for theory testing and development, as outlined above.