Study context
This study was situated within an interpretivist paradigm and sought to understand how, when, and why the complex phenomenon of programmatic assessment worked by developing a program evaluation framework (a ToC) using data from mixed sources (Rees et al., 2023). The research was conducted across four settings and approval was obtained at all institutions (Monash University Research Ethics Committee approval no. 28847, Edith Cowan University Research Ethics Committee approval no. 02561, University of Canberra Research Ethics Committee approval no. 9369, and University of Wollongong Human Research Ethics Committee approval no. 2021/333). All team members had academic roles and experience in HPE. Four members (JJ, CP, RB, SG) had led the design and introduction of programmatic assessment in their respective programs. These members, and JL, had hands-on experience within both programmatic and traditional assessment approaches at their institutions.
Terminology for contribution analysis
Terminology is used interchangeably and inconsistently within contribution analysis literature and so we first present terms and definitions (Table 2) (Mayne, 2019). Notably, outputs and outcomes are distinct concepts. Outputs refer to the tangible goods and services derived from program activities, while outcomes denote changes in behaviours and actions (Hall et al., 2021). Within the ToC, outcomes are stratified into capacity change, behavioural change, direct benefit, and well-being change (Mayne, 2017).
Table 2. Terms and definitions used in contribution analysis, including alternative terms evident in published literature (Lemire et al., 2012; Mayne, 2015, 2017, 2019).
Term
(alternative term(s))
|
Explanation
|
Impact pathway
(results chain, casual chain, logic model)
|
A pathway depicting the sequences of steps or events from activities to outcomes.
|
Assumption
|
Salient connections, events and or conditions necessary for a link within an impact pathway to function as expected and fulfil program outcomes.
|
Theory of change
|
Structured assembly of impact pathways and assumptions presenting how program activities lead to outcomes. Components are activities, outputs, reach and reaction, capacity change, behavioural change, direct benefit, and well-being change.
|
Activities
|
Observable actions undertaken as part of the program.
|
Outputs
|
Tangible goods and services that directly result from the program activities being undertaken.
|
Reach and reaction
|
Identification of the target group (reach) who are intended to receive program outputs and their reaction to the program (reaction).
|
Capacity change
|
Changes in knowledge, attitudes (beliefs, opinions, feelings, perspectives), skills (mental and physical ability to use new or alternative practices), aspirations (ambitions, hope, objectives, or desires) and opportunities of the target group who receive or use the program outputs; established using the COM-B model which is the influence of capabilities (C) and opportunities (O) on motivation (M) which are necessary for behaviour (B) change.
|
Behavioural change
|
Changes in practice that occur in the target group due to capacity change.
|
Direct benefit
|
Improvements in the target group derived from behaviour change.
|
Well-being change
|
Long-term accrued improvement in the well-being of beneficiaries who may or may not be the program target group.
|
External influences
|
Events and conditions unrelated to the program but that contribute to the realisation of intended outcomes.
|
Nested theory of change
|
Additional theories of change which capture a particular component of the complex program.
|
Contribution claim
|
Statement(s) presenting evidence and describing the mechanism, or lack thereof, for the contribution the program (or component) makes to observed outcomes.
|
Contribution story
|
Central narrative that explains how a program, and components, contribute to the observed outcomes.
|
Relevant Explanation Finder
|
Structured framework facilitating critical review of collected data (in step 3 of contribution analysis) against the theory of change.
|
The following sections describe the six steps of contribution analysis and application in the present study to evaluate programmatic assessment, including the qualitative multi-centre study undertaken in step 3.
Step 1: set out the cause-effect issue to be addressed
The first step, describing the problem which the evaluand seeks to address and developing cause-effect questions, serves to focus the evaluation (Mayne, 2012) and is usually undertaken by a team with program experience (Biggs et al., 2014; Buregeya et al., 2020; Choi et al., 2023; Riley et al., 2018). Following this approach, we (JJ, CP, MH, SG) used our experience of programmatic assessment to develop cause-effect questions. Three of us had led the design and implementation of programmatic assessment at two dietetic programs (JJ at Edith Cowan University (Jamieson et al., 2017), and CP and SG at Monash University (Palermo et al., 2017)) with the fourth researcher having extensive experience in teaching programmatic assessment (MH). We had each evaluated programmatic assessment within and across the two programs providing a comprehensive understanding (Dart et al., 2021; Jamieson et al., 2022; Jamieson et al., 2021). One researcher (JJ) developed cause-effect questions which were reviewed and agreed upon by the other researchers (CP, MH, SG). The cause-effect questions were (i) what factors have influenced the implementation of programmatic assessment? (ii) what role did programmatic assessment contribute, or not, to intended outcomes? (iii) what conditions were necessary to achieve this contribution to the outcomes? These questions guided the ToC development (in step 2) and gathering of data (in step 3) by framing focus group questions with key stakeholders.
Step 2: develop the postulated theory of change and risks to it, including rival explanations
An initial ToC is then iteratively developed often using existing evaluative data (Biggs et al., 2014), stakeholder consultation (Biggs et al., 2014; Delahais & Toulemonde, 2017; Downes et al., 2019; Koleros & Mayne, 2019), and expert discussion (Delahais & Toulemonde, 2012; Delahais & Toulemonde, 2017; Koleros & Mayne, 2019), providing a sound comprehension of program activities, outputs, and intended outcomes (Mayne, 2012). Top-level outcomes (well-being change) are usually first identified and then, progressively working backwards, proposed impact pathways are constructed (Riley et al., 2018). Importantly, a robust and credible ToC is critical as it is the framework against which collected evidence is later evaluated (Mayne, 2012, 2019).
We applied several steps to develop the initial ToC for programmatic assessment. First, we selected the Edith Cowan University dietetic course as a case study as we possessed extensive experience designing, implementing, and evaluating programmatic assessment in this context and it had been operational for several years. The lead researcher (JJ) first familiarised themselves with the contribution analysis literature and completed training in program theory. The same researcher then conducted a focus group with faculty staff (n = 2) who co-designed the programmatic assessment at Edith Cowan University and had been key stakeholders in subsequent implementation over the prior two years. These practical experiences provided valuable insight when building the ToC. Both focus group participants were provided questions in advance (Online Resource 1), including ToC explanation and key terms, to optimise discussion. As this focus group (in step 2) had the purpose to develop the initial ToC, the questions were derived from the cause-effect questions established in step 1. During the focus group, the researcher wrote goals, outcomes, activities, assumptions and influencing factors identified by participations onto post-it notes. Then as a group, we iteratively discussed the ordering of links, starting with outcomes, and working backwards to determine how outcomes were achieved (this initial mapping is provided in Online Resources 2 item (a)).
After the focus group, one researcher (JJ) identified and unpacked root causes and consequences of the problem which programmatic assessment sought to address using problem analysis. Consulting published research on competency-based assessment, this involved articulating the core problem and identifying, and untangling, contributing factors and consequences. This produced a Problem Tree which was verified by co-researchers (CP, MH, SG) and, along with the focus group mapping, became the ToC starting point. Next, we (JJ, CP, MH, SG) iteratively developed impact pathways through repeated reading of the Problem Tree, focus group mapping, literature on programmatic assessment, and personal experience, and developed and refined the initial ToC using the COM-B model proposed and detailed by Mayne (2019) (Online Resource 2 item (b) and (c)). COM-B is a social science model positing the influence of capabilities (C) and opportunities (O) on motivation (M) with all three critical inter-related conditions needed for behaviour (B) change (Mayne, 2017). COM-B was introduced to contribution analysis by John Mayne in 2016 as a structured model to explore and explain drivers of behaviour change, enabling robust ToC on which to base inferences (Mayne, 2018, 2019). In the final stage, ToC robustness was evaluated by one researcher (JJ) using the ToC Analysis Criteria given by Mayne (2017), resulting in minor revisions which were reviewed and agreed by all other researchers (CP, MH, SG).
Step 3: gather existing evidence on the Theory of Change
Step 3 involves gathering evidence to determine ToC validity (Riley et al., 2018). Sufficient and rigorous evidence is required to determine if the postulated ToC impact pathways and outcomes occurred as posited (if at all), validate or challenge assumptions, and identify factors and insufficiencies; all which underpin contribution claims (in step 4) and the contribution story (in step 6) (Mayne, 2012). Evidence can be obtained from existing published and unpublished evaluations (Biggs et al., 2014; Choi et al., 2023) while others have conducted mixed methods research to collect data (Buregeya et al., 2020; Hersey & Adams, 2017; Junge et al., 2020), or a blend of multiple methods (Delahais & Toulemonde, 2012; Delahais & Toulemonde, 2017; Downes et al., 2019). Due to the paucity of evaluative data on programmatic assessment at the time of this research, we chose to conduct a multi-centre qualitative study in this step. The following paragraphs present the methods for the multi-centre qualitative study which occurred in step 3 of contribution analysis.
Participant recruitment
In June 2021, two researchers (JJ and CP) delivered a videoconference workshop (Zoom™ Video Communications Inc) on programmatic assessment with representatives from all 16 accredited Australian dietetic programs. Attendees represented 12 of the 16 accredited Australian dietetic programs. During the workshop, researchers first presented programmatic assessment using the clustering of the 12 principles into three themes proposed by Bok et al. (2021). Attendees then completed an activity which determined the extent, if any, programmatic assessment had been implemented in their respective programs which serving the dual purpose of identifying institutions that were implementing programmatic assessment. At the end of the workshop, participants were notified of the research project and asked to email completed activity responses to the researchers (JJ and CP) if they had an interest in joining as co-researchers. Representatives from the four universities who were not in attendance at the workshop were also notified of the research by email and provided the opportunity to appraise their programs and express an interested in collaboration and were sent one reminder. Of the 16 accredited programs, eight returned the form which was reviewed by four researchers (JJ, CP, MH, SG) and of these, five were deemed to have implemented programmatic assessment in accordance with the published principles (Heeneman et al., 2021). One researcher (JJ) then had a videoconference meeting with the program representative to verify adherence to the published principles of programmatic assessment through discussion . After the meeting, one university declined to participate in the research as it exceeded their work capacity at that time. The four remaining programs, Edith Cowan University, Monash University, University of Canberra, and University of Wollongong were included in the study and representatives joined the research team (RB at University of Canberra; JL at University of Wollongong; with JJ at Edith Cowan University; and CP, MH, and SG at Monash University). The collaboration with co-researchers was critical to connecting with key stakeholders for broad context to explore the research question.
From these four participating universities, three key stakeholder groups were recruited for the qualitative study: faculty-employed academics who had responsibility for teaching and assessment, graduates who had met requirements for program completion, and workplace supervisors who were employed at a placement provider and oversaw learner tasks during placement. Inclusion criteria for participation required affiliation with one of the four programs in the prior 12 months. Twenty-one focus groups with faculty (n = 19), graduates (n = 15), and supervisors (n=32 participants), were held across the four universities. Participant characteristics are presented in Table 3. Participants who were employed (n = 57) worked in community or health promotion (n = 23), hospitals (n = 28), education/ teaching (n = 22), aged care (n = 5), food service (n = 5), research and development (n = 5), private practice (n = 4), and or disability (n = 1). Participants reported being employed on a full-time (n = 32), part-time (n = 21), casual (n = 2), or other (n = 2, parental leave, contract) basis.
Table 3. Focus group participant demographics, undertaken in step 3 of contribution analysis, presented according to stakeholder groups.
|
Faculty
(n = 19)
|
Graduates
(n = 15)
|
Supervisors
(n = 32)
|
University
Edith Cowan University
Monash
University of Canberra
University of Wollongong
|
4
5
5
5
|
6
7
1
1
|
1
11
9
11
|
Age (years)
|
47 ± 7 (35 – 58)
|
30 ± 10 (22 – 50)
|
38 ± 8 (28 – 57)
|
Gender
Female
Male
Non-binary
|
19
-
-
|
12
2
1
|
32
-
-
|
Employment location
Metropolitan
Regional
Rural or remote
Not employed
|
18
1
-
-
|
5
1
-
9
|
25
4
3
-
|
Research setting
All four universities offered a postgraduate dietetic program that mandated 100 days of workplace-based placement where learners undertook authentic activities under the supervision of workplace supervisors. Programmatic assessment had been introduced to the dietetic program at Edith Cowan University in 2016, 2018 at Monash University, 2016 at University of Canberra, and 2018 at University of Wollongong. Three of the dietetic programs were post-graduate with variable sized cohorts (20 students at Edith Cowan University, 25 – 40 at UC, and up to 61 at Monash University). The University of Wollongong had both an undergraduate (30 students) and postgraduate (20 students). Each program adhered to the twelve principles of programmatic assessment (Online Resource 3) as each uniquely utilised an andragogical justified (principle 5) mix (principle 4) of feedback-rich (principle 2) assessment moments, conceptualised as low-stakes data-points (principle 1 and 6). Learners participated in learning meetings (principle 11) where performance was reviewed and discussed, based on low-stakes data-points, functioning as intermediate check-in moments and providing an opportunity for individualised remedial action (principle 10). Low-stakes data-points were collated and reviewed by at least two independent assessors who used consensus building to make a high-stakes decisions which determined program progression and graduation (principle 3, 7 and 9). All four programs applied the Dietitians Australia National Competency Standards (Palermo et al., 2016) as the framework for designing the programmatic assessment and making high-stakes progression decisions (principle 8) and adopted a learner-centred education paradigm where learner agency was promoted (principle 12).
Data collection
Each co-researcher (JJ, SG, RB, JL) sent the expression of interest email to stakeholders affiliated with their respective programs. The email provided study information and a QualtricsTM (Provo, UT) survey link for interested individuals to indicate availability for a focus group and provide demographic data (age, gender, geographical location, area(s) of work practice, current workload). Graduates were also contacted using personal messages in LinkedIn to maximise participation. One researcher (JJ) reviewed all Qualtrics survey responses and organised focus groups.
Separate videoconference semi-structured focus groups were held for each of the four program and stakeholder type between October 2021 and February 2023, with interruptions incurred due to the COVID-19 pandemic. Separate stakeholder focus group sessions (i.e., graduates, or faculty, or supervisors) for each university were conducted to enable a homogenous discussion about each programmatic assessment. One researcher (JJ or SG), not affiliated with the program, facilitated each focus group with the co-researcher, affiliated with the respective program and known to participants, also in attendance (JJ, SG, JL, or RB). While we recognise this propinquity may have given rise to pre-determined aspirations and judgements (Berger, 2015), and, the pre-existing relationships may have influenced participants sharing, we determined that the insider knowledge was important to contextualise discussions ensuring subtleties were not missed. In mitigation, the primary focus group facilitator was external to the program under study giving an outsider positioning. Perspectives and assumptions were handled through multiple researchers being involved in data analysis and interpretations, whereby researchers checked the findings against elements of the raw data.
Nine focus group questions were developed by researchers (JJ, CP, SG, MH) with consideration to the cause and effect questions (developed in step 1) and postulated ToC (developed in step 2). After the first seven focus groups (two with faculty, two with graduates, three with supervisors) an additional five questions were added to capture discussions not explicitly explored in the initial questions, but which were deemed relevant to the contribution analysis evaluation. These additional questions further investigated participants’ views and experiences by exploring differing opinions regarding high-stakes progression decisions, relationships, negotiating learner underperformance, and employability (Online Resource 4). Focus groups were between 24 and 81 minutes in duration and were audio-recorded. Otter.ai (AISense) was used to transcribe focus groups with one researcher (JJ) reviewing and editing all transcriptions for accuracy.
Data analysis
Using the Framework Analysis Method (Gale et al., 2013), two researchers (JJ and SG) abductively coded three focus groups, one from each stakeholder group, in an iterative approach referring to the ToC and research questions. The researchers then discussed the coding and agreed upon an analytical framework by grouping the codes into categories, which reflected the components of the ToC. The initial framework had eight codes and 46 sub-codes, each with a definition and illustrative quotation. The framework and all transcribed focus groups were then entered into NVivo™ Version 12 (QSR International) for subsequent analysis by one researcher (JJ). During analysis, the researcher amended the framework by adding four sub-codes relating to the use of professional competency standards, client outcomes, preparing learners for practice, and improved competency-based assessment practices. All changes were reviewed and agreed to by a second researcher (SG). The final framework is given in Online Resource 5. During data analysis, themes were mapped to the ToC noting modifications to the initial ToC based on the data; reporting by stakeholder group (graduates, supervisors, faculty); and frequency of the data, as was needed for step 4 of contribution analysis to verify the ToC and develop the contribution claims. All researchers involved in the focus groups (JJ, SG, RB, JL) reviewed the results and discussed over two meetings. These conversations confirmed agreement with the analysis and noted the consistency across stakeholders and universities. The researchers also highlighted the key findings which were carried forward into step 4.
Step 4: assemble and assess the contribution claim, and challenges to it
Evidence gathered in step 3 is then analysed to identify and scrutinise links and influencing factors. Influencing factors are contextual conditions that determine an outcome by enabling or hindering the link (Lemire et al., 2012). The Relevant Explanation Finder, introduced to contribution analysis by Lemire et al. (2012) and adapted by others (Biggs et al., 2014; Buregeya et al., 2020), has been applied to structure data analysis and support the construction of contribution claims (Delahais & Toulemonde, 2012). A contribution claim asserts the presence (or lack thereof) of change, the contributing links(s), and influencing factor(s) (Delahais & Toulemonde, 2012). Iteratively, contribution claims are mapped to links within the ToC and the ToC is modified (Biggs et al., 2014; Mayne, 2012). A preliminary contribution story is then developed including a revised ToC and supporting narrative (Delahais & Toulemonde, 2012; Mayne, 2012) which may be reviewed by stakeholders to facilitate identification of new evidence which must be obtained in subsequent steps to strengthen the evaluation (Delahais & Toulemonde, 2012).
One researcher (JJ) used Microsoft Excel to create a Relevant Explanation Finder with an incorporated Evidence Analysis Database (Delahais & Toulemonde, 2012). The structure and application of the Relevant Explanation Finder have been well-described elsewhere (Biggs et al., 2014; Delahais & Toulemonde, 2012; Lemire et al., 2012). The same researcher systematically assembled and critically assessed synthesised data according to each column heading, which was then reviewed by a second researcher (SG). Iteratively, the researcher (JJ) used data compiled in the Relevant Explanation Finder, with frequent reference to step 3 data synthesis, to develop contribution claims and revise the ToC. Adapted from the approach described by Delahais and Toulemonde (2012), each contribution claim included a mechanism label and description with any further actions including ToC revision or need for additional evidence noted for future contribution analysis steps. All researchers then met and discussed the contribution claims and revised ToC and, through discussion, reached agreement. The result was nine contribution claims with nine assumptions, two external influences and three threats to programmatic assessment (presented in Online Resource 6).
Step 5: seek out additional evidence
Additional data is gathered in step 5 to enhance links within the ToC and contribution story credibility (Mayne, 2012). A range of approaches to this penultimate step are reported including merging with earlier steps (Downes et al., 2019; Hersey & Adams, 2017), expert review (Choi et al., 2023; Delahais & Toulemonde, 2012; Delahais & Toulemonde, 2017), further targeted data collection (Koleros & Mayne, 2019), and accessing secondary data sources (Koleros & Mayne, 2019; Riley et al., 2018).
As all components of the ToC and contribution claims were substantiated by evidence collected in step 3 and capacity for further data collection limited by study timeline, we decided to conduct stakeholder (from step 3) review and obtain secondary data via published evaluations of programmatic assessment which had increased since the research was commenced. For the stakeholder review, one researcher (JJ) created a video (17 minutes in length) that presented the ToC and contribution story. The video was reviewed by two other researchers (MH and SG) with minor editing and was then recorded. The video was then emailed to all participants from the study in step 3, with an invitation to view and provide feedback using a QualtricsTM survey. The survey asked respondents to confirm that they had viewed the video, identify their stakeholder group (graduate, supervisor, faculty), identify three main learnings from the video, comment if the findings reflected their experience of (programmatic) assessment, what needed further clarification, and what areas they would like to know more about. Participants were provided two weeks to respond. After which, one researcher (JJ) compiled and reviewed the survey data. Feedback indicated that the ToC and contribution story accurately reflected experiences with programmatic assessment and was sufficiently clear for participants to understand. As such, no further modifications were made to the ToC in response to the participant feedback.
A literature review was then undertaken to strengthen the ToC and contribute to the robustness and transferability of the findings. One researcher (JJ) searched the electronic databases MEDLINE (Ovid), Embase (Ovid), Web of Science, Scopus, and Cumulative Index to Nursing and Allied Health Literature Plus (EBSCO) on 23 February 2023 for the term “programmatic assessment” in the title. Inclusion criteria was empirical evaluation evidence on programmatic assessment within any discipline, written in English, and published after 8 December 2019. This date was selected as Schut et al. (2021) had published a literature review on programmatic assessment providing a synthesis of research to this date. The search yielded 407 publications which were imported into Covidence (Veritas Health Innovation). Covidence was used to identify and remove 279 duplicates, leaving 128 publications for title and abstract screening. At title and abstract screening, 41 publications were excluded as they were not about programmatic assessment or did not provide empirical evidence, leaving 87 publications for full text review. Eleven articles met the inclusion criteria with one (Jamieson et al., 2021) being excluded as we (JJ, CP, MH, SG) authored the paper and findings had been incorporated into the ToC in step 2. The remaining ten articles (Baartman et al., 2023; Baartman et al., 2022; Dart et al., 2021; de Jong et al., 2022; Jamieson et al., 2022; Roberts et al., 2022; Ross et al., 2023; Schut et al., 2020; Schut et al., 2021; Torre et al., 2022) were included. The same researcher (JJ) extracted data (publication year, title, aim, setting, methods, participants, results) into an Excel worksheet (summarised in Online Resource 7) which was then mapped, using colour coding to indicate the source, to the contribution claims. In an iterative reading of extraction data and ToC, contribution claims were revised. Revisions were reviewed by another researcher (CP) and then, along with the ToC, contribution claims were updated and finalised. After revisions, we had seven contribution claims and eleven assumptions, with no change to the three threats and two external influences. One initially proposed mechanism had insufficient evidence and was discarded (Online Resource 6, contribution claim 9).
Step 6: revise and strengthen the contribution story
Revising, strengthening and presentation of the contribution story is the final step (Mayne, 2012) and often involves critical review by a steering group or expert panel (Choi et al., 2023; Delahais & Toulemonde, 2012; Delahais & Toulemonde, 2017; Downes et al., 2019). Based on the collection, analysis, and integration of additional data in step 5, one researcher (JJ) revised and finalised the contribution story, which was reviewed by all researchers with no further amendments indicated.