We employed a web-based Delphi survey to gather validity evidence based on consensus ratings on which CanMEDS key competencies could be feasible and consistently assessed for workplace-based assessments in the GP Training.[13-15] Based on available literature, we discussed and decided on the necessary steps to ensure methodological rigor. Table 1 provides an overview of designing steps, based on the Guidance on Conducting and REporting DElphi Studies (CREDES) guideline.[16] We further elaborate on our methodological decisions considering the CREDES design steps.
Table 1 Steps for designing a Delphi study based on the CREDES guideline.
• Defining the purpose of the Delphi study
|
• Definition of Delphi round
|
• Definition of (non) consensus
|
• Selection of expert panel
|
• Development and pilot of Delphi instrument
|
• Guidelines on interpreting results and proceeding between the rounds including informational input for experts
|
• Role of research team to prevent bias
|
• Strategies to improve response rate
|
Study design
We chose to employ an e-Delphi to recruit panellists from diverse geographic locations within Flanders and to reach a larger group in a cost-efficient way(Gill, Leslie, Grech, & Latour, 2013; Graham, Regehr, & Wright, 2003). The online form was also preferred since this study took place during the COVID-19 pandemic. We defined feasibility as what can be measured in the workplace and whether the competency formulation is suitable for workplace-based assessment. We defined consistency as what can be consistently measured across different training phases in the workplace (Figure 1).[13-15] Consensus was defined as 70% of respondents agreed or strongly agreed that an item was feasible and consistent for assessment in the workplace.[17] Non consensus was defined as no major change in consensus ratings nor any suggestions for change by the panel after 2 rounds.
To guarantee the reiterative nature of our study, we decided to set a minimum number of three rounds.[18] After each Delphi round, when consensus was achieved for a CanMEDS key competency, the latter was no longer offered for evaluation. Although the traditional Delphi methodology commences with an unstructured round, we chose to follow a semi-structured approach, since our main research goal was to validate the predefined CanMEDS framework.[4] Therefore, we used a combination of closed and open-ended questions.[19]
In the first round, panellists were asked to evaluate the CanMEDS key competencies as feasible and as consistent based on a 5-point Likert scale. They were also able to provide qualitative comments for each key competency.[7, 14] In the second round, we informed the panellists about the consensus ratings of round 1. In this round, panellists were asked to formulate concrete suggestions for adjustments and rate the two research criteria. A document was also added addressing the issues that arose in round 1 based on qualitative remarks. To provide some clarity about the formulation of the competencies, the CanMEDS enabling competencies of each key competency were provided to assist the panel with their suggestions.
To answer to some of the remarks, we added the enabling competencies that corresponded to each key competency. Additionally, we listed and categorized the most frequent qualitative comments providing an overview. Decisions about modifications on key competencies were clearly communicated. We asked the panel again to evaluate the CanMEDS key competencies as feasible and as consistent for workplace-based assessment based on 5-point Likert scale. In the third round, we provided summaries of the ratings from the previous rounds. After panellists’ request, we included a list of examples of how each CanMEDS key competency would transfer to the workplace. In this final round, we asked the panellists whether they agree or not that a CanMEDS key competency was feasible and consistent for assessment in the workplace. If not, they were required to specify the reasons for abstaining consensus.[15] Figure 2 shows an overview of each Delphi round.
Study setting
To create a coherent approach across Flanders, four Flemish universities (KU Leuven, University of Ghent, University of Antwerp, and the Flemish Free University of Brussels) have created an interuniversity curriculum for the GP Training which consists of three phases. Practical coordination and decision-making regarding the curriculum are the responsibility of the Interuniversity Centre for GP Training (ICGPT). The ICGPT is responsible among others for allocating clinical internships, organizing exams, arranging fortnightly meetings of GP trainees with tutors, and handling trainees’ learning portfolios where evaluation of competencies is registered.
Selection of panel
To select panellists, we followed a purposive sampling.[13, 20] We set three selection criteria: 1) sufficient experience as a GP (>3 years of experience), 2) experience in mentoring and assessing trainees in the workplace, 3) sufficient time and willingness to participate.[7, 21] To incorporate a wide range of opinions, the panel consisted of both GP trainers and GP tutors.[22] GP trainers were workplace-based trainers assisting trainees during their internship, while GP tutors were associated with a university providing guidance and facilitating peer learning and support (10-15 trainees per group) twice monthly. Both groups were responsible for assessing trainees in the workplace. Panellists resided in different provinces of Flanders to minimize converging ideas and to ensure reliability.[13, 22] Although there is no consensus about an appropriate sample size for a Delphi design, a number of 15-30 panellists could yield reliable results.[22, 23] In our study, we selected panellists that had received the same medical background and hold general understanding in the field of interest. In addition, to determine sample size, we took into consideration feasibility parameters to obtain a good response rate, such as providing large timespans for each Delphi round and reasonable required time to completion.
Development and pilot of Delphi survey
The 27 CanMEDS key competencies were translated from English to Dutch, because the panel was Dutch speaking. Figure 3 graphically illustrates how the Delphi survey was constructed. First, the CanMEDS competencies were translated by five researchers separately.[24] After discussing and evaluating all translations, we decided to keep the Dutch translation as close as possible to the original English framework. Secondly, to validate the translation and pilot the instrument, we sent it to a group of medical professionals to comment on it. Thirdly, once feedback was received and the Dutch translation was finalized, the Dutch version of the framework was back translated to English to confirm the accuracy of the translation.[25]
Every Delphi round consisted of an introductory part, the CanMEDS key competencies, and an ending section. In the introduction, the purpose of each round was explained, and decision rules were communicated. We added the ending section to provide space to the panel for communication and feedback not related to the CanMEDS key competencies (e.g., necessary time to completion, remarks about layout). To avoid confusion among the different CanMEDS roles, the key competencies were grouped per role. Figure 4 illustrates how the survey items were displayed prior to any consensus had been reached.
Data collection and analysis
To collect our data, we used the Qualtrics XM Platform. This online tool allowed for maintaining anonymity among the panellists.[26] A personal link was sent via email to each panellist. This allowed following-up response rates and sending reminders to specific members. Due to the workload caused by the COVID-19 pandemic, each round lasted four weeks. We opted for a flexible approach towards the panellists to increase the response rate of each round. Reminders were weekly sent to members that had not completed the survey.[26] Data collection took place between October 2020 and February 2021. For analysing quantitative data, we calculated descriptive statistics of every item using SPSS 27 (IBM SPSS Statistics 27). We used Microsoft Excel to list and categorize qualitative data. Panellists’ comments were anonymously and literally registered. For analysing qualitative data, we used content analysis.[27]
Role of research team to prevent bias
Methodological decisions made by the research team were in line with the available literature. We predefined and stipulated methodological steps before commencing the study. We applied, monitored, and evaluated these steps during the study. The results of each round were discussed by the research team, while qualitative data were interpreted by two researchers for researcher triangulation.[28]