This was a study developing of competency model for general practitioners through a modified Delphi method. The Delphi method is a structured process for consensus-building among a diverse group of experts. The approach has commonly been adopted in medical research  and remains today the most widely used method for selecting quality indicators in healthcare . The process ends when an agreement has been reached on the discussed topics. According to the previous studies, two or three rounds are frequently used in the Delphi process [22-23]. This study involved two rounds of questionnaires to an expert panel via e-mail from September to November 2020. All methods in the Delphi process were carried out in accordance with previous studies [22-24] and research guideline for the Delphi survey technique .
This modified Delphi process was deployed based on two stages: (1) generating an initial set of relevant competencies derived from a literature review, behavioral observation of GP–patient consultations, and critical incidents interviews of GPs; (2) conducting a 2-round, web-based Delphi survey of experts in general practice to prioritize and gain consensus on the essential competencies of GPs. Please see Figure 1 for the process of the Delphi study.
A list of eligible experts was initially selected considering the representation of all potential differences in background, occupational environment and clinical practices. The experts were invited based on the following inclusion criteria: (1) working as GP, educator, or administrative leader in general practice department; (2) having at least 5 years’ working experience in general practice; (3) being familiar with requirement of GP; (4) being familiar with “5+3” residency training; (5) being from various geographic regions within China. The participants were asked for their willingness to take part in the study. In a preliminary recruitment round, 30 eligible experts were invited to participate in the study and 28 experts agreed to participate.
The potential competencies were generated based on a literature review, behavioral observations, and critical incidents interviews.
A preliminary list of competencies was constructed from three sources by literature review. Firstly, literature was searched in PubMed, EMBASE, Google-Scholar and three Chinese databases (China National Knowledge Infrastructure, Wanfang Data, VIP Chinese Periodical Services) with terms commonly used to describe GP (e.g., general practitioner, family physician, family doctor, community health worker), competency (e.g., competency, competencies, core competencies) and evaluation (e.g., evaluation, measurement, tool, indicator). A total of 37 published research papers describing domestic and foreign GPs’ competencies were identified form literature review. Secondly, 5 published competency model from international general practice organizations were also identified, including: the World Organization of Family Doctors (WONCA) , the College of Family Physicians of Canada (CFPC) , the Accreditation Council for Graduate Medical Education (ACGME) , the Royal College of General Practitioners (RCGP) , the Royal Australian College of General Practitioners (RACGP) . Thirdly, 2 published policy documents of residency training content and requirement of GP in China were reviewed [26,27].
Potential competencies were extracted from these sources and screened by a panel of 2 reviewers (YW and FYW, Ph.D. candidates) according to the following criteria: (1) the indicator was relevant to requirements of GPs in China; (2) the indicator was measurable. When there were doubts about whether an indicator should be retained, the research team would discuss together to make a decision. There were 88 competencies identified by the screening process.
Eleven GPs from 5 community health service institutions (CHSIs) in Beijing were invited to participate based on a convenience sample. Participating GPs were observed when providing medical care in the general practice consultations with each GP for one workday during November 2019 to January 2020. All consecutive patients visiting the recruited GPs on the observing workday were recruited in our study with oral agreement. During the observation, the information was recorded which including patients’ reasons for encounter (RFEs) and medical services provided by GPs. Three research assistants (YW, FYW, and ZLP, Ph.D. candidates) were hired as observers who were postgraduate students, majored in general practice and had a training session before the observation. During the observation, the observers were seated in the least intrusive corner of consulting room and will not talk to the GPs and patients. There were 21 competencies related to GPs’ work content were identified by the behavioral observation process.
Critical incidents interview
The same 11 GPs as in behavioral observation were invited and 8 GPs participate in the critical incidents interview. Of those three declined, the reasons was that individuals invited to participate were unable to attend for practical and/or domestic reasons. During the interview, participants were asked to describe incidents with good effect and incidents with bad effect. Questions were asked based a "STAR Principle", which included ‘What kind of situation was it at that time?’ (Situation), ‘What was the main task you faced at that time?’ (Task), ‘In that incident, what skills did the you display?’ (Action), ‘What was the final result of this incident?’ (Result). The information from incident interview was taped, transcribed, and coded. Three researchers (YW, FYW, and ZLP, Ph.D. candidates) extracted the information about GPs’ competencies from the incident interview data respectively. When there were doubts about whether a description of competency should be retained, the research team would discuss together to make a decision. There were 35 competencies were identified by the critical incidents interview process.
A total of 144 competencies were identified by these three processes above. After deleting the duplicate competencies and integrating the competencies with similar dimensions being measured, a preliminary list of 63 potential competencies were left. Then, the competencies were discussed in detail one by one in a research team meeting, concentrating on whether these competencies were measurable and wording them by referring to other competency models. After further removal and integration, 46 potential competencies were left, which were categorized into 7 domains.
All 46 potential competencies were formatted into the Delphi questionnaire. Importance and feasibility of the competencies were rated on a 1-9 Likert scale (1 = not important/feasible; 9 = very important/feasible). Spaces were left for experts to make comments on these existing competencies or recommend new competencies which they considered should be included in.
First round. The first round of Delphi survey was performed from September to October 2020, lasting 4 weeks. Materials were sent to experts by e-mail, including first-round questionnaire, research background, and basic demographic information collection form. In the first-round questionnaire, experts were asked to rate the importance and feasibility of each competency, and give their comments.
After the first round of Delphi survey, data was collected and analyzed. The median scores, the distribution of scores (frequency count of answer choices), and comments were reported. For the experts’ comments, including modification, deletion and addition, we sort out and make a summary of comments expressed by at least two participants.
Second round. The second round of Delphi survey was performed from October to November 2020, lasting 4 weeks. The second-round questionnaire was sent to experts who had completed the first-round questionnaire by e-mail. In the second-round questionnaire, the competencies which were achieved consensus level or modified based on comments in the first round were retained for Delphi round 2. New competencies were added based on the suggestion by more than two experts. Competencies were removed which did not achieved consensus level or was recommended to be removed by more than 2 experts. Along with the second-round questionnaire, the graph-based report of the results of the first round was also send to experts. Importance and feasibility of each competency were rated using the same 1-9 Likert scale as in the first round.
Consensus. There is no definite consensus criteria for the Delphi study . In this study a consensus was reached based on two selection criteria: median score greater than seven on a nine-point scale and at least 75% of panel ratings in the top tertile (7–9) for importance and feasibility.
Descriptive analysis was used to describe the characteristics of participates and results. Means [with standard deviation (SD)] were used to report continuous variables, while frequencies (%) were used to report categorical variables. The Data management and analysis were performed using Statistical Package for Social Science (SPSS), version 22.0.