Developing Evaluation Criteria for Competency-Based Curriculum in Medical Colleges


 Background
The present study was conducted to examine the operation status of the Competency-Based Curriculum (CBC), which has become the main curriculum in the medical schools in Korea, and prepare valid evaluation criteria consented by experts to make improvements. The evaluation criteria were derived based on a model built by combining the Context, Input, Process and Product (CIPP) evaluation model and the Kirkpatrick evaluation model, which are representative educational evaluation models.
 
Methods
Firstly, literature survey was performed and a semi-structured interview was conducted with 5 experts to develop a draft of the evaluation criteria. To verify the validity of the draft of the developed evaluation criteria, two surveys based on the modified Delphi methodology were conducted with a panel consisting of 20 experts.
 
Results
Based on the literature survey and the expert interview, a draft of the evaluation criteria was derived, including 5 evaluation areas, 18 evaluation items and 58 evaluation indicators. Two Delphi surveys were conducted to validate the evaluation criteria. The evaluation criteria that showed a relatively low content validity ratio (CVR) were corrected and complemented by reflecting the experts’ opinions to finally derive 5 evaluation areas, 16 evaluation items and 51 evaluation indicators.
 
Conclusions
The significance of the present study is that an evaluation model and its evaluation criteria suitable for the curriculums of the medical schools in Korea were developed with the consensus of the experts. The preparation of valid evaluation criteria will open up the possibility of improving the evaluation in medical schools, contributing to the improvement of the educational quality and the continued quality improvement of medical education.


Background
The present study was conducted to examine the operation status of the Competency-Based Curriculum (CBC), which has become the main curriculum in the medical schools in Korea, and prepare valid evaluation criteria consented by experts to make improvements. The evaluation criteria were derived based

Results
Based on the literature survey and the expert interview, a draft of the evaluation criteria was derived, including 5 evaluation areas, 18 evaluation items and 58 evaluation indicators. Two Delphi surveys were conducted to validate the evaluation criteria. The evaluation criteria that showed a relatively low content validity ratio (CVR) were corrected and complemented by reflecting the experts' opinions to finally derive 5 evaluation areas, 16 evaluation items and 51 evaluation indicators.

Conclusions
The significance of the present study is that an evaluation model and its evaluation criteria suitable for the curriculums of the medical schools in Korea were developed with the consensus of the experts. The preparation of valid evaluation criteria will open up the possibility of improving the evaluation in medical schools, contributing to the improvement of the educational quality and the continued quality improvement

Background
To improve the quality of education, many medical schools have newly introduced the Competency-Based Curriculum (CBC), which has become the center of the current curriculums of medical schools (1) (2) (3).
The CBC is a curriculum model that defines the competencies that should be exercised at the work sites after the learners finish their learning and that plans the educational goals, contents and methods, and evaluation accordingly (4) (5) (6) (7). The CBC started to be recognized in Korea in the middle and late 2000s (8). Now that a decade has passed since the CBC was introduced and settled in the medical schools in Korea, it is necessary to examine if the education is implemented properly to make needed improvements.
Previous studies reveal the attempts to assess and improve the medical education in Korea (9) (10) (11) (12). However, most of the previous studies were conducted without a systematic framework, and the CBC has not been sufficiently investigated.
The evaluation criteria are so important that they can determine the success of evaluation (13). Therefore, valid evaluation criteria must be prepared to improve the curriculum. Nevertheless, the field of medical education in Korea lacks the evaluation criteria prepared through a consensus.
Various evaluation models for education evaluation have been suggested. Many scholars, including Worthen et al. (1997) (14), Russ-Eft & Preskill (2009) (15) and Cook (2010) (16), presented various viewpoints to classify the models. However, despite the various ways of classification, the models of which usefulness have been proved by evaluation in various education fields, including medical education, are the Context, Input, Process and Product (CIPP) evaluation model and the Kirkpatrick evaluation model (17). The CIPP evaluation model is for the educational evaluation in the aspects of context, input, process and product to provide information needed for decisions. The global society of medical education has introduced the CIPP evaluation model, which is widely applied to educational evaluation (18) (19) (20) (21) (22) (23) (24). However, the previous studies regarding the CIPP evaluation model were not conducted by reflecting the CBC of medical schools.
The 4-level Kirkpatrick evaluation model is one of the popular educational evaluation models that are applied to the evaluation of the results of medical education (25) (26) (27). This model reviews the results of education in 4 levels, which are reaction, learning, behavior and result (28). However, the studies based on this evaluation model have a limitation that the positive and negative details that happen throughout the educational process may not be examined (29) (30) (31).
The CBC has been developed based on the behaviorism theory which is interested in the change of the learner's behavior (32) (33). Behaviorism is focused on the learner's observable behavior (34), which is consistent with the idea presented by Kirkpatrick (2009) that no behavior change occurs without learning.
Therefore, the Kirkpatrick evaluation model can provide useful information to the CBC (35).  (36). Kim (2018) sought both the outcome and process but eventually focused on the outcome-oriented Kirkpatrick evaluation model (37). When the two models, which are particularly useful in the evaluation of medical education, are combined and applied to the development of evaluation criteria, the quality of medical education is expected to improve.
In summary, the improvement of the CBC of the medical schools requires the development of comprehensive and systematic evaluation criteria for the verification of outcome in the entire educational process. The CIPP evaluation model and the Kirkpatrick evaluation model need to be applied to support the comprehensive and systematic evaluation.

Methods
The CBC evaluation criteria for medical schools were developed in the following stages: the drafting of evaluation criteria; and the validation of evaluation criteria by Delphi method. The states are described in details below.

Drafting of Evaluation Criteria
The evaluation criteria drafting stage is to prepare the draft of the evaluation criteria. For this, previous studies were analyzed with regard to the concept, purpose and characteristics of the CBC and the evaluation criteria for the evaluation of medical education programs, and a semi-structured interview was conducted with 5 experts in medical education and education evaluation. Table 1 shows the experts with whom the interview was conducted. All the experts were current professors who had a career of at least 14 years in the field of medical education and experiences specialized in the introduction, operation and evaluation of the CBC. Their majors were medicine and pedagogy. The interview was conducted between July 29 and August 10 in 2020 with one expert each time for about 1 hour to 1 and half hour. The materials acquired from the interview were analyzed and interpreted by the comprehensive analytical procedures utilized by Koo (2018) and provided by Lee and Kim (2014) (38,39). The entire recording of the interview was dictated, and the collected materials were repeatedly read from the beginning to the end to list meaningful statements, establish units and prepare a list of them.
<Table 1> Experts with whom the interview was conducted. The evaluation criteria extracted from previous studies and the analytical results of the expert interview were consolidated and summarized to derive the draft of the evaluation criteria. The draft was modified by 3 experts who majored in medical education, pedagogy and medicine to establish the draft of the evaluation criteria.

Validation of Evaluation Criteria
In the present study, after the draft was prepared, the evaluation criteria were validated by applying the modified Delphi method. Majored in other fields 10 years In addition, since the validity in the Delphi method may be presented by analyzing the levels of opinion convergence and consensus by the expert panel (44) (45) (46), a degree of consensus of 0.75 or higher and a degree of opinion convergence of 0.5 or lower were considered as a high validity.
In the Delphi method, the stopping criterion that determines the rounds was the Coefficient of Variation (CV), which was considered to examine the stability. The CV is the standard deviation divided by the arithmetic mean. A CV of 0.5 or lower was considered as requiring no additional round, 0.5 to 0.8 was considered as a relatively stable, and 0.8 or higher as requiring an additional survey (47) (48).
The data from the first and second Delphi survey were analyzed by calculating the frequency, percentage, mean, standard deviation, median, quartile, degree of consensus, degree of convergence and CV by using Excel 2016 software program.
In addition, the Delphi panel was asked to freely describe the parts of the individual questions that required correction, addition or removal. After completing a round, the opinions from the Delphi panel were summarized to modify the questions, especially those that were pointed out by at least two experts. The questions that were pointed out by at least one expert were modified through the consultation with the experts who reviewed the draft evaluation criteria.

Development of Draft Evaluation Criteria
A literature survey and an expert interview were conducted to prepare the draft of the evaluation criteria.
The evaluation criteria of the medical education programs were explored through the literature survey.  Table 3 summarizes the results of the literature survey and the expert interview. According to the literature survey and the expert interview, the context, input and process of the CIPP evaluation model were applied as evaluation area without modification. The product area was divided into the learning outcome and curriculum quality management, and the Kirkpatrick evaluation model was applied to the learning outcome part, building a mixed model and deriving the evaluation areas for the model. The evaluation items and evaluation indicators were also prepared as the first Delphi evaluation criteria shown in Table 5 and

Results of First Delphi Survey
The five evaluation areas of the evaluation criteria were 1. context, 2. Input, 3. Process, 4. Learning outcome and 5. Outcome of curriculum. In the first Delphi survey, as shown in Table 4, the mean value was 4.0 points or higher in all the evaluation areas and the standard deviation was 0.5 to 0.74. The median and the mode were both 5 points, indicating that the validity of the evaluation areas was very high. The overall rate of positive response was 95% to 100%. The CVR was 0.90 to 1.00, which was higher than the validity level (0.42) suggested by Lawshe (1975) for an expert panel consisting of 20 members (43).
Therefore, the survey showed that all the evaluation areas presented in the draft evaluation criteria were valid.
The comments of the expert panel on the individual evaluation areas were additionally reviewed, and 1.
Context was modified to 1. Educational environment & context, and 5. Outcome of curriculum to 5.
Continuous Quality Improvement(CQI) according to their comments.
The mean score of most evaluation items (17 items) was 4 points or higher, except the evaluation item of

Results of Second Delphi Survey
The second Delphi survey was conducted to verify the validity of the evaluation area, evaluation items and evaluation indicators of the evaluation criteria modified according to the analysis of the results from the first Delphi survey. Table 4 shows the results of the second Delphi survey.
The mean validity of all the evaluation areas was 4.0 or higher in the 5-point scale. The median and the mode were both 5 points, indicating the validity of the evaluation areas was very high. The overall rate of positive response was 95% to 100%. The CVR was 0.90 to 1.00, which was higher than the validity level (0.42). Therefore, all the presented evaluation areas were found valid.
In addition, the degree of consensus of all the evaluation areas was over the criterion (0.75), and their degree of convergence was 0.5 or lower, satisfying the validity conditions. The CV value was 0.5 or lower, indicating that survey results could be considered as the final results without the need for conducting an additional round of Delphi survey.
Beside the analysis of the basic statistics, the additional comments of the expert panel on the individual evaluation areas were reviewed. As a result, '1. Educational environment & context' was recovered to '1.
Context' according to the major opinion.
The mean score of 16 evaluation items was 4 or higher, except the two evaluation items of '2.

Discussion
Experts from nearly 50% of the medical schools in Korea participated in the preparation of the evaluation criteria and the verification of their validity. Therefore, the significance of the present study is that the validity of the developed evaluation tool was verified by the experts throughout Korea.
The validity of the evaluation criteria selected by the literature survey and the expert interview was verified by the Delphi methodology, which showed that most of the evaluation criteria are valid. In comparison with the first Delphi survey, the consensus, convergence and CV values in the second Delphi survey were found more positive, indicating that the members of the Delphi panel were gradually approaching a consensus. This showed that the characteristics of the Delphi survey were significantly realized in the present study, including the repeated procedures, controlled feedback and the anonymity of the Delphi panel members.
Since most of the drafted evaluation criteria were proved to be valid, it was found that the CBC of medical schools needs to be evaluated by using the evaluation criteria in the extensive range of 'context, input,

Study limitations
This study has the following research limitations. First, this study derived evaluation criteria based on previous research, interviews with 5 experts, and Delphi survey with 20 participants. Therefore, it is difficult to say that the presented evaluation criteria reflected all the reality of competency-based curriculum in domestic medical schools.
Second, this study drew the evaluation criteria by collecting the opinions of experts. However, there are various stakeholders in medical education. For this reason, the results of this study are limited in that they do not reflect the opinions of various stakeholders, such as students, other than experts.

Conclusion
The significance of the present study is that an evaluation model and its evaluation criteria suitable for the curriculums of the medical schools in Korea were developed with the consensus of the experts. The

Availability of data and materials
The datasets used and analyzed during the current study are available from the corresponding author upon reasonable request.

Ethics approval and consent to participate
The study protocol was performed in accordance with the Declaration of Helsinki