Study design and participants
This single centre, unblinded, randomised controlled trial included fourth-year medical students of Seoul National University College of Medicine (SNUCM), Korea, who were enrolled to take a 2-week dermatology elective course in 2018. A total of 87 medical students applied to take the dermatology electives.
At SNUCM, all fourth-year medical students take 10 hours of lecture-based dermatology classes and take written tests in February before enrolling in the dermatology electives. Subsequently, they apply to undergo rotations in six clinical departments and attend 2-week rotations in each department for a total of 12 weeks from March to June. Each dermatology rotation includes 14 to 15 students. Students attend outpatient clinics, grand rounds, and medical didactics at three different hospitals (6 days in Seoul National University Hospital, 2 days in Seoul Metropolitan Government-Seoul National University [SMG-SNU] Boramae Medical Center, and 2 days in Seoul National University Bundang Hospital).
Randomisation and blinding
Students were randomly assigned to the experimental, lecture, and no intervention groups (3:3:4) by drawing numbered folded notes at random from a bag on the first day of their dermatology elective course. One the second day of the 2-week elective course, students underwent three different educational programs in one session. Students assigned to the experimental group participated in a 2-hour training session. Students randomised to the lecture group attended a 1-hour lecture with the same clinical cases as the experimental group. Then they attended outpatient clinic for 1 hour after the lecture. Students assigned to the no intervention group attended outpatient clinic for 2 hours (Figure 1). All students were informed that the test scores in this study were excluded from the final grades. Blinding was not feasible because of the different nature of interventions.
This study was exempted from full review by the Institutional Review Board, and the requirement for obtaining informed consent from the participants was waived by the board, because this study aimed to investigate the effectiveness of instructional techniques in established educational settings. Bioethics and Safety Act in Korea stipulated that such research can be exempted. This study was registered at clinicaltrials.gov as NCT03472001.
Intervention in the experimental group
The design goal of the experimental group was to encourage students to improve their clinical reasoning by practicing to think critically like a dermatologist, and to help students build adequate illness scripts of skin diseases. Three to four students and one instructor who was a board-certified dermatologist (H.S.Y.) participated in each training session, which was conducted for 2 hours. Approximately 10 minutes were required to complete the following steps for each case:
- Written clinical cases with photographs
This study used 10 written clinical cases for practice (training cases in Table 1). All cases were based on real patients whose diagnoses were confirmed by diagnostic work-up or treatment responses. We selected clinical cases that were encountered by non-dermatologists (e.g., primary physicians or other specialists) and had an incorrect initial diagnosis at first. Of the 10 cases, 9 were organised to enable students to compare and contrast adjacent diseases, which were different diseases that showed considerable overlap in terms of symptoms or signs. Each case consisted of a brief description of the patient’s medical history in a bulleted list and a photograph of the patient’s skin lesion (Figure 2). Cases were presented as PowerPoint slides. Additionally, three to five essential components in each case to make a correct diagnosis were defined.
- Abstractions in medical terminology
All students were requested to write down a description of the patient’s skin lesion using abstract terms called semantic qualifiers [16]. Students were able to refer to lists of semantic qualifiers of dermatologic descriptions in their textbooks and notes. Next, students verbalised what they have observed using appropriate semantic qualifiers for the case.
- Initial diagnosis by students
Students were asked to state the most likely diagnosis for the case. Every student had to present his or her own diagnosis.
- Correct diagnosis
The correct diagnosis of the case was shown after all students arrived at their initial diagnoses.
- Reflection and feedback
Students were requested to reflect on the case. They were asked to list findings in the case that were essential to making the correct diagnosis. Next, they had to discuss their thoughts, one after another, using appropriate semantic qualifiers. The instructor provided immediate feedback following each student’s presentation. When student’s thought was concordant with one of the predefined diagnostic components, the instructor confirmed whether the student’s thought was relevant. When student’s thought differed considerably from the predefined components, the instructor explained why it was not essential for the diagnosis (e.g., when one student presented that old age was essential to the diagnosis of herpes zoster, the instructor responded that paediatric patients are commonly encountered, even though herpes zoster is prevalent in elderly patients). If students had difficulty identifying relevant findings, the instructor would give a hint or a cue by asking a question (e.g., “Can a diagnosis be changed if multiple lesions are present?).
- Further evaluation (optional)
In some cases, students were asked to analyse findings that should have been present in certain skin conditions or laboratory tests that they should perform to confirm the diagnosis (e.g., skin biopsy).
Intervention in the lecture group
We had already conducted the same training course for the experimental group in 2017 and obtained students’ common answers and misunderstandings. During the lecture, the instructor explained the findings, including why the diagnosis was drawn and how to differentiate it from other diagnoses, acquired through reflection and feedback by the experimental group. The intervention in the lecture group used clinical cases identical with those of the experimental group. The lecture was delivered as a video (PowerPoint slides with a narration) provided by the same instructor of the experimental group to maintain quality. To encourage students’ concentration, a chief resident attended the lecture session as a supervisor while the video was playing.
Outcome
All students took a test before (baseline test) and after completing the 2-week rotation (final test). Students administered the baseline and final tests at the first and last day of each rotation, respectively. They diagnosed 10 novel cases (i.e., the patients were different than those in the training session) of diseases that were also presented in the training session (training set), and 10 cases of new diseases that were not included in the training session (control set in Table 1).
During the test, students were requested to read each case and write down the two most likely diagnoses for the case. When the answer was correct, either of the two diagnoses, students got one point. The list of diseases of the two tests were the same; however, the patients were different in the baseline and final tests. Students did not know the correct answers not until after the tests.
Statistical analysis
We enrolled all students participating in the dermatology electives in 2018, instead of calculating the sample size. Descriptive results were expressed as means±standard deviations (SDs). A linear mixed effects model was used to compare the baseline and final test scores among the three groups (diagnostic accuracy) and within group from the same subjects. The model included intervention, set (training vs. control), time point (baseline vs. at the end of rotation), and all of their possible interaction terms. Intervention, time, and set were included as fixed factors, and a random intercept as a random factor was considered in the linear mixed effects model. Our model was assessed by the Bayesian information criterion [17]. Furthermore, differences between the means of the three groups were determined by Tukey’s post hoc pairwise test for multiple comparisons.
The effect size of intervention was calculated as Cohen’s d with values of 0.2, 0.5, and 0.8 indicating small, medium, and large effects, respectively [18]. SPSS version 20.0 (SPSS Inc., Chicago, IL, USA) was used for the statistical analysis. P < 0.05 was considered statistically significant.