Trial design
The flipped classroom was conducted in the intervention group, and the traditional face-to-face classroom was conducted in the control group. The two groups were compared to examine the usefulness of the flipped classroom. The learning objectives of the class were set—participants should be able to derive DTR in five areas: biceps, triceps, flexor brachii, patellar tendon, and Achilles tendon. The participants were allocated to the intervention or control groups. To integrate quantitative and qualitative assessments, we designed a mixed-method study. This study was conducted based on the CONSORT 2010 guidelines [8] (Supplement 1).
Participants
A total of 83 medical students participated in the study. They were fifth-year students who participated in a clinical clerkship at the Department of General Medicine, Chiba University Hospital, from November 2018 to July 2019. One group consisted of five to six students who practiced for two weeks in this department. All participants learned about basic physical examinations, including DTR in fourth-year classes, and passed a pre-clinical OSCE in which those skills were assessed.
Intervention
In the intervention group, participants learned about DTR by seven minutes of e-learning, followed by undergoing ten minutes of procedural teaching. In the control group, participants attended face-to-face classes about DTR and then underwent ten minutes of procedural teaching (Fig. 1). A video available on this website was used as the e-learning material [9]. This video describes DTR, the anatomy of tendons, the locations to blow, and the posture of the examinee. Medical students in the intervention group were able to watch the video repeatedly on their smartphone, tablet, and PC. In the control group, participants underwent face-to-face learning about DTR instead of e-learning. In the interest of fairness, the video was made available for students of the control group after this study.
Three faculty members participated in this study, of which one faculty conducted procedural teaching. The instructional design was conducted considering the opinions of all faculty members, and teaching skills were standardized before the study began.
Outcome measures
Self-confidence in DTR examination and mastery of the techniques were compared between the two groups. A 5-point Likert scale was used to evaluate self-confidence in the DTR examination before and after the procedural teaching. In the intervention group, the evaluation was performed after prior learning. In the control group, the evaluation was performed before face-to-face teaching. In addition to self-confidence evaluation, we assessed mastery of the techniques after procedural teaching using the DOPS (Fig. 1). After the procedural teaching, the evaluators directly observed and evaluated the procedures performed on a simulated patient who was played by another student. The item, “technical ability,” of DOPS was analyzed. This item evaluates a mastery on a scale of 1 to 6. The faculty development was conducted to improve inter-observer reliability before this study. In this development, scaling criteria were defined as follows: a score of 3 means the borderline student can derive part of DTR; a score of 4 implies the student can derive all DTR; a score of 2 or less indicates below expectation; and a score of 5 or more means above expectation. To ascertain the baseline, the participants’ age, sex, and completion of a clinical clerkship in neurology were investigated. In the intervention group, “accessibility of the prior learning material” was investigated on a 5-point Likert scale for reference to the material’s validity.
Sample size
All fifth-year students who were not yet practicing at the Department of General Medicine participated because this study was also a part of clinical clerkship education in the department. A total of 83 medical students (15 groups) participated in the study. This sample size was more than 27, which is the required sample size for a two-tailed test of the difference in means between the two groups, assuming a significance level of 0.05, a statistical power of 0.8, and an effect size of 1.0.
Randomization
The 15 groups were allocated to the intervention or control groups. The allocation was not blinded to the participants or the evaluators. The same person participated as a faculty member and an evaluator.
Statistical method
All statistical analyses were performed using SPSS Statistics for Windows 26.0 (IBM Co., Armonk, NY, USA). The unpaired t-test, with the significance level set to 0.05, was used to analyze the results of the 5-point Likert scale and DOPS.
Free description questionnaire
A free description questionnaire was used to qualitatively assess self-confidence in DTR examination before and after procedural teaching. These evaluations were conducted using the same questionnaires with the evaluation of self-confidence. The free description questionnaire was expected to provide explanations for the quantitative results. Opinions about prior learning or flipped classrooms were also collected in the intervention group.
Focus group interview
In the intervention group, focus group interviews (FGI) [10]−[11][12] were conducted for a qualitative assessment. Considering the study objective, we set the theme to assess the effectiveness of the flipped classroom after procedural teaching. Three evaluators (SU, KS, and KI) took charge as interviewers. The interview guide was prepared after a discussion with the three interviewers (Supplement 2). All students in the intervention group (n = 39, seven groups) were interviewed. In the interview, the faculty member interviewed the group that consisted of five to six students after the procedural teaching. The interviewer asked the following question: “Think of the advantage of the flipped classroom. Why do you feel that it was an advantage?” To analyze the interviews, the verbatim reports were created based on the content recorded on a digital voice recorder. These reports were analyzed qualitatively by content analysis. The two researchers (SU, KS) performed the analysis and consensus-building. We performed researcher triangulation to ensure the quality of the analysis. Cohen’s kappa coefficient was used to assess inter-rater reliability [13].
Mixed methods research
To integrate quantitative and qualitative assessments, a mixed methods study was designed as an exploratory sequential design [14]. The qualitative assessment was intended to support and explain the results of the quantitative assessment as explanations.