3D skull model based on VR technology
The virtual 3D skull model used in this study was constructed from computed tomography (CT) scans of a human skull from the Peking Union Medical College (PUMC) Anatomy Teaching Collection (Fig. 1). The CT scans were imported into Mimics 17.0 (Materialise NV, Leuven, Belgium) and converted into STL (stereolithography) files. The method used to create a 3D model from CT scans was previously published by Shui et al.. Several defective structures (ethmoid plate, crista galli, anterior clinoid process and inferior orbital fissure) on the 3D skull model were modified using 3D Studio Max 2016 (Autodesk Inc, San Rafael, CA). In addition, each bone was isolated from the whole skull and painted in a different colour (Fig. 2c & d). The model was then imported into the Unreal Engine VR platform (Fig. 2a)) through the HTC VIVE software development kit (High Technology Computer Corporation, Taiwan) and Unreal Engine 4.15 (Epic Games Inc, Cary, NC), which is compatible with HTC VIVE CE (High Technology Computer Corporation, Taiwan), a VR HMD with a resolution of 2160´1200. Users could rotate and scale the model through handheld controllers. In addition, each cranial bone could be isolated from all other bones, allowing the user to view an individual selected bone and its position in space relative to the other bones. When the isolated structure was placed back in its original position, the model was reset.
Seventy-four clinical undergraduates from PUMC who had just finished a 2.5-year pre-medical programme at Tsinghua University were recruited. These students would begin their undergraduate stage of medicine in the subsequent 5.5 years, from the basic study of anatomy to the clinical internships. The anatomy course combines regional and systematic anatomy and requires 144 study hours for each student. Every theoretical lecture is followed by a cadaver dissection teaching of equivalent time. There are a theoretical test and an identification test for objective assessment at the end of the course.
The students were randomly divided into three groups: the VR skull group (VR group, n=25), cadaveric skull group (cadaver group, n=25), and 2D atlas group (atlas group, n=24). Seventy-three participants completed the trial, while one participant in the atlas group dropped out of the study for personal reasons before the pre-intervention test.
This study was approved by the Institutional Review Board of the Institute of Peking Union Medical College Hospital (PUMCH) (Project No: ZS-1724).
A flowchart of the study design is displayed in Fig. 3. All participants finished pre-intervention tests. Then, they attended a 30-min PowerPoint-based introductory lecture on cranial anatomy, including the characteristics of each cranial bone, feature structures and spatial relationships. The lecture was taught by a teacher from PUMC whom the students had not met before. During the lecture, each participant received a single sheet of paper with the teaching outline, which could also be used for note-taking. Afterwards, the three groups were assigned to three separate rooms for a 30-min self-directed learning session using skull VLR, cadaveric skulls, and 2D atlases. The students in the VR group received 2 min of instructions about the manipulation of VR equipment before learning. Study mentors were assigned to each room to prevent intragroup communication and were forbidden to answer questions related to anatomy. The participants took turns so that each participant had 7.5 min to manipulate and observe the model in the first perspective, and they observed the 3D model on the computer screen for the remaining 22.5 min. The participants in the cadaver group and atlas group also had the same amount of time to hold the cadaver skull or atlas, while the other participants could only observe, without manipulation. To compensate for the inability to view the teaching outline on paper in the simulated environment, a projector was used to project the teaching outline on a screen (Fig. 2b). A post-intervention test was conducted immediately after the learning session to evaluate the educational efficacy of each model. Finally, each participant completed a perception survey.
The pre- and post-intervention tests were composed of the same set of theory tests and identification tests (Supplementary file 1.1 & 1.2). The theory test consisted of 18 multiple-choice questions that mainly covered basic knowledge on the skull. Each correct answer was awarded 1 point, and the examination lasted 15 min. The identification test consisted of 25 fill-in-the-blank questions on labelled anatomical structures about the skull. All structures were labelled on the cadaveric skulls. The participants had 45 seconds to observe each structure and write down its name. Each correct answer was awarded 1 point. The content was based on the syllabus from the PUMC anatomy course, and all the test questions are available in Supplementary file 1.
To assess the potential efficacy of the teaching tools, in addition to the objective learning efficiency determined by the test scores, a perception survey was designed (Supplementary file 1.3). The questions were based on those included in several previous studies conducted to evaluate the efficacy of other 3D models [8, 20, 21]. The perception survey used consisted of five parts that addressed the participants’ enjoyment, learning efficiency, attitude, intention to use, and the tool’s authenticity, and a standard five-point Likert scale was used to quantify the responses (1-strongly disagree, 5-strongly agree with the statement).
Data collection and marking
Demographic information, including each participant’s age, sex, self-reported VR headset experience and video game experience, was collected during the trial. Participants recorded their group and individual identification numbers on the sign-in sheet. The previous grade point average (GPA) of each participant was obtained from the grade counsellor. The demographic and grouping information were hidden from the test mentor, study mentor and study staff until the trial was completed. The study staff scored each answer sheet, and the results were reviewed by the investigators (Zhu J and Cheng C) twice.
Using the Chen et al.  mean total scores and variance data of post-intervention test, power calculations were performed for this study. The calculations revealed that 26 students were required per group (78 students total) to achieve 80% power to detect a 10% change in the post-intervention total scores at an alpha level of 0.05.
The previous GPA, test scores and perception survey scores are expressed as medians (interquartile ranges, [IQRs]), and the categorical variables are expressed as numbers (%). The participants’ ages are expressed as the means [±SDs]. A p-value of <0.05 was considered to indicate statistical significance. Statistical analysis was performed using SPSS 23.0 (IBM Corp, Armonk, NY).
The data distributions were assessed using the Kolmogorov-Smirnov test. The between-group differences in the pre- and post-intervention test scores, changes in the scores, and perception survey scores were assessed using the Kruskal-Wallis H test because they were found to be non-normally distributed. If there was a significant difference with the Kruskal-Wallis H test, the Mann-Whitney U test was employed for pairwise comparisons. The participants’ ages were compared with ANOVA. The categorical variables, except for video game experience, were compared with the chi-square test; video game experience was compared with Fisher’s exact test.