Subjects and Setting - In the fall of 2016, Michigan State University’s College of Human Medicine (CHM) implemented a new curriculum called the Shared Discovery Curriculum (SDC). The SDC is organized around patient chief complaints and concerns. Students start their medical training by learning basic data gathering and patient communication skills through simulation-based educational experiences that includes standardized patient encounters and direct observation and feedback from clinicians. After eight weeks of training, students begin working two half-days a week in clinic settings with medical assistants and nurses. As they gain more experience, their clinical responsibilities grow. At the same time, they work to master clinical applications of basic science knowledge independently, in small groups, and in a weekly large group session. They also continue to receive four hours per week of clinical skills training in the medical school’s simulation centers.
Measures - The students are evaluated for both formative and summative purposes via progress testing twice each semester using both written and clinical skills examinations. Descriptions of the development and piloting of the PCSE used in the SDC has been published elsewhere.5,6 The examination uses an Objective Structured Clinical Examination (OSCE) format7 and consists of eight 15-minute Standardized Patient (SP) encounters with 10-minute post encounter stations. Each encounter assesses some combination of patient interaction skills, hypothesis-driven history gathering, physical examination, counseling and safety behaviors using checklist and rating items completed by the SP. Post-encounter tasks, which are graded by faculty members, assess the application of medical knowledge, clinical reasoning, and clinical documentation. The PCSE stations are designed to assess EPAs not easily assessed by written examinations. The performance of an examinee is reported as the percentage of possible points students achieved across all eight cases in each of the six domains.
The standardized patients used in the PCSE are trained to the specific PCSE cases they simulate. Both their portrayal of the case script and the accuracy of their completion of the checklist/rating forms are assessed by the staff in the simulation center before each PCSE is given and include measurements of inter-rater reliability. Adjustments to either the case or how the SP is trained are made when these quality assurance efforts identify a problem.
Design - The PCSE is given as part of a broad-based progress assessment that also includes written examinations. These progress assessments occur twice each semester for a total of 20 assessments over the course of the medical school curriculum. Third- and fourth-year students are assessed using the PCSE once each semester. Depending on their rotational schedule each third- and fourth-year student is assessed either in the first or second PCSE given that semester. To pass in a semester, students must pass at least one of the two PCSE given that semester with scores deemed appropriate for their level of training. Third- and fourth-year students who do not meet course-specific expectations for all skill areas on the PCSE take a make-up exam to demonstrate their competency.
Since students in all four years of training take the same PCSE examination at roughly the same time, we can potentially observe the growth in clinical skills both longitudinally over the course of each students’ medical training as well as cross-sectionally between students with different levels of training taking the same PCSE. The SP cases for each PCSE are drawn from a pool of cases that are continually being developed. The SP cases will eventually be reused but only after the students who were originally assessed via the case have graduated. As a result, students will not encounter cases from a previous PCSE in which they were evaluated.
As noted, third- and fourth-year students take a single PSCE each semester with a portion of the students taking the first administration of the PCSE in a given semester and others, the second. Given this complication, we chose to focus on first- and second-year student performance for this study. During fall semester 2017 and spring semester 2018, four PCSEs were conducted as part of the SDC progress assessment. Second-year students from the first matriculation class in the SDC and first-year students from the second matriculation class completed the assessments. The scores in these four administrations of the PCSE for the two classes of students were used to assess growth in the students’ clinical skills during the first two years of the curriculum and the psychometric characteristics of the PCSE.
Generalizability Study - We conducted a generalizability analysis8 of the PCSE domain scores separately for both first- and second-year students. We considered standardized patient cases as the only facet in the universe of admissible observations. This resulted in a student by SP case ANOVA design for estimating the variance components used in the generalizability study.
As noted above, we are interested in both cross-sectional comparisons of the first- and second-year students’ performance as well as the longitudinal growth of the students’ performance across multiple administrations of the PCSE. These two types of comparisons have different generalizability coefficients, and standard errors of measurement.9 In the cross-sectional comparisons, the students at each level of training are assessed on the same eight SP cases. The error variance is equivalent to the error variance as defined in classical test theory.10 When making longitudinal comparisons of the same students over multiple examinations, the comparisons are based on different SP cases that are not perfectly parallel. As such, longitudinal comparisons include an additional source of error from the variation among cases, have lower generalizability and larger standard errors of measurement than cross-sectional comparisons. The difference between these two types of measurement is also often referred to as “norm-referenced” and “domain-referenced” measurement.11
We used GENOVA for conducting the generalizability analyses.12 As noted, PCSE scores are reported as the percentage of possible points a student achieves in the domain across all eight cases. Since the generalizability analysis is based on case-level data, we conducted the generalizability analysis on the number of points achieved for each case. Since the generalizability coefficients are ratios of the expected values of variance components, the difference in metric did not impact the generalizability coefficients. It did, however, impact the standard error of measurement provided by GENOVA. To avoid this problem, we calculated standard errors of measurement from the observed standard deviation of the domain scores and the generalizability coefficients using a formula provided by Magnusson.10
We conducted the analysis separately for first- and second-year students. Since there was no easy way to combine the estimated variance components from multiple administrations of the PCSE, we conducted the generalizability analysis on a single administration of the PCSE and used the data from the first administration of the PCSE given in spring semester 2018 for conducting the generalizability analysis.
Repeated Measures Analysis - To assess both cross-sectional growth, longitudinal growth and their interaction, the data from both classes and the four administrations of the PCSE given over fall 2017 and spring 2018 semesters were analyzed using repeated measures ANOVA. The two classes of students formed the design over subjects and the four administrations of the PCSE formed the design over measures. Orthogonal polynomial contrasts were used to assess the shape of the growth curve over the four administrations of the PCSE. The repeated measures analysis and the generation of summary statistics was done using SPSS Version 25. We considered (p < 0.01) as statistically significant in the repeated measures analysis.
Human Subject Protection - The student performance data and matriculation class were provided to the researchers in a deidentified format by the Office of Medical Education Research and Development (OMERAD) honest broker. Given the PCSE was administered as a normal part of the SDC student evaluation program and the student data were deidentified by the recognized honest broker within the medical school, the data used in the study are not considered to be human subjects data by the Michigan State University Human Research Protection Program13