Effectiveness of a One-day Simulation-based Program in Psychiatry for Medical Students: A Controlled Study

Background: Training in psychiatry requires specic knowledge, attitudes, and skills that are obtainable by simulation, which needs further development for medical students. After an analysis of previous research with medical students, we evaluated the effectiveness of a one-day teaching program in psychiatry by simulation and validated a scale measuring Condence in Psychiatric Clinical Skills (CPCQ). Methods: The population was recruited during the 2019-2020 academic year among the 131 fth-year undergraduate students at the French University of Versailles Saint-Quentin-en-Yvelines-Paris Saclay (the year of compulsory psychiatric training). A controlled study compared knowledge (university grades on the entire national program of psychiatry) and condence (CPCQ scores) between a control group who received the usual psychiatric instruction and the simulation group who also participated in the simulation program. In the simulation group, satisfaction (including the quality of the debrieng) was investigated. The CPCQ scale was validated by assessing the factor structure, internal consistency, and test-retest reliability. Results: All fth-year undergraduate medical students were included: 24 in the simulation group (voluntarily recruited) and 76 in the control group. Their knowledge did not differ before the simulation. After the simulation, knowledge and condence increased signicantly in the simulation group. Two months after the simulation, knowledge and condence were signicantly higher in the simulation group. Satisfaction with the training and debrieng was very high. The CPCQ scale showed good psychometric properties: a single-factor structure, acceptable internal consistency (α=0.73 [0.65 - 0.85]), and good test-retest reliability (ICC=0.71 [0.35 - 0.88]). Several limits were discussed. Conclusions: Adding a one-day simulation program in psychiatry to the usual teaching improved the knowledge and condence of medical students even 2 months after. The CPCQ scale could be used for the evaluation of educational programs.


Background
In Medical Education, the use of the pedagogical method of simulation has increased greatly since the introduction of "Harvey", the cardiology patient simulator, in the 1970s, followed by other experiments in surgery, pediatrics, obstetrics, and anesthesia, and now concerns all medical elds (1). It includes technology-enhanced simulation (virtual reality simulators, mannequin-based simulation, or computer simulation with virtual patients (2)) as well as standardized patients (SPs), notably in psychiatry (3). A major component of experiential learning is the debrie ng that follows a simulated or real experience, consisting of a facilitated conversation in which participants analyze their actions, thought process, and emotional states during three stages: reactions, understanding, and summary (4)(5)(6). It aims to assess the learners' clinical reasoning patterns and foster thinking patterns from direct experience to later action (7).
Two methods are well-established: the Debrie ng Assessment for Simulation in Healthcare (DASH) (7) and the Objective Structured Assessment Debrie ng (OSAD) (8).
In psychiatry, simulation is yet to be developed (9,10). Training in psychiatry requires particularly speci c knowledge, attitudes, and skills that cannot simply be learned theoretically without experiential learning. Simulation, which promotes learning by doing and experiencing (11), provides a great opportunity to allow the development of speci c communication (12), psychotherapeutic, and clinical skills for the assessment and management of various psychiatric disorders, technical abilities (13) and teamwork skills and interprofessional collaboration (14).
A recent meta-analysis on simulation training in psychiatry for medical students, post-graduate trainees, and quali ed doctors reported a threefold increase in research over the past ten years (15). Several universities have introduced such training as compulsory (16) because medical students are demanding it for several reasons. All students, and not only those who want to become psychiatrists, may bene t from such learning because psychiatry is de facto practiced by many physicians, starting with GPs (17). A signi cant proportion of medical students never participate in a clerkship in psychiatry due to the limited number of places and even for those who do, training may be insu cient (9,18). Simulation could reduce fear and stigmatization. It also avoids the potential inconvenience of inexperienced students interacting with vulnerable individuals and, above all, exposing them to large groups of students (19,20).
The standard reference for the assessment of a learning intervention is Kirkpatrick's Training Evaluation Model (21), which measures the impact on ve levels: 1) reaction effect: satisfaction/dissatisfaction of participants; 2) learning effect: participants improve their knowledge and change their attitudes, skills, or con dence; 3) behavioral effect: transfer of knowledge, skills and abilities in the everyday practice with patients; 4) patient results; and 5) return of investment (22)(23)(24). The meta-analysis showed the global effectiveness of simulation training in psychiatry on attitudes, skills, knowledge, and satisfaction (15).
We analyzed the results for medical students (48 surveys, 16 RCTs, and 32 controlled studies, 10 with follow-up). A positive impact on satisfaction was reported for level 1 in seven studies (25,26), without any evaluation of the satisfaction with the debrie ng. For level 2, an improvement in student's objective knowledge after simulation relative to other pedagogical methods was reported in 14 studies (and none in 6), often for substance use disorders (27,28). Other domains of psychiatry need to be investigated and only N=8 studies investigated the impact at a distance from the time of the intervention. N= 27 studies assessed self-reported attitudes: N=15 con dence (without a validated tool), N=10 attitudes, N=2 empathy and N=1 the Malash Burnout Inventory. A positive impact on self-reported attitudes and con dence was reported in twenty studies and any in seven. N=17 studies rated communication skills with standardized patients by independent blinded researchers, approaching an evaluation of the transfer of knowledge and skills (level 3), but they still remained in standardized situations and did not reach the everyday clinical practice. 10 reported an improvement and 7 any (22,29,30). Levels 3, 4 and 5 were not investigated yet.
We developed and evaluated an intervention for medical students with three features: 1) one-day program, 2) coordination between teachers of psychiatry for adults, children, and adolescents and those of general medicine, and 3) presentation of common situations encountered in daily practice in primary care or in the emergency room. We also developed a scale to measure the learners' con dence in their psychiatric clinical skills. Examining one's own practices is a fundamental dimension to characterize skills and clinical performance (31) , (32). No tool was suitable for our study. Con dence scales from nursing education with excellent psychometric properties (33)(34)(35) were not adapted for medical students. General self-assessment scales of psychiatric competence did not study their psychometric properties (36,37). Finally, con dence scales with satisfactory internal consistency but speci c to certain clinical situations, such as suicidal risk (38)(39)(40) and depression (22)) could not be used for our clinical situations.
The present study thus had two objectives: 1) to evaluate the effectiveness of the one-day simulation program for medical students in terms of level 1 and 2 in Kirkpatrick's Training Evaluation Model: satisfaction (including satisfaction with the debrie ng), knowledge and self-con dence in clinical skills and changes in professional practices two months after the program and 2) to explore the psychometric properties of a new self-report questionnaire on con dence in one's clinical skills in psychiatry.

Design
The study had a mixed design, with comparisons of before and after the intervention and between the intervention and control groups ( gure 1).

Population
The population was recruited during the 2019-2020 academic year among the 131 fth-year undergraduate students at the University of Versailles Saint-Quentin-en-Yvelines-Paris Saclay (the year of compulsory psychiatric training).

Intervention
The intervention consisted of one day (8 hours) of teaching of psychiatry by simulation with a simulated patient. The scenarios, decided within a group of 10 hospital-university teachers (from university fellow to professor in general medicine, child and adolescent psychiatry, or adult psychiatry), had to be: 1) addressed in the curriculum of the o cial national program (41), 2) a pathology frequently encountered in general practice, and 3) realistically performed by a team of psychiatry teachers not trained in acting (eliminating the scenario of schizophrenia, which is challenging to play (15)). The four scenarios presented a drug suicide attempt in the context of borderline personality disorder and alcohol addiction assessed in the emergency room by a psychiatrist and bereavement associated with post-traumatic stress disorder, hypomania, and a refusal to go to school by a 14-year-old adolescent assessed in a general-practice setting.
Each simulation session included a brie ng (10 min), the simulation (10-20 min), a structured debrie ng (45 min), and a theoretical synthesis slide presentation (20 min). Learners were divided into groups of eight (2 actively participating, 6 watching the live video broadcast in an adjacent room). Three teachers were involved, one playing the role of the patient, one as a potential facilitator, and one staying with the learners.
No randomization was used to assign the intervention or control group status. Students in the intervention group were recruited voluntarily and accepted an optional teaching unit on the condition that they actively participated in one scenario. The control group, recruited from non-participating students, received the same usual psychiatric instruction as the simulation group in the form of a compulsory twoday interactive seminar with the technique of the ipped classroom (42). All fth-year undergraduate students were provided with the pedagogical written content of the simulation sessions to ensure that any differences between the groups were related to the simulation teaching technique itself.

Measures 1) Knowledge
Theoretical knowledge was measured using multiple choice questions (MCQs) three times: for all students, two months before the simulation teaching (45 questions before the compulsory psychiatry seminar) and two months after (50 MCQs during the psychiatry examination, covering the entire national program of psychiatry for the university grade) and for the simulation group, before and after the teaching (28 questions). All scores were scaled from 0 to 20.

2) Con dence (Supplementary Information 1)
Con dence was assessed by the speci cally created Con dence in Psychiatric Clinical skills Questionnaire (CPCQ): 12 items, rated on a four-level Likert scale, explored con dence in theoretical knowledge, clinical skills (clinical reasoning and psychiatric interviewing), communication and interpersonal skills (with the patient, the patient's proxies, and other professionals), and the management of psychiatric disorders. The individual mean score was used in the analyses.
The change in professional practice was evaluated with one question: "How much do you think this teaching of simulation psychiatry will improve your future practice? " It was rated on a 4-level Likert scale ranging from "very unimportant" (coded 1) to "very important" (coded 4).

3) Satisfaction (Supplementary Information 2 and 3)
General satisfaction was rated out of 10. A 10-item questionnaire, rated on a four-level Likert scale, explored various aspects of satisfaction, such as the preference for simulation over another pedagogical modality, the perceived realism of the situation, the importance of being actively involved, etc. In addition, learners who underwent a clerkship in psychiatry were asked to compare it to the simulation. Questions about the scenarios and free comments were collected.
Satisfaction with the brie ng and debrie ng was assessed using the student version of the DASH (43). This scale, with excellent internal consistency (0.82-0.95) (44)(45)(46), explores the climate, structure of the debrie ng, ability to engage in exchange, and strengths and areas for improvement. The mean across all items (6 overall assessments, 23 behavioral assessments) was used.

Statistical analysis
Comparisons between simulation and control groups First, age and participation in a clerkship in psychiatry (a potential confounding factor for con dence (39)) were compared between the two groups using chi² tests and scores on the pre-requisite exam using Student's t-test. Analyses of covariance (ANCOVA) was then carried out with the mean CPCQ score and the psychiatry nal exam score as dependent variables, the group as the independent variable, and the covariates that differed signi cantly between the two groups (clerkship in psychiatry).

Pre/post-simulation comparisons
The average CPCQ and knowledge test scores just before and after simulation were compared using paired sample Student t-tests. Satisfaction was measured post-stimulation.
Psychometric characteristics of the CPCQ scale Construct validity was explored by exploratory factor analysis using oblim rotation and maximum likelihood factorization as the factorization method. Two criteria were used to determine the number of signi cant factors: rst, Catell's scree test, i.e. factors present to the left of the eigenvalue curve de ection (47), and second, Kaiser's criteria, i.e. factors for which the eigenvalue is > 1 (48). The internal consistency of each identi ed factor was evaluated using Cronbach's α coe cient (49), with an acceptability threshold set to 0.7 (50). These analyses were carried out on the largest sample for the same time of measurement ( nal exam) and by bringing the two groups together.
Test-retest reliability was assessed by the intra-class correlation coe cient (ICC), calculated using a mixed model with a random double effect. It was de ned as poor for an ICC < 0.4, acceptable between 0.4 and 0.59, good between 0.6 and 0.74, and excellent between 0.75 and 1 (51). The two times of measurement chosen to calculate it were those for which the least possible change was expected, i.e. just after the simulation and two months later.

Number of required subjects
The number of required subjects was calculated for knowledge (score out of 20 on the usual psychiatric examination). According to the results of the previous year, the average score was 13.3, with a standard deviation of 1.9. To show a mean difference of 2 points with an alpha risk of 5% and a statistical power of 90% required at least 19 subjects per group.

Ethics statement
The research was authorized on 20/12/2010 by the Ethics Committee of the University of Paris-Saclay (CER-Paris-Saclay-2019-061). All participants signed written and informed consent.

Participants
The 24 places available for the simulation were lled within a few days of the opening of registration and all learners agreed to participate in the research. In the control group, 76 among 107 students (71.0%) consented to participate in the research.
The simulation and control groups did not differ in terms of either the sex ratio (X² = 0, p = 0.93) or initial knowledge (t(97) = 1.2, p = 0.24). There were more students with a clerkship in psychiatry in the intervention group (X² = 3.7, p = 0.056).

Knowledge
The ANCOVA showed better theoretical knowledge on the psychiatry exam (F(1.96) = 6, p = 0.016) in the simulation group than in the control group.

Con dence
The ANCOVA showed higher con dence on the CPCQ scale (F(1.89) = 6, p = 0.003) for the simulation group than the control group.
Con dence measured with the CPCQ scale improved after teaching (t (23) = 8.2, p < 0.001). Learners shifted from an average low con dence level (2.2 ± SD 0.3) before instruction to an average high con dence level (2.7 ± 0.2) afterwards. This gain was maintained for two months, insofar as the reassessment of con dence with the CPCQ scale was not signi cantly different between immediately after the simulation and two months later (t (15) = 0.7, p = 0.52).
Students in the intervention reported on average that it will change their future practice in a way "important" or "very important" (3.4 ± SD 0.5).

Satisfaction
The overall satisfaction score was excellent (9.3 +-SD 0.6). The average scores on the satisfaction questionnaire (3.5 ± SD 0.2) showed that learners were satis ed to very satis ed with the teaching. The lowest score was obtained on the question about optional or compulsory teaching "(2.7 ± SD 0.8), suggesting a neutral position for the group of learners. The highest score was obtained for the questions on preference of courses rather than the simulation (3.9 ± SD 0.3), and on the realism of the simulations (3.8 ± SD 0.4). All learners who had a clerkship in psychiatry felt that the simulation was more (4/5) or much more (1/5) informative that the clerkship.
The level of di culty was found to be appropriate on average (3.1 ± SD 0.2). The scenarios were judged to be informative or very informative (3.5 ± SD 0.4).
Free comments were positive and suggested areas for improvement (summarizing an ideal psychiatric interview, furthering theoretical reminders, including other pathologies, such as eating disorders and schizophrenia).
The average total DASH score showed the brie ng and debrie ng of the simulation sessions to be rated as very good (6.5 ± SD 0.4). Behavioral scores suggested a good sense of security for the learners.

Psychometric characteristics of the CPCQ scale
Factor structure A scree diagram ( Figure 2) showed a single-factor structure of the CPCQ scale according to the Catell criterion, as the de ection of the curve occurred for two factors. The rst factor was the only one with an eigenvalue > 1 (Kaiser criterion) and accounted for 20.5% of the variance.

Internal consistency
With a Cronbach's alpha coe cient of 0.73 [0.65 -0.85], the internal consistency was satisfactory.

Discussion
We evaluated the effectiveness of a one-day simulation program with standardized patients for medical students as a complement to usual teaching. There was improvement after simulation for knowledge and con dence. Satisfaction, including that concerning the debrie ng, was high.
The effectiveness in improving knowledge of our one-day simulation training in psychiatry is coherent with previous results with medical students (15). The positive impact was maintained a few months after the training (27,28,52,53). This difference may be due to higher pedagogical time spent with the intervention students (one day) (54). But simulation in psychiatry may also allow, as an addition to other pedagogical tools, sustainable acquisition of knowledge that cannot be simply learned theoretically and memorized without experiential learning (11). All participants who had a clerkship in psychiatry and the simulation, found the latter to be more informative. Simulation with a standardized patient provides an opportunity for real-time feedback and re ection on performance, which is rarely the case in interactions between medical students and people with psychiatric disorders (10).
The improvement in con dence is also consistent with the results of the majority of previous studies on medical students (15). This is the rst study to report that the improvement in self-con dence is maintained two months after receiving the training. Among attitudes, self-con dence is that which has been the most explored relative to empathy or stigmatization (15), and has been associated with better skills, for example in assessing suicide risk (32).
Simulation is popular among students. Our study con rmed a very high level of satisfaction with the content of the teaching and its usefulness for practice. The average DASH score, well above the usual acceptability threshold of four (46), suggests effective brie ngs and debrie ngs in a safe educational framework. Despite the lack of teachers trained to act, the simulations appeared to be realistic to the learners. Future studies could use validated scales, such as the Maastricht Assessment of Simulated Patients (55), to reliably assess the quality of standardized patient role-play.
Our study had several limitations, despite a relatively high median quality, as assessed by the MERSQI (Medical Education Research Study Quality Instrument), i.e. 12 (Supplementary Information 4) vs 10.8 for studies reported in the meta-analysis (15). The main limitation was, as in most previous studies (15), the absence of random assignment, which is di cult in the context of elective teaching based on learner preferences. The higher proportion of learners with a clerkship in psychiatry in the simulation group may suggest a selection bias toward individuals with a high level of interest and motivation for the discipline. Second, some measures were missing: prior exposure to simulation experiments for both groups and a measure of pre-intervention con dence for the control group (the simulation group may have had a higher level of con dence than the control group prior to the intervention, in connection with participation in a clerkship in psychiatry). Third, the generalizability of our results is limited by the small sample size and a single teaching site. Fourth, we did not explore levels 3 and 4 of Kirkpatrick's model for simulation, i.e. the transfer of knowledge and skills in clinical practice outcome on management and individuals with a mental disorder. This would be important to judge whether wider dissemination of this pedagogical technique in the mental health eld would be pertinent. We did not nd a validated scale to assess skill in psychiatry, despite the efforts of certain authors to develop objective measures of the e ciency of a psychiatric interview (56), and an assessment by teachers was not possible, as the students participated in the simulation only once.
The CPCQ scale of the con dence of medical students' clinical skills in psychiatry showed satisfactory psychometric properties (acceptable internal consistency, good test-retest reliability, and a unifactorial structure) and it proved to be an easy and rapid evaluation tool. It is an important addition to tools for which the psychometric properties are not known (57). Given the small sample size for measuring testretest reliability, the con dence interval obtained was large and the results should be replicated on a larger sample.

Conclusion
Our study shows the effectiveness in terms of knowledge gained, con dence, and satisfaction of a oneday program of teaching psychiatry through standardized patient-based simulation as a complement to usual teaching for fth-year medical students in France. The teaching has the disadvantage of being resource-intensive (58), especially in terms of human resources, with a teacher/learner ratio of 3/8 and we did not measure its impact or not on behavioral effect in the everyday practice with patients and on patient results. The Con dence in Psychiatric Clinical Competence Scale shows acceptable psychometric properties and may be used by other educational teams involved in teaching psychiatry to medical students.

Consent for publication
Not applicable.

Availability of data and materials
Anonymous data used and analyzed in this study are available upon request.

Competing interest
The authors have no con ict of interest to declare.

Funding
This research received no speci c grant from any funding agency in the public, commercial, or not-forpro t sectors. We thank the Centre Hospitalier de Versailles for editorial assistance.
Authors' contributions PR designed the study and performed the data analysis. NY and PR drafted the manuscript. NY, ALD, MR, PS, FH, FU, NG, MS, CP, and PR made critical revisions and edited the manuscript. All authors reviewed the manuscript.

Statement
All methods were carried out in accordance with relevant guidelines and regulations

Data availability statements
The datasets used and analysed during the current study are available from the corresponding author on reasonable request.