Moving Forward from the “Boot Camp Method” To Learning-Curve Development in Simulation-Based Paracentesis Training For Undergraduate Medical Students

Background: Abdominal paracentesis is an essential competence for physicians. Simulation-based mastery learning (SBML) programs lead to developing these skills for medical students. Most programs are structured as short boot-camps, without assessing skills' retention mid and long-term or learning curves. Aim: To assess the learning curve of paracentesis through an SBML program for medical students and compare this learning curve-based program with the boot-camp method. Methods: A prospective quasi-experimental study was conducted. A cohort of medical students participated in an SBML program with successive sessions until prociency criteria were met while their learning curve was assessed (LC group). A control group received an SBML boot-camp intervention (BC group). As a procient group, gastroenterology fellows (GF group) performed a paracentesis on the simulated model. The skills of the three groups were compared using technical/prociency scores. Learning curves and cost analysis were performed. Results: 100% of the LC group achieved prociency in the fourth session, with a attening learning curve between third-fourth sessions. Comparing the initial and nal sessions of LC group showed a signicant improvement in their scores. When comparing the three groups, statistical differences were found in the skill scores, with the BC group having the lowest. The overall cost per participant was highest for the LC group. Conclusion: This study identies a learning curve for paracentesis with an SBML program. The LC group signicantly improved their skills, comparable with the GF group’s performance. SBML focused on a learning curve ensured better skills acquisition than the “boot-camp method.”


Introduction
Ascites is a major complication of liver cirrhosis, occurring in up to 50% of patients [1]. Its development re ects the progression of the disease and is associated with substantially increased mortality [2]. More than 100,000 inpatient admissions for complicated liver cirrhosis occur in the U.S. annually [1]. Bedside paracentesis (diagnostic and/or therapeutic) is one of the cornerstones in the management of patients with ascites. Paracentesis performed early in hospital admission is associated with increased short-term survival [3].
Thus, paracentesis is a signi cant competence for physicians worldwide. As with other medical procedures, skill is gained through frequent performance of the procedure on patients, according to Halsted's traditional medical learning method [4]. However, exposure to the procedure could be infrequent or non-standardized during the clinical education process, since exposure to patients is not always homogeneous. In consequence, paracentesis is often performed by learners who are not pro cient or con dent [5,6], which creates a risk for adverse events. According to the literature, a complication rate of 1.6% has been reported, including bowel perforation, hemorrhage, hemoperitoneum, and puncture site infection [4,7,8].
Simulation-based mastery learning (SBML) with deliberate practice provides the opportunity for medical students and residents to develop procedural skills [6,9]. In recent years, different simulation-based programs have been designed and assessed for paracentesis training [1,4,6,10,11], supporting their effectiveness in skills acquisition. In addition, accreditation entities now demand simulation training for students, residents, and fellows [12].
Most of these programs are structured as short boot camps, meaning a brief educational session is immediately followed by an evaluation without assessing skills' retention mid and long-term [6,10,13,14]. Although these programs have led to changes in medical education curricula worldwide, the failure to assess mid and long-term acquisition of the desired skills induces a bias concerning the number of training sessions required to master a procedure. However, understanding the learning curve of this procedure could clarify the actual number of training sessions required to achieve pro ciency.
This study aims to assess the learning curve of abdominal paracentesis through an SBML program for medical students and to compare two different learning methodologies for paracentesis skills acquisition: the boot camp method and the learning curve-based method.

Study design and participants
A prospective quasi-experimental study was designed and conducted.
Medical students from the Ponti cia Universidad Católica de Chile were recruited between 2018 and 2019 to participate in this study, according to our prede ned inclusion and exclusion criteria ( Table 1).
The rst cohort of students were recruited from March to December 2018. This group was de ned as the Learning Curve group (LC). The control group was recruited from March to December 2019. This group was de ned as the boot camp group (BC).
As a pro cient group, gastroenterology fellows (GF) were recruited. They had not received mandatory simulation-based training in paracentesis during their medical education; rather, re ecting the traditional medical learning method without simulation [4], they had practiced directly on a minimum of 20 real patients to ensure adequate experience according to the Core Curriculum recommendation of the Gastroenterology Leadership Council [12]. They were recruited according to our inclusion and exclusion criteria ( Table 1).
All participants agreed to participate voluntarily, and informed consent was obtained.
This study was approved by the Ethics Committee of the Ponti cia Universidad Católica de Chile in compliance with the Declaration of Helsinki.

Simulated paracentesis model
A previously validated simulated paracentesis model was used in this project [4,15]. The model was designed and developed by a group of experts (gastroenterologists and surgeons) together with specialist designers of simulated models at Ponti cia Universidad Católica de Chile. The validation process included a qualitative analysis of how well the model met clinical educational needs, including delity, washing capability, transportation, reuse capacity, low cost, safety for students, and use of the Z-traction technique on the abdominal skin (Fig. 1). The innovative characteristics of this paracentesis model led to a patent application, with an overall cost of $1,000 USD for each model [15].

Simulation-based training program
Both the LC and BC groups received previously validated educational support material developed by clinical experts [4]. This material included a video tutorial on the paracentesis technique, encompassing theoretical and practical concepts. The participants were asked to review the educational support material prior to the training sessions. To ensure a homogeneous framework, this educational material, including the video tutorial, was reviewed in a 15-minute brie ng by the clinical instructor at the beginning of both the LC and the BC training program.
The LC group underwent successive training sessions using the simulated paracentesis model and were tutored by an expert who provided them with direct feedback.
The participants attended a half-hour training session once a week until they met the pre-established criteria for pro ciency.
The purpose of this format was to reduce any possible forgetfulness bias and, at the same time, to avoid conducting all the training sessions on one day.
The BC group underwent a one-day SBML boot camp program that consisted of four hours of paracentesis training with the simulated model. The workshop was performed in groups of four participants and lead by an expert tutor who provided individual feedback. The students were encouraged to observe and evaluate the performance of their peers [4].
The GF group received the same educational material to standardize the theoretical-practical framework. Following the 15-minute brie ng with the video tutorial, they performed a single paracentesis puncture on the simulated model.
Debrie ng at the end of the training program was provided by an expert tutor who provided feedback to the three groups ( Fig.   2).

Assessment tools
All the sessions for each group were video recorded. The video-recordings were randomly analyzed by a blinded expert who did not participate in any other training or educational activity. The expert evaluated the video-recordings based on technical skills and speci c procedural milestones. The primary outcomes were the OSATS (objective structured evaluation of technical skills) score [16] and the DOPS (direct observation of procedure skills) score. A previously modi ed and internationally validated OSATS scale was used [17,18] (Table 2). The DOPS scale was adapted by an experts' panel (n = 8) to assess compliance with the speci c requirements of the paracentesis procedure (Table 3). Both assessment tools were available to participants to guide their learning.
At the end of both training programs, a seven-item survey was administered to participants to assess their perception of their performance. Each question comprised a ve-point Likert scale. This instrument was previously designed and validated by Barsuk and adapted to Spanish by Tejos et al. [4,6] (Table 4).

Pro ciency criteria
We established a minimum passing score for paracentesis clinical skills determined with the input of eight clinical experts using the modi ed Angoff standard-setting point [12,19]. For the modi ed OSATS scale, the minimum passing score was set at 23 out of 25 points (equivalent to 92% of the maximum score), and for the DOPS scale, the minimum score was set at 25 out of 27 points (equivalent to 93% of the maximum score). The maximum time to perform the procedure was set at 20 minutes.
The LC group were informed at the beginning of the training program that the main objective was the acquisition of skills. If a student did not achieve the minimum required score within a month, they had to continue training until they achieved pro ciency. For the BC group, if a participant did not achieve the required score, they were required to attend a second workshop with a different group of peers.

Learning curve analysis
Mixed-effects models with a random intercept were constructed to analyze differences in the consecutive OSATS and DOPS scores of each trainee. Since scores from consecutive training sessions for the same subject were compared, an intra-subject correlation was expected, which would produce biased estimates of the standard errors when estimated using linear regressions models [20]. Mixed-effects models can be used to estimate standard errors that take the clustering within subjects into account and have been proven to have higher statistical power than conventional repeated-measures analysis of variance [21]. Also, pairwise comparisons were made between adjacent sessions results, with multiple comparisons being adjusted by Bonferroni's correction (to control type I error).
The mixed-effects models allowed an analysis of the LC group [22] that represented the trainees' average OSATS and DOPS scores. Each trainee's trajectory, showing the trainee's individual learning curve, was estimated using Growth Curve Modeling [23], specifying an intercept and a random coe cient model.
Given that the mixed-effects models' residuals did not have a normal distribution, the standard error was estimated using bootstrapping (10,000 replications). Thus, 95% con dence intervals (CI) were obtained using the bias-corrected and accelerated method [24]. Mean scores and 95% CI were expressed for each training session.

Cost-analysis
A cost-analysis comparison between the two training program methodologies was carried out. Medical supplies, infrastructure, simulated model and teaching-time costs were analyzed (Table 5).

Statistical analysis
All analyses were performed using STATA version 16 (StataCorp LLC, College Station, TX, USA). Learning curve statistical analysis was detailed previously. Mann-Whitney U and Kruskal-Wallis tests were performed to compare the groups. All results were expressed in terms of median and IQR or mean ± SD, as appropriate. A p-value of < 0.05 was considered statistically signi cant.

Participants
Two-hundred-and-thirty-one medical students were invited to participate between 2018 and 2019, out of which 71 were included in the study (a 30.7% acceptance rate). The global mean age was 23 ± 1,4 and 48% of the participants were female.
Twenty-one students were included in the LC group, 50 students were included in the BC group, and 10 gastroenterology fellows comprised the pro cient group (the GF group). Two participants in the LC group did not complete all the training sessions and were excluded from the analysis ( Table 6).
However, there were signi cant differences between the rst session (mean score 21.53 [95% CI 20.55-22.50]) and the second session (p < 0.001), between the rst session and the third session (p < 0.001), and between the rst session and the fourth session (p < 0.001). Individual trajectories can be observed in Fig. 4b. The learning curve in ection point took place in the second session. However, the majority of the trainees continued to improve in the third session.
All of the participants in the LC group had met the pro ciency criteria by the fourth session (LC4).
Interestingly, 84% of the participants met the pro ciency criteria in terms of OSATS and 89% in terms of DOPS in the third session of the training program.

Boot camp group (BC) performance
Only 66% of the BC group met the pro ciency criteria as measured by OSATS, and only 86% achieved the DOPS minimum required score. Therefore, 17 students had to repeat the workshop. The median OSATS and DOPS scores were 25 points (24-25) and 27 points (26)(27), respectively (Fig. 5).

Gastroenterology fellows' (GF) performance
Ninety percent of the fellows met the pro ciency criteria, with a median score for OSATS and DOPS of 25 points (24-25) and 26 points (26)(27), respectively. Surprisingly, one of the fellows did not achieve the required minimum OSATS and DOPS scores, achieving 23 and 24 points, respectively (Fig. 5).

Assessment of the training program perception
The questionnaire administered to participants showed that they highly valued both training programs, based on a 5-point Likert scale, where 1 = strongly disagree and 5 = strongly agree. The LC group reached a mean of 4.9 ± 0.26 points, and the BC group achieved a mean of 4.8 ± 0.39 points, with no statistical differences (p = 0.621) ( Table 4).

Cost analysis
The total cost for the LC group was $121 USD per student for all four sessions; thus, the cost of each session was $30 USD per student. The total cost of the boot camp was $80 USD per student. The cost of the teaching time was the most expensive item for each group (Table 5).

Discussion
We found that pro ciency was achieved by 100% of the LC group at the fourth session (LC4). We found no differences when comparing the LC4 and the GF group's OSATS and DOPS scores. Both the LC and GF groups had signi cantly superior OSATS results compared to the BG group; there were no differences in the DOPS scores between the three groups.
The main strength of this work is that we evaluated the students' performance by analyzing their learning curve. This is a very important strength of this study, since one of the main confounding biases in previous educational studies has been the focus on evaluating the outcome of education after conducting just a "point in time" training, exempli ed by the boot camp training. Another strength of our study is the high quality of the study design. Speci cally, we believe that the experimental interventions, the blinded experts who reviewed the videos, and the validated scales used to measure the outcomes give strong internal validity to our work.
One of the main limitations of this study is the absence of randomization. This problem was considered during the design and planning of the study; however, we decided to go with a quasi-experimental design because we were not able to randomize the gastroenterology fellows nor the medical students as they were recruited one by one during different periods.
Another limitation is that the simulated model made in our institution does not contain echogenic materials; thus, the paracentesis was performed using classic anatomical repairs. However, this is compatible with local reality, where only a minority of punctures are ultrasound-guided. We also believe that the absence of paracentesis training on a real patient and the evaluation of the learning curve for this procedure could be an additional limitation; nevertheless, it seems unethical to train medical students on real patients due to the strong evidence supporting the use of simulation as a method of learning technical skills. Moreover, as we noted in our results, even clinical experience of 20 or more procedures without previous standardized training does not guarantee pro ciency [6,12].
Knowing and understanding the learning curves of medical and surgical procedures is essential to designing effective training programs and de ning the minimum competencies for medical education [17,25]. Therefore, the purpose of this study was to determine the learning curve in acquiring the technical skills to perform paracentesis in a simulated model and compare it with the performance of students trained by the widely accepted boot camp method. We also compared both groups with gastroenterology fellows, who are used to executing this procedure during their daily clinical practice.
Our undergraduate medical students trained through the LC program achieved a level of procedural pro ciency comparable to the performance of more experienced professionals, that is, the gastroenterology fellows, who have four to ve years of postgraduate training. Conversely, the students trained through the boot camp method (BC) had statistically signi cant worse results in terms of the skills they acquired, and 34% did not reach the required pro ciency level at the end of their training. Moreover, one of the gastroenterology fellows did not meet the pro ciency criteria in his performance, re ecting the importance of standardized training.
The results of this study allow us to claim that a paracentesis simulation-based training program requires at least three to four practical sessions for students to achieve pro ciency. Students acquire technical skills progressively and in a standardized methodological sequence, whereas in the boot camp method, they are asked to recall procedural steps seen in a video just a few minutes previously [6]. Thus, we recommend moving from the one-day boot camp method in SBML to short, successive training sessions with deliberate practice on a simulated model, which results in progressive learning [10,13].
Although the overall cost per student of the LC intervention is higher, one-third of the students in the BC group had to repeat the workshop, which was not factored into the cost analysis. Regardless of the learning methodology, the highest cost was related to the teaching time (Table 5).
In future studies, alternatives to optimize teaching and feedback can be explored, such as tele-mentoring. Quezada et al. used this methodology to teach advanced laparoscopic surgical skills through a mobile app, with remote feedback provided by expert tutors [26]. This novel method could optimize the teaching time and has been proved as effective as in-person instruction [26]. Also, current studies on simulationbased programs and medical/surgical skills still focus primarily on levels 1 and 2 of the Kirkpatrick model for learning evaluation and educational impact [4,15,27,28]. More research is needed on levels 3 (transfer of skills to real patients) and 4 (costs and service quality).
In conclusion, this study outlines a learning curve for the paracentesis procedure in a simulated model and demonstrates that a SBML program based on that learning curve improves technical skills signi cantly in medical students. Also, the performance of the students trained with this method was comparable with that of gastroenterology fellows with much more clinical experience.