Procedures
The study consisted of a one-way experimental design completed in two stages (see Figure 2). Ethics approval was granted by the Northern Alberta Institute of Technology Research Ethics Board (#2021-03).
Stage One
In March 2021, second-year Respiratory Therapy (RT) students were scheduled to cover course content on patient advocacy, including instruction on speaking up and challenging authority. During the course, an emphasis was placed on the use of CUS, a structured method for escalating concern in a situation by stating you are Concerned, Uncomfortable and what is occurring is a Safety Issue (Gerstle, 2018). The sample cohort was targeted as their profession plays a vital role in ensuring patient safety during airway management (Rutherford et al., 2012) and was preparing to begin practicum placements in Fall 2021. Study materials, including consent form, individual difference measures, and VS, were hosted on Qualtrics (Qualtrics XM, 2021). No deception was used. Participants were informed the purpose of the study was to enhance the retention of course material and improve speaking up skills. Participants in the experimental condition completed a gamified virtual asynchronous simulation using a Choose-Your-Own-Adventure format. In the VS the participants make choices for an emergency department respiratory therapist who is providing treatment to a status asthmatic. The simulation contained six decision points, two of which were therapeutic decisions and four of which related to IP competencies. At each decision point there was a correct choice, a partially correct choice, and one or two incorrect choices to choose from. Feedback was provided after each decision point in the simulation and participants had the ability to go backwards and try again if they chose incorrectly. To reach the conclusion of the simulation participants had to successfully use CUS to convince a doctor to provide appropriate attention and care to a patient. Participants were provided with a written debriefing about the simulation, the purpose of the simulation, and the need for speaking up (see supplemental material).
Individual Difference Measures
Data were collected on demographic characteristics, procedure-related confidence and experience, and four personality scales. The personality scales included 1) the Big Five, the Conscientiousness, and Agreeableness subscales have been linked with obedience and conformity (Bègue et al., 2015); 2) Moral Foundations Questionnaire (MFQ); The MFQ can be used to determine how moral dispositions influence obedience, in particular, In Group Loyalty and Respect for Authority (Doğruyol et al., 2019); 3) the Brief State Humility Scale, correlates with the honesty-humility subscale of the HEXACO (Kruse et al., 2017), low humility is associated with hierarchy orientation (Lee et al., 2009); 4) the very short version of the Authoritarianism-Conservatism-Traditionalism (ACT); authoritarian dispositions have been associated with group loyalty, conformity, and obedience (Bizumic & Duckitt, 2018; Duckitt et al., 2010).
Stage Two
One month after course instruction, in April 2021, participants completed an in-person high-impact compliance simulation. A one-month delay was used to determine the short-term stability of learning. After the simulation, participants were debriefed and asked not to share information about the simulation with classmates, thanked for their participation, and dismissed.
The Scenario
The simulation was an interprofessional airway management scenario. A senior anesthesiologist was on the third attempt in a can’t-intubate scenario and the participant must challenge the anesthesiologist to prevent patient harm. Before the simulation the actors and facilitators received training and rehearsed the scenario. The actors were all Caucasian females between 40-50 years old and were instructed to act authoritatively. The actors were all registered RTs who had experience and confidence performing intubation. Facilitators were knowledgeable about patient advocacy and challenging authority and were experienced with conducting simulation and debriefing.
The scenario pre-briefing was presented in an OSCE style. Participants were told they were returning from lunch and a charge nurse informed them that Dr. Anderson from anaesthesia was attempting to intubate a patient and needed help. The participant needed to make three strong challenges to complete the scenario successfully. After the second challenge attempt, the doctor would state, “It’s ultimately my responsibility for what happens here. I need to get this tube in,” if the participant challenged again, the scenario would end. An open environment with no specific points for challenge or reply was used to create a naturalistic scenario. The participant was able to challenge the doctor at any moment. The scenario would end after a successful challenge or three minutes. Facilitators had the discretion to end the simulation early if the participant was not taking any action. The scenario was identical to the one used by Violato et al. (Violato, Witschen, et al., 2021) (see Figure 3 and supplemental material). A CAE Healthcare iStan Mannequin (CAE Healthcare, 2017) was used. Three simulations ran concurrently, all simulations were audio and video recorded for analysis.
Outcome Measures
The primary outcome measure was a successful challenge coded dichotomously, yes or no. As no uniformly accepted method for engaging in PD/SU exists, no specific phrases constituted a successful challenge. A successful challenge was defined as the participant making an explicit, direct, and persistent challenge to the doctor that removes ambiguity. Challenges must be explicit and direct to elicit change effectively (Bandura, 1999; Violato, Witschen, et al., 2021). A strong challenge could comprise, though was not limited to, statements such as what is occurring is unsafe or is a safety issue, making a statement about stopping, or indicating they would call for help (Violato, Witschen, et al., 2021). As a backup in the case of lost audio or video, the facilitators scored successful challenges during the simulation.
As CUS was emphasized during classroom instruction, instances of using the components of CUS were recorded. CUS was scored as a binary categorical variable, use or no use. Using one, two, or all three elements of CUS were recorded as a single instance.
Prior research (Delaloye et al., 2017; Pattni et al., 2017; Pian-Smith et al., 2009; Sydor et al., 2013) has scored speaking up using a modified Advocacy Inquiry Scale (mAIS). The mAIS is a scale based on the principles of advocacy (stating one's observation) and inquiry (requesting further information) (Delaloye et al., 2017). The mAIS was used to score behaviour in a continuous manner, with scores ranging from 1 (no action) to 6 (physical action) (see supplemental material).
Video Review
Each video-recorded simulation was reviewed by two independent raters (EV, JW) blind to the randomization of the conditions. Videos were scored for a successful challenge, CUS, mAIS, frequency of questions/suggestions, the number of times participants read the blood oxygen saturation (Sp02), a successful challenge after the physician’s responsibility phrase, and time to challenge.
The raters used a modified version of the confederate hierarchical demeanor rating (HDR) scale (Delaloye et al., 2017; Sydor et al., 2013) to determine the consistency of confederate behaviour between the three simulations. Confederate HDR was calculated by assigning a score of “1” to each hierarchical demeanor statement answered “yes” and a score of “0” to each statement answered “no” (Delaloye et al., 2017) (see supplemental material)
Analysis
Statistical analysis was performed using jamovi (Version 1.8.1), a point and click interface for R (jamovi, 2021) and R (R Core Team, 2019) using the caret package (Kuhn, 2019) and glmnet package (Friedman et al., 2010). Interrater reliability (IRR) for categorical data was calculated using Cohen’s Kappa, IRR for continuous data was calculated using intraclass correlation coefficients (ICC). Mean scores for the mAIS, HDR, frequency of reading questions, suggestions, and SpO2 were used for all analyses.
Chi-square analysis was used to analyze the binary categorical variables of challenge x condition, challenge x CUS and condition x CUS. A t-test was used to compare mAIS scores.
Elastic-net regression was used to examine individual difference scale scores and demographic variables as predictors of speaking up. Elastic-net regression is designed for cases where p > n and is more effective than traditional regression methods when sample sizes are small (James et al., 2013; Zou & Hastie, 2005). However, due to the small absolute sample size, including all potential individual differences in a single predictive model would produce suboptimal prediction. The Big Five and MFQ represent different constructs, comprising multiple subscales; as such each scale was examined separately along with the BSHS and ACT, which are comprised of a single global score. Two models were trained with different resampling methods, one using 5-fold Cross-Validation (CV) with 5 repeats and one using Bootstrapping. A tuning grid was set for alpha from 0-1 and lambda from .0001-1 with a search length of 100. With the smaller number of predictors for demographic characteristics, and airway management and intubation experience, binary logistic regression was used.
Data Preprocessing
Survey data were checked for careless responding. There were no inordinately fast completion times or response patterns indicating careless responding. Each of the four scales and subscales was examined for normality and outliers. Several data points outside 1.5x the interquartile range were identified. The outliers did not have a substantial influence on descriptive statistics or cause deviations from normality as examined through QQ plots and the Shapiro-Wilk test. As the cases were not influential and no careless responding was identified, the data points were included. No significant deviation from normality was identified for the mAIS, Shapiro-Wilk p = .9.
Two videos of the in-person simulation were lost due to technical issues; the facilitator's backup rating was used. Though post-hoc power calculations indicated β = .5 and the sample was small, the sample was adequate for the planned analysis (Campbell, 2007). For successful challenge the initial agreement between the coders was 24/32 (79%), Kappa = .54, Rater bias ratio = 1 χ2 = 9, p = .003. An iterative process of discussion and re-coding was engaged, after which there was 100% agreement. For the mAIS, ICC = .93, Rater bias ratio = .09, χ2 = 15.7, p < .001. For the HDR, the ICC = .96; for confederate HDR between simulations a Kruskal-Wallis test indicated no significant difference, χ2 (7) = 4.32, p = .7.