Within supervisor-trainee relationships, entrustment plays a key role in supporting learning and professional growth. Providing trainees with an optimal balance between supervision and autonomy can facilitate their development towards independent practice (ten Cate et al., 2016). Entrustment operationalizes this balance via decisions that rely on a supervisor’s trust in a trainee to perform clinical tasks with varying levels of independence, and encourages feedback on the competencies needed for progressive independence. While recent studies examined supervisor and trainee perspectives on entrustment separately, it would be useful to know how their interpretation of feedback on similar types of clinical tasks (in similar contexts) compares. Such feedback should reflect factors supervisors consider when making entrustment decisions, including both trainees’ competence and personal qualities (Dijksterhuis et al., 2009; Hauer et al., 2014; Kennedy et al., 2007; ten Cate & Chen, 2020). While trainees are aware that these factors influence their supervisors’ trust in them (Caro Monroig et al., 2021; Gin et al., 2021), it is less clear whether trainees regard the same factors as equally important as their supervisors do for earning clinical trust. Supervisors with a performance focus may base their assessments on how well trainees demonstrate clinical competencies, while trainees adapting to the clinical learning environment may be more attuned to how their developing roles and relationships can act as gatekeepers to participation (Castanelli et al., 2021, 2022; Hatala et al., 2022; Pugh & Hatala, 2016). Bridging these two distinct viewpoints of trust – and identifying potential biases that can shape them – are key to developing supervisor-trainee relationships that lead to assessment for learning (AfL) and ensuring equitable implementation of entrustment.
When framing entrustment from a social constructivist standpoint, it becomes paramount to examine both supervisor and trainee viewpoints. Trust within a supervisor-trainee relationship forms collaboratively, developing from verbal and nonverbal interactions between them (Hauer et al., 2014; Mitchell et al., 2018). Hence, entrustment is neither an isolated decision made solely by a supervisor, nor an assessment based only on their judgement of the trainee. The trainee plays a role in the co-construction of trust, and hence their actions and impressions contribute to the entrustment decision (Sklar, 2016). Prior qualitative studies based on retrospective interviews shed light on supervisor and trainee viewpoints separately (Gin et al., 2021; Hauer et al., 2015; Sheu et al., 2016; Sterkenburg et al., 2010). However, their retrospective nature may include hindsight bias and distortion of how those insights apply in moments of actual teaching and practice. Narratives documenting feedback dialogs about observed patient encounters provide a window that is slightly closer to entrustment decisions made in actual practice. Such feedback narratives are sometimes written by supervisors and sometimes by trainees. Comparing narratives written by supervisors and trainees about similar types of clinical tasks (in similar contexts) could provide insight into differences between their perceptions or viewpoints. These narratives can be analyzed for factors related to the selection of entrustment levels, and for the emotional tone expressed in their language (i.e. sentiment). Finally, such feedback may also reflect potential biases related to trainee gender and underrepresented in medicine (UIM) status (Rojek et al., 2019).
When considering similar types of clinical tasks, trainees and supervisors may prioritize different factors when making entrustment decisions. From the supervisor standpoint, studies have focused on factors influencing supervisors’ decisions to entrust trainees. Theoretical studies developed five factors that supervisors consider (trainee, supervisor, context, task, and relationship) (Dijksterhuis et al., 2009; Hauer et al., 2014; Holzhausen et al., 2017; Kennedy et al., 2007), which were supported by empiric studies primarily based on retrospective supervisor interviews (Hauer et al., 2015; Nelson et al., 2023; Sheu et al., 2016). More recently ten Cate and Chen developed a framework that summarizes trainee qualities for entrustment found in the literature (ten Cate & Chen, 2020). The trainee viewpoint of entrustment has focused on several manifestations of trust including self-trust (Sagasser et al., 2017; Sturman et al., 2021), supervisor trust in the trainee (Caro Monroig et al., 2021; Gin et al., 2021; Karp et al., 2019), trainee trust in the supervisor (Castanelli et al., 2021, 2022), and mutual trust (Bonnie et al., 2020). While these aspects of trust are interrelated, in this current study we focus on the trainee’s understanding of their supervisor’s trust, since it is most directly tied to the level of autonomy a trainee is granted (and supervision they are provided) to perform a given clinical task. From retrospective studies on trainees’ perceptions of supervisor trust, we have learned that they consider the same five factors that supervisors use to determine trust (Gin et al., 2021) but with an emphasis on the relationship and trainee factors (Caro Monroig et al., 2021). While these differences in supervisor and trainee viewpoints have been suggested by interviews reflecting aggregate experiences in retrospect, it remains of interest to assess whether they are reflected in actual supervisory decisions when they occur in practice.
The sentiment of feedback narratives may also reveal details around the negotiation of entrustment and acceptance of feedback. Prior work on feedback by Ginsberg, et al. utilized the lens of politeness theory to demonstrate that social pressures to maintain effective supervisor-trainee relationships led to a lack of directness in the language used by supervisors (Ginsburg et al., 2016). Feedback documented by trainees may not reflect such pressures when directed towards themselves. Trainees were found to make active decisions about whether to accept feedback, based on their judgement of the credibility of the feedback provider (van de Ridder et al., 2015). If a trainee were to document a supervisor’s feedback that they did not agree with, the language that the trainee uses may reflect ambiguity or a lack of agency, since the emotional content of language can reflect a trainee’s perceptions of competence and self-efficacy (Sagasser et al., 2017).
Given trust’s dependence on human judgement and instinct (often based upon little data or experience), these viewpoints are inevitably susceptible to bias (Hauer et al., 2023). Multiple junctures within entrustment are subject to bias, including: the entrustment ratings themselves, the content of the narratives, and the language (e.g. sentiment) used in the narratives. Recently, Padilla et al. examined entrustment ratings in a surgical residency context for gender bias (Padilla et al., 2022). They found no such bias in assessments submitted by faculty, but a negative bias in self-assessments submitted by female residents. Dayal et al. examined milestone ratings in an emergency medicine residency, finding a bias in the rate of milestone attainment that favored male residents (Dayal et al., 2017). With respect to content, Mamtani et al. performed a large qualitative study comparing feedback themes in narrative comments given to male and female residents in an emergency medicine setting, finding that female residents were more likely to be told they lacked confidence with procedural skills (Mamtani et al., 2022). Rojek et al. examined adjectives used in medical students’ clinical evaluations, finding biases related to both students’ gender and under-represented minority (URM) status (Rojek et al., 2019). Similar biases in feedback content and sentiment related to trainee gender have either been suggested (but found lacked statistical significance) (Minter et al., 2005) or found to be unlikely (Andrews et al., 2021).
While qualitative studies led to retrospective insights about supervisor and trainee viewpoints of entrustment, and quantitative studies found conflicting results on bias in different settings, systematic analysis of a large dataset of entrustment-associated narratives may allow us to perform both analyses simultaneously and assess how they interact. A large dataset taken from an institution-wide experience over several years may thus allow for identification of systematic differences in trainee and supervisor viewpoints, and examination of potential biases represented in the narratives. However, performing consistent text analysis of this nature across a large dataset would be difficult to do via manual coding, and may be prone to bias of the coders themselves. Recently, the development of large language models (LLMs) has facilitated innovations in natural language processing (NLP) and artificial intelligence (AI) that elevate the ability of NLP to characterize themes and emotions (Alaparthi & Mishra, 2021; Boscardin et al., 2023; Zhang et al., 2023). LLMs underly generative AI applications such as ChatGPT and Bard. LLMs are implemented via artificial neural networks that have been trained to represent language probabilistically, considering interrelationships of words in the context of sentences, paragraphs, bodies of text, and entire corpora. When applied to narrative excerpts, they can be used for many NLP applications, including: representing meaning numerically (i.e. via embeddings), producing specific output (i.e. measuring sentiment), and generating new next based on specific prompts (i.e. chatbots).
In this study we developed and utilized NLP tools based on LLMs to systematically compare the thematic content and sentiment of feedback narratives about observed clinical encounters written either by supervisors or trainees. We developed a gender-neutral sentiment analysis strategy to mitigate algorithmic bias. By examining how supervisors and trainees documented such feedback dialogs on an institution-wide scale over two years, we quantitatively compared how trainees and supervisors differentially: 1) prioritized factors tied to entrustment ratings, 2) used sentiment to convey trust, and 3) reflected potential sources of bias (based on students’ gender identity and UIM status).