Consensus on an implicit bias and health disparities curriculum in neonatal medicine: a Delphi study

Despite longstanding and recurrent calls for effective implicit bias (IB) education in health professions education as one mechanism to reduce ongoing racism and health disparities, such curricula for neonatal-perinatal medicine (NPM) are limited. We aim to determine the key curricular elements for educating NPM fellows, advanced practice providers, and attending physicians in the critical topics of IB and health disparities. A modified Delphi study was performed with content experts in IB and health disparities who had educational relationships to those working and training in the neonatal intensive care unit. Three Delphi rounds were conducted from May to November 2021. Experts reached consensus on a variety of items for inclusion in the curriculum, including educational goals, learning objectives, teaching strategies, and educator principles. Essential curricular components of an IB and health disparities curriculum for neonatal medicine were defined using rigorous consensus building methodology.


INTRODUCTION
Infants and families of color or from marginalized and minoritized communities experience health inequities, defined as avoidable inequalities of health between populations, within the neonatal intensive care unit (NICU). These inequities are evident as differences in preterm birth rates [1], care delivery [2,3], and outcomes [4][5][6].
Striking differences in quality of care and health outcomes exist between and within hospitals, including differential administration of antenatal steroids, timeliness of screening eye examinations, provision of human milk on hospital discharge, and rates of healthcare associated infections [7]. Racism, the "system of ignorance, exploitation, and power, whereby one group deems itself superior to all others on the basis of ethnicity, culture, mannerisms, and color," is a powerful social determinant of health that underlies these health inequities [8]. Training and practice in antiracism is fundamental to achieving health equity.
Implicit biases are the judgements and "habits of mind" that form below the level of conscious awareness, without intention, effort, or control [9,10]. These unconscious conclusions may contribute to personally mediated racism, a form of racism described by Dr. Camara Jones [11] that includes human acts of commission and omission built upon unconscious and conscious beliefs and values, which are supported by modern culture and structures. These biases are ubiquitous, observed even among health professions education (HPE) learners [12] and professionals [13], and they often exist in stark opposition to self-reported perceptions of and behaviors toward Black people [14].
IBs have been linked to health disparities [15,16] but are potentially modifiable [17,18], leading to a longstanding call for effective IB education of health professionals and HPE learners [9,[19][20][21], including within pediatrics [8] and its subspecialties such as neonatology [22]. Though many have attempted to study IB education in healthcare, empirical studies of IB education within HPE commonly assess only lower-level learning outcomes such as self-assessed awareness or knowledge [23][24][25], and few assess for lasting changes in knowledge, skill or behavior [26,27]. The instructional approaches to IB education are similarly diverse, frequently developed and implemented at single institutions and/or woven into other curricula (such as cultural competency, health disparities, antiracism, or anti-oppression curricula) [12,23,24,27,28].
Practical considerations also exist, and educators may find it difficult to translate such theoretical models to actual educational interventions even though several expert conceptual frameworks exist for IB education within HPE [29][30][31]. Nonetheless, for IB education to lead to meaningful antiracism outcomes, dedicated IB curricula rooted in theory, developed by experts in content and education science, and with evidence of meaningful outcomes of interest are needed. Such a curriculum, specific to the needs of those within the unique field of neonatal medicine, has not been described to date.
To advance the rigor of IB education and assess its impact within neonatal medicine, we envisioned a scholarly approach to curriculum development [32] aligned with higher level assessment [33] and evaluation [34] outcomes. To create a teachable, generalizable, and scalable IB and health disparities curriculum for those practicing in modern NICUs, including neonatal-perinatal medicine (NPM) fellows, advanced practice providers (APPs), and attending physician faculty, we set out to define the essential components of an evidence-based IB and health disparities curriculum using the Kern's curriculum development framework [32], a conceptual model used widely within medical education. This approach describes curriculum development as a nonlinear, evolving, and dynamic process, and describes the ultimate goal of curricula as "to equip learners to address a problem that affects the health of the public or a given population." In this report, we describe our approach to development of two important curricular components needed for robust curricular design: namely, defining goals and objectives, as well as delineation of educational strategies, with the goal of equipping learners with additional knowledge, skills, and attitudes that may translate to improved health equity in the NICU.

METHODS
Systematic consensus group methods, including Delphi methodologies, attempt to synthesize expert opinion and are particularly useful when empirical evidence is limited or even contradictory [35]. As the ideal approaches to educating neonatal providers on IB and health disparities are not known, we sought to determine what content experts in IB and health disparities considered important and necessary for effective education for NICU trainees and providers. The Delphi method of consensus building relies on anonymity, iteration, controlled feedback, statistical group response, and structured interaction [35]. The modified Delphi accommodates a diverse group of geographically remote experts; this is especially relevant to IB and health disparities topics where inclusivity is key to a balanced outcome. Through anonymity and avoidance of face-to-face discussions, the modified Delphi method also reduces the risk of undue influence of the group by any particular participant [35]. Virtual and asynchronous participation by qualified experts also allows for ample time and space to consider difficult questions of high uncertainty and speculation [36]. Because of these affordances, we chose the modified Delphi method to determine expertdefined IB and health disparities curricular components.

Development of Delphi questionnaire
In January 2021, in accordance with our curricular development framework [32], we began broadly identifying and investigating the current approaches to IB education by conducting a scoping review of implemented curricula for clinical health professions learners and their educators published between January 2003-December 2020. As part of our general needs assessment to understand the state of modern IB education within HPE, we mapped the varying approaches and recommendations to bias education, and identified gaps in current approaches to design, implementation, learner evaluation, and curricular evaluation (unpublished data). The curricular components advocated for by the studies included in our literature review formed the numerous potential curricular items for our Delphi expert deliberation.
We organized these components into the following categories: curricular goals and learning objectives (encompassing Kern's third step), and educational strategies/methods and educator principles (encompassing Kern's fourth step). Kern describes the curricular step of "educational strategies" as delineation of both content to be included in the curriculum, as well as the ways this will be presented to learners. We added the category of educator principles because we identified numerous considerations for educators, though not necessarily educational methodologies or strategies, espoused in the literature related to bias education. Learning objectives were further organized into objectives related to bias and the self, the impact of race in our society, ideas of power and privilege, and clinician skills and behaviors. Educational strategies/methods were subcategorized as general strategies, those related to use of the Implicit Association Test (IAT) [37], bias reduction strategies, and strategies of how to respond to incidents of racism. Educator principles were further categorized as unintended consequences of the curriculum, participant and educator factors that may influence efficacy, and the ideal learning climate.
We organized the questionnaire with response options indicating level of agreement for item inclusion using a 5-point Likert-scale (ranging from strongly agree to strongly disagree). We instructed experts to leave an item blank if they were unfamiliar with the item or felt unable to comment. For every curricular subcategory, we included optional free-text spaces for experts to provide their reasoning for item inclusion and/or exclusion and propose ideas for additional items not originally addressed. Because of the significance and nature of the topics, we felt it was particularly important to ensure sensitive and inclusive language on items, so we also asked experts for any suggestions for item rewording.
We performed cognitive interviews and pilot tested the survey multiple times with a group of neonatal medicine educators (the executive council of the National Neonatology Curriculum Committee). The final version of the questionnaire was approved by all study authors. Subsequent questionnaires were developed based on the study investigators' review of Delphi expert feedback and used the same Likert agreement scale with free text explanation format.

Expert selection
Delphi methodology requires the selection of experts who have a deep understanding of the identified issue [13]. We used purposive and snowball sampling to identify potential Delphi participants. Our investigator team developed a list of qualified experts who met two broad initial criteria: (1) content expertise in IB and/or health disparities and (2) neonatology educator. We defined content expertise as being identified as a leader in IB/ health disparities via the person's leadership, educational, advocacy, social media, or research roles. We also required participants to have had an educational relationship (either historical or present) with NICU learners and/ or nurses, allied health professionals, APPs, and neonatologists, as we were interested in those with educational expertise involving NPM health care professionals and learners. None of the investigators participated as a Delphi expert.
Our investigator team used professional networks to identify potential experts meeting the recruitment criteria. For geographic regions not represented in this list, we searched institutional websites for experts in IB and/or health disparities at institutions with a NPM fellowship. We also sought experts with primary roles within general pediatrics and maternalfetal medicine to gain a diversity of viewpoints for those caring for the parental-infant dyad before, during, and after the NICU stay. Finally, we asked those who agreed to participate to provide the names of other potential IB or health disparities experts with an educational relationship with NICU learners or health professionals and invited the recommended individuals to join the Delphi study.

Consensus definition
We defined consensus a priori in terms of percent agreement amongst experts. Items were included in the final curriculum when ≥70% of respondents provided responses of "strongly agree" or "agree" for item inclusion. We determined that items with <50% of total responses as "strongly agree" and "agree" be excluded from subsequent Delphi rounds. Items reaching indeterminate consensus (50-69% of experts responding "strongly agree" or "agree") required additional deliberation in subsequent Delphi rounds. Because the focus of the Delphi was to develop and include all critical items, we did not use the Delphi process to rank items in terms of importance nor limit the number of items reaching consensus.

Delphi process
We used Research Electronic Data Capture hosted at Mayo Clinic for data collection and management [38]. Invitations were sent via email to all potential experts to participate in the Delphi process. For each round, experts were sent an initial invitation to participate and up to four reminder emails to participate over the subsequent 4-6 weeks. Prior to consensus building, the purpose and goals of the Delphi process were explicitly described to the experts. In round one, we collected information on expert demographics and asked experts to define their sphere, extent, and scope of expertise in education, research, leadership, and advocacy.
Using statistical group response, we provided aggregate, anonymized feedback after each round, summarizing percent agreement for each item and identifying items that had reached consensus, those that required further deliberation (such as novel items proposed by experts or those reaching indeterminate agreement), and items failing to reach consensus. Experts were only asked to participate in subsequent rounds if they participated in all preceding rounds.
In each successive round, after reviewing the statistical group response, experts provided their level of agreement with all novel suggestions and their input on items requiring additional deliberation. For items that had achieved consensus but had suggested revisions regarding optimal wording, experts were asked to choose between "maintain," "revise," or "either wording is acceptable." Items were maintained in their original form if the percentage of "maintain" responses plus "either is acceptable" responses exceeded 70% of the total responses; items were modified if the percentage of "revise" responses plus "either is acceptable" exceeded 70% of the total responses. In the event of a tie between rewording options, the author group discussed and determined final item wording based on all expert feedback.
For items reaching intermediate expert agreement (50-69% agreement), the author group developed potential revisions using experts' free-response feedback and asked experts whether items should be maintained in the original format, revised to a modified item, either maintained or revised, or removed entirely. We included each of these items only when the percentage of experts responding with "maintain" or "revise" was >70%. Items reaching 50-69% agreement that had additional suggestions for rewording were re-posed to experts in the subsequent round, and those with <50% agreement for inclusion were removed.
We conducted the Delphi process iteratively until consensus for inclusion or exclusion was reached for all proposed items. Descriptive statistics, using Microsoft Excel (MS Excel), were used to evaluate demographics and Likert scores.

Participants
Of the 48 experts invited to participate in the Delphi, 24 (50%) accepted and participated in round 1. Experts represented a diverse group of educators, leaders, and researchers within the spheres of IB and health disparities with varying years of experience in clinical medicine (Table 1). Most experts reported educational expertise in IB and/or health disparities (22/24, 92%) and leadership roles (17/24, 71%). Most experts also defined themselves as researchers with publications in peer-reviewed journals for the topics of IB and health disparities (18/24, 75%), with two experts also holding multiple R01 level National Institutes of Health grants related to these topics (2/24, 8%). Three quarters (18/24) of the experts participated in round two, and 15 experts (63%) participated in the final round.

Delphi process
Three Delphi rounds were conducted to reach consensus from May to November 2021. Figure 1 shows a detailed overview of individual curricular item handling during the Delphi process. Percent agreement for each proposed item (expert response of either "strongly agree" or "agree" for item inclusion in the final curriculum) is shown in Appendix 1.
In Round One (5/2021-6/2021), 66 original curricular items described in the literature (7 goals, 18 objectives, 30 educational methods/strategies, and 11 educator principles) were proposed to experts. Four items were removed after round one due to <50% agreement; 8 items had indeterminate agreement for inclusion (50-69%) amongst experts. Analysis of the free-response comments from experts led to the addition of 1 goal, 6 objectives, 4 educational methods/strategies, and 4 educator principles for deliberation in Round Two. Suggestions for rewording were posed for 4 goals, 12 objectives, and 3 educational methods/strategies. At the end of Round One, consensus was reached for inclusion of 55 items: 7 goals, 18 objectives, 23 educational methods/ strategies, and 6 educator principles.
In Round Two (8/2021-9/2021), 9 items were removed due to persistent non-agreement for inclusion, and 19 consensus items were reworded. No new items were proposed. At the end of Round Two, consensus was reached for the inclusion of 64 items: 8 goals, 23 objectives, 26 educational methods/ strategies and 7 educator principles. Two of these objectives had reached consensus but still had free-response comments In Round Three (10/2021-11/2021), experts provided clarification regarding the remaining five unresolved items. Two objectives were reworded, and 1 objective was reclassified as an educator principle. Two other educator principles were removed due to persistent nonagreement.
After three Delphi rounds, consensus was reached for the inclusion and language of 8 goals and 23 objectives (Table 2) and 26 educational methods/strategies and 8 educator principles (Table 3).

DISCUSSION
In this national modified Delphi study, a diverse group of subject matter experts in IB and health disparities with established relationships with NPM learners and staff reached consensus for curricular goals, learning objectives, education methods/strategies, and educator principles that will lay the foundation for a targeted neonatal medicine IB and health disparities curriculum. IB exists in healthcare providers [39], can increase throughout medical training [40], and may be a modifiable factor in reducing health disparities due to racism [41]. The IB education literature is characterized by variable approaches to curricular design and evaluation, learner outcomes assessed, and quality of reporting [12,23,24,27,28]. If education on IB and the health disparities they contribute to is to lead to long-term changes in provider knowledge, skills, and attitudes, then a rigorous approach to curricular development is necessary.
Delineation of educational goals is critical to rigorous curriculum development for two reasons. First, goals should reflect the desired learning outcomes, and thus, they emphasize the development of certain knowledge, skills, or attitudes. This has profound implications on the appropriateness of subsequent learner assessment and curriculum evaluations. For example, curricula with goals oriented to provider behavioral change (e.g., communication strategies) align well with the assessment of simulated and real provider behaviors, as well as the impact of these behaviors on patients. Secondly, goals inform which learning objectives are required, and subsequently, which educational strategies will be appropriate to achieve the learning objectives.
In our Delphi process, experts reached consensus on eight overarching goals (Table 1). These goals emphasize self-awareness of bias (G1-2), knowledge of the multiple components of modern health disparities (G3-6), and strategies for provider behavioral change (G7-8). The goals developed in this modified Delphi study reflect consensus opinion that IB education must not stop at acknowledgment and awareness of one's IBs, but that such education must also provide learners with opportunities to practice bias mitigation and interventions when bias is recognized. This is similar to IB frameworks within broader HPE contexts imploring educators and institutions to "move beyond concepts toward applications." [21].
The educational goals reaching consensus gave rise to numerous cognitive, social and behaviorist-oriented objectives that align with multiple levels of knowledge, as described by Miller [33]. These objectives encompassed concepts related to bias and the self (O1-5), the impact of race in our society (O6-11), power and privilege (O12-16), and clinician skills and behaviors (O17-23). Significant attention was given to the action verbs within the objectives by both the study team and the Delphi experts to ensure objectives were specific, measurable, achievable, and realistic, which are critical to inform appropriate learning and assessment strategies. Because of the nature of the topics, many cognitively oriented objectives (with verbs such as "describe," "identify," and "reflect") were found to be appropriate by the expert panel. These objectives support the foundational level of knowledge (in Miller's pyramid, that a learner "knows") and lend themselves well to written assessments [33]. Although foundational and necessary, this knowledge level is not sufficient for the ultimate goal of reducing provider IB and subsequently reducing health disparities [41]. Additional objectives reaching consensus support application and demonstration of knowledge, specifically in the development and practice of skills to recognize and mitigate IB (O17-23). These skills (with verbs such as "develop," "articulate," and "apply") allow for learning at the level of Miller's "shows" and "does" [33] and Table 2. Expert consensus for included goals and objectives. lend themselves well to oral, performance, and workplace-based assessments.
Though experts reached very high levels of agreement for goals and objectives that closely aligned with several established IB educational frameworks, the educational methods, strategies, and guiding principles were more controversial. Some educational approaches advocated in the literature were rejected by the expert panel, demonstrating that not all strategies espoused in the literature are recommended as practical and useful by experts. Eight educational methods/strategies and eight educator principles proposed in the literature failed to reach consensus amongst this expert group (see Appendix 1). For example, experts disagreed that this curriculum should incorporate mindfulness training, implore learners to increase their motivation to be fair, use fantasy characters to teach about stigma, or use exercises based on public exposure of privilege. Experts also had divergent opinions about the utility of the IAT [37] and did not reach consensus that the IAT should be a required component of the curriculum. Experts did not agree that taking an IAT leads to changes in learner self-perception, a finding that Gonzalez and colleagues have also reported [12]. Experts also did not reach consensus on the value of the IAT as an assessment tool for long-term learning nor as a program evaluation tool. This finding aligns with the guidance from Project Implicit which specifically states that the IAT does not meet reliability standards for measurement and discourages its use as an assessment tool within pre-post research designs unless a control group is also used [37]. Nonetheless, experts reached consensus that the IAT has a role as a "catalytic reagent" for inciting self-awareness though cautioned its use in isolation, as knowledge of one's biases alone may lead to discounting of the findings or even hostility in the learner. Experts encouraged educational strategies such as reflection, storytelling and teach back for learning about IB and health disparities. Strongly endorsed bias mitigation strategies (with >90% agreement) included stereotype replacement, counter-stereotyping, individuation, perspective taking, and increasing opportunities for contact. Experts also reached consensus that all five proposed bias-response strategies have a potential role in such a curriculum, with the highest levels of percent agreement for the Step Up/Step Back method (100%) [24] and the Active-Bystander model (93%) [42]. The Step Up/Step Back model encourages participants to actively decide whether to speak out for those unable to or to support others to speak out for themselves, whereas the Active Bystander model focuses on recognition of the bystander role, inhibitors and facilitators of action, and a variety of intervention techniques. This suggests that learners may benefit from learning multiple practical strategies for attending to bias.
Finally, the consensus on educator principles suggest that educators, facilitators, and learners all shape the learning environment during IB and health disparities education. Learners approach bias education from a variety of backgrounds and with unique life experiences and educators may benefit from conducting local needs assessments to determine their own learners needs, baseline knowledge, skills, and attitudes. In contrast to many curricula in which goals and objectives are to be completed at the end of a finite period, experts agreed that personal and professional growth in IB topics are lifelong pursuits that cannot be addressed in a single session, module, or calendar year. Experts valued individual goal setting and encouraged educators to recognize that learners may progress in their knowledge and skill at different rates. Thus, the ideal curriculum would allow for local educators to determine the relative emphasis, frequency, and duration of topics. Local educators will also be best able to determine if this curriculum should be integrated within other curricula or stand alone based on their local resources and expertise. Experts encouraged facilitators to vigilantly monitor for power imbalances and adverse effects on learners (such as marginalization, adding to the minority tax, shame, or guilt), and discouraged educators from presenting cultural "menus" and perpetuating stereotypes. Finally, though both "safe" and "brave" spaces are advocated for in the literature [24,29,43], experts did not agree that one approach was superior to another; they instead left the choice to the local facilitator.
This study has limitations related to the modified Delphi process. The final product of consensus building methodologies relies on the level of expertise of the expert pool [35], and is influenced by the state of implicit bias and health disparities education in the literature. Thus, the expert recommendations for best educational practices may change over time as knowledge related to these topics increases. Expert attrition during the Delphi study was moderate. We decided to present all potential curricular items to experts as a single initial survey, so that experts could see the wide range of potential approaches to IB and health disparities education before providing their level of agreement with any particular goal, objective, strategy, or principle. While this may have allowed for iterative feedback and consideration of the interactive effects of each curricular component on others, it may also have contributed to expert uncertainty and attrition. We also chose an a priori level for consensus as 70% agreement and did not force experts to rank order items in order of importance, both of which may increase the number of final curricular components and make the curriculum longer or more complicated than desired. Yet, the number of items reaching high levels of expert consensus highlights the complexity of this topic, and it is unlikely that any one curriculum would be able to incorporate all these elements. Instead, the compiled lists provide educators a scholarly initial direction to bias education while still allowing them to tailor the curriculum to their local learners, resources, and contexts.
In summary, a rigorous approach to a neonatology specific IB and health disparities curriculum is a key step toward promoting health equity in the NICU. Delphi experts reached consensus for numerous curricular components, including goals and objectives, educational strategies, and learning environment considerations, for such a curriculum. These curricular components should be considered when creating IB focused educational materials for NICU providers. Future studies are needed to evaluate the proposed methods of delivering this curriculum and its effects on provider, patient, and family outcomes.

DATA AVAILABILITY
The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.