Study participants
Participants were recruited from psychiatry and general internal medicine. As mentioned in the Introduction, the evidence so far on the effects of experience and contact on prejudice is mixed. The specialties of psychiatry and general medicine were chosen because in examining bias against illness of the mind and a characteristic of the physical body (obesity) it was deemed interesting to compare specialists in the mind (psychiatrists) and those who did not specialise in the mind and thus could be expected to focus more on physical characteristics (general internists). Female physicians have been found to display weaker implicit bias against the obese than male physicians, hence gender differences in obesity bias were expected.(35) Participants were recruited from the Geneva University Hospitals and from private practices in Geneva by email or physical mail.
Data collection
All participants provided written informed consent. The experiment was described as an investigation into implicit attitudes using a categorisation task with the aim of improving standards of clinical care. The IAT was not named nor were the words ‘bias’ and ‘prejudice’ mentioned to avoid influence on responses to the clinical vignette. The order of measures (Fig 1) was chosen to mask the characteristics under study for as long as possible. The ethics committee approved this procedure for obtaining consent. Participants first completed a demographic questionnaire and a clinical vignette online. Patient characteristics were randomized in the vignette to create four groups each receiving a different version.
A face-to-face meeting was then held with each participant, who completed a Mental Illness IAT and a Weight IAT, followed by either one of two interventions, instructions for the cognitive load condition, or a control condition, then a repeat of both IATs, and finally each responded to two Feeling Thermometers to rate their feelings towards the obese and the mentally ill. The interviewer asked participants to refrain from discussing the contents of the study with colleagues. Interviews were conducted in the local language of the participants, French, and all materials and tests were in French. All methods were carried out in accordance with relevant guidelines and regulations.
Measures
IATs: The most prevalent measure of implicit biases is the Implicit Association Test (IAT), a computerized task where participants rapidly categorize negatively and positively valenced words with images or words. The relative speed of association of, e.g. in a Race IAT, black faces with positively-valenced words (as compared to the other possible associations), indicates the level of bias.(42)
Mental illness stimuli were taken from a previously tested IAT, comparing words for physical illnesses and words for mental illnesses paired with negatively and positively valenced words.(38) The Weight IAT used stimuli from the Project Implicit website: silhouette images of thin and obese people paired with negatively and positively valenced words.(43) Words were translated from English to French. D-score interpretation used by Project Implicit indicate a score greater than 0.15 as a slight bias, greater than 0.35 as a moderate bias, and greater than 0.65 as a strong bias. Negative scores represent the inverse association, namely an association between either mental health or obesity (as compared to physical health or thinness) with positive rather than negative words.(42,43)
Vignette: The vignette was taken from a study that found differences in clinical responses to pain correlated with patient gender.(44) It was translated from Portuguese to French and modified to create four versions: a control version with no medical history, a version where the BMI of the patient was 32 and thus indicated clinical obesity, a version were the patient had experienced depressive episodes, and a version were the patient had a BMI of 32 and had experienced depressive episodes. There were six questions that asked participants to evaluate pain intensity, clinical severity, clinical urgency and pain credibility on a scale of 1-7 (S1 Appendix).
Feeling Thermometer: These consisted in a continuous unnumbered line with ‘warm feelings’ written on the left side of the line and ‘cold feelings’ on the right. The range was 0-12, but the increments were not indicated on the actual line where participants marked a cross. Instead, the segments were designated as follows:
0 - 2.75 warm feelings
2.75 - 5.5 slightly warm feelings
5.5 - 6.5 neutral feelings
6.5 - 9.25 slightly cold feelings
9.25 - 12 cold feelings
Participants marked the point that represented their feelings towards the obese and the mentally ill. Lower scores represent warmer feelings. The feeling thermometer measures explicit feelings, thus is a measure of explicit bias.
Interventions and cognitive load
Few interventions have been tested on implicit weight bias and implicit mental illness bias and even fewer have produced significant results.(45–50) Video interventions designed to increase physicians’ identification with their obese and mentally ill patients were chosen for their potential to reduce levels of implicit bias. The participants were simply required to watch the short (1.25 minute) videos before retaking the IAT. The videos were extracts of videos made by the National Health Service in the UK to help physicians identify and empathise with their patients and showed two genuine female patients in the UK talking about their clinical experiences with obesity and depression. The videos were subtitled in French and are available from the following website: https://www.unige.ch/medecine/ieh2/fr/recherche/groupe-samia-hurst-manjo/. The participants retook the IATs immediately after watching the videos.
Given concerns regarding the rest-retest reliability of the IAT (41,51) and the possibility of a learning effect when participants completed the second IAT, we included a control group. In the control group, participants counted backwards in twos out loud in place of the intervention for the same length of time and then retook the IATs.
A fourth group, in place of the intervention, was instructed on how to proceed with the second set of IATs under a cognitive load condition, consisting in counting backwards in twos out loud during the intervention. They then proceeded to retake the IATs. The effects of implicit bias are thought to increase under stress and time pressure, common working conditions for physicians that can be simulated -albeit approximately- with a cognitive load. While previous research has shown that conditions of cognitive load increase implicit biases,(52,53) we were not aware of other research that looked directly at the effects of cognitive load while performing an IAT.
It was hypothesized that the video interventions would correlate with reduced levels of implicit prejudice and that the cognitive load condition would correlate with increased levels of implicit prejudice when compared with the control group.
Statistical Analysis
The sample size was based on an initial a priori power analysis for ANOVA, targeting a medium-sized interaction effect between two factors (specialty and experience) with two level each with an alpha of 0.05 and a power of .80, requiring a total sample size of N = 158 (40 per cell). Data was entered into the SPSS statistical software package for analysis. Participants’ socio-demographic characteristics were analysed using descriptive statistics. Inter item reliability (the questionnaire’s internal consistency) was tested using Cronbach’s alpha. We changed our analysis to enable treatment of IAT scores as continuous variables in response to reviewer comments. Linear regression was conducted to test for association of gender, experience, and specialty with responses to vignettes, pre-intervention IATs and explicit attitudes. Linear regression was also conducted to test for association of interventions (or control) with post-intervention IATs and explicit attitudes. To take possible confounding variables for this second analysis into account we also included gender, experience, and specialty. Reported effect sizes were computed using Cohen’s d. Two-tailed p<0.05 was selected as the significance threshold. The data is publicly available at the following DOI: 10.26037/yareta:md2ryexqsrchhb2fafgor6lcmm.