The Impact of Gender on Researchers’ Assessment: A Randomized Controlled Trial

OBJECTIVES
This randomized controlled trial aimed to test whether women or men would be preferred with identical curriculum vitae (CV); and the impact of the career stage in the evaluators' choice.


STUDY DESIGN AND SETTING
A simulated post-doctoral process was carried forward to be assessed for judgment. Level 1 and 2 Brazilian fellow researchers in the field of Dentistry were invited to act as external reviewers in a post-doctoral process and were randomly assigned to receive a CV from a woman or a man. They were required to rate the CV from 0 to 10 in scientific contribution, leadership potential, ability to work in groups, and international experience.


RESULTS
For all categories of CVs evaluated, CVs from men received higher scores compared to the CVs from women. Robust variance Poisson regressions demonstrated that men were more likely to receive higher scores in all categories, despite applicants' career stage. For example, CVs from men were nearly three quarters more likely to be seen as having leadership potential than equivalent CVs from women.


CONCLUSIONS
Gender bias is powerfully prevalent in academia in the dentistry field, despite researchers' career stage. Actions like implicit bias training must be urgently implemented to avoid (or at least decrease) that more women are harmed.


Background
Women remain underrepresented in Dentistry in academia, and this gap is widened whenever each career step is progressed. 1,2 ecent evidence showed that in the United States, for example, almost half of the graduates from dental schools were women, whereas only 22% of the faculty were women. 1 This study suggested the same trend for all the other countries evaluated, including the United Kingdom, France, Germany, and Japan (the country with the largest gap, where more than 40% of dental students were women but only 4% of the professors were women). 1 The glass ceiling effect is a metaphor used to describe this invisible barrier women face to advance their career to higher levels. 3e gender gap in research might be related to three main factors: lower professional performance, systemic bias, and individual bias. 4Women's lower performance can be connected to many underlying challenges they face, such as family and societal pressures, childcare responsibilities, among others. 1,4 ther potential reason for lower professional performance can be the unconscious use of more modest speech by women, leading to a diminished chance of having an article written by women accepted in a peer-reviewed journal. 4,5  systemic bias refers to the way that ecosystems are organized to favor men. 4 In the grant ecosystem, the review criteria unfairly favor male principal investigators because of the cumulative advantage, which is highly prevalent for the research output.For example, there is a larger prevalence of rst and last male authors in published papers, and men may present a higher rate of successful grant applications in their pro le. 4 The third factor contributing to this gap is individual bias, which can be related to conscious or unconscious gender bias from persons who make decisions, as any stakeholder with a decision-making power such as editors, grants ad-hoc reviewers, committees, journals reviewers, and so on. 4Individual bias occurs because human beings are not neutral. 5Their judgment and behavior are based on associations arising from previous experiences that lead to certain preferences or aversions. 5Implicit or unconscious bias is the term behind discriminatory behaviors without conscious discriminatory actions in society. 6 is of utmost importance to investigate underlying associated factors to predict researchers' assessment of their gender in Dentistry and the overall STEM (Science, Technology, Engineering, and Mathematics) elds.Thus, we developed a randomized controlled trial to test whether women or men would be preferred with identical curriculum vitae (CV); and the impact of the career stage in the evaluators' choice.
To this, a simulated post-doctoral process was carried forward to be assessed for judgment.

Protocol Availability and Ethical Approval
The study protocol was approved by the local Brazilian Institutional Review Board (IRB) (Comitê de Ética em Pesquisa da Faculdade de Medicina, Universidade Federal de Pelotas, Brazil/number 10227419.2.0000.5318),and the full research protocol is available on the Open Science Framework platform (https://osf.io/2ut5v/).

Study Design
This study was designed as a randomized, 1:1, superiority, parallel-group, blinded (for assessors) and controlled by gender and career stage trial, comparing the researchers' assessment for the same CV with a male or female gender, using a selection process for a post-doctoral position in Dentistry at a southern Brazilian University as a proxy.The study's expositions were the gender of the applicant at two levels (male and female) and the career stage of the applicant at two levels (early-stage or later stage careers).The primary outcome was evaluators' assessments in each of the four categories evaluated (scienti c contribution, leadership potential, ability to work in groups, and international experience) according to CV's gender and career stage.
This study was reported based on the CONSORT 2010 Statement and its extension for multi-arm randomized trials. 7,8igibility criteria Eligible participants (i.e., ad-hoc assessors) were level 1 and 2 research fellows from the Brazilian National Council for Scienti c and Technological Development of the year 2020 in dentistry.Because of our university community's potential knowledge about our trial, we decided to exclude our research fellows from the sample to avoid possible contamination bias.

Sample size
The sample size estimation was based on the results of a previous study. 2and measures of clinical relevance.We assumed a maximum type 1 error of 0.05, a power of 0.90, and an effect size of -0.81 (mean difference between groups at the nal grade) with a standard deviation of 1.1.We obtained a sample size of 78 researchers.Considering the average response rate of 10% in questionnaires 9 , and a non-response rate of 90%, we assessed eligibility for all 211 research fellows in dentistry.We randomized all 117 researchers who met the inclusion criteria and agreed to participate.

Randomization and blinding
Researchers were randomly assigned to receive a female or male CV with a 1:1 allocation per a computergenerated randomization system strati ed by career stage (early-career or non-early career) using permuted blocks of random sizes.The list of random numbers was made on a website (www.sealedenvelope.com).The concealment of participants' allocation was warranted by a researcher not involved in the study, and another researcher allocated each participant following the allocation sequence.Each researcher received only one CV.
The researchers were not aware of the study.They were invited to act as external peer reviewers in a selection process for a supposed post-doctoral position at a southern Brazilian university.

Interventions
Each of the researchers selected according to the eligibility criteria received an e-mail (Appendix 1) with an invitation to act as an ad-hoc reviewer in a supposed post-doctoral fellowship.In case of acceptance, each researcher received a second e-mail (Appendix 2) containing information about the process, which s/he was invited to be part of the evaluation process.Along with the information process, this e-mail had one of the four CV possibilities to be evaluated.The options were: early-career female (Appendix 3), earlycareer male (Appendix 4), non-early career female (Appendix 5), non-early career male (Appendix 6).For more information, see Table 1.This e-mail also contained a document with a simulated call for application (Appendix 7) to give credibility to the process.The CV considered as "early career" contained information by an applicant who just concluded his Ph.D. and has 12 papers published, compared to the CV considered as "non-early career applicant."It contained information about an applicant who will have a previous post-doctorate and more than 20 papers published.The idea of different CVs degrees was to assess if gender bias occurs more at the beginning of the career or when the career is more consolidated.
Each researcher received a CV of an applicant (gender and career stage selected according to the randomization) and was required to rate each topic from 0 to 10 (0 being insu cient and 10 very su cient) on a visual analog scale.The topics were scienti c contribution, leadership potential, ability to work in groups, and international experience.To allow the blinding of the evaluators and the equivalence of the male and female CVs and pro les, information on the full name and the publication list was blinded, as well as any external reference that could be cross-checked online, such as Researcher ID, ORCID ID, social media pro les, grant numbers, etc.The researchers were also not aware that they were participating in a study.However, when they sent the CV assessment, they received an e-mail with a questionnaire (Appendix 8) containing information about the study and requesting authorization to use the previously submitted data.Researchers were also asked whether, at any time, they have ever suspected the veracity of the process for selection of a fellow post-doctoral researcher.
The same researcher (MCF) sent all e-mails containing invites, the explanation of the study, and the Free Prior Informed consent (FPIC) from an institutional e-mail created for this purpose.

Outcomes
The primary outcome was the nal grade given by evaluators according to CV's gender and career stage.As a secondary outcome, we evaluate the each one of the four categories evaluated (scienti c contribution, leadership potential, ability to work in groups, and international experience) and the grades in each category according to the gender of CVs' and evaluators' gender.Each category could receive grades from 0 (lowest score) to 10 (excellent score), being the nal grade an arithmetic mean of the four grades given in each item.Grades could also contain decimal numbers (e.g., 9,6).

Statistical Methods
Descriptive analyzes were used to summarize evaluators' characteristics.Continuous variables were described as mean and standard deviation or interquartile range (IQR).Categorical variables were expressed as point estimates and 95% con dence intervals.Gross binary associations between exposures and outcomes were conducted by X 2 tests within an alpha of 0.05 for signi cance.
Step forward robust variances Poisson regressions with log links were undertaken to estimate associations of exposure variables of interest (gender and career stage) to dependent variables -scienti c contribution, leadership potential, ability to work in groups, and international insertion), both adjusted for CV gender and career stage and non-adjusted.All analyzes were performed using the software SPSS statistics 25 (IBM, Nova York, USA), and an alpha level of 0.05 was set for inferential analyzes.

Results
The RCT was conducted between June and September 2020.From the 211 researchers assessed for eligibility, 117 met inclusion criteria, agreed to their participation, and were randomized.After discovering that the evaluation process was a study on gender bias, 56 participants signed the FPIC and had their evaluations included in the analysis.More details are presented in Figure 1.
Table 2 presents the characteristics of the reviewers invited to make the evaluation and those who completed it.The participants who completed the evaluation are also divided into those who signed the FPIC or not, and those who did not answer.The majority of invited participants were men (62.2%).68.4% of women and 56.1% of men who did the evaluation signed the FPIC.No women signed the term with "no", and 31.6% of women didn't answer the FPIC, while 4.5% of men signed the term with "no" and 39.4% didn't answer.For the early career stage CVs, men received higher scores than women in all four categories, for a difference of at least half a point for working in groups and one point for the other ones.For not early career stage CVs, men received higher scores than women in all four categories but the ability to work in groups, with a difference of half-point for scienti c contribution, 0.8 for international experience, and 1.2 for leadership potential (Figures 1 and 2).For all categories evaluated, men and researchers at the late-career stage receiving higher grades (Table 3).Women gave higher grades for male CVs' for scienti c contribution and nal grade.Men gave higher grades for male CV's for scienti c contribution, leadership potential, and nal grade (Table 4).

Discussion
As far as we know, this was the rst randomized controlled trial to evaluate the impact of gender on researchers' assessment for a post-doctoral Dentistry scholarship.For all categories of CVs evaluated, men received higher scores compared to the CVs from women.Even though interquartile ranges were likely interpolated for all variables, the robust variance Poisson regressions presented statistical differences for all variables.Poisson regressions demonstrated that men were more likely to receiving higher scores in all categories, despite applicants' career stage.For example, CVs from men had nearly three quarters more likely to be seen as having leadership potential compared to CVs from women.These results demonstrate the individual gender bias that women face in academia.
Considering applicants' career stage, the not early career stage CV received higher scores for all categories evaluated when compared to early career stage, what was already expected, since the not early career stage CV had a more signi cant number of publications, participation in events, receipt of awards, etc.Our main objective in creating two types of CV was to assess whether gender bias would be more present at some career stage.However, in our study, gender bias occurred similarly in both CVs' types.
Overall, it is worth noting that gender affected more the scores for all evaluated aspects in the CVs than the career stage, which is quite remarkable considering the signi cant differences in the CVs pro les represented by the career stages in the present study.
A descriptive analysis was carried out to see whether the gender of the evaluators could in uence the scores given.Both male and female evaluators gave lower grades for women's CVs scienti c contribution and nal grade.Male evaluators gave lower grades for women's CVs also for leadership potential.This means that gender bias is potentially not committed exclusively by men and, contrariwise, by both genders.
Gender norms are not natural.They are constructed, learned, and reproduced socially with the intervention of different institutions such as the State, the family, religion, and the media. 10This whole social context shapes the way we represent our idea of gender.Often the norms of gender are so tied to our imagination that gender bias occurs unconsciously.It is the so-called implicit bias. 6,10 ther consciously or not, gender bias is a severe problem that affects women's careers at different levels and can be decisive in academic success. 10If we consider, in addition to the individual biasdemonstrated in the present study -the systemic bias and so many other disadvantages suffered by women, such as being solely responsible for the care of children and family, for example, this seems like a battle already lost. 11,12 previously mentioned, the human being is not neutral, and our decisions are made according to the context in which we are inserted and with previous experiences. 10We all have biases, and they need to be discussed and clari ed.It is essential to have implicit (not just gender) bias training for all academy people. 6,10 ecision-makers need to make their choices based solely on competence; teachers also need to be trained to teach people who will make decisions in the fairest possible way in the future.
Despite the strengths of the design, the present study has some limitations.The evaluators needed to agree with the use of the data previously sent.Unfortunately, a large number of evaluations could not be used due to the non-signature of the FPIC by the evaluators.From the 104 researchers who made the assessment, only 63 signed the form, 43.9% men and 31.6%women who made the evaluation did not authorize the use of their data.Even though the results of this study showed a statistical difference in the evaluation of men and women within the academia (with a power of 0,99), due to this high number of researchers who have not signed the FPIC we could not full ll the sample size, and the difference presented in the results could be even bigger if more researchers had agreed to use their data.Another limitation is that in this study, gender was considered as a binary variable, and information on gender bias may have been lost by not considering gender diverse people.

Conclusions
We can conclude from the study ndings that individual gender bias is prevalent in academia, even indexed by the Dentistry eld, despite researchers' career stage.Actions like implicit bias training must be urgently implemented to avoid (or at least decrease) that more women are harmed.

Declarations Data Sharing Statement
A completely deidenti ed data set of this trial will be delivered to an appropriate data archive for sharing purposes. Figures

Table 1 .
CV possibilities to be evaluated *CVs from the same career stage were strictly the same, except for gender identification

Table 2 .
Characteristics of reviewers invited and reviewers who did the CVs' evaluation*

Table 3 .
Poisson regressions by CVs' gender and career stage for each grade category *each one of the grade categories was included in a separate model containing gender and career stage.