Effects of the accent on the grades
This study demonstrates that the NNEA could either positively or negatively affect the global scores in OSCEs at a statistically significant level. However, there was no evidence that the NNEA influenced the scores for checklist items. This indicates that the accent could affect the global score and hence the standard setting process but not necessarily impact the assessment of the individual student.
All analysis indicated that the checklist sums were not affected by the accent variable. They are easier to observe than domain based items and the decisions to award a mark is not influenced by the examiner’s subjective judgment to the same degree as the domain based items or global scales . This could have led to the consistency in the checklist sums when the accent variable was changed. When examiners made judgments on the global scores i.e. the total impression of the student, they were required to decide whether the student was competent overall. This process would be guided by the idiosyncratic experience formed from the cultural and professional norm. This could lead to the stereotype activation while a general impression was formed during the marking. The effect of accents on the judgment has been supported by previous psychological research  but never been identified in medical education.
This study presented interesting data regarding the direction of the disparity in the global scores. It could be speculated that the NNEA activated the examiners’ negative stereotypes as described in the literature . However, this is contradicted by the data suggesting the NNEA also increased chance of receiving positive global scores. One explanation for this is that although the accent could trigger a bias, the mechanism behind was not exclusively negative out-group categorisation.
People are very sensitive to speech differences and can distinguish someone to have a ‘foreign’ accent quickly . On top of the initial assignment of the stereotypical membership status, it is necessary to consider what happens beyond. It has been demonstrated that when native listeners encountered speakers with NNEAs, they accommodated to the features of the accented speeches . Native listeners expected non-native speakers to be slower than native speakers. They preferred speeches with the expected rate to speeches with a much faster rate.
Suppose that examiners in this study recognised the NNEA, they might have accommodated to the difference in the accent while they were marking. This accommodation could assist examiners to listen to the student well even though the intelligibility was reduced, reducing the effect of the NNEA.
Another study found that native listeners were likely to conclude that any divergence of speech patterns is because speakers were non-native . This implies that even when a non-native person makes a communication error, it is more likely to be associated with their language background rather than their ability. In this study, examiners could have perceived an unsatisfactory communication as consequences of the student being non-native. If this was the case, then the NNEA could have created a positive bias.
This is a slightly different mechanism to the stereotyping activation that has been discussed so far. It would be of interest to see how much the active effort to understand a speech with NNEA and a prior recognition of the non-native status interacts with the stereotyping.
Another possibility could be that the activated stereotype could have been associated with positive characteristics such as empathy or friendliness . This is an example of positive associations to the stereotypes.
Alternative explanation is the contradictory effects of consciously held stereotypes. If an examiner initially perceives a student with a NNEA as ‘foreign’ and therefore ‘underperforming’, what would happen if the student’s performance disconfirms this stereotype? There has been relevant research in the field of the consumer business. When a product performance is higher than the pre-purchase expectation, it leads to a positive consumer response . If this theory is applied to this study, it could be considered students with NNEAs could be assessed more positively than native peers when both perform similarly well.
Although this study was a small-scale pilot experiment, the findings showed a disparity in the global scores due to a NNEA. This could lead to further discussion not only on the matter of OSCE reliability and bias but also carry a wider implication for the increasingly diverse medical student population in the UK.
The experiment showed the accent could lead to both negative and positive bias. As previously mentioned, it is not straightforward to decide whether marking down due to NNEAs could be justifiable or not. It could be argued that decreased comprehensibility due to NNEAs  impairs the communication with a patient. However, a line between the accent being a real communication barrier or a ‘perceived’ issue for the examiner is hard to draw.
Meanwhile, the presence of a positive bias presents an interesting implication. No previous research described medical students with NNEAs could be at an advantage. This trend needs to be addressed in a similar manner to the negative bias. In other words, did the examiner feel having an accent assisted the clinical examination/interviews or did they unconsciously favour the accent?
This study also initiates further discussion on what the consensus is among the medical education community regarding the communication variability within a standardised clinical assessment. Whether the observed positive and negative bias was unconscious or not, it is crucial to consider how the examiners and educational authorities should treat this disparity.
This study also has implication on current and prospective medical students. Identification of the accent bias leads to non-native students to perceive that they could have either an advantages or disadvantages despite being similarly competent to their native peers. Burgess et al.  described a ‘stereotype threat’. When a student recognises the existence of a negative stereotype towards his/her characteristics, this leads to an unconscious hindrance in the performance. Unlike the negative bias towards NNEAs, positive bias was not previously acknowledged. It is difficult to expect what implication this finding would have on students. It might be a case where some with socially favoured accents would maintain their communication and speech styles whereas others with accents with negatively associated factors would suppress their original speech style in OSCEs.
It is not just the medical students’ diversity that needs further consideration. Of all doctors who obtained full registration with the General Medical Council (GMC), 58% qualified outside of the UK . The proportion of international medical graduates practicing in the UK has increased since the 1960s .
These statistics mean that there could also be an increase in the international medical graduate examiners. This could affect how NNEAs influence the examiners considering the stereotyping is dependent on the individual’s social identity . How the communication divergence is evaluated is also different depending on the native/non-native status of a listener . It might be a case that the effect of NNEAs in the current OSCE would change with the increase in abroad-trained examiners.
Similarly, there has been a rise in the migrant population in the UK . The significance of NNEAs in patient communication may change, ultimately leading to a shift in the definition of clinical competence.
Due to a time restriction, the number of the examiners in the study was low. The analysis demonstrated statistically significant findings in the global scores. But the number of ‘Pass’ and ‘Borderline’, two most dominant grades given, was equal for both videos with and without the NNEA. Therefore, it is likely that the statistical findings have risen from the difference of one count in the ‘Good’ and ‘Fail’ grades. The possibility that this was due to chance would be difficult to exclude even with the use of the p value, considering the small sample size.
The global score is not the direct determinant of a student’s outcome in OSCE  but is used as the markers in the borderline regression method to set a pass mark. Borderline regression is not reliable when the data set is drawn from a small sample of less than 50 cohorts . Due to the low number of examiners involved in this study, it is difficult to say the observed change in the global scores due to accent would have any actual effect on the OSCE outcome. Conducting a similar experiment with more examiners would be important in identifying a reliable relationship between NNEAs and OSCE marks.
Although the examiners were blinded from the study aim, some might have deduced it during the experiment. There were only two videos for each examiner. It would be possible for them to notice the speeches in the videos were different. This could have produced bias as more conscious effort would be taken to minimise the marking variation.
As shown in the result, the pattern of the score change for checklist sums was different for group 1 and group 2. It could be argued that the order the examiners watched scripts 1 and 2 influenced how the performance was perceived. This might have been a co-founding variable.
Using more videos with variations in accents, student characteristics and scripts would improve the blinding. The effect of script order would also be reduced. This was not feasible in this study due to the resource constraint. Examiners could further be asked after the submission of the marks to comment on what they thought the study was investigating. This information would aid in evaluating the validity of the study.
In this study, the language background of the examiners were not asked. This information would be valuable because the native/non-native status is highly relevant to how people assess NNEAs.
It was not possible to look at the effect of the demographics information in this study since two markings of from one examiners were used in the analysis. It would be possible to analyse the demographics impact by grouping the data based on smaller subsections of the participant group. It would allow demographics information and accent variables to be treated as isolated variables. More examiners are required to conduct this analysis reliably.
It would also be valuable to investigate the effect on the simulated patient evaluation. The mark scheme used in the experiment had one item for empathy which the simulated patient was asked to contribute to. Analysis of this item would explore the influence of NNEAs on the assessment marks comprehensively. It would also show the influence of accents on the wider public.
In this study, one type of NNEA was compared to NEA. Researchers suggested that perceived stereotypes could be markedly different depending on the type of the accent . Therefore, further research into the effect of several types of NNEA on examiners is required.
Quantitative study alone would not provide an insight into why the NNEAs might be influencing the examiners. Study methods such as interviews should be considered to explore this issue.
Another point to consider is how NNEAs could be interpreted in light of clinical competence. It would benefit students, medical educators and patients to discuss how NNEAs in the clinical context are viewed by all stakeholders. This would inform the training process of OSCE examiners the future.