BackgroundThe reliability of clinical assessments is known to vary considerably with inter-rater reliability a key contributor. Many of the mechanisms that contribute to inter-rater reliability however remain largely unexplained and unclear. While research in other fields suggests personality of raters can impact ratings, studies looking at personality factors in clinical assessments are few. Many schools use the approach of pairing examiners in clinical assessments and asking them to come to an agreed score. Little is known however, about what occurs when these paired examiners interact to generate a score. Could personality factors have an impact?
Methods:
A fully-crossed design was employed with each participant examiner observing and scoring. A quasi-experimental research design used candidate’s observed scores in a mock clinical assessment as the dependent variable. The independent variables were examiner numbers, demographics and personality with data collected by questionnaire. A purposeful sample of doctors who examine in the Final Medical examination at our institution was recruited.
Results:
Variability between scores given by examiner pairs (N=6) was less than the variability with individual examiners (N=12). 75% of examiners (N=9) scored below average for neuroticism and 75% also scored high or very high for extroversion. The higher an examiner’s personality score for extroversion, the lower the amount of change in his/her score when paired up with a co-examiner; reflecting possibly a more dominant role in the process of reaching a consensus score.
Conclusions:
While the variability between scores given by examiner pairs (N=6) was less than the variability with individual examiners (N=12), the reliability statistics for both assessments were comparable. However, using paired examiners resulted in a more accurate and robust score than simply averaging two independent examiners scores. The higher an examiner’s personality score for extroversion, the lower the amount of change in his/her score when paired up with a co-examiner; reflecting possibly a more dominant role in the process of reaching a consensus score. These findings could have implications for the organisation and administration of clinical assessments. Further studies with larger numbers of participants might establish if personality testing before choosing examiner pairs should be adopted.

Figure 1

Figure 2

Figure 3
Loading...
On 18 Mar, 2020
On 05 Mar, 2020
On 04 Mar, 2020
On 04 Mar, 2020
Posted 31 Oct, 2019
On 23 Jan, 2020
Received 26 Dec, 2019
On 12 Dec, 2019
On 12 Nov, 2019
Received 12 Nov, 2019
Invitations sent on 05 Nov, 2019
On 29 Oct, 2019
On 28 Oct, 2019
On 28 Oct, 2019
On 22 Sep, 2019
Received 18 Sep, 2019
On 17 Sep, 2019
Received 16 Sep, 2019
Received 16 Sep, 2019
On 05 Sep, 2019
On 26 Aug, 2019
Invitations sent on 25 Aug, 2019
On 24 Aug, 2019
On 15 Aug, 2019
On 15 Aug, 2019
On 10 Aug, 2019
On 18 Mar, 2020
On 05 Mar, 2020
On 04 Mar, 2020
On 04 Mar, 2020
Posted 31 Oct, 2019
On 23 Jan, 2020
Received 26 Dec, 2019
On 12 Dec, 2019
On 12 Nov, 2019
Received 12 Nov, 2019
Invitations sent on 05 Nov, 2019
On 29 Oct, 2019
On 28 Oct, 2019
On 28 Oct, 2019
On 22 Sep, 2019
Received 18 Sep, 2019
On 17 Sep, 2019
Received 16 Sep, 2019
Received 16 Sep, 2019
On 05 Sep, 2019
On 26 Aug, 2019
Invitations sent on 25 Aug, 2019
On 24 Aug, 2019
On 15 Aug, 2019
On 15 Aug, 2019
On 10 Aug, 2019
BackgroundThe reliability of clinical assessments is known to vary considerably with inter-rater reliability a key contributor. Many of the mechanisms that contribute to inter-rater reliability however remain largely unexplained and unclear. While research in other fields suggests personality of raters can impact ratings, studies looking at personality factors in clinical assessments are few. Many schools use the approach of pairing examiners in clinical assessments and asking them to come to an agreed score. Little is known however, about what occurs when these paired examiners interact to generate a score. Could personality factors have an impact?
Methods:
A fully-crossed design was employed with each participant examiner observing and scoring. A quasi-experimental research design used candidate’s observed scores in a mock clinical assessment as the dependent variable. The independent variables were examiner numbers, demographics and personality with data collected by questionnaire. A purposeful sample of doctors who examine in the Final Medical examination at our institution was recruited.
Results:
Variability between scores given by examiner pairs (N=6) was less than the variability with individual examiners (N=12). 75% of examiners (N=9) scored below average for neuroticism and 75% also scored high or very high for extroversion. The higher an examiner’s personality score for extroversion, the lower the amount of change in his/her score when paired up with a co-examiner; reflecting possibly a more dominant role in the process of reaching a consensus score.
Conclusions:
While the variability between scores given by examiner pairs (N=6) was less than the variability with individual examiners (N=12), the reliability statistics for both assessments were comparable. However, using paired examiners resulted in a more accurate and robust score than simply averaging two independent examiners scores. The higher an examiner’s personality score for extroversion, the lower the amount of change in his/her score when paired up with a co-examiner; reflecting possibly a more dominant role in the process of reaching a consensus score. These findings could have implications for the organisation and administration of clinical assessments. Further studies with larger numbers of participants might establish if personality testing before choosing examiner pairs should be adopted.

Figure 1

Figure 2

Figure 3
Loading...