This study provides valuable evidence from a public sector teaching hospital in Pakistan regarding the reliability of 360-degree evaluation and its potential use in healthcare settings. This study revealed good reliability of the assessment scale across all three groups of evaluators included in this study, which is consistent with previous evidence. (12, 15) The study compared the overall average of the evaluation scores on the selected scale, as rated by three different types, of evaluators. However, the highest scores were observed for self-evaluations as rated by RTs themselves. On the other hand, the lowest scores were assigned to the group of evaluators comprising faculty members from the same department. This discrepancy in the scores across the evaluator groups can have many explanations, including subjective interpretation of the items on a Likert scale and differences across the evaluators regarding knowledge and understanding of interpersonal communication as well as professionalism. Moreover, faculty might have evaluated RTs more stringently than patients because they were officially in a position to provide feedback to the RTs and because of their natural tendency as educators to critically appraise their performance as part of medical education processes. On the other hand, the patients might have given relatively better scores due to possible fear of any negative consequences for their treatment and investigation outcomes as a result of poorly rating the RTs. A slightly varying pattern was observed when average scores of evaluations were compared across the evaluators for the scale items assessing professionalism only where the highest average scores were reported by the patients, followed by RTs and faculty. The study also revealed moderate interrater reliability with a moderate intraclass correlation coefficient for the evaluation scores, indicating low variability and significant agreement in the evaluation scores obtained by the three different evaluators. This finding is in line with previous evidence from Pakistan supporting the reliability of 360-degree evaluations conducted by multiple raters or evaluators. A study was conducted at Aga Khan University, Karachi, Pakistan, in a cohort of Internal Medicine residents from all four training years between November 2009 and March 2010. Every resident (n = 49) was evaluated for interpersonal and communication skills by eight raters, including physicians, nurses, and unit staff. Each study participant was also supposed to provide a self-evaluation. The study calculated the reliability of those evaluations across the different evaluator groups. The study found a correlation coefficient of 0.39 with fair consistency of ratings across different evaluators. (15) Another 360-degree evaluation of sixteen emergency medicine residents in Turkey involved seven distinct categories of evaluators, including residents, faculty members, nurses, ancillary staff, unit clerks, paramedics and patients. The study included a total of 1088 evaluations and revealed high interrater reliability, i.e., > 0.7 for 360-degree evaluations across different types of evaluators. The evaluation scores given by nurses were significantly lower than those given by other evaluators, while the highest scores were given by the paramedics (16). However, the interrater reliability may widely vary across studies depending on the design and implementation of the evaluation tool and the type of evaluators. (11)
The current study also computed interrater reliability for individual items in the evaluation scale. The study found moderate reliability for all ten items in the scale, with intraclass correlations ranging from 0.44 to 0.67. However, the faculty reported the lowest average scores for two items related to communication skills. One of those items was related to introducing oneself to the patient, and the other was related to effectively explaining the risk and benefits of the procedure to the patients. Poor scores indicate a lack of satisfaction among faculty members regarding performance in communication skills. Interpersonal communication skills are believed to be the most sensitive indicators of performance in healthcare settings because formal education and technical knowledge are not required to assess flaws or deficiencies in interpersonal communication. In addition, interpersonal communication is a major source of showing healthcare providers’ empathy and responsiveness toward the patient; hence, any compromise in the provider’s interpersonal communication can be identified easily. A retrospective analysis of patient complaints evaluated all radiology-related complaints received by the Salazar institution's Office of Patient Advocacy from April 1999 to December 2010. The qualitative analysis of patients’ complaints revealed that a considerable proportion of complaints were related to a lack of patient-centred care and poor interpersonal skills of radiology staff members, identifying these as major areas of improvement (17). Similarly, a previous study conducted at the radiology department of Fayoum University Hospital (FUH), Egypt, in Oct-Dec 2016 assessed the quality of healthcare and patient satisfaction with the intent of highlighting areas for improvement. The study revealed that a lack of explanation of the radiological procedure to the patient was a major factor resulting in patient dissatisfaction with radiological services. (18) Another study conducted in a public and private hospital in Southeast Nigeria revealed that the professionalism of radiographers influences patients’ satisfaction and dissatisfaction with radiology services. (19) In the current study, the lowest rating for introducing oneself to the patient was also consistent with the evaluation by the technologist themselves, as well as the evaluation by the patients. However, poor ratings for the abovementioned items in our study can be explained by the increased workload due to high patient turnover, resulting in a shortage of time for adequate communication with patients.
This study has several strengths. It evaluated the performance of RTs for a variety of procedures with approximately 20 observations for each performer. Hence, it provided a holistic view of the performance while implementing various procedures involving different machines and imaging protocols. This variety in observation while assessing a performer has resulted in high consistency in the assessment. Moreover, all the assessments were conducted using a previously tested scale, which facilitated the comparison of the study findings with previous evidence.
Nevertheless, this study has certain intrinsic limitations. First, the participants who were observed, i.e., RTs, were aware of the situation. Hence, the knowledge that they are being observed might have resulted in deliberate change or improvement in their usual practices or performance. Hence, the Hawthorn effect is an inherent limitation of the 360-degree evaluation approach, which might have affected the results of the present study as well. This limits the validity of 360-degree evaluations, especially for behavioural outcomes. Second, despite prior assurance, the scores of evaluations conducted by patients might be affected by their fear of compromising treatment or service quality and outcomes in the long run if they give low scores to the service provider, i.e., RT, in this study. Third, it was a single-center study, and the participants included were RTs only; hence, generalizability to other cadres of healthcare providers and technologists in other specialties requires caution. Furthermore, the use of a five-point Likert scale for this evaluation might have affected the study results due to differences in the subjective interpretation of the scale items as well as a natural tendency to avoid extreme options on a Likert scale.