To our knowledge, this systematic review is the first to characterize and appraise leadership development programs specifically among graduate medical education trainees. Of note, one prior systematic review had appraised the strength of conclusions using the Best Evidence in Medical Education (BEME) Index, but did not appraise the methodology, framework, and results in total. To do so, we employed the MERSQI. The MERSQI is a validated and widely used instrument to assess quality of educational interventions, and, among the most commonly used instruments (BEME, MERSQI, modified Newcastle-Ottawa Scale [m-NOS]), it is most strongly associated with study quality, as assessed by the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) statement.32 By using the MERSQI to more critically inspect these studies, our analysis informs educators about how to build upon what has been previously published to better structure leadership development programs in graduate medical education training programs.
The biggest limitation in the design of these studies is the lack of validity (Internal structure, content, and relationship to other variables). Only one article (Lee, Tse, and Naguwa, 2004) documented efforts to ensure that there was validity in the internal structure of their intervention. None sought to validate content and relationship to other variables. We strongly recommend that future studies critically examine and report the steps that they take to ensure validity. Admittedly, this is difficult given the absence of a single definition of leadership and the tendency for leadership to be viewed as a situationally- and contextually-dependent competency.3,4,5 However, it should not deter investigators in exploring and analyzing how the variables being measured may link to the concept of leadership.
Likewise, future studies need to examine outcomes on patient/healthcare and behaviors. Only one of the 15 examined knowledge or skills (Whitman, 1988) while the others studied satisfaction, attitudes, and perceptions as outcomes. Because leadership is so intrinsically tied to behavioral patterns, evaluation of these sorts of outcomes is essential.1,2 Likewise, leadership is consistently mentioned in the articles included in our analysis as potentially transformative for healthcare, yet the impacts of these interventions on such outcomes are not measured or documented. This is an understandable limitation given the practical challenges of designing medical education studies but it is difficult to interpret the significance of these studies without data regarding more meaningful outcomes that are more closely tied to leadership.
Of course, our analysis has some important limitations. First, because leadership encompasses several overlapping concepts, the foci of these studies were slightly different. Some articles did not break down what types of leadership skills were emphasized in these training programs, while others provided significantly more detail. This variability in content and focus underline the importance of looking critically at leadership as a set of overlapping competencies. Moreover, it reinforces the need to scrutinize study design and methodology of prior published studies, over specific results, since it is unclear how much overlap there is in content between the curricula of the 15 included studies.
Second, the outcomes reported were largely self-reported through non-validated questionnaires. Except for Kuo’s report of the establishment of a three-year residency program, all of the included studies used either pre- and post-test knowledge-based assessments, or self-assessments. Six of the studies that evaluated the course content and composition demonstrated that participants were satisfied, according to the authors’ conclusions. Additionally, 6 studies demonstrated there was a positive impact on their own perception that they had learned about leadership skills. While these are helpful in determining what was learned and how learners viewed their experiences, it does not necessarily provide information about how leadership training impacts behaviors or institutional culture. The absence of follow-up beyond the initial training course in all but three studies also makes it difficult to determine what lasting impact these training programs had on participants.
Thirdly, inclusion and exclusion criteria were not clearly elucidated in the included studies. In the absence of this information, it is difficult to ascertain selection bias or drop-out between training sessions. Similarly, demographic information regarding age and gender were missing in all but one study. These findings preclude generalization of any particular conclusion about leadership training in graduate medical education.
Our systematic review does have certain methodological shortcomings. We limited our search to articles focusing on leadership, but due to the ambiguities regarding the precise definition of leadership, we may have missed articles related to “team leaders,” “managers,” “self-management” or other topics within the realms of leadership training. It is therefore vital to establish clearer definitions of leadership in the context of healthcare and to articulate what competencies define physician leadership. Using clearer definitions of leadership may facilitate investigators to better describe their efforts to uphold the validity of contents, internal structure, and relationship to other variables.
Strengths of our systematic review include the use of multiple databases and the solicitation of other references by both searching the reference lists and by attempting to contact authors of the published material. The methodological rigor of the review was upheld through strict adherence to the PRISMA statement, and each study was evaluated by the MERSQI, a validated instrument to appraise the methodological quality of studies.