Does Professional Development Effectively Support the Implementation of Inclusive Education? A Systematic Review and Meta-Analysis

doi:10.21203/rs.3.rs-2333341/v1

Download PDF

Systematic Review

Does Professional Development Effectively Support the Implementation of Inclusive Education? A Systematic Review and Meta-Analysis

https://doi.org/10.21203/rs.3.rs-2333341/v1

This work is licensed under a CC BY 4.0 License

Version 1

posted

You are reading this latest preprint version

Inclusive education is a reform aimed at educating all students in general classrooms, independent of diversity features such as special educational needs, giftedness, or migration. Its successful implementation requires teachers with professional knowledge about inclusive education, skills to address the diverse needs in the classroom, and positive beliefs toward inclusive education. Teachers are provided with professional development opportunities, but are these effective in improving their learning process and positively impacting students’ behavior and achievement? We conducted a systematic review and meta-analysis to address this question. The screening of 12,050 search results revealed 342 eligible studies with more than 155,000 participants and 1,123 effects from four outcome categories: teachers’ knowledge, skills, and beliefs and students’ behavior and achievement. We observed positive, though varying, effects on all four outcome categories: large effects on knowledge regarding inclusive education (g = 0.93), moderate effects on teachers’ skills (g = 0.49), small effects on teachers’ beliefs (g = 0.23) and small-to-moderate effects on student behavior (g = 0.37). We also examined factors that might explain the differences in the strength of training effects. The data suggest that long-term training with high practical relevance and active learning opportunities facilitates transfer to schools.

Educational Psychology

inclusive education

professional development

meta-analysis

teacher training

systematic review

In past decades, education systems have considerably changed the way students with diverse educational needs are educated (United Nations, 2006). As evidenced by an increasing numbers of countries introducing laws facilitating inclusive education (Allan, 2021), school systems are adapting to address the demands of their diverse communities. Approaches that involve educating students together in general classrooms are gaining influence to overcome diversity-related disadvantages, independent of students’ diverse features such as special educational needs, giftedness, or multicultural background (United Nations, 2006). Well-prepared and educated teachers are crucial to realize this development toward more inclusive education.

Teachers play a crucial role in implementing inclusive education, as they shape learning opportunities and experiences in school for students (Hattie, 2009) and are responsible for introducing innovations in education systems. To realize educational reforms, teachers should be equipped with professional knowledge, skills for implementation, and positive beliefs about the reform (e.g., Bransford et al., 2005). Successful dissemination should influence students’ achievement, behavior, and attitudes toward school through the positive effects on teachers (Desimone, 2009; Garet et al., 2001; Pit-Ten Cate et al., 2018).

Approaches To Support Teachers’ Learning About Inclusive Education

To support preservice teachers in developing these skills and attitudes and acquiring the necessary knowledge, inclusive education has become obligatory in teacher education across many countries (Florian & Camedda, 2020). Professional development opportunities are offered to in-service teachers, in support of their ongoing implementation of inclusive education. These teachers face the challenge of changing their current practice (Gregoire, 2003) but also have the chance to try and implement new teaching methods immediately. Therefore, we assume that learning while being in teacher education and learning on the job in professional development differs; we are specifically interested in the effectiveness of professional development for in-service teachers.

Professional development opportunities offered to in-service teachers commonly refer to group training sessions where teachers from different institutions come together and receive input on a topic; however, they are not supported with regard to later implementing the presented information and methods. Although following this practice equips many teachers with professional development at comparably low costs (Darling-Hammond et al., 2017), its effectiveness can be questioned. Educational research has identified certain aspects that can enhance the efficacy of professional development (e.g., Desimone, 2009; Garet et al., 2001). Content focus, coherence with other learning activities, opportunities for intensive active learning, collective participation of teachers from one institution, and longer duration of training are all aspects that can support teachers’ learning in professional development and help modify their teaching practices (Desimone, 2009). However, these are difficult to meet achieve in one-shot events. Moreover, the majority of standard designs of professional development programs do not follow these guidelines (e.g., Cramer et al., 2019).

Because providing in-service teachers with opportunities for professional development is a common way to prepare them to implement inclusive education, it is crucial to investigate its effectiveness. Studies report no effect of professional development participation and even negative influences on in-service teachers’ attitudes toward inclusive education (e.g., Adhabi, 2018; Alquraini, 2012; Edwards et al., 2006). Even well-designed programs report that only a few participants genuinely understand the concept of interest and feel equipped to implement it (e.g., Carew et al., 2019; Forlin et al., 2014). Thus, we decided to conduct a comprehensive systematic review and meta-analysis on the effects of professional development in preparing in-service teachers for implementing inclusive education.

Previous Reviews

The number of studies addressing this topic has risen with an increasing number of countries introducing inclusive education. Previous reviews have mainly described the research practices in this field (Van Mieghem et al., 2020; Waitoller & Artiles, 2016), revealing that studies investigating the influence of professional development addressing inclusive education assess different categories of outcome variables. Specifically, teachers’ knowledge, skills and beliefs, as well as students’ academic achievement, learning behavior, and attitudes toward school, are commonly assessed. Reviews report positive influences of professional development participation but include only a few primary studies in their analyses, with considerable variations in reported effect sizes (Avramidis & Norwich, 2002; Brock & Carter, 2017; Dignath et al., 2022). However, these studies focus on specific aspects of diversity (i.e., special educational needs) or isolated outcome measures (e.g., attitude; Dignath et al., 2022, or implementation fidelity; Brock & Carter, 2017). They mainly comprise studies on English-speaking countries or are exclusively peer-reviewed studies (e.g., Tristani & Bassett-Gunter, 2020; Van Mieghem et al., 2020). None of the existing reviews differentiates between preservice and in-service teachers, although it is plausible that learning processes differ depending on previous experience and the environment (e.g., Edwards et al., 2006).

Indicators For Improved Implementation Of Inclusive Education

Inclusive education is a complex issue, and there is no clear picture of how implementation programs should ideally be designed. There are diverse approaches in everyday practice and research on preparing educators for inclusive education. Effective professional development should lead to increased (a) knowledge and (b) skills, (c) a positive change in beliefs, and through changes in instructions to (d) improved student behavior and achievement (Desimone, 2009).

We conflated the subcategories of outcome variables reflecting all the four categories described in the literature: (a) knowledge can be measured via self-reports and knowledge tests; (b) teachers’ skills are measured using assessments of implementation quality of inclusive teaching methods, variations and frequency of teaching methods, and perceived self-efficacy to use those methods; (c) attitudes and concerns toward inclusive education and perceptions of inclusive teaching methods are assessed to indicate teachers’ beliefs toward inclusive education; and, regarding student behavior, (d) academic achievement is mostly assessed via standardized tests. Data on other student behavior, such as learning behavior, school attendance, and attitudes toward school, are mainly collected via teacher surveys, school data, and students’ self-reports.

When investigating the effects of professional development, all four categories of outcome variables are relevant and can be expected to influence each other. For example, knowledge about students’ educational needs is necessary to identify their specific learning requirements. Teachers must also be aware of diverse teaching approaches to select an appropriate one for their students. Applying new teaching approaches can improve students’ academic achievements and learning behavior, enabling teachers to perceive positive influences on students. Observing struggling learners benefit from changes in instruction can support the development of positive beliefs toward inclusive education, which can facilitate the acquirement of relevant knowledge and its implementation through targeted programs. However, there is no systematic literature synthesis that examines relevant outcomes of professional development at both the teacher and student levels.

Study Aims

The current systematic review and meta-analysis aims to address this research gap by investigating the effects of professional development on four important outcome categories, specifically to determine whether it (a) enhances knowledge about inclusive education, (b) improves skills of in-service teachers, (c) fosters positive beliefs toward inclusive education, and (d) supports students’ academic achievement and learning behavior. For this study, we conceptualize professional development as structured training for in-service teachers, primarily provided in a group setting and offered to facilitate teachers’ preparedness for inclusive education. As the implementation of inclusive education is not limited to the classroom teacher, we include studies on general and special education teachers, teaching assistants, and school administration personnel from kindergarten to high school. Contrary to previous reviews, we distinguish between pre- and in-service teachers. It is questionable how meaningful conclusions are when interventions with pre- and in-service teachers are treated as equivalent, because the conditions under which learning takes place differ between pre- and in-service teachers. As in-service teachers mainly shape the current implementation of inclusive education, we focus solely on this group.

The current study aims to identify aspects that enhance the effectiveness of professional development in addressing inclusive education. In order to extend the range of information gathered, we do not limit geographical areas and include studies with different designs in this meta-analysis (Dignath et al., 2022; Katsarov et al., 2022). The inclusion of different study designs has been discussed in the literature (Mueller et al., 2018) and this practice has become more common practice, as it can help answer the question of underlying effects and analyze biases introduced by the different designs (Price et al., 2004). Cross-sectional studies have often been excluded from meta-analyses due to their methodological limitations (e.g., inability to determine causal interpretations and investigate behavior over time), although such studies can provide meaningful insight into teachers’ professional development. For example, cross-sectional studies can indicate the effects of naturally occurring professional development participation, including participation in short-term events and multiple programs (O’Connor & Sargeant, 2014), compared to more intensive programs (than commonly provided; Cramer et al. 2019), which are often investigated in intervention studies (Waitoller & Artiles, 2013).

Hypotheses

Overall, we expect positive influences of professional development participation on all four categories of outcome variables. We expect smaller effects on student-level outcomes than on teacher-level outcomes (Desimone, 2009) because these are expected to occur through teachers’ behavior changes. Regarding the study design, we expect larger effects in intervention studies than in cross-sectional studies, as the former often assesses the program’s effect shortly after its completion, while the latter does not control the time between the program and assessment. Additionally, professional development effects assessed in cross-sectional studies can be expected to be rooted in commonly provided short-term one-shot programs compared to intensive programs investigated in intervention studies. When examining professional development intervention studies, we expect effect sizes to be associated with the number of design criteria described by Garet et al. (2001) and Desimone (2009) met by the training. More specifically, we expect content focus, active learning opportunities, coherence with additional learning activities, longer duration of the training, and collective participation with colleagues to enhance the effectiveness of the professional activities. Some professional development programs offer certification after successful participation. We include this as an additional design aspect because receiving certification can enhance motivation and learning effects (Larsen et al., 2008).

The PRISMA guidelines (Salameh et al., 2020) were used to plan and conduct all phases of the meta-analysis to ensure a transparent process (see preregistration https://osf.io/jyw5z/). Supplementary information, including data and code, is available via OSF (https://osf.io/ehjc3/).

Literature Search

A systematic search of the literature was conducted in April 2021 using the databases PsycINFO, Web of Science, and ProQuest (ERIC, Education Database, Dissertations & Theses) using the following search terms: inclusion OR inclusive education OR inclusive classroom AND professional development OR teacher training OR workshop OR teacher education AND teacher OR pedagogical staff OR pedagogical personnel OR teaching assistants OR educators AND school OR K-12 OR kindergarten OR preschool OR vocational college. To counteract publication bias, unpublished studies were screened using the Dissertations & Thesis database of ProQuest and conference abstracts in the search process. In addition, relevant journals and conferences were manually searched (see preregistration). After removing duplicates, the literature search revealed 12,050 results, of which 253 were derived solely from journals and 81 from the manual search of conference abstracts.

Inclusion Criteria And Screening

We included all publications that met the following criteria: Studies had to 1) measure the impact of professional development, 2) regarding the topic of inclusive education, 3) among in-service teachers and school personnel (e.g., administrative staff, teaching assistants) on 4) knowledge (assessed with knowledge tests and self-ratings), skills (concerning implementation quality, use of inclusive teaching methods, and self-efficacy for inclusive teaching), beliefs (attitudes, concerns, and perceptions of inclusive education), or student behavior (academic achievement, on-task behavior, school attendance, and attitudes toward school); and (5) been reported as cross-sectional data, pre- versus post-comparisons of participant data, or comparisons with a control group.

Studies were excluded if preservice teachers or mere coaching were investigated. If a specific study was described in both a journal article and a dissertation, the dissertation was included in the meta-analysis because data are commonly reported in more detail in dissertations than in journal articles.

Three coders screened the search results. Two raters independently screened a subsample of 550 abstracts. Divergent decisions concerning the inclusion or exclusion of studies were resolved through discussion and arriving at a consensus. Krippendorff's Alpha (2011) indicated high inter-rater reliability (α = 0.912). The screening identified 947 abstracts that met the inclusion criteria (see Fig. 1). Initially, we also planned to include qualitative studies that met the mentioned criteria. However, due to the large number of studies identified in the screening process, we decided to exclude the qualitative studies after the screening process. From the 947 studies, 281 qualitative studies were excluded, 65 studies did not fulfill the inclusion criteria when we examined the full text, 209 studies did not report sufficient data for the meta-analysis (e.g., only reported accurate data one time during data collection, did not report accurate data for the control group). Additionally, 27 studies were excluded as they were duplicates of already included studies. The authors of studies for which full texts were not available (k = 34) were contacted. Thus, we obtained 11 further studies. The remaining 23 studies had to be excluded. Thus, 342 studies were finally included in the meta-analysis.

Coding Of Moderator And Control Variables

To ensure transparency of the coding, a codebook was developed in advance and adjusted during a test phase until all coders were proficient with it (see Supplement A). A pre-configured M.S. Excel table was used for the final coding process. The publications identified in the screening process were then coded based on the characteristics of the source of information (e.g., type of publication, country) and study design (e.g., sample size, sampling procedure), participant attributes (e.g., profession, experience with inclusive education), design of the professional development program (e.g., duration, topic, practice opportunities), obtained results (e.g., type of data collection, instrument, analysis method), and type of outcome measure (knowledge, skills, beliefs, student behavior). Again, a subsample of 38 studies was independently coded by two coders. Discrepancies in coding decisions were resolved through discussion and after arriving at a consensus. Inter-rater reliability was high on average (mean Krippendorff's Alpha across all coded items M_α = 0.903, SD = 0.088, Min = 0.529, Max = 1).

Professional development programs were rated based on design criteria as suggested by Garet et al. (2001) and Desimone (2009), drawing on the information provided in the studies. We used the number of contact hours of the program to determine duration. We rated content focus, coherence, active learning, and collective participation on scales comprising three criteria, with scores ranging from 0 to 3. A score of 0 implied that the program did not meet any criterion, a score of 1 implied that the program met one criterion, a score of 2 implied that two criteria were met, and a score of 3 implied that the program met all criteria. The scale content focus was rated from 0 = general topic to 3 = addressing a specific topic (e.g., a specific type of special educational needs) within a specific subject and inclusive teaching method. The active learning scale was rated from 0 = input session to 3 = when the program provided practice opportunities within the sessions, used case studies for explanations and practice, and provided alternating input and praxis phases (compared to blocked design). On the coherence scale, a score of 3 was given when implementation was planned in the professional development session, additional coaching was provided, and teachers had to fulfill specific prerequisites to participate in the program (mainly referring to having at least one child with the diversity feature of interest in their classroom). Collective participation was rated by the number of colleagues participating in the program, with 0 = single participation, 1 = participation with at least one colleague, 2 = participation with the class team, and 3 = participation of (almost) all school staff.

Effect Size Calculation

Cohen’s ds were chosen as effect sizes and calculated from reported means and standard deviations (Lipsey & Wilson, 2001). If these data were not provided, reported test statistics were used to calculate the effect sizes. The correction factor J was applied to correct for the small-sample bias of d (Borenstein et al., 2021), resulting in Hedge’s g (1981) as the effect size metric of the current meta-analysis. The R package esc (Lüdecke, 2019) was used to calculate the effect sizes. Effect sizes larger than 2 in absolute value were assumed to be outliers that arose from reporting mistakes. In such cases, we contacted the study authors; if the authors did not respond, effects were replaced by estimates that were two standard deviations from the mean in the respective outcome categories (see Lipsey, 2009; Tukey, 1977). In total, 14 effects were thus replaced.

Summary Effects And Heterogeneity Tests

Summary effects were estimated for each outcome category separately. Effect sizes of g = 0.2 will be interpreted as small effects, g = 0.5 as medium effects, and g = 0.8 as large effects (Cohen, 1977). We expected high heterogeneity in the data, so all analyses were performed with random-effects models. We applied multi-level analyses (Assink & Wibbelink, 2016) to account for the dependency of effect sizes within studies. The (observed) sampling variance of the effect sizes was modeled on Level 1, the within-study variance on Level 2, and the between-study variance on Level 3. The Q-test (Borenstein et al., 2021) was used to assess effect-size heterogeneity. Significant values indicated the presence of heterogeneity and suggested conducting moderator analyses to identify potential effect moderators (see below). All analyses were conducted in R with the metafor package (Viechtbauer, 2010), and visualizations were plotted using the metaviz package (Kossmeier et al., 2019). Significance was set to p < 0.05 (two-tailed).

Moderator Analyses

Moderator analyses were conducted for each outcome category in the case of effect size heterogeneity using the following variables: (1) indicators of study quality (explained in detail below); (2) study characteristics (i.e., intervention vs. cross-sectional study, publication year, years since the legal introduction of inclusive education, continent); (3) data collection (i.e., time between the last session of the professional development program and post data collection, type of measurement [e.g., observation, questionnaire, vignette] and whether the data collection instrument focused on a specific diversity feature or method); (4) participant characteristics (mean age and teaching experience, school type, percentage of those with experience implementing inclusive education); and (5) professional development design (content focus, active learning, coherence, duration, collective participation, certification). Differences based on the design of programs are analyzed only in intervention studies. We applied the Benjamini-Hochberg procedure (Benjamini & Hochberg, 1995) to control the false-discovery rate when conducting multiple comparisons for each group of tested moderators addressing the same research question. We used a false-discovery rate of 10% for this procedure.

Assessment Of Risk Of Bias And Publication Bias

To assess the risk of bias in individual studies, we adapted the Medical Education Research Study Quality Instrument (MERSQI; Reed et al., 2007) for our meta-analysis. The MERSQI has shown high inter-rater reliability and assesses the risk of bias based on quality assessments for each outcome. Our adapted version assessed the (1) study design, (2) response rate, (3) sampling procedure, (4) allocation to conditions, (5) type of data, (6) use of a standardized instrument, (7) internal structure of the instrument, and (8) whether the handling of missing data was reported (see Supplement B). As suggested by the PRISMA guidelines, we analyzed the risk of study bias for each category separately through moderator analyses.

As suggested in the literature, several methods were applied to identify and estimate the risk of publication bias (Rothstein et al., 2005; Siegel et al., 2021). Most methods are difficult to apply with multi-level data. We chose four methods to examine typical sources of publication bias: First, to reduce the risk of publication bias in the first place, unpublished studies were explicitly included in the meta-analyses, and moderator analyses were conducted to test whether published and unpublished studies differed in their effects. Second, contour-enhanced funnel plots (Peters et al., 2008) were created for each outcome category and all their subcategories. In the absence of publication bias, the effect sizes should be symmetrically distributed around the mean effect size, typically in the form of an inverted funnel. Third, the Egger regression test for multi-level data, which regresses the effect sizes on their precision (standard errors) to test for small-study effects (Fernández-Castilla et al., 2021), was used to support the visual analyses. Effect-size asymmetry in the funnel plot results in a significant Egger regression test if large enough. Fourth, power-enhanced funnel plots were created to include information on the power of individual studies to detect the estimated effect size in the present sample (Kossmeier et al., 2020).

Description of Included Studies

In total, 342 studies met our inclusion criteria (see Supplementary C and N for a complete list). In two cases, two publications were based on the same data but reported different outcome measures and were thus treated as one study (Chao et al., 2016, 2017; Machů, 2015; Machů & Lukeš, 2019).

Most studies were conducted in North America (k = 205), followed by Europe (k = 68). Of the included studies, there were 158 journal articles, 166 dissertations, 12 project reports, and 6 conference papers. In total, 158,713 participants were included in the primary studies, including 62,729 students, 77,787 teachers, and 18,197 members from class and school teams including different professions, such as teaching assistants, and administrative staff. Sample sizes in the primary studies ranged from 3 to 31,000 participants, with a median of 77 participants. Twenty-three studies were conducted with preschool teachers, 115 with primary school teachers, and 76 with secondary school teachers; 128 studies did not define the school type where teachers were employed. Moreover, 188 studies applied a cross-sectional design, 99 had a single-group pre-posttest design, 37 had an independent group pre-posttest design, and 18 had an independent group post-test design. Most studies focused on inclusive education for students with special educational needs (k = 288), with 76 focusing on specific special educational needs (e.g., autism, learning disabilities). The remaining 54 studies focused on other diversity features, such as, second language learners and gifted students, or addressed multiple categories of heterogeneity.

The professional development programs investigated in the 154 intervention studies ranged from 2 to 750 hours, with a median of 20 hours, and lasted between half a day and three school years, with a median of three months. Most programs addressed a specific topic (k = 112; primarily specific types of special educational needs) but usually did not target a specific subject (k = 126). Most intervention studies assessed the professional development program’s impact immediately after its end (k = 98); 24 programs offered a certificate to the participants after completing the program, and 51 offered coaching in addition to the training.

In total, 1,123 effect sizes were calculated and distributed as follows among the outcome categories: 88 effect sizes for knowledge, 371 for skills, 461 for assessed beliefs regarding inclusive education, and 203 for influences on student behavior (Fig. 2). No differences were observed for effect sizes calculated from means and standard deviations compared to those calculated from reported test statistics (all F < 3.8, all ps > .05, see Supplement F).

Summary Effects

We calculated summary effects for each outcome category and investigated whether the subcategories within a category differ from each other. We observed significant positive effects in all four outcome categories (Fig. 3). The analysis of knowledge showed a large effect (g = 0.93 [0.76; 1.10]), with no difference between self-rated knowledge (g = 0.96 [0.70; 1.22]) and knowledge assessed using tests (g = 0.91 [0.68; 1.15], F(1, 86) = 0.50, p = .48). A moderate effect was observed on skills to implement inclusive education (g = 0.49 [0.41; 0.56]) and on its subcategories (see Fig. 3). The subcategories (implementation quality, use of inclusive methods, self-efficacy for inclusive teaching) did not differ from each other (F(2, 368) = 0.15, p = .86).

We observed a small but positive and significant effect on beliefs toward inclusive education (g = 0.23 [0.17; 0.28]), again with no differences between its subcategories (F(2, 457) = 2.34, p = .10). Still, for attitude (g = 0.23 [0.18; 0.28]) and perception of inclusive teaching methods (g = 0.27 [0.16; 0.39]) a positive effect of professional development participation was observed, while no significant effect was observed for concerns about inclusive education (g = 0.08 [-0.14; 0.30]). A small-to-moderate effect of teachers' participation in professional development was observed on students’ behavior (g = 0.37 [0.23; 0.51]), with no differences between student achievement (g = 0.41 [0.22; 0.61]) and other student behavior (g = 0.29 [0.11; 0.46], F(1, 201) = 0.001, p = .98).

Next, we investigated the presence of heterogeneity for each outcome category and the Q-test was significant for each (Table 1; for more detailed information, see Supplement E). The most variance was found on the between-study level, except for the beliefs category, where variance was mainly present on the within-study level (53.21%). These analyses suggest conducting moderator analyses in all four outcome categories.

Table 1

Overview of Summary Effects and Variance
Outcome category	k	N	Effect size	95% CI	Q	p_Q	Within-study variance (%)	Between-study variance (%)
Knowledge	50	88	0.926	0.76–1.10	1482.96	< .0001	10.24	83.35
Skills	153	371	0.485	0.41–0.56	7446.14	< .0001	34.19	63.76
Beliefs	200	461	0.230	0.18–0.28	2745.3	< .0001	53.21	32.59
Student behavior	51	203	0.372	0.23–0.51	3339.81	< .0001	39.36	58.51
Note. k indicates the number of studies reporting data in the corresponding outcome category and N indicates the number of effect sizes per category.

Moderator Analyses Of Study Characteristics

The results of moderator analyses for the four outcome categories are summarized in Table 2 (see Supplement J-M for more detailed information). Publication year, years since the legal anchoring of inclusive education, and the continent where the studies were conducted did not influence the observed effects. Knowledge and beliefs data were not influenced by the control variables describing the study characteristics. Effect sizes reporting effects on skills differed between intervention studies and cross-sectional studies, with the former reporting significantly larger effect sizes (g = 0.56 [0.46; 0.66]) than the latter (g = 0.36 [0.27; 0.46]).

Table 2

Overview of Moderator Analyses in the Outcome Categories
	Knowledge			Skills			Beliefs			Student Behavior
Moderator	F	df	p	F	df	p	F	df	p	F	df	p
Study design
Publication year	1.06	1, 86	.31	3.22	1, 368	.07	0.08	1, 458	.77	2.56	1, 200	.11
Years since legal anchoring	0.22	1, 86	.64	0.12	1, 368	.74	0.91	1, 458	.34	0.60	1, 200	.44
Intervention studies	3.73	1, 86	.06	8.13	1, 369	.005	0.04	1, 458	.84	0.16	1, 201	.69
Continent	0.67	6, 81	.68	1.03	8, 362	.41	1.60	8, 451	.12	2.27	4, 198	.06
Data collection
Time between training and data-collection	0.38	1, 75	.54	1.63	1, 257	.20	0.41	1, 210	.52	2.72	1, 194	.10
Instrument Focus: diversity feature	6.57	1, 86	.01	0.00	1, 369	.95	6.21	1, 458	.01	1.08	1, 201	.30
Instrument focus: method	0.03	1, 86	.86	0.86	1, 369	.36	6.34	1, 458	.01	0.15	1, 201	.70
Instrument type	1.99	2, 85	.14	1.40	5, 365	.22	1.57	3, 456	.20	8.53	4, 198	< .001
Participant characteristics
Mean age	1.66	1, 34	.21	0.84	1, 89	.36	0.03	1, 170	.87	0.07	1, 41	.79
Inclusive teaching experience	3.89	1, 39	.06	2.16	1, 214	.14	0.08	1, 281	.77	-	-	-
School type	0.57	3, 84	.64	1.57	3, 366	.20	2.66	3, 456	.048	2.58	3, 198	.06
Professional development
Content focus	0.01	1, 75	.94	0.02	1, 257	.90	0.38	1, 210	.54	2.55	1, 194	.11
Active learning	0.00	1, 75	.98	7.84	1, 368	.005	2.14	1, 210	.15	0.02	1, 194	.90
Coherence	0.70	1, 75	.41	2.10	1, 368	.15	0.91	1, 210	.34	1.68	1, 194	.20
Duration	0.02	1, 75	.89	0.56	1, 245	.46	1.05	1, 209	.31	0.01	1, 188	.92
Collective participation	1.89	1, 75	.17	0.07	1, 257	.79	0.85	1, 210	.36	3.86	1, 194	.05
Certification	8.27	1, 75	.005	0.74	1, 257	.39	0.93	1, 209	.34	0.07	1, 194	.79
Note. Degrees of freedom differ based on the information available in the studies.

Moderator Analyses Of Data Collection Characteristics

No differences were observed based on the number of weeks between the last session of the professional development program and post-data collection (Table 2; for more detailed information see Supplement J-M). Effects on knowledge were influenced by the type of instrument used: Studies applying instruments focusing on specific diversity features reported smaller knowledge gains (g = 0.75 [0.47; 1.02]) compared to instruments that focused on addressing diversity features in inclusive education (g = 1.04 [0.83; 1.24]). Tested knowledge was larger when assessed with (mainly self-developed) surveys (g = 1.10 [0.80; 1.39]) than with questionnaires and single-items (g = 0.63 [0.28; 0.98], F(2, 48) = 3.54, p = .04).

Effects on skills were not influenced by variables describing the data collection. Regarding beliefs, effects differed based on the applied instruments, with larger effect sizes being observed when instruments focusing on specific teaching methods were used (g = 0.51 [0.41; 0.60]) compared to instruments focusing on the implementation of inclusive education in general (g = 0.22 [0.17; 0.27]). The type of measurement influenced data collection on student behavior, with studies applying observational measures reporting larger effects (g = 0.69 [0.34; 1.03]) than studies using teachers’ self-reports (g = 0.16 [-0.09; 0.40]).

Moderator Analyses Of Participant Characteristics

None of the variables describing participant characteristics influenced the observed effect sizes in the categories (Table 2). School type influenced effect sizes assessing the attitudes toward inclusive education subcategory (F(3, 333) = 3.29, p = .02): We observed positive effects in primary (g = 0.24 [0.17; 0.32]) and secondary (g = 0.37 [0.25; 0.48]) school teachers but no effect in kindergarten teachers (g = -0.03 [-0.55; 0.49]). Two subcategories—tested knowledge and use of inclusive teaching methods—were influenced by participant characteristics: The higher the mean age, the smaller the effect observed for tested knowledge (F(1, 23) = 8.50, p = .01, B = -0.11, SE = 0.04), and the more the teachers reported having inclusive teaching experience, the smaller were effects on the use of inclusive teaching methods (F(1, 88) = 10.91, p = .001, B = -0.02, SE = 0.007).

Moderator Analyses Of Professional Development Design

Content focus did not influence any category of outcome variables, but it did influence a subcategory of student behavior: Changes in student achievement were positively influenced by higher content focus (F(1, 90) = 5.66, p = .02, B = 0.25, SE = 0.11). Active learning was a significant moderator for skills (F(1, 368) = 7.84, p = .005): Programs with more active learning opportunities reported larger effect sizes (B = 0.11, SE = 0.04). Specifically, the indicator alternating versus blocked design explained variance in effects sizes reflecting changes in the use of teaching methods (F(1, 88) = 9.03, p = .004), with larger changes reported in programs with alternating input and praxis phases (g = 0.78 [0.57; 0.99]) than in programs with a blocked design (g = 0.03 [-0.40; 0.46]).

Coherence with other learning activities had no influence, except for the subcategory changes in the perception of inclusive teaching methods were positively influenced by additional coherent learning activities (F(1, 60) = 4.80, p = .03, B = 0.11, SE = 0.07). Analyses of the indicators for coherence showed that programs requiring teachers to fulfill prerequisites for participation reported larger changes in the subcategories of perception of teaching methods (g = 0.60 [0.33; 0.87], F(1, 59) = 13.58, p = .005) and student achievement (g = 0.71 [0.18; 1.24], F(1, 90) = 5.20, p = .03) compared to programs that were open to all teachers (g = 0.13 [0.02; 0.24], g = 0.25 [0.09; 0.42], respectively).

We limited the analyses on duration to programs lasting up to 200 hours, representing about two-thirds of all effect sizes (64.6%), to reduce the influence of extreme programs due to large differences between them (range 2–750 hours). Following this reduction, we did not observe influences of the duration of training programs on any of the outcome categories and subcategories (Table 2 and Supplement J-M). Collective participation did not influence any outcome category but negatively influenced the subcategory of student achievement (F(1, 90) = 7.71, p = .01, B = -0.21, SE = 0.08). When all school personnel participated, no effects on student achievement were observed (g = 0.06 [-0.06; 0.17]); however, small-to-moderate effects were observed when class teams participated (g = 0.3 [0.1; 0.5]) and moderate effects were noted when teachers participated with one colleague (g = 0.66 [0.16; 1.15]) and without colleagues (g = 0.74 [0.1; 1.39]). Studies offering certification after successful completion of the program observed larger effect sizes for knowledge gain (g = 1.39 [1.07; 1.72]) than did programs without certification (g = 0.86 [0.65; 1.08]) but this did not influence the other categories of outcome variables.

Study Quality

Study bias in the included studies was generally high (M = 3.74, SD = 1.36, Min = 1, Max = 8.5). Although moderation analysis of study bias indicated that study quality did not influence the observed effect sizes in the four outcome categories (all F < 1.4, all ps > .2, see Supplement G), its influence was observed in the subcategories of self-rated knowledge (F(1, 35) = 11.06, p = .002, B = -0.25, SE = 0.08) and use of inclusive teaching methods (F(1, 133) = 4.89, p = .03, B = -0.12, SE = 0.05), where less risk of bias was related to smaller effect sizes.

Publication Bias

Regarding the presence of publication bias, visual analyses of the contour-enhanced funnel plots indicate that the individual effects are roughly symmetrical (see Supplement F, H, and I). Most fall within the 99% confidence interval, and outliers stem from published and unpublished studies. Egger regression tests suggested symmetry for the funnel plots for all outcome categories and subcategories (all F < 1.3, all ps > .2). The power-enhanced funnel plots (see Fig. 4) illustrate substantial differences in detecting the estimated effect sizes between studies. Moreover, a few studies with very low power were included, but these fall within the normal range of effect sizes. Most studies have low power, especially those assessing beliefs (med_power = 21.5%) and student behavior (med_power = 41.8%). Power in studies assessing skills was rather moderate (med_power = 69.2%) and sufficient in studies assessing knowledge (med_power = 90.2%). Publication status was an inconsequential predictor of knowledge (F(1, 86) = 0.88, p = .35) and skills (F(1, 368) = 2.07, p = .15), but it moderated effects on beliefs (F(1, 457) = 6.85, p = .01) and student behavior (F(1, 200) = 4.77, p = .03). Larger effects were reported in published studies (beliefs: g = 0.30 [0.22; 0.38], student behavior: g = 0.49 [0.28; 0.71]) than in unpublished studies (g = 0.19 [0.13; 0.24], g = 0.17 [0.07; 0.28], respectively).

These methods suggest that publication bias is present, to a varying degree, in the current meta-analysis in outcome categories and subcategories. The analyses reveal that two typical sources of publication bias do not exert a large influence on the estimated effect sizes, as studies with small sample sizes and low power in this meta-analysis report effect sizes within the normal range. Because half of the included effect sizes stem from unpublished studies (53%), the risk of publication bias was reduced by design.

This study aimed to investigate the effectiveness of professional development in supporting the implementation of inclusive education by in-service teachers and analyze design aspects of professional development programs in this regard. A systematic review and comprehensive meta-analysis were conducted to investigate the effects of professional development on four categories of outcome variables. This meta-analysis is the first to consider different indicators for the effectiveness of professional development addressing inclusive education on both the teacher and student levels together because professional development aims to disseminate knowledge and skills to in-service teachers and help them develop positive attitudes toward the topic in order to improve students’ learning behavior and school experiences (Desimone, 2009).

Through a systematic literature search in five databases, relevant journals, and conferences, we identified 342 studies from more than 50 countries that assessed the effects of professional development addressing inclusive education on at least one of the outcome variables on in-service teachers and reported quantitative data. As inclusive education is a complex field with varying indicators for good implementation depending on the context and focus, we included outcome variables that reflected the described outcome categories. In sum, we collected 1,123 effect sizes spread via the four outcome categories of teachers’ knowledge (k = 88), skills (k = 371), beliefs (k = 461), and students’ behavior (k = 203).

Does Professional Development Improve Teachers’ Knowledge, Skills, Beliefs, And Students’ Behavior?

The primary approach to improving teaching practices through professional development is disseminating information to in-service teachers. Our study reveals that professional development improves teachers’ knowledge, as expressed by the large effect sizes (g = 0.93 [0.76; 1.1]). Changes in tested knowledge were smaller among older teachers, perhaps because learning gets harder with age or because older teachers possess more knowledge than do younger teachers and profit less from mere information dissemination. In sum, assessments using knowledge tests reveal that teachers have more information about inclusive education, while assessments using self-reports show that teachers also perceive themselves as more knowledgeable when participating in professional development. Hence, concerns about the lack of knowledge, which teachers often consider an obstacle to implementing inclusive education, can be addressed via professional development programs.

Another goal of professional development is to improve teachers’ skills, and our analysis reveals that professional development considerably improved teachers’ skills (g = 0.49 [0.41; 0.56]). This finding is in line with Dignath and colleagues (2022), who observed a large effect (d = 0.63) but based their analysis on five studies that mainly included preservice teachers whose confidence in their teaching skills was still developing. Further, the smaller effect observed in the current study may be rooted in methodological issues. In contrast to Dignath and colleagues, the current meta-analysis included studies with control groups to adjust estimated effect sizes for typical developments. We also included unpublished studies and observed that published studies reported larger effect sizes for self-efficacy than did unpublished ones, as expected, due to the file-drawer effect. Estimated effects in the current study can therefore be interpreted as more realistic. Our study reveals that professional development supports the self-perceived capability to implement inclusive education and actual execution, as implementation quality, variance, and frequency in using evidence-based practices increased. However, improvement does not refer to satisfying implementation.

Regarding teacher belief changes, we observed a small positive effect size (g = 0.23 [0.17; 0.28]). In detail, we observed this effect on attitudes toward inclusive education and perceptions of teaching methods, while concerns about inclusive education were not influenced by professional development. Professional development aims to provide teachers with information but does not influence underlying conditions for the implementation in schools that are often the main focus of teachers’ concerns. Therefore, it is not surprising that concerns about inclusive education are not influenced by professional development. Teachers perceive many obstacles regarding implementing inclusive education; these concerns are likely to inhibit implementation intentions and should therefore be addressed in future programs. In general, changing beliefs requires much effort as it is expected to result from acquiring new knowledge and having positive experiences (Gregoire, 2003). However, small effects observed for beliefs may also be due to the self-selection of participants in professional development and research, as these participants are expected to have somewhat positive beliefs about the topic of interest in the first place. When teachers had to fulfill specific criteria to participate in the program—usually having a child with a specific type of special educational needs in their classroom—changes regarding the perception of teaching methods were larger compared to programs open to all teachers. This supports the assumption that applying newly acquired knowledge and skills is relevant to change beliefs.

Teachers undergo professional development to improve students’ behavior, academic performance, learning behavior, and school experiences. In the current study, we observed a small-to-moderate effect on students’ behavior (g = 0.37 [0.23; 0.51]). Finding positive effects on the student level is in line with the meta-analysis by Brock and Carter (2017), who observed large effect sizes (g = 1.08). However, the programs investigated in the meta-analysis by Brock and Carter (2017) were more intensive than those in the current analysis. They included interventions with preservice teachers, who are expected to benefit more from professional development programs as they cannot rely on teaching practice. In this study, we included different study designs and observed positive effects on students in all designs. Contrary to expectations, effects assessed with objective measures were larger than effects assessed via teacher reports or students’ self-reports. This may be caused by teachers’ problems in accurately and sensitively identifying changes in students’ behavior while teaching that class.

How Should Professional Development Be Designed To Enhance Effectiveness?

Desimone (2009) identified five design criteria that affect the efficacy of professional development: content focus, coherence with other learning activities, opportunities for active learning, collective participation of teachers from one institution, and longer duration. Our study revealed little empirical support for these design criteria. Active learning was the only design principle with a significant influence on one outcome category, change in teachers’ skills, while the other design principles merely influenced different subcategories. We analyzed certification after completing the program as an additional design aspect. Not surprisingly, it was found to influence teachers’ knowledge gains as people tend to concentrate more on learning-provided information when receiving certification (Larsen et al., 2008).

However, the lack of support for these design principles should not be overstated, as descriptions of the programs were scarce. For example, when the description did not provide information on the content, we coded the program as addressing inclusive education in general, although this might not have been the case. Therefore, we expect that design of the programs has more influence on their effectiveness than observed in our study. Further, we lack data from short-term programs, most commonly offered to in-service teachers but rarely addressed by research. Intervention studies primarily focus on assessing the effects of intensive programs, and our moderator analyses are therefore limited to identifying aspects that enhance efficacy within such programs. This might explain the lack of support for influences of program duration. However, we observed some design features that may improve the programs’ efficacy:

Although content focus did not influence data on the overall student level, we observed that when the topic addressed in the programs was more specific, students’ achievement improved. Coherence with other learning activities improved teachers’ perception of inclusive teaching methods. When teachers had to fulfill specific criteria to participate in the program, larger effect sizes were observed in students’ achievement. These prerequisites mainly involved teachers having at least one child in their classroom with specific special educational needs addressed in the program. This indicates that teachers apply learned content more easily when they can relate the provided information to their classroom, and are probably more motivated because these teachers come with specific questions and needs.

Providing more opportunities for active learning positively affected change in teachers’ skills particularly in their use of inclusive teaching methods. The opportunities to apply newly learned content and methods seem to support the development of teaching skills. This effect was influenced by designing the program as a blocked session or alternating input and praxis phases. Skills in general and their use of inclusive teaching methods, in particular, improved when teachers attempted new methods in their classroom, reflected on their process, and received feedback in the next session. This supports the finding of Brock and Carter (2017), who observed that implementation quality improved, especially when teachers had a chance to observe modeling and receive performance feedback.

In our study, collective participation hurt students’ achievement. The more teachers from one school participated, the smaller the positive effect on students’ achievement. We attribute this to the fact that more teachers are required to participate in the program unlike single participation, which is mainly based on teachers’ willingness to do so. This indicates that forcing teachers to participate in professional development can undermine positive changes. It has also been shown that attitudes within a faculty can converge (Pedaste et al., 2021). Hence, negative attitudes toward professional development can spread within the collegiate and undermine the implementation intentions of individual teachers.

In summary, intensive professional development programs positively influence teachers’ knowledge, skills, and beliefs, as well as students’ behavior. The present analysis also provides directions for the design of future professional development programs: Since we assume that effects at the student level occur through improved teaching practices, we suggest providing opportunities for active learning, especially designing it with alternating input and practical phases. According to our data, asking participating teachers to think of specific students they seek to support can also enhance learning effects.

Limitations

About two-thirds of the integrated works considered in this review were intervention studies. With a median duration of three months, the professional development programs investigated in those studies were relatively intensive to programs usually offered to teachers (Cramer et al., 2019). As we expect the intensity of programs to enhance effectiveness, the effect sizes estimated in this study may have overestimated the true effects of professional development. Further, intervention studies tend to measure the program’s effects shortly after its end, reflected in a median of zero weeks between the end of the program and post-evaluation. We observed significant positive effects of professional development in both intervention and cross-sectional studies. Differences were observed only for the category of teachers’ skills, although we calculated moderate effect sizes for both. Hence, the estimated effects exist but may be overestimated.

Previous reviews investigating the effect of professional development with regard to inclusive education limited their inclusion criteria to one study design. For example, randomized controlled trials, as investigated by Brock and Carter (2017), are more often applied in intensive professional development programs, while more expensive evaluations are applied for expensive programs. Dignath and collegaues (2022) only included single-group studies whose effects can be overestimated as there is no chance to control for natural development. This assumption is supported by our analyses, as intervention studies tended to report larger effect sizes than did cross-sectional studies, especially single-group studies. Therefore, we chose to include different study designs to not only investigate differences in reported effect sizes but also balance strengths and weaknesses. As mentioned, our analyses indicated significant positive effects for all study designs.

Overestimation of effects can also be rooted in publication bias. As more than half of the included literature was unpublished, the risk of publication bias for our meta-analysis was reduced by design. We did not discover notable indications for the presence of publication bias. The difference between published and unpublished studies was substantiated, yet again confirming the file-drawer effect (Rosenthal, 1979). We observed low power among many included studies. Against assumptions based on publication bias, the effects reported in studies with very low power lay within the range of the remaining studies and reported small effect sizes. Therefore, studies with low power did not inflate the estimated effects. Still, the small number of studies with large power was disappointing.

Implications For Future Research

Concerning the significance of supporting teachers in implementing a more inclusive school system, the results reveal that professional development is a helpful building block. However, teachers need more support and adapted school frameworks to achieve satisfactory implementation. Moreover, to assess the effectiveness of short-term events commonly offered as professional development, more data from these kinds of programs and their long-term effects are needed. Researchers should not only focus on intensive programs but also investigate real-life learning opportunities for teachers.

Based on our work, we highly recommend that researchers consider statistical power when planning a study. When assessing beliefs or other self-reported variables, we recommend using instruments that generate imaginations of concrete situations, such as including a child with ADHD in the classroom, to reduce the influence of socially desired response behavior. To improve transparency and replicability, providing a more detailed description of the professional development programs and making learning materials easily available are essential, as it would allow more detailed analyses of the program design.

Our study observed positive effects on all outcome categories and subcategories but on concerns about inclusive education. These could not be addressed adequately in such programs. However, since worries are a barrier to implementation and are communicated by teachers, researchers and trainers should examine how to reduce concerns in professional development.

Our study investigated the effectiveness of professional development to improve the implementation of inclusive education, revealing that professional development is a promising strategy to improve not only teachers’ knowledge, skills, and beliefs but also students’ behavior and academic achievement. This review is the first to investigate the effects of professional development on the teacher and student levels simultaneously. Compared to previous studies, we applied a comprehensive literature search considering different characteristics of professional development programs and research practices, which allowed us to identify a vast number of effect sizes in all four outcome categories and estimate small confidence intervals, reiterating the positive influence of professional development addressing inclusive education. Our analyses show that, in particular, knowledge transfer is effective via professional development. The study findings align with previous reviews and provide new insights regarding the design aspects of professional development programs. The analyses reveal that in-service teachers can and should be supported via professional development to improve their implementation of inclusive education.

Competing interests: The authors declare no competing interests.

Adhabi, E. (2018). The perceptions of elementary school special education and general education teachers on full inclusion of students with autism spectrum disorder (ASD) in Saudi Arabia. [Doctoral dissertation, Saint Louis University]. ProQuest Dissertations & Theses Global.
Allan, J. (2021). Inclusive education, democracy and COVID-19. A time to rethink? Utbildning & Demokrati – Tidskrift för Didaktik och Utbildningspolitk, 30(1), 9–21. https://doi.org/10.48059/uod.v30i1.1549
Alquraini, T. A. (2012). Factors related to teachers’ attitudes towards the inclusive education of students with severe intellectual disabilities in Riyadh, Saudi. Journal of Research in Special Educational Needs, 12(3), 170–182. https://doi.org/10.1111/j.1471-3802.2012.01248.x
Assink, M., & Wibbelink, C. J. M. (2016). Fitting three-level meta-analytic models in R: A step-by-step tutorial. Quantitative Methods for Psychology, 12(3), 154–174. https://doi.org/10.20982/tqmp.12.3.p154
Avramidis, E., & Norwich, B. (2002). Teachers’ attitudes towards integration / inclusion: A review of the literature. European Journal of Special Needs Education, 17(2), 129–147. https://doi.org/10.1080/08856250210129056
Benjamini, Y., & Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society. Series B (Methodological), 57(1), 289–300.
Borenstein, M., Hedges, L. V., Higgins, Julian, P. T., & Rothstein, H. R. (2021). Introduction to meta-analysis (2nd ed.). John Wiley & Sons.
Bransford, J., Darling-Hammond, L., & LePage, P. (2005). Preparing teachers for a changing world: What teachers should learn and should be able to do. Jossey-Bass.
Brock, M. E., & Carter, E. W. (2017). A meta-analysis of educator training to improve implementation of interventions for students with disabilities. Remedial and Special Education, 38(3), 131–144. https://doi.org/10.1177/0741932516653477
Carew, M. T., Deluca, M., Groce, N., & Kett, M. (2019). The impact of an inclusive education intervention on teacher preparedness to educate children with disabilities within the Lakes Region of Kenya. International Journal of Inclusive Education, 23(3), 229–244. https://doi.org/10.1080/13603116.2018.1430181
Chao, C. N. G., Chow, W. S. E., Forlin, C., & Ho, F. C. (2017). Improving teachers’ self-efficacy in applying teaching and learning strategies and classroom management to students with special education needs in Hong Kong. Teaching and Teacher Education, 66, 360–369. https://doi.org/10.1016/j.tate.2017.05.004
Chao, C. N. G., Forlin, C., & Ho, F. C. (2016). Improving teaching self-efficacy for teachers in inclusive classrooms in Hong Kong. International Journal of Inclusive Education, 20(11), 1142–1154. https://doi.org/10.1080/13603116.2016.1155663
Cohen, J. (1977). Statistical power analysis for the behavioral sciences (rev. ed.). Edition. Academic Press. https://www.ebsco.com/terms-of-use
Cramer, C., Johannmeyer, K., & Drahmann, M. (2019). Fortbildungen von Lehrerinnen und Lehrern in Baden-Württemberg. https://doi.org/10.25656/01:16567
Darling-Hammond, L., Hyler, M. E., & Gardner, M. (2017). Effective teacher professional development. Learning Policy Institute.
Desimone, L. M. (2009). Improving impact studies of teachers’ professional development: Toward better conceptualizations and measures. Educational Researcher, 38(3), 181–199. https://doi.org/https://doi.org/10.3102/0013189X08331140
Dignath, C., Rimm–Kaufman, S., van Ewijk, R., & Kunter, M. (2022). Teachers’ beliefs about inclusive education and insights on what contributes to those beliefs: A meta–analytical study. Educational Psychology Review, 1–53. https://doi.org/10.1007/s10648-022-09695-0
Edwards, C. J., Carr, S., & Siegel, W. (2006). Influences of experiences and training on effective teaching practices to meet the needs of diverse learners in schools. Educator, 126(3), 580–592.
Fernández-Castilla, B., Declercq, L., Jamshidi, L., Beretvas, S. N., Onghena, P., & Van den Noortgate, W. (2021). Detecting selection bias in meta-analyses with multiple outcomes: A simulation study. Journal of Experimental Education, 89(1), 125–144. https://doi.org/10.1080/00220973.2019.1582470
Florian, L., & Camedda, D. (2020). Enhancing teacher education for inclusion. European Journal of Teacher Education, 43(1), 4–8. https://doi.org/10.1080/02619768.2020.1707579
Forlin, C., Loreman, T., & Sharma, U. (2014). A system-wide professional learning approach about inclusion for teachers in Hong Kong. Asia-Pacific Journal of Teacher Education, 42(3), 247–260. https://doi.org/10.1080/1359866X.2014.906564
Garet, M. S., Porter, A. C., Desimone, L., Birman, B. F., & Yoon, K. S. (2001). What makes professional development effective? Results from a national sample of teachers. American Educational Research Journal, 38(4), 915–945. https://doi.org/10.3102/00028312038004915
Gregoire, M. (2003). Is it a challenge or a threat? A dual-process model of teachers’ cognition and appraisal processes during conceptual change. Educational Psychology Review, 15(2), 147–179. https://doi.org/10.1023/A:1023477131081
Hattie, J. (2009). Visible learning. A synthesis of over 800 meta-analyses relating to achievement. Routledge.
Katsarov, J., Andorno, R., Krom, A., & van den Hoven, M. (2022). Effective strategies for research integrity training—A Meta-analysis. Educational Psychology Review, 34(2), 935–955. https://doi.org/10.1007/s10648-021-09630-9
Kossmeier, M., Tran, U. S., & Voracek, M. (2019). Visual inference for the funnel plot in meta-analysis. Zeitschrift für Psychologie, 227(1), 83–89. https://doi.org/10.1027/2151-2604/a000358
Kossmeier, M., Tran, U. S., & Voracek, M. (2020). Power-enhanced funnel plots for meta-analysis: The sunset funnel plot. Zeitschrift für Psychologie, 228(1), 43–49. https://doi.org/10.1027/2151-2604/a000392
Krippendorff, K. (2011). Computing Krippendorff's Alpha Reliability. Retrieved from https://repository.upenn.edu/asc_papers/43
Larsen, D. P., Butler, A. C., & Roediger III, H. L. (2008). Test-enhanced learning in medical education. Medical Education, 42(10), 959–966. https://doi.org/10.1111/j.1365-2923.2008.03124.x
Lipsey, M. W. (2009). The primary factors that characterize effective interventions with juvenile offenders: A meta-analytic overview. Victims and Offenders, 4(2), 124–147. https://doi.org/10.1080/15564880802612573
Lipsey, M., & Wilson, D. (2001). Practical meta-analysis. Sage Publications.
Lüdecke, D. (2019). R package ‘ esc ’ - Effect Size Computation for Meta Analysis - v0.5.1. 36.
Machů, E. (2015). Analyzing differentiated instructions in inclusive education of gifted preschoolers. Procedia - Social and Behavioral Sciences, 171, 1147–1155. https://doi.org/10.1016/j.sbspro.2015.01.224
Machů, E., & Lukeš, P. (2019). Teachers’ work with taxonomy of educational objectives as one of the forms of the gifted preschoolers’ development. Acta Educationis Generalis, 9(3), 1–15. https://doi.org/10.2478/atd-2019-0011
Mueller, M., D’Addario, M., Egger, M., Cevallos, M., Dekkers, O., Mugglin, C., & Scott, P. (2018). Methods to systematically review and meta-analyse observational studies: A systematic scoping review of recommendations. BMC Medical Research Methodology, 18(1), 1–18. https://doi.org/10.1186/s12874-018-0495-9
O’Connor, A. M., & Sargeant, J. M. (2014). Meta-analyses including data from observational studies. Preventive Veterinary Medicine, 113(3), 313–322. https://doi.org/10.1016/j.prevetmed.2013.10.017
Pedaste, M., Leijen, Ä., Kivirand, T., Nelis, P., & Malva, L. (2021). School leaders’ vision is the strongest predictor of their attitudes towards inclusive education practice. International Journal of Inclusive Education, 1–17. https://doi.org/10.1080/13603116.2021.1994661
Peters, J. L., Sutton, A. J., Jones, D. R., Abrams, K. R., & Rushton, L. (2008). Contour-enhanced meta-analysis funnel plots help distinguish publication bias from other causes of asymmetry. Journal of Clinical Epidemiology, 61(10), 991–996. https://doi.org/10.1016/j.jclinepi.2007.11.010
Pit-Ten Cate, I. M., Markova, M., Krischler, M., & Krolak-Schwerdt, S. (2018). Promoting inclusive education: The role of teachers’ competence and attitudes. Insights into Learning Disabilities, 15(1), 49–63.
Price, D., Jefferson, T., & Demicheli, V. (2004). Methodological issues arising from systematic reviews of the evidence of safety of vaccines. Vaccine, 22(15–16), 2080–2084. https://doi.org/10.1016/j.vaccine.2004.01.009
Reed, D. A., Cook, D. A., Beckman, T. J., Levine, R. B., Kern, D. E., & Wright, S. M. (2007). Association between funding and quality of published medical education research. Journal of American Medical Association, 298(9), 1002–1009. https://doi.org/10.1001/jama.298.9.1002
Rosenthal, R. (1979). The 'file drawer problem' and tolerance for null results. Psychological Bulletin, 86(3), 638–641. https://doi.org/10.1037/0033-2909.86.3.638
Rothstein, H. R., Sutton, A. J., & Borenstein, M. (2005). Publication bias in meta-analysis. John Wiley & Sons.
Salameh, J.-P., Bossuyt, P. M., McGrath, T. A., Thombs, B. D., Hyde, C. J., Macaskill, P., Deeks, J. J., Leeflang, M., Korevaar, D. A., Whiting, P., Takwoingi, Y., Reitsma, J. B., Cohen, J. F., Frank, R. A., Hunt, H. A., Hooft, L., Rutjes, A. W. S., Willis, B. H., Gatsonis, C., Levis, B., Moher, D., & McInnes, M. D. F. (2020). Preferred reporting items for systematic review and meta-analysis of diagnostic test accuracy studies (PRISMA-DTA): Explanation, elaboration, and checklist. Research Methods and Reporting2, 370, 1–17. https://doi.org/10.1136/bmj.m2632
Siegel, M., Eder, J. S. N., Wicherts, J. M., & Pietschnig, J. (2021). Times are changing, bias isn’t: A meta-meta-analysis on publication bias detection practices, prevalence rates, and predictors in industrial/organizational psychology. Journal of Applied Psychology, 1–27. https://doi.org/10.1037/apl0000991
Tristani, L., & Bassett-Gunter, R. (2020). Making the grade: Teacher training for inclusive education: A systematic review. Journal of Research in Special Educational Needs, 20(3), 246–264. https://doi.org/10.1111/1471-3802.12483
Tukey, J. W. (1977). Exploratory data analysis. Addison-Wesley.
United Nations (2006). United Nations Convention on the Rights of Persons with Disabilities. https://doi.org/10.5771/9783845266190-471
Van Mieghem, A., Verschueren, K., Petry, K., & Struyf, E. (2020). An analysis of research on inclusive education: A systematic search and meta review. International Journal of Inclusive Education, 24(6), 675–689. https://doi.org/10.1080/13603116.2018.1482012
Viechtbauer, W. (2010). Conducting meta-analyses in R with the metafor package. Journal of Statistical Software, 36(3), 1–48. https://doi.org/10.18637/jss.v036.i03
Waitoller, F. R., & Artiles, A. J. (2013). A decade of professional development research for inclusive education: A critical review and notes for a research program. Review of Educational Research, 83(3), 319–356. https://doi.org/10.3102/0034654313483905
Waitoller, F. R., & Artiles, A. J. (2016). Teacher learning as curating: Becoming inclusive educators in school/university partnerships. Teaching and Teacher Education, 59, 360–371. https://doi.org/10.1016/j.tate.2016.07.007

Supplementarymaterial.docx

Download PDF

Version 1

posted

You are reading this latest preprint version

Does Professional Development Effectively Support the Implementation of Inclusive Education? A Systematic Review and Meta-Analysis

Status:

Version 1

Abstract

Figures

Full Text

Approaches To Support Teachers’ Learning About Inclusive Education

Previous Reviews

Indicators For Improved Implementation Of Inclusive Education

Study Aims

Hypotheses

Method

Inclusion Criteria And Screening

Coding Of Moderator And Control Variables

Effect Size Calculation

Summary Effects And Heterogeneity Tests

Moderator Analyses

Assessment Of Risk Of Bias And Publication Bias

Results

Description of Included Studies

Summary Effects

Moderator Analyses Of Study Characteristics

Moderator Analyses Of Data Collection Characteristics

Moderator Analyses Of Participant Characteristics

Moderator Analyses Of Professional Development Design

Study Quality

Publication Bias

Discussion

Does Professional Development Improve Teachers’ Knowledge, Skills, Beliefs, And Students’ Behavior?

How Should Professional Development Be Designed To Enhance Effectiveness?

Limitations

Implications For Future Research

Conclusion

Declarations

References

Supplementary Files

Status:

Version 1