An observation study of the implementation and quality of a policy-based early childhood education program for newly arrived refugee children in Germany

Early childhood education [ECE] can support the development and adjustment of refugee children - though the large numbers of newly arriving refugee families challenge ECE capacities of hosting countries. In Germany, a specialized ECE policy funds programs for refugee families at scale. As that policy provides only few regulations, implementation and ECE quality at program-level remain unknown. In formative research, we categorized programs and assembled ECE quality measures. We then examined ECE quality in standardized observation procedures. The policy-based programs were differently implemented in mobile concepts/ temporary set-ups, improvised settings or in formal settings for education. For ECE quality, we found that structural quality depended on implementation settings while process quality, as referring to social-emotional support, was consistently high. Our research process yielded a dearth of tools to comparatively assess ECE quality among heterogenous programs at population-and program-levels with respect to the needs of speci�c target groups. Flexible programs with limited regulations, allowing locally adapted implementation strategies, are likely an innovative approach to support refugee children’s development after migration.


Main Text
In Germany, more than 163,000 children under the age of six, eeing from Syria, Afghanistan, and Iraq, have applied for asylum in 2016, the year net immigration peaked (BAMF, 2017).Since then, immigration of families with young children still continues along with newly arrived mothers giving birth after arrival.
While the early years of life lay foundations for long-term development, chaos, threat and deprivation in conjunction with displacement can threaten early child development [ECD] (Fazel et al., 2012;Park, Katsia cas & McHugh, 2018).Distinctive evidence in Germany yields that young children from newly arrived refugee children families demonstrate higher rates of socio-emotional problems (Buchmüller et al., 2018), lower levels of cognitive, and host country language development but not motor development (author, under review).Slow developmental progress puts a lifelong equity in education and well-being at risk and calls for additional action to support refugee children's ECD.

ECD programs promote positive youth development
Recent migration of young refugee children focused the attention of practitioners, policy makers and researchers on the early childhood education [ECE] sectors for addressing early developmental disparities.A body of evidence yields the largest long-term bene ts for investments in lifespan development if spent during the early years (e.g., Anders, 2013;Schweinhart et al., 1993).Previous studies supported that especially disadvantaged children can bene t from ECD programs, enhancing their developmental potential (Sincovich, 2019;Weiland & Yoshikawa, 2013;Winsler et al., 2008).ECD programs subsume a range of child-and caregiver-centered initiatives that aim to facilitate developmental growth and pre-academic learning of children below school age.Therefore, ECD programs not only stimulate children's motor, social, emotional, language and cognitive skills (High, 2008) but also facilitate adjustment processes, and provide resources to family systems.Previous studies suggested that fostering socio-emotional learning and host country language acquisition via ECD programs has a positive impact on later academic achievements of immigrant and dual-language learning children (Castro et al., 2011;Votruba-Drzal et al., 2015).Notably, such effects seemed overall larger for immigrantthan for non-immigrant populations (Hancock et al., 2012;Weiland et al., 2013).ECD programs could thus contribute to sustainably mitigate developmental, educational and socio-emotional di culties that have been found for refugee children.
Although the cited evidence documents ECD program effects on short-and also lifetime development trajectories, there is a larger debate on how those effects emerge and how they are determined.ECD programs are overall conceptualized to support young children through direct and indirect pathways.Indirectly, programs support child ecologies, largely through addressing caregivers, and thus especially promote ECD of children at risk (e.g., Lee et al., 2006).Correspondingly, Marti and colleagues (2018) found positive effects of caregivers' program involvement on child outcomes.Representing direct effects, ECD programs facilitate ECD through the provision of nurturing environments and learning opportunities, in stimulating interactions with program organizers and other children.Dosage effects substantiate such direct pathways, thus the frequency of child program attendance predicts child development outcomes (Zaslow et al., 2016), especially for children from disadvantaged families.In the case of refugee children, ECD programs moreover facilitate the transfer of socio-cultural knowledge and practices of hosting countries after arrival (New et al., 2015).Still, evidence is required on refugee-targeted ECD programs and the distinct determinants of program effectiveness (Murphy et al, 2018).

Early childhood education quality determines effectiveness for child development
Previous research demonstrated that the effectiveness of ECD programs overall depends on ECE quality (Burchinal, et al. 2000;Büchner & Spiess, 2007;Sammons et al., 2014).Low-quality in ECD programs was associated with no or even detrimental effects on child outcomes (Britto, Yoshikawa & Boller, 2011).ECE quality subsumes the structural and process characteristics of a program.Structural quality of the ECE environment includes physical (e.g., group, staff, and equipment), spatial (e.g., location), and temporal characteristics (e.g., schedule and routines; Thomason & Paro, 2009).Process quality encompasses social, emotional, and instructional characteristics, mainly conveyed through caregiver-child interactions (Howes et al., 2008).Process quality can be further separated into instructional support (i.e., cognitive stimulation and pre-academic activity) and social-emotional support (i.e., feelings of comfort and security, positive social interactions).Previous studies demonstrated distinct effects of both structural and process quality on children's academic and socio-emotional development (Anders et al., 2013;Bradley et al., 2001;Trawick-Smith et al., 2016).Beyond main effects, structural quality is considered to lay groundwork for effects of high process quality (Burchinal, 2018).Process quality was found to represent the primary driver for ECD elicited in ECE (Slot, 2015).Still, few studies with mainly qualitative approaches speci cally inform on the relevant structural and process characteristics of ECD programs for refugee children (Hurley et al., 2013;Hurley et al., 2014).Those studies emphasized the importance of distinctive structural characteristics, such as clear routines and schedules, frequent use of symbols for communication and self-expression as well as links to local social service providers for refugee children.
Beyond structural characteristics, ECE staff mentioned distinctive components of process characteristics for refugee children.These were high responsiveness and supportive interactions due to children's high risk for socio-emotional problems.As most of the refugee children were dual language learners, staff moreover mentioned that interactions with a focus on language are especially important.Both studies by Hurley and colleagues aggregated idiosyncratic evidence, re ecting experiences of ECE staff working with refugee children in diverse ECD programs with overall unknown ECE quality.

Identifying ECE quality among diverse ECD programs
Measuring ECE quality, however, has been a di cult endeavor for several reasons.First, ECD programs can have rather different conceptual orientations.While some programs have more holistic orientations focusing on indirect effects (i.e., family systems or support child ecologies in which ECD occurs), others are more speci c in their goals and exclusively child-directed (i.e., center-based child groups).Second, ECD programs can be universal, for all children, or rather speci c, targeting at particular groups of children and families at risk.Stronger than in previous ECE research, such conceptual differences need consideration in the construction and administration of ECE quality measurement.If not, results on ECE quality are likely a function of the concept or also implementation setting.
Previous ECE quality observation tools adhere to program regulations and are embedded in the contexts, especially the ECE systems, that underlie stakeholder authority reach or certain ECE paradigms.For example, the Early Childhood Environment Rating Scale (ECERS-R; Harms et al., 2015) is a widely used observation tool designed to examine mainly structural characteristics of state-funded and center-based preschool groups in Western, high-income countries (see Betancur et al., 2021 for a discussion and crosscontext adaptation).In contrast, playgroups are more heterogenous and exible ECD programs which tend to emphasize social learning goals (e.g., connecting caregivers and children with the community, fostering a sense of belonging) and joyful activities over children's progress in pre-academic learning.Distinct from center-based preschool programs, playgroups are typically set up in informal settings and directly engage caregivers (Sincovich et al., 2019).Substantially less work investigated integrating playgroup models and standardized tools for measuring ECE quality among playgroups (Commerford & Robinson, 2016).One reason might be that it is more di cult to propose univocal guidelines given the diverse concepts and goals among playgroups to support ECD.Some previous work (Commerford & Hunter, 2017;Jackson, 2013) proposed core playgroup principles as generated from workshops and focus groups.Key concepts of those principles are to appropriately stimulate early childhood experiences, increase parental knowledge on ECD and learning, facilitate social networks, support transitioning into education and provide resources as well as referral to appropriate services.For low-and middle-income countries, international initiatives recently generated sets of items measuring ECE quality of playgrouplike ECE services.Those sets however are developed along with speci c ECD program curricula or distinctively for low-resource contexts and blend structural with process quality aspects for feasibility and easy administration (UNESCO, 2017).
Taken together, previous evidence on policy-based ECD programs has limitations relevant to our research study.First, ECE quality of diverse ECD programs, ranging from center-based preschool programs to playgroups, is seldom considered speci c for heterogenous target groups and implementation contexts.
However, considering ECE quality as adaptive, or context-dependent, could better contribute to understanding the range of impact and impact heterogeneity among different ECD programs.This is especially important when implementing in emergency contexts (e.g., Child Friendly Spaces, Meltzer et al, 2019) and providing to ethnoculturally-diverse refugee populations (e.g., Dybdahl et al, 2001).Adding to this, some authors have raised ethnocentric concerns when conventional tools are used in different contexts and with diverse populations (e.g., Urban, 2019).Lacking adaption of ECE quality could transport Western views and thus reinforce inequity by marginalizing different de nitions of quality and ECD (Hu, 2015).Second, evidence on speci c ECE quality at population levels is yet scarce.Such evidence could however directly inform ECE policies in accordance with international and inclusive ECE guidelines.Beyond efforts on program upscaling and enrollment, the United Nations' Sustainable Development Goals, Target 4.2, more recently expanded the focus to ECE quality for maximizing ECE impact for all children worldwide.

Flexible ECD Program Initiative -Bridging Projects in Germany
The challenge to set up and effectively regulate policy-based ECD programs for refugee children has been emerging in Germany since 2015.The Ministry of Children, Families, Refugees and Integration (MKFFI) of the largest German state, North-Rhine Westphalia (NRW), then introduced an ECE policy to support ECD of newly arrived refugee children.Local stakeholders in ECE, such as the Communal Youth Welfare o ces and private ECE agencies were granted exibility in implementing a range of ECD programs, so called "Bridging Projects", to adapt to local circumstances and the diverse needs of young refugee children and their families.Based on that policy, more than 1,000 ECE programs with an overall capacity of more than 10,000 children have been annually funded.On average, BPs offered enrollment to 8.6 (SD= 4.05) children per group, had a duration of 33.5 weeks (SD= 14.23), and a caretaking time of 10.41 hours per week (SD= 8.27; own calculations based on registration data for BPs).Attendance is fully subsidized as BP organizers receive a at rate of €30 per hour for caretaking of one to ve children.The few regulations request that at least one staff member per group has a quali cation in ECE (i.e., formal training or a degree in an ECE-related subject), and the staff-child ratio should be 1:5 or better.Volunteers are encouraged to support trained staff.BP organizers are free to choose the location, time, frequency as well as the age range of children before school entry, and the involvement of parents.Such specialized ECD programs can thus range from highly structured preschool programs to low-barrier mother-child playgroups.At the time of policy implementation, the majority of ECE staff in Germany had no previous experience in teaching larger numbers of refugee children.

Study Aim
Studying the implementation of policy-based and refugee-targeted ECD programs contributes to generate meaningful ECE strategies to stimulate ECD of refugee children.Speci cally, assessing ECE quality can inform stakeholders on (1) variations between ECD programs especially when policies provide only few regulations and, (2) on how to re ne program guidelines when programs are created locally and regulated at scale.Our study contributes to these pending issues as we investigated the implementation and ECE quality of the BPs.Using a two-phase approach in this study, we (A) explored diverse realizations of BPs and generated a set of measurements to assess ECE quality among diverse BPs.We then, (B), examined ECE quality of the various forms of BPs.We discuss our results in the light of addressing refugee children's ECD needs at scale and the challenges arising in ECE quality assessments among diverse ECD programs.

Method Study Design
In a rst and formative study phase, we reviewed registry data provided by state authorities, conducted unstructured eld observations and concurrently reviewed available observation tools and guidelines on ECE quality.Goals in this phase were (A) to identify a hypothetical scheme to systemize diverse BPs, and (B) to select a set of indicators to assess ECE quality as relevant for diverse BPs.In a second study phase, we examined ECE quality of BPs (i.e., structural and process quality) in eld observations using standardized tools.Based on observation data, we then characterized BPs, explored whether ECE quality varied across the categorization scheme and compared process quality of BPs to regular child-care centers in NRW.We report results of the rst study phase along with our study method as it concurrently informed our research process.

Sample
The review of registry data in phase 1 yielded a hypothetical scheme that distinguishes BPs with regard to their implementation settings.Programs were described as located in formal settings for education, improvised settings, or more exibly organized in mobile and temporary set-ups (see Table 1).That scheme could be substantiated in 6 unstructured eld observations of seemingly different BPs types as identi ed by the registration list.For the second study phase, we randomly drew BPs from the registration list and requested participation.We stopped the recruitment after N= 50 BPs consented.At this point, a total of 153 BPs was contacted.Of those BPs that did not participate, some organizers did not respond, others reported no active BP due to relocation of families or stated concerns that the study might disturb the safe space atmosphere in groups.As we additionally lost two BPs before phase 2 data collections started (one group closed as scheduled, the other was closed due to decreasing numbers of participants), the nal study sample for structured eld observations consisted of n= 48 BPs.

Procedure
In the rst study phase, we reviewed the brief and unstructured descriptions of the BPs as part of the funding proposals submitted to state authorities.First and second study authors and two research assistants ltered out characterizing attributes of BPs, then structured the content during group discussions and identi ed in team consensus a hypothetical scheme to categorize diverse BPs.We subsequently conducted observations in 6 selected BPs to probe our categorization scheme (respectively 2 observations per hypothetical category, July to September 2016).During these visits we also explored ECE quality characteristics.We documented our observations in a low-structured observation report form (see Measures) and concurrently discussed observations in further group meetings.The group discussions in those meetings were aligned to identify the best approach for the examination of the ECE quality among the variety of different BPs.We decided a-priori to split ECE quality into structural and process components.For process quality, key concepts of constructs and tools that focus on staff-child interactions yielded overall applicability to BPs in team consensus.However, the research team acknowledged that BPs overall put more emphasis on children's socio-emotional adjustment rather than on instructional support for pre-academic learning and to facilitate transitioning into early education or kindergarten.For structural quality, the available instruments were considered not equally applicable to the heterogenous BPs.In group discussions, applicable indicators for key concepts of structural quality were therefore identi ed and adapted with regard to widely used observation inventories, that is, the ECERS-R (Harms et al., 2014), "Child Care Checklist Physical Environment Checklist" (NICHD, Study of Early Childcare and Youth Development, 2006).Selected indicators were examined in further eld observations.During the development and identi cation of structural quality indicators, we intended to maximize applicability of indicators to different BPs and their relevance for the ECE principles proposed by stakeholder authorities in NRW (MKFFI, 2016; initiator/funder of the refugee ECE policy) while also maintaining structure and content comparable to established measures of ECE quality.Additionally, the measurement should be feasible with good inter-rater reliability.The rst study author facilitated group discussions while second study author mostly conducted documentation of the meetings.Between two and four research assistants with at least bachelor's degrees additionally participated in all group discussions.We describe the generated observation tool for structural quality and, also, our selection of process quality indicators in the Measures section.
In the second study phase, members of our research team visited BPs for structured eld observations between October 2016 and April 2017.The full observation team for BPs consisted of ve graduate students with a bachelor's or master's degrees in psychology.The rst and the second study authors instructed the team on observation procedures.Four observers (the same research assistants participating in study phase 1), along with the rst and second study author were o cially trained and licensed in the Classroom Assessment Scoring System Pre-K (Pianta et al., 2008), a widely established observation measure of process quality during the early years.The initial six structured BP visits were used for preparing instruments, piloting across diverse BPs and observer training.In the subsequently visited BPs (n= 42), teams of two observers conducted standardized observations.One person assessed structural quality, while the other assessed process quality.
All teachers of BPs involved in the present study provided written informed consent beforehand.All parents of children attending the participating BPs during the period of our studies received written information on the study in addition to verbal information provided by teachers.Families were asked not to attend the BP at the day of the observation, if they felt uneasy about the study.No child-level data was used for this study.The Internal Review Board of the <Faculty-University> approved the study protocol (2016-298) in accordance with the ethical guidelines of the Germany Psychological Society.

Report forms
In study phase 1, we used an open observation report form that guided exploration during unsystematic eld observations and facilitated on-site documentation of ECE quality characteristics.This form included three domains, (1) a general project description, (2) structural quality, (3) process quality.Each domain included few guiding questions standardizing the exploration process and thus supporting comparative group discussions of ECE quality indicators across BPs.In study phase 2, organizers of BPs completed a report form on group characteristics.This form covered project times, numbers and demographics of enrolled participants, attendance behavior of participants, and basic information on BP staff.

Measuring structural quality
As resulting from the rst, formative study phase, we created the "Bridging Project Evaluation Scale" (BREVIS) to observe structural quality in diverse ECE environments.BREVIS consists of 24 indicators of structural quality which are assigned to ve dimensions: (1) premises, covering structural aspects of the setting such as availability of space for activities, an area for relaxation, or sanitary facilities, (2) equipment, covering the availability of movable furniture and their suitability for young children, (3) structuring of a session, covering the formal structure of the program, including clearly indicated start and ending times, establishment of rituals, rules, and routines, (4) team coherence, characteristics of team climate and the degree of effective staff cooperation, and (5) educational materials for preacademic activities and play, as well as for language facilitation in multilingual groups.A single-observer completion of the BREVIS took around 30 minutes.Each indicator was rated on a three-point Likert scale (1-inadequate, 2-acceptable, 3-very good).Anchors for each indicator facilitated ratings.Observers could additionally comment on their ratings in a separate column.Ratings with comments were discussed in subsequent group meetings.Inter-rater reliability for the BREVIS indicators was assessed in four double coding sessions.Two-way consistency, single-measure intra-class correlations (ICC) with random effects were calculated.Average ICC was good (mean ICC = 0.724, range = [0.563,1]), differential ICCs for the subscales demonstrated moderate to excellent inter-rater reliability.Cronbach's alphas showed moderate to good internal consistency on dimension level (see Table 2).An overall good internal consistency (a= .80)and moderate inter-dimension correlations suggested that the BREVIS reliably assessed distinct features of structural quality.

Measuring process quality
Based on the rst study phase, we decided to use the Classroom Assessment Scoring System Pre-K [CLASS] (La Paro, Pianta, Hamre & Stuhlman, 2002) as a measure of process quality.This observation tool assesses different aspects of caregiver interactions with preschool-aged children on-site by an independent observer.Given the overall emphasis on social-emotional and behavioral adjustment, we omitted rating dimensions of the CLASS that were more strongly linked to instructional support.The selected dimensions for our study were "positive climate" (e.g., relationships, positive affect), "negative climate" (e.g., punitive control, disrespect), "teacher sensitivity" (e.g., awareness, responsiveness), "behavior management" (e.g., redirection of misbehavior, clear expectations), and "productivity" (e.g., preparation, transitions from one activity to another).One exception was the CLASS domain "language modeling" (e.g., frequent conversation, self-and parallel talk) because acquisition of basic host language skills is especially relevant to behavioral adjustment and to navigate social situations.We further added the dimension "teacher involvement" suggested by Agache, Kohl, Bihler, Willard and Leyendecker (2018).Higher ratings on teacher involvement indicated more active engagement and a higher extent of attention to the children's activities.All CLASS dimensions, including teacher involvement, were rated on scales ranging from "1" indicating low, over "4" moderate, to "7" indicating high staff-child interaction quality.One observer per BP conducted two observation cycles of 15 minutes each.Internal consistency for observations was good (a= .84).Prior to the data collections of the second study phase, the four licensed CLASS observers were re-certi ed as they again passed the o cial online reliability test with average rates for inter-rater agreements ranging from 80 to 94%, as compared to gold-standard raters.
In addition to the interpretation of quality criteria ratings, process quality of the BPs was comparatively interpreted with regard to a representative sample of 177 groups of state-subsidized daycare centers in NRW.In state-subsidized daycare centers the average class size was M=21 children with an average teacher-child ratio around 1:6.49 (SD= 3.6, Median= 5.75; Bihler, Agache, Kohl, Willard & Leyendecker, 2018).

Statistical Analysis
BREVIS ratings were analyzed on indicator-level regarding frequencies.We additionally computed means, con dence intervals, and ranks for each BREVIS indicator and calculated sum scores on dimension-level.Process quality of BPs was analogously examined based on CLASS ratings.That is, we computed means for each CLASS dimension as well as the mean for a second stratum score for overall social support by summarizing ratings of the dimensions positive climate, negative climate, teacher sensitivity, behavior management, and productivity.Moreover, we compared CLASS ratings for BPs to those ratings for ECE groups in state-subsidized daycare centers.To investigate structural and process quality of differently implemented BPs (i.e., mobile concepts, improvised settings, formal settings for education), we compared aggregated-to-dimension BREVIS and CLASS ratings as separated by BP types.Interpretation of all inferential parameters followed two-sided testing with an alpha-error level at 5%.Cohen's d with pooled variances was additionally reported.All analyses and the graphical computation were run in R using default functions and optional packages, for example, foreign (3.5.0;R Core Team, 2014).

Results
Bridging Project Groups and Activity Characteristics Among the 42 BPs we visited in study phase 2, an average of M= 1.56 teachers (SD= .5;Median= 2) and M= 5.65 children (SD= 2.07, Median= 6) were present during eld visits of the BPs.The most frequent caregiving constellations were "activities with 2 to 6 children" (27.38%) followed by "one-on-one" interactions (25%) and "activities with more than 6 children" (25%).The average teacher-child ratio during the visits was 1:3.56 (SD= 1.35, Median= 1:3.5).Thirty-two BPs provided additional information on teachers and the country of origin of the participating children.In total, 452 children attended those BPs on a regular basis.The children's major countries of origin were Syria (40.39%), followed by Balkan countries (13.27%),Iraq (13.05%), and Afghanistan (12.61%).Responding staff was on average M= 41.33 years old (SD= 11.84, Median= 40.5, range= [20,61]) and 12.5% were male.Regarding staff education levels, 15.6% had a college degree or had completed an ECE-related subject in tertiary education.46.9% had received ECE-related vocational training; 12.5% were childcare assistants.21.9% teachers did not report any ECE-related quali cation, and 3.1% omitted this information.

Results on Structural Quality
We analyzed indicators of structural quality for 41 BPs using the BREVIS; one observation was incomplete due to early closing and therefore excluded.Quality of premises, the rst dimension of BREVIS, was overall acceptable.However, 38% of the observed groups lacked areas for relaxation, and one-quarter lacked adequate sanitary facilities.The second dimension on the quality of the equipment was, on average, acceptable while some groups lacked child-friendly furniture (12%).For the third dimension, structuring of a session, observations indicated that several groups had neither a clear structure (10%) nor repetitive elements such as routines, rituals or rules (18%).Overall, team coherence was good in all BPs except for one.Observations for the educational materials suggested that, in general, materials for joyful activities as well as materials to promote competencies in arts and crafts, language, and literacy were su ciently available.However, several BPs lacked different types of material, especially for quantitative reasoning (12%) and, more speci cally for diverse children, for language facilitation in multilingual settings (26%).Detailed results can be found in Table 3.

Results on Process Quality
Process quality for staff interacting refugee children was analyzed based on the CLASS observations in 41 BPs.The socio-emotional dimensions (positive climate, negative climate, teacher sensitivity, behavior management, and productivity) were overall rated in medium to high across dimensions and implementation types (Mean range= [5.10, 6.90]).Language modeling was overall rated within medium ranges.Ratings of teacher involvement revealed that staff in BPs was frequently engaged in activities with the children (M= 5.07; SD= 1.16).We further compared CLASS ratings of the BPs to groups in state-subsidized daycare centers (see Table 4).On average, BPs showed a better staff-child ratio, t(227.96)=-9.44, p< .001,d= .99).For the BPs, we found fewer negative interactions (t(80.24)=2.78, p< .01,d=0.40), higher productivity (t(51.97)=3.12, p< .01,d= .62),and better language modeling (t(48.81)=3.86, p< .01,d= .88).We found no differences for positive climate, teacher sensitivity, and behavior management.The second-stratum dimension, social support, yielded better ratings for BPs than for daycare centers (t(52.19)=2.42, p< .05,d= .47).

Comparing Different Bridging Project Types
Our randomly drawn BP sample consisted of 14 BPs in educational settings, 22 BPs in improvised settings and ve BPs with mobile concepts or temporary set-ups.Due to the small sample size, BPs with mobile concepts or temporary set-ups were not part of the inferential comparisons.Considering BREVIS on domain-levels, we analyzed structural quality comparatively for the different BP types (Table 4).Except for equipment (t(33.28)=-0.93, p> .05,d= 0.30), BPs in settings for education tended to have higher scores on structural quality dimensions when compared to those in improvised settings.We found largest differences for structuring of a session (t(32.70)=-3.71, p< .001,d= 1.19).Descriptively, BPs with mobile concepts or in temporary set-ups consistently tended to have the lowest ratings on indicators of structural quality.For process quality, we analogously compared CLASS ratings and teacher involvement between BP types (Table 5).Dimensions of social-emotional support did not differ between BPs in formal settings for education or improvised settings except for productivity (t(30.50)=-2.50, p< .05,d= 0.76).The dimensions language modeling and teacher involvement did not reveal differences between the two types.Descriptively, CLASS ratings tended to be slightly lower for BPs with mobile concepts or in temporary set-ups.

Discussion
Subsidized a exible ECE policy, differently implemented ECD programs (BPs) for refugee children have been established in Germany with yet unknown ECE quality.The BPs are located in mobile concepts or temporary set-ups, improvised settings and formal settings for education.While we found that process quality indicators seem applicable across differently implemented BPs, we assembled a set of structural quality indicators.Overall structural quality in observed BPs was acceptable or better but differed systematically between implementation types.As can be expected, those BPs located in settings for education were most likely to provide good structural quality.Process quality was consistently independent of the implementation setting, also when compared to regular daycare centers.We discuss quality parameters of BPs, the need for adaptive ECE quality measures with regard to context and target group and how our study ndings can inform policy decisions to support refugee children's ECD at scale.

ECE quality differs across Bridging Projects
Overall, indicators of structural quality showed acceptable to good quality for ECE despite implementation differences.Our ndings thus suggest that fundamental arrangements for ECE can be established among diverse settings.On several indicators, however, structural quality varied between BP settings.BPs with mobile concepts or temporary set-ups were more likely to lack availability of relaxation areas or materials for quantitative reasoning and for language facilitation.As implemented in more challenging locations for ECE, those BPs in non-formal settings for education more likely require additional resources to compensate for structural disadvantages.We found the largest differences between implementation types for the dimension structuring of a session.This nding offers different interpretations.First, BPs in improvised settings were generally less likely to apply curricula or xed schedules.Second, differences could be due to transactional costs.That is, BP staff might have needed more time to individually prepare sessions in non-education settings or to arrange different activities within one session.Adding to this, ECE staff in Germany had little previous experience in working with refugee families (Chwastek et al, 2021).In settings for formal education, structural premises might support BP organizers to plan and structure sessions.However, structuring sessions considering the target group among non-education settings could be overall more challenging and thus re ect in BREVIS ratings.
Unlike structural quality, high process quality showed few links to implementation settings.BPs consistently yielded moderate to high process quality on the assessed domains that was overall comparable to state-subsidized daycare centers in Germany, or slightly superior.Our data provided explanatory approaches to this nding based on previous literature.First, BPs demonstrated a good staffchild ratio with small groups and high staff involvement.Such group characteristics are generally considered important preconditions for better process quality, although with inconsistent evidence (Pianta et al., 2005;Slot et al., 2015).Second, a study by Singer and colleagues (2014) yielded that teacher involvement, as re ected in the CLASS-analogous domain rated on high levels in our study, links to better interaction quality.Thus, staff involvement could have contributed to high interaction quality as found among diverse BPs.Third, BPs offer a lower dosage of early education compared to regular ECE services which could have limited teacher fatigue.Notably, previous evidence also supports strong links between teachers' professional training and process quality (Slot et al., 2015), while staff in BPs on average had lower levels of training and volunteers supported trained staff.Those links, however, seemed stronger for instructional domains of staff-child interactions (Pelatti et al., 2016) which were not prioritized in many BPs and within our investigation.Beyond, the previous research also supports a tradeoff between the staff-child ratio and staff professional training for high process quality, thus better trained staff could achieve high process quality for more children at the same time.We found that BPs usually have small groups and staff with both high-and low-level training in ECE mostly engages in oneon-one or small group activities.The positive effects of professional training could hence be less crucial for most BPs.
Exclusively, the CLASS dimension of productivity (i.e., establishing and enforcing routines and effective transitions between activities) was rated lower for improvised settings.There are two, not mutually exclusive, interpretations for this nding.First, BPs in improvised settings had more exible concepts and thus put less emphasis on productivity, which builds on session preparation in advance and establishing re-occurring procedures.Second, such BPs had more di culties retaining families for a longer period of time, meaning new children and families continuously needed to learn the routines and activities of the group.Beyond anecdotal evidence from our formative research process, the second interpretation is also backed by previously identi ed challenges in BPs, namely infrequent attendance, tardiness, and uctuation of refugee children attending BPs (Busch et al., 2018).
Given that previous studies substantiated links between high process quality and better ECD-related outcomes, BPs likely have a positive impact on refugee children.Focusing on child-directed effects, high process quality is associated with better socio-emotional adjustment and overall stimulates ECD with positive long-term development (Burchinal et al. 2008;Mashburn et al., 2008).Diverse BPs therefore provide opportunities for stimulating interactions that especially serve young refugee children's socialemotional needs.Thus, high process quality of BPs could mitigate increased levels of child behavior problems among newly arrived refugee children enrolling into the BPs (Buchmüller et al., 2018).
Considering the indirect/family-mediated effects, high process quality in ECD programs could facilitate building trustful relationships with refugee families and conveying ECD-and education-related information.Both indirect effects were described for refugee families attending transitional ECE services in Canada (Poureslami et al., 2015).
Flexible ECD programs for refugee families heterogeneity among BPs underlines that those ECD programs differ from other policy-based ECE services in Germany (especially from state-subsidized ECE in large groups and centerbased settings).Speci cally, BPs are designed to individually address the ECD-related needs of a speci c target group, that is, children from refugee families during post-migration periods.Also, BPs do not propose a set of mandatory quality standards and speci c implementation settings to receive program funding.That constellation puts local ECE stakeholders in charge and likely reasons between-program heterogeneity.
Considering both structural and process quality indicators, overall differences between BPs in formal settings for education and improvised settings concerned the establishment of temporal frameworks, predictable procedures and the continuous enforcement of routines and rituals (as re ected in BREVIS, structuring of a session; CLASS, productivity).From a larger perspective, this could re ect a speci c challenge for exible ECD programs such as the BPs -keeping balance between establishing routines as well as reliable structures and exibly reacting to the individual needs of the refugee families.
'Establishing exible routines' was consistently described by Swedish ECE staff as a challenging strategy to prepare young refugee children for transitioning into preschool, kindergarten or rst grade (Lunneblad, 2017).
The large diversity observed in BP implementation could be overall linked to diverging ECD program concepts.It would however then be questionable whether high structural quality standards are equally important for all BPs.Structural quality, especially learning routines, might be more relevant for BPs that bridge a pending demand for transitional ECE services, such as during the kindergarten year.Conversely, structural quality among BPs in improvised settings and mobile concepts might have a lower priority due to different foci.As those BPs are often set up near refugee accommodations, their emphasis could be more on adaptive outreaching strategies to overcome contact barriers and initiate trust in early education institutions among diverse refugee children, hence addressing families' information de cits or the cultural expectations of families regarding child education (Quintero, 1999;Morantz et al., 2013).Moreover, BPs in improvised settings and with mobile concepts could be better able to respond to sudden changes (e.g., after relocation of families), or to urgent demands by children or families often related to broader postmigration challenges (see Busch et al., 2018).In sum, BPs with high structural quality in formal education settings might focus merely on direct effects in their concepts including education transitions, while BPs in other settings might focus more strongly on indirect effects, that is reaching and attaching refugee families.

Assessing and interpreting quality among diverse ECD programs
In our formative research, we found that measuring structural BPs required more adaptation while process quality could be assessed using dimensions of an established tool.On the one hand, it is intuitive that structural indicators more likely depend on context as they cover the physical characteristics of ECD program environments.On the other hand, there is still a gap between the abstract guidelines proposed for high structural quality by ECE authorities (MKFFI, 2016) and operationalizing measures that reduce structural quality uniformly to the availability of certain items or environmental premises.With the BREVIS, we selected a set of observable indicators, also considering the speci c needs of refugee children.While our approach increased feasibility, it still might not su ciently acknowledge whether structural quality was constrained by implementation settings.Given that we found differences in structural quality between settings of BPs, our ndings should be thus interpreted with caution.Beyond, we cannot conclude based on our data whether variations in structural characteristics are linked to program concepts or re ect a lack of general structural quality.Slot and colleagues (2017) found that structural characteristics in ECE were linked to program curricula, that is, speci c program concepts.Thus, our study rea rms that non-adaptive measurement of structural quality might especially require the consideration of implementation contexts (i.e., environment and concepts) and speci c target groups during interpretation.Overall, more exible approaches that relate ECE concepts to structural characteristics with respect to speci c target groups for assessing quality could be the next step in adaptive measurement development on ECE quality.Independent of structural quality, we found high process quality among the BPs.Previous evidence supported links between structural and process quality in ECD programs, also with substantial betweenand within-study variability among such links (Singer et al., 2014;Cabell et al., 2013).Again, different ECD program concepts (especially differing activities and settings) in combination with variable measurement approaches of structural quality could account for such variability.While Singer and colleagues observed process quality among playgroups and found only weak links, Cabell and colleagues studied differing preschool classroom-based situations and found stronger but variable links.In the latter study, links emerged among instructional support with the overall highest process quality ratings on instructional domains for learning activities administered in large group settings (Cabell et al., 2013).For BPs, which are typically more similar to the playgroups studied by Singer and colleagues, there was a strong emphasis on social-emotional support and mostly one-to-one or small group settings.The previous evidence thus backs our ndings that process quality in social-emotional support domains are likely more invariant to context and concept variations.Focusing on context differences in future studies that use a set of social-emotional and instructional process quality domains could further dissolve contradicting ndings, along with more fundamentally con icting evidence on the dual structure of process quality measurement tools such as the CLASS Pre-K (e.g., Slot et  Limitations and future research Lastly, methodological challenges and limitations of our investigation should be acknowledged. For sampling, we did not use a strati cation strategy to investigate ECE quality among diverse BPs.In consequence, the sample was imbalanced across the implementation types.Our observation tool for structural quality (BREVIS) requires validation beyond face and content validity in subsequent studies on diverse programs for refugee children.Most important, future research should clarify how variability in BREVIS relates to ECD and process quality characteristics.For process quality, the selected CLASS dimensions narrow the focus on social-emotional support and language modeling in BPs.Some BPs in formal settings for education are however likely to also address instructional-oriented support.Subsequent studies need to consider program quality and implementation characteristics of ECD programs in their predictions of socio-emotional development and language acquisition of young refugee children.Even though BPs were differently implemented regarding the local circumstances and needs, we did not consider systematic information on the BPs' conceptualizations, context-related premises and challenges.In-depth analyses of BPs across different implementation types could be a starting point to further validate interpretations of our ndings and understanding links between program concepts, quality characteristics and implementation settings.

Conclusion
Our study offers insights into the implementation of a refugee-targeted ECE policy with few regulations in Germany.Findings suggest that program settings and concepts could matter for understanding program quality, especially with regard to the needs of speci c target groups.If exible ECD programs are regulated at scale and created locally, adaptive measures on ECE quality could best support policy decisions (i.e., examine program implementation and policy effectiveness).More speci cally, our study provides preliminary evidence that ECD programs for newly arrived refugee families do not per se require formal settings for education to achieve ECE quality if considered as supporting social-emotional adjustment during resettlement periods.Such ECD programs do not generally provide an equal alternative to state-subsidized early education at daycare centers which prepare transitioning into formal education.The strength of the BPs is that they offer exible ECD services for resettlement contexts and are easily accessible for refugee families.The BPs can thus inspire stakeholders in refugee-hosting countries to set up early actions at scale aimed at preventing an educational crisis among young refugee children.
(Morantz et al., 2013;), 2018).Assessing program quality contributes to policy development and increases policy impactLiterature supports that effective policy modi cations can increase program quality along with program impact(Melhuish et al., 2019).Structural characteristics are therefore the distal and regulable aspects of ECE and thus the main objective of statutory quality regulations.Our study demonstrates what constitutes program quality based on a funding policy with only few regulations.For programs targeting newly arrived refugees, study ndings substantiate that securing a good staff-child ratio in small groups could ensure high process quality despite implementation and concept heterogeneity -even if only a limited number of ECE professionals run the program supported by volunteers.Literature substantiated that achieving high process quality is the primary driver of program impact, and discusses staff-child ratio and staff professional quali cation as its facilitators.Such theory was mostly based on highlyregulated and center-based programs.Our ndings can preliminarily extend the body of policy-guiding work in cases when policy-based programs are characterized by heterogenous implementation settings, varying program concepts and speci c target groups.The challenge of measuring ECE quality among diverse BPs mirrors the general issue of assuring quality across policy-based ECE programs as those include a wide range of different services in European, highincome countries (see Meluish, 2016 for an overview on policy-based services in the UK).While most research focuses on center-based preschool programs, other ECD concepts lack empirical support, especially those in informal and improvised settings.That dearth is especially critical as services for refugee families, and other speci c target groups, likely require more diverse ECE approaches(Morantz et al., 2013; Lundblad, 2017).To re ne ECE policies serving all children, we need a better understanding of ECE policy impact and their theories of change supporting ECD on individual levels.This requires re ned tools to capture program quality among ECE services at individual-, program-and population-levels with respect to different program concepts.