The study selection process is illustrated by the PRISMA Study Flow Diagram (see Figure 1). Of the 12,259 screened studies, 186 were assessed for eligibility in the final selection. A total of 174 full-text articles were excluded for different reasons: a) 58 articles were not randomized controlled trials; b) 29 articles included exclusively preterm infants with brain abnormalities; c) 57 articles did not measure neurodevelopment or did not measure it with a standardized instrument; d) 26 articles were not eligible interventions; e) three articles were excluded for other reasons (language of publications other than French and English or year of publication before 2002); and f) data was not available for one article. Finally, 12 studies met the inclusion criteria and were included in this systematic review (21-24, 29-36). Five studies were included in meta-analysis (21, 23, 29, 31, 35) for infants’ neurodevelopment. All the other studies were not suitable for meta-analysis because the nature of the intervention or the instrument used to measure neurodevelopment was different, precluding data pooling (22, 24, 30, 32-34, 36).
Characteristics of Included Studies
The main characteristics of the included studies are summarized in Table 1. The 12 studies published between 2002 and February 2020 included 933 preterm infants. Five studies were conducted in the United States (21, 24, 29, 31, 33), one in the Netherlands (22), two in India (30, 32), one in Thailand (23), one in Taiwan (35), and two in Iran (34, 36). Eleven studies were RCTs (21-24, 29-32, 34-36) and one was a pilot RCT with published results (33).
More than half the studies (n=7) included preterm infants born younger than or at 32 weeks of gestational age (21-24, 31, 33, 34). Five studies included infants born at a gestational age higher than 32 weeks (29, 30, 32, 35, 36).
The 12 studies include a variety of interventions, including NIDCAP (21, 29, 31), positioning and incubators covers (22), alternative positioning (24), sensory stimulation interventions considering tactile stimulation (33) and multisensory stimulation (30, 32), parental participation programs (23, 35), music (36) and physical activity and/or hydrotherapy (34).
Four studies did not specify who performed the intervention (22, 30, 32, 36). For three studies, the intervention was delivered by two certified NIDCAP professionals (21, 29, 31). The other interventions were either performed by a nurse (34), mothers with guidance from nurses (23), nurses and/or parents when they were at the bedside (24), the principal investigator (a nurse) or a trained research team member (33), or physical therapists, parents and nurses (35). The majority of the studies (n=8) comprehensively described their control group (i.e., standard care or comparator group) (21, 22, 24, 29, 30, 32-35), and three that described their control group as standard care did not provide specifics (23, 31, 36).
The shortest intervention duration was 14 consecutive days (34), whereas the longest was five weeks, with the intervention performed six days per week (33). In some studies, the intervention was performed at a distinctive dose and frequency during NICU hospitalization (23, 30, 32-36) (see Table 1 for characteristics of studies). Four interventions were carried out for short periods of time ranging from five to 15 minutes (30, 33, 34, 36), 30 minutes (32) or one hour (35), whereas the others were almost always regularly integrated into care, with no details about frequency or dose (21, 22, 24, 29, 31).
The studies measured neurodevelopment using different scales, such as the APIB (21, 29, 31), the Prechtl Neurological Examination of the Full-term Newborn (22, 29, 31), the INFANIB (30, 32), the NNNS (24, 33) the TIMP (32, 34), the NNE (23, 35), the New Ballard Score (34, 36) and items from the Dubowitz examination (34). Four studies used more than one scale (29, 31, 32, 34). Eight studies included preterm infants whose neurodevelopment was measured during NICU hospitalization or at discharge (22-24, 30-36) and three studies measured the primary outcome at two weeks CA (21, 29, 31). Three studies did their measurements at the end of the intervention (32-34), while one study performed their last measurement 28 days after the infants’ birth rather than at the end (23).
Risk of Bias of Included Studies
The risk of bias assessment graph for the included studies is presented in Figure 2 (details for each included study); a summarization figure in Additional File 2 – Figure S1 and a summary table in Additional File 3 – Table S2 are also presented. Six of the 12 studies reported the random-sequence generation appropriately (22, 24, 30, 33, 35, 36), while the process used was not clearly indicated in the other studies. Only two studies adequately describe their allocation concealment method (29, 33). The risk of bias from the blinding of personnel was high in three studies (22, 31, 35) due to the nature of the intervention, and it was unclear for all the other studies because of insufficient information. Blinding of outcome assessment was adequately performed in 10 studies except for one rated high-risk (30) and one unclear for the study not addressing this outcome (32). All the studies were judged low for incomplete outcome data except for two: one did not provide reasons for missing data (29) and one had an imbalance numbers in groups (31). Selective reporting was unclear in all studies and this risk of bias was high in three studies because not all prespecified outcomes were reported (30), or a difference was noticed between the protocol registered and the publication (22, 34). We considered six studies to be free from other sources of bias (22, 24, 32-34, 36), whereas two studies were judged high since an important risk of bias was associated with threats of study validity (21, 31) and four had unclear risk of bias for insufficient rationale or evidence provided (23, 29, 30, 35) (see Additional File 3 – Table S2 for a detailed explanation).
Risk of Bias Across Studies
As no more than ten studies were included in the meta-analysis, funnel plot asymmetry was not tested, as the power of this test would be too low to distinguish an asymmetry indicating a publication bias (25). However, we performed different strategies to decrease potential reporting bias, including a comprehensive search by an expert librarian using nine different databases, an online search of several trial registries to identify relevant published trials and contacting authors by email to obtain missing data.
Synthesis of Results
Developmental Care vs. Standard Care
NIDCAP. Neurobehavioral Development. Three studies (21, 29, 31) that included a total of 229 participants (treatment: n=117, control: n=112) investigated the effects of NIDCAP compared to standard care using the APIB scale. Compared to standard care, the effect of NIDCAP was found to significantly improve preterm infants’ autonomic system (MD ‑0.83; 95% CI ‑1.28 to ‑0.37; I2 = 45%; p=0.0004) (see Figure 3), motor system (MD ‑1.04; 95% CI ‑1.58 to ‑0.50; I2 = 66%; p=0.0002) (see Figure 4), state system (MD ‑0.74; 95% CI ‑1.06 to ‑0.42; I2 = 0%; p<0.00001) (Additional File 4 – Figure S2), interaction‑attentional system (MD ‑0.48; 95% CI ‑0.85 to ‑0.11; I2 = 0%; p=0.01) (Additional File 5 – Figure S3), and self‑regulatory system (MD ‑0.84; 95% CI ‑1.17 to ‑0.51; I2 = 9%; p<0.00001) (Additional File 6 – Figure S4). The effect of NIDCAP also significantly improved the examiner facilitation subscale (MD ‑1.02; 95% CI ‑1.44 to ‑0.60; I2 = 0%; p<0.00001) (Additional File 7 – Figure S5).
NIDCAP. Neurological Development. Two studies (29, 31) totalling 137 participants (treatment: n=72, control: n=65) investigated the effects of NIDCAP compared to standard care using the Prechtl Neurological Examination of the Full-term Newborn. The NIDCAP was found to significantly improve preterm infants’ neurological development (MD -15.00; 95% CI -25.28 to -4.73; I2 = 74%; p=0.004) (see Figure 5).
Alternative Positioning. Neurobehavioral Development. In one study (24), the effect of positioning was evaluated using the NNNS and the preterm infants in the treatment group showed significantly less asymmetry than those in the control group. Only one significant effect was reported for the asymmetry subscale (MD 0.88; 95% CI 0.45‑1.31; p<0.0001), while no significant effect was found for the other NNNS subscales (i.e., attention, handling, quality of movement, regulation, nonoptimal reflexes, stress abstinence, arousal, hypotonicity, hypertonicity, excitability and lethargy).
Positioning and Incubator Covers. Neurological Development. Only one study (22) with 148 participants (treatment: n=76, control: n=72) investigated the effects of incubator covers and positioning compared to standard care on preterm infants using the Prechtl Neurological Examination of the Full-term Newborn (normal vs. abnormal). No significant effect between groups was found (RR 0.93; 95% CI 0.70 to 1.22; p=0.58).
Parental Participation Intervention vs. Standard Care
Neurobehavioral Development. Two studies (23, 35) that included 294 participants (treatment: n=145, control: n=149) investigated the effects of a parental participation program compared to standard care on preterm infants using the NNE. Compared to standard care, the program was not found to significantly improve neurobehavioral development (MD 5.39; 95% CI -3.43 to 14.20; I2 = 90%; p=0.23) (see Figure 6).
Sensory Stimulation vs. Standard Care
Tactile. Neurobehavioral Development. One study (33) that included 18 participants (treatment: n=9, control: n=9) investigated the effects of a tactile intervention using the NNNS. No significant difference between groups was found for any of the 12 subscales (i.e., attention, handling, quality of movement, regulation, nonoptimal reflexes, asymmetric reflexes, stress abstinence, arousal, hypotonicity, hypertonicity, excitability and lethargy).
Multisensory. Neuromotor Development. Only one study (30) that included 50 participants (treatment: n=25, control: n=25) investigated the effects of a multisensory stimulation intervention compared to standard care, assessed with the INFANIB. The multisensory stimulation was significantly in favour of the experimental group (MD 3.08; 95% CI 1.33-4.83; p=0.0005).
Multisensory. Neuromuscular Development. One study (36) with 80 participants (treatment: n=40; control: n=40) evaluated the effects of a multisensory intervention compared to standard care using the New Ballard score. Both groups showed significant improvement before and after the treatment, but the difference was significantly higher in the treatment group (MD 5.60; 95% CI 4.65-6.55; p<0.00001).
Music vs. Developmental Care
Neuromotor and Neuromotor Development. In one study (32) that included 36 participants (treatment: n=18, control: n=18), the effect of music compared to developmental care was evaluated using the TIMPS and the INFANIB. Significant effects of music were reported for infants’ neuromotor development measured with the TIMPS (MD 0.39; 95% CI 0.08-0.70; p=0.01) and the INFANIB (MD 1.89; 95% CI 0.42-3.36; p=0.01) compared the control group.
Physical Activity and/or Hydrotherapy vs. Containment
Neuromotor and Neuromuscular Development. One study (34) of 38 preterm infants (treatment: n=19, control: n=19) investigated the effects on neuromotor development of three different interventions – physical activity, hydrotherapy and a combination of physical therapy and hydrotherapy – compared to containment, using the TIMP and items from the Dubowitz examination. For all interventions, the ANOVA effects were not significant: physical therapy (mean: 50.21) vs. containment (mean: 51.57); hydrotherapy (mean 48.05) vs. containment (mean 51.57); or physical therapy combined with hydrotherapy (mean: 52.00) vs. containment (mean: 51.57): p=0.11. For the neuromuscular development, no significant findings were found for the New Ballard score (p>0.05) while for the two items of the Dubowitz, ankle dorsiflexion was not significantly different between groups, but leg recoil was significantly better for the physical therapy and hydrotherapy groups (p=0.04).
Quality of Evidence
The overall quality of evidence was considered low to very low. The summary findings table is presented by outcome (see Additional File 8 – Table S3). For the comparison between NIDCAP and standard care, the overall quality of evidence was rated low to very low for the autonomic system, motor system, state system, interaction-attention system, self-regulatory systems and examiner facilitation (neurobehavioral development), and very low for neurological development. For the comparison between parental participation program and standard care, the quality of evidence was rated very low. The main reasons for downgrading scores were high risk of bias, high heterogeneity between studies and small sample sizes. For the other comparisons including only one study, the summary of findings table is reported for each outcome (see Additional File 8 – Table S3).