This protocol was developed according to the (Preferred Reporting Items for Systematic review and Meta-Analysis Protocols) PRISMA-P 2015 checklist(70) and was registered with PROSPERO (number CRD42020172251)
Criteria for considering studies for this review
Types of studies
Randomised controlled trials and quasi-randomised controlled trials will be included in this study either where the unit of randomisation is the individual (individual randomised controlled trial - IRCT) or where the unit of randomisation is a group/cluster (group randomised controlled trial - GRCT). Adjustment for clustering will be conducted where the authors did not do so. RCTs are studies in which participants were randomly allocated to an experimental condition (intervention) or a control group (no treatment, treatment as usual (TAU), placebo control group, other intervention). The process of adjustment for clustering may lead to greater risk of selection bias. Cross-over trials are study designs where participants receive all treatments (usually active intervention and comparator) in a random sequence, thereby acting as their own control. Cross-over trials in psychological interventions are unlikely to include a “washout” period and they often assume the form of waiting-list control designs, where an individual or a group is first assigned to a waiting list and then they receive the active intervention. Only the first period of cross-over trials (e.g., waiting list and active intervention) will be included in this study.
Study settings included will be medical (e.g., primary care), clinical, community, educational/school-based, home, and online. Where possible, we will classify the type of setting of the intervention for each study.
No limit will be applied with regards to publication year.
Types of participants
Primary caregivers, either male or female, of any age, who are adoptive, biological, foster, single, adolescent, homosexual, divorced, incarcerated will be included in this review. We will include studies with caregivers experiencing common mental health problems (see below for mental health disorders to be excluded). However, if we find significant difference between interventions used to target caregivers with mental health disorders and those offered universally, we will conduct separate/subgroup analyses according to caregiver mental health status.
Studies with the following participant characteristics will be excluded: relatives of the child who are not in the role of primary caregiver; parents or children with a major medical condition which may lead to a specialised type of intervention (e.g., pre-term conditions, gestational-diabetes, major parent or child disabilities, including intellectual disabilities); parents with current confirmed (but not a history of) substance misuse, psychosis or who are perpetrators of maltreatment of the partner or child. Children who received a diagnosis within the Autistic Spectrum Disorder (ASD) will not be excluded unless they meet some of the above-mentioned exclusion criteria. When a range of child ages is included in a trial, interventions which took place prenatally or when at least 75% of the children were younger than 3 years and 11 months will be included, which will be calculated using mean, standard deviation and z-scores, assuming normality of distribution. This age limit is because the developmental stage of the child will be key in determining the nature of the intervention and thus it may not be appropriate to pool intervention components of older and younger children.
Eligible interventions must have directly measured child internalising problems at the end of the trial, either from parent, teacher, child, or clinician reports up to age 18 years 11 months. Where baseline measures of internalising problems are reported, these will be extracted. We anticipate this being less common given our specifications on child age.
Types of outcome measures
The following primary outcomes will be extracted only when validated measures were used.
Child/adolescent primary outcomes
Child/adolescent internalising problems, for example anxiety and depression symptomatology and disorders, will be included. Studies will be screened and data will be extracted independently by two researchers (IC and EP). Studies using reports of child outcomes from children, parents, teachers, clinical/medical personnel will be included. When videotaped (e.g., Incredible Years - IY), data will be included when validated scales have been used. We will examine child/adolescent outcomes at whatever time they are available. Assessment points will be classified according to the time-point at which they were measured(52):
- post-intervention (at conclusion of programme/ intervention delivery)
- short-term follow-up (within 12 months post-intervention)
- medium-term follow-up (1 to 3 years post-intervention)
- long-term follow-up (3 or more years post-intervention).
Child/adolescent primary outcome measures
The following measures of (but not limited to) child/adolescent internalising problems will be included:
- Any standardised diagnostic instrument which assesses children’s internalising symptoms (depressive, anxious symptoms) will be included. Accepted diagnostic criteria include but are not limited to the Diagnostic and Statistical Manual of Mental Disorders (DSM-5)(71), International Classification of Diseases (ICD-10) criteria(72), Diagnostic Classification of Mental Health and Developmental Disorders of Infancy and Early Childhood (DC:0–3)(73) or Diagnostic Classification of Mental Health and Developmental Disorders of Infancy and Early Childhood (DC:0-5)(1).
- Validated questionnaires which assess internalising symptoms in children and adolescents (also at the level of sub-scales) will be included.
When multiple scales were used, we will apply the following decision rules:
- Instruments with strongest construct and predictive validity (i.e., higher sensitivity/specificity against a gold standard of a child driven diagnostic instrument) will be prioritised.
- Scales specifically targeting internalising problems compared to scales developed for other aims with internalising problems as a subscale will be favoured.
- Scales developed for the age-range of the population will be favoured upon scaled developed for the general population.
- The most used scale across studies will be favoured to reduce heterogeneity due to measurement error.
- Given that we conceptualise internalising problems on a continuum we will prioritise the total internalising score where available, rather than internalising subscales.
Secondary outcome data will be extracted from studies identified as meeting above-mentioned inclusion criteria.
Primary Caregiver Secondary Outcomes
Secondary outcomes will not be used as inclusion criteria when primary outcomes were not assessed, as the focus of our review is not on externalising disorders or parenting outcomes. However, broader effectiveness of parenting interventions will be considered, where possible, especially for related outcomes which may influence the likelihood of programme success.
Secondary primary caregiver outcomes (process measures) will be:
- Caregiver’s mental health;
- Caregiver’s parenting self-efficacy;
- Parent-child relationship and family relationship measures;
Caregiver’s secondary outcomes will be considered as potential mechanisms of change. Validated measures such as those listed above, but not limited to, will be used.
Child Secondary Outcomes
The following child secondary outcomes will be considered when studies used (but not limited to) instruments such as:
- Child behavioural problems (e.g., conduct problems, aggressive behaviours, bullying, peer aggression) as measured by validated tools.
- School achievement or attendance (e.g., years of education, drop-out).
- Cognitive measures, where validated tools have been used.
In the current review, externalising problems will be considered only as secondary outcomes for three main reasons. First, as mentioned in the background, there is already a wealth of evidence available of parenting interventions aimed at the reduction of behavioural problems compared to the substantial lack of evidence for internalising problems. Second, externalising problems may underlie internalising symptoms; addressing the first without knowing whether the last were addressed could lead to an overestimation of positive effects of these interventions. Third, there is evidence of a specific increase in the past decades of internalising problems compared to behavioural problems(74). This makes the question of this current review particularly important for both its novelty and potential utility.
Adverse and negative outcomes will be considered. Child negative outcomes, parent negative outcomes (e.g., low self-esteem, depression, partner separation, family disruption) will be extracted when available. Attention to not only the possible positive effects but also the potential adverse intervention effects is important for a complete assessment of the effectiveness of an intervention(75).
Decision rules used for the primary outcomes will also be applied to secondary and adverse outcomes.
In this review, parenting interventions are defined as those that have a central focus on parenting abilities, behaviours and beliefs. Specifically, the intervention should include active training in parenting abilities with or without other foci. They should be somewhat standardised (e.g., based on a structured manual, booklets, or guidelines) in order to ensure reproducibility of the delivery by the staff to the parents(76). Some of the most common parenting interventions include Family Cognitive Behavioural Therapy, Incredible Years (IY), Triple P, Mellow Parenting, Strengthening Families Strengthening Communities, and the Family Links Nurturing Programme (FLNP)(77,78). Specific parenting abilities targeted are expected to be wide and can include attention to nurturing skills, teaching abilities, discipline, monitoring, management, language, parent-child relationship, self-regulatory strategies and others (for a complete list see: “Parenting Matters: Supporting Parents of Children Ages 0-8”(69)). We will therefore exclude any individual medical, psychiatric and psychological therapy administered to the caregiver which does not specifically intervene on parental abilities or behaviours. Interventions which were delivered during pregnancy, post-partum or before the 4th birthday of the child will be included. Treatments that occurred preconception will not be considered in this systematic review. No limitations on the intensity (number and length of sessions) and length of follow-up will be imposed. Parenting programmes will be considered eligible regardless of the theoretical framework.
Given the complexity of parenting interventions, we aim to disaggregate interventions into key components using a components-based NMA(79). Possible components could include different intervention foci, such as the specific behaviours, skills, emotions or cognitions the intervention targets. This approach will enable us to determine which intervention components (or grouped intervention components) are driving any treatment effect. Where intervention components are not reported, authors will be contacted, and missing information will be requested.
All control group types will be included in this systematic review regardless of whether they received or were exposed to any other type of control intervention, or lack thereof. Eligible comparators will include wait list, TAU (treatment as usual), other treatment and no treatment. Other treatments may include information about parenting and infant/child development, information on the management of behavioural/emotional problems, or other forms of psycho-/health education.
Years of publications considered
No limitation on the year of publication will be imposed.
No language limitation will be imposed. In the event that an eligible study has been written in a language other than English, where possible, professional translators will be hired.
Published and unpublished trials will be included in the review. Ongoing studies will be searched in the randomised controlled trial register (see website: https://clinicaltrials.gov/) and considered where relevant.
The search strategy was developed in collaboration with systematic review experts and with a medical librarian with expertise in systematic reviews. The search strategy includes MESH Terms and keywords and search strategy terms which were harmonised for each database (search of Medline is reported below). Existing systematic reviews and meta-analyses known to the authors will also be hand-searched. The search will be re-run prior to the final analysis.
OVID, MEDLINE from 1946 to present:
Exp *Caregivers/ or exp *parents/ or exp *Legal Guardians/ or caretaker.mp. or custodian.mp. or exp *Pregnancy/ or exp *maternal behavior/ or exp *parent-child relations/ or exp *parenting/ or exp *paternal behavior/ or expectan*.mp. or exp *postnatal/ or exp *post-natal/ or exp *post-partum/ or exp *perinatal/ or exp *prenatal/ or exp *antenatal/ or parent-child relations/ or exp *father-child relations/ or exp *mother-child relations/ or exp *parenting/
exp *preventive health services/ or exp *Prenatal Education/ or exp *Perinatal Care/ or exp *"early intervention (education)"/ or exp *early medical intervention/ or *primary prevention/ or *secondary prevention/ or exp *tertiary prevention/ or exp *Psychotherapy/ or program*.mp. or coach*.mp. or training*.mp.
exp *control groups/ or exp *cross-over studies/ or exp *double-blind method/ or exp *random allocation/ or exp *single-blind method/ or randomi#ed controlled trial.mp. or exp *clinical trials as topic/ or exp *controlled clinical trials as topic/ or exp *randomized controlled trials as topic/
No language or date of publication filter will be applied. Publication type filter will be applied in each database where possible.
The following electronic databases will be searched:
- Cochrane Central Register of Controlled Trials (CENTRAL) (1996-present)
- Ovid Medical Literature Analysis and Retrieval System Online (Medline) (1949-present)
- Ovid Excerpta Medica Database (EMBASE) (1974-present)
- Ovid PsycINFO (1806-present)
- Education Resources Information Center (ERIC) (1966-present)
- gov (1997-present)
RCTs registered at https://clinicaltrials.gov/ but not yet published will be screened and included when eligible.
Searching other resources
Reference lists of relevant systematic reviews and individual studies will be hand-searched.
Grey literature will be searched and known experts in the field will be contacted to explore the possibility of any unpublished research. Where critical information is not reported in published research, study authors will be contacted.
Deduplication will be carried out using Endnote
Rayyan(80) software will be used for data management and study screening.
Following the deduplication process in Endnote, the selection process will be conducted independently by two review authors (IC and EP), who will screen and identify studies based on inclusion and exclusion criteria (Figure 4. Flow chart of selection process, Table 1).
Table 1. Inclusion and Exclusion Criteria
Primary caregivers of infants and toddlers up to 3 years and 11 months.
Studies including specific groups of caregivers with intellectual disabilities and with current mental health problems such as schizophrenia, substance misuse and abuse, and children born preterm, at low birth weight or with congenital diseases.
Structured psychosocial parenting intervention delivered either antenatal or within the child’s first 3 years and 11 months of life.
Interventions not focusing specifically on parenting, interventions delivered at preconception, or unstructured interventions.
No restrictions will be imposed.
Child and/or adolescent internalising problems up to age 18 and 11 months.
Studies reporting only externalising problems or cognitive or health related outcomes.
Randomised controlled trials (RCT) or quasi-RCTs either with individual or group levels of randomization. Cross-over trials.
Study designs such as case control, cohort, cross-sectional and systematic reviews
No restrictions will be imposed.
The first selection of potentially eligible studies will be conducted by screening titles and abstracts in Rayyan. Disagreements between reviewers will be resolved through discussion and consensus. Where consensus is not possible, a third external reviewer (RMP) will resolve any disagreements. After the first round of screening is completed, the full texts of the resulting studies will be screened for eligibility. These articles will in turn be further screened independently by the two reviewers (IC and EP) according to the process outlined above. Detailed reasons for exclusion will be tracked and reported in the PRISMA flow diagram. Multiple reports of the same studies will be considered together. Included articles will receive an identification number before data extraction.
Data extraction and management
Data from included studies will be extracted and risk of bias assessments will be performed independently by two reviewers (IC and EP). Data regarding study design, intervention characteristics (type of intervention, theoretical framework of the intervention, experience of the provider, length, intensity, outcomes), participant characteristics, comparators, delivery mode, setting, attrition rates, outcome measures and effect sizes will be extracted and entered into REVMAN(81).
Unadjusted results will be preferred over adjusted results to improve consistency across studies and reduce the potential for selective reporting bias. If missing data cannot be obtained, the Cochrane Practical Methods for Handling Missing Data will be used(82,83).
RCTs, quasi-randomised trials, and cross-over trials at the individual and group level will be eligible for inclusion. Information relevant for assessing risk of bias will be extracted using the Cochrane tool(84) (e.g., allocation method, randomisation).
Participant characteristics (of the caregiver and the child) that may modify treatment effects will be extracted and reported. These will include baseline psychopathology, gender, age, comorbidity status, presence and numerosity of previous mental health related conditions, and psychiatric medication use. The number of participants at baseline and those lost to drop-out will also be extracted.
Data on other potential intervention modifiers such as treatment length, intensity (frequency of sessions), length of follow-up, expertise of the therapist and measures used will be included.
Data on the type of comparator used as well as data regarding control group participants will be extracted.
Risk of bias in individual studies
The risk of bias assessment will be conducted independently by two reviewers (IC and EP) using the Cochrane Collaboration’s Risk of Bias Assessment Tool(84). In cases of disagreement, consensus will be reached through discussion and where not possible to obtain consensus, a third reviewer (RMP) will be included in the process.
The Cochrane Risk of Bias Assessment Tool(84) assesses the following potential sources of bias:
- random sequence generation
- allocation concealment
- blinding of participants and personnel
- blinding of outcome assessment
- incomplete outcome data
- selective reporting
- other sources of bias
Each category will be scored as at low, high or unclear risk of bias.
Allocation methods will be assessed to determine the potential of bias due to the creation of incomparable groups.
Adequacy of concealment in the participant allocation process will be assessed. Randomised and quasi-randomised methods of concealment will be considered good-enough. Potential for bias due to inadequate concealment of the allocation process will also be assessed.
Due to the nature of the interventions, double blinding is not possible. We will therefore evaluate whether personnel were blinded when allocating participants to the different conditions and whether outcome assessors were blinded as to which intervention group participants were in.
Incomplete outcome data
The intention-to-treat (ITT) approach is widely used in the estimation of treatment effects from RCTs because of its low risk for bias(85). Where studies did not report an intention-to-treat analysis, authors will be contacted in an attempt to obtain missing data. How the authors dealt with incomplete data and how data on attrition and exclusion were reported will also be evaluated. Imputation methods for handling missing data are described below.
Selective outcome reporting
Any evidence of attempts by the authors to omit the reporting of relevant outcomes will also be assessed.
Measures of treatment effect
Dichotomous outcome data
For dichotomous outcomes, intervention effectiveness will be summarised as odds ratios (OR) or risk ratios (RR) and presented with 95% confidence intervals (CIs) and standard deviations (SDs). We will utilise the Number Needed to Treat (NNT)(68,83) approach to determine how many participants are needed in the intervention group to observe the expected outcome.
Continuous outcome data
For continuous outcomes, data from treatment completers will be pooled by calculating mean differences (MDs) between groups. If trials measured the same outcome using different scales, standardised mean differences (SMDs), also known as Cohen’s d, will be estimated and reported with 95% CIs. The SMD is obtained by subtracting the mean obtained in the control group from the mean obtained in the intervention group and dividing this value by the standard deviation of the outcome among participants. Where it is not possible to calculate SMDs, t-tests, F-tests, χ2, p-values, eta-squared and beta coefficients will be used and reported. When SMDs differ from zero (based on 95% CIs), treatment and control groups will not be considered equivalent. If improvement is associated with lower scores on the outcome measure (e.g., fewer anxious or depressive symptoms), negative SMD values will be interpreted as the treatment being more effective than the control, and vice versa(86). Together with CIs, SMDs will be interpreted as follows: small effect size, SMD = 0.2; medium effect size, SMD = 0.5; and large effect size, SMD = 0.8(87). When baseline information is available, we will compare pre and post-test measures as standardised mean change as indicated in literature(88,89).
Unit of analysis issues
If studies have not taken clustering into account, for example, where only raw or observed means and SDs are reported, methods in section 16.3.4 of the Cochrane Handbook(90)will be used to perform approximately correct analyses. Data from cluster randomised trials will only be included in meta‐analyses if clustering has been quantified and reported using an intra-cluster-correlation coefficient (ICC), or if other approximately correct analyses can be performed.
Repeated observations on participants
Studies reporting long-term outcomes will be included. Since the combination of outcomes with variable length of follow-up can lead to unit-of-analysis error in standard meta-analysis, separate analyses based on previously defined length of follow-up(91) will be carried out.
Dealing with missing data
In the case of missingness of relevant data, authors will be contacted to request sufficient information to conduct an intention-to-treat analysis. Where possible we will describe participant characteristics for whom data are missing and we will analyse the proportion of missingness as a function of the number of participants included in the analyses and total number of participants in the study (e.g., how many people were initially included in the study and for how many people outcome data are available). We will employ different methods to handle missing data according to the nature of missingness. Missing data due to dropout/attrition will be included in the risk of bias assessment.
Dichotomous outcome data
Missing dichotomous data will be assumed to be missing not at random (MNAR) or informatively missing (IM)(92). This approach assumes participants have dropped-out for some reason (for example participant’s mental health). We will assume that participants who dropped-out after the allocation process did so for negative reasons, for example non-response to the intervention (e.g., the intervention did not lead to an improvement).
A recommended simple imputation for dichotomous missing data is the best-case and worst-case scenario(93,94). In the best-case scenario, the assumption will be that all missing participants dropped-out because of a positive outcome in the experimental group and a negative or null outcome in the control group. Conversely, in the worst-case scenario participants dropped out because the treatment had a negative or null effect, whilst the control group led to a positive outcome. In the case of a large amount of missing data, results obtained may be unrealistic because they will reflect two extreme scenarios(85,95).
Continuous outcome data
When continuous outcome data are missing or outcomes were not recorded at time-point of interest, the Last Observation Carried Forward (LOCF) imputation method (using the last observed non-missing values)(96) will be used to fill-in missing values, whenever these data are available. Imputation methods will be considered carefully because of their potential to lead to biased results (e.g., overestimating or underestimating the effectiveness of an intervention).
Missing standard deviations (SDs) will be calculated from p-values, t-test statistics, confidence intervals (CIs) and standard errors (SEs).
Assessment of heterogeneity
Clinical and methodological heterogeneity will be assessed by examining variation across studies in participant characteristics, intervention type and delivery mode, outcomes or other relevant study characteristics such as concealment method or blinding procedures.
Results will be analysed using a network meta-analysis (NMA)(97). An NMA utilises a connected network of interventions. NMA allows to assess the comparative effectiveness of several competing interventions for a condition, as long as all the trials included in the analysis form a connected network(67,68,98). The idea behind NMA is a simple one: when head-to- head evidence comparing interventions B and C is not available, evidence on the BC intervention effect can be obtained indirectly via trials comparing AB and AC (Figure 5. Illustration of indirect treatment comparison in an NMA). This enables all pairwise effects to be estimated indirectly, even in the absence of direct evidence, whilst respecting the randomised structure of the evidence(99).
In the first instance, we will fit a model which compares interventions as “clinically meaningful units” i.e., at the whole intervention level. If appropriate, we will also conduct a component-level NMA, where the ‘active ingredients’ of interventions are modelled using a network meta-regression approach(79).We will explore component effects using an additive main effects model, as well as a full interaction (multiplicative) model where each unique combination of components is regarded as a separate intervention.
All statistical analyses will be conducted in a Bayesian framework using OpenBUGS software (www.openbugs.net). OpenBUGS is commonly used software for conducting NMA in a Bayesian framework due to its flexibility and availability of programme code(100).
Statistical heterogeneity is anticipated given potential variation in participant characteristics, intervention type and mode of delivery, and outcome measures. Random effects models, assuming a common between-study variability, will be used to address resulting statistical heterogeneity. The goodness of fit of each model to the data will be assessed using the posterior mean residual deviance, defined as the difference between the deviance for the fitted model and the saturated model, where deviance quantifies model fit using the likelihood function. Models will be compared using the Deviance Information Criterion (DIC), calculated by summing the posterior mean of the residual deviance and the effective number of parameters. The DIC penalises the posterior mean residual deviance (model fit) by the effective number of parameters in the model (model complexity) and therefore takes both model fit and complexity into account.
NMA validity depends on the consistency assumption. That is, that there is no intervention effect modification by treatment comparison or, that the frequency of effect modifiers is similar across the included studies. This assumption can be examined by assessing the inclusion/exclusion criteria of every intervention in the network to determine whether participants, intervention protocols and administration, etc. are similar in ways that may modify treatment effects. Trial and participant characteristics (such as maternal depression) will therefore be compiled into a table to facilitate and visually inspect the ‘similarity’ of factors we consider likely to modify treatment effects.
Assessment of reporting biases
Potential small-study effects will be examined by including study size as a covariate in meta-regression analyses. Funnel plot asymmetry will be tested (101) to explore potential reporting bias (i.e., publication bias, selective outcomes reporting and selective analysis reporting). In addition, we will try to reduce selective outcomes reporting bias by not excluding studies based on the outcomes, as recommended in Cochrane Handbook for Systematic Reviews of Interventions, section 8.7 and 7(102,103), and by checking, where possible, for discrepancies between outcomes reported in protocols and outcomes reported in the published articles.
Subgroup analysis and investigation of heterogeneity
We anticipate that where possible we will conduct subgroup analyses on: parent characteristics, (e.g., gender, age, previous or present psychopathology); child characteristics (e.g., gender, age, comorbidities); socio-economic status of the household and of the broader environment (e.g., household income, low vs high income country); programme administrator characteristics (e.g., years of expertise), intervention features (e.g., comparator, intensity or length of the intervention, theoretical framework); and outcome characteristics (e.g., short vs long-term follow-up)(104). We will attempt to include all possible relevant modifying factors and exclude prognostic factors. Unfortunately, especially in psychological research, these factors (modifiers and prognostic factors) often overlap and their potential role is difficult to disentangle, particularly at the protocol stage. Finally, individual’s genetic liability to mental health problems may represent an important modifier and/or prognostic factor. Whilst genetic risk is not modifiable (i.e., it does not represent a viable target of interventions) and it is unlikely that it is measured in the studies eligible for this review, the use of a well-performed random or quasi-random allocation should minimise imbalances in important prognostic variables or effect modifiers across intervention and control arms, allowing inferences to be made on the effectiveness of parenting interventions on child and adolescent internalising problems.
Once we have obtained the studies, we will examine the presence of potential modifiers and prognostic factors, and which modifiers have the scientific dignity to be included as post hoc analyses, as recommended in Cochrane Handbook for Systematic Reviews of Interventions, section 9.6.5(90).
In the selection of characteristics to be included in our meta-analysis, we will consider that certain relationships may be confounded. For example, we may not find any effect of the intensity of an intervention because it is closely related to the severity of the condition of the participants, which could bias our findings.
Sensitivity analyses will be conducted by excluding studies at high or unclear risk of bias on allocation concealment and blinding domains per the Cochrane Risk of Bias Assessment Tool(84). Sensitivity analyses will include:
- fixed-effect analyses for the pairwise and network meta-analyses;
- where missing data were imputed, trials will be removed where exchangeability assumptions were not met;
- trials that used a non-operationalised diagnostic criteria will be removed.
Confidence in cumulative evidence
We will use an appropriate tool to summarise and assess the quality of the evidence across included studies(105). This might include: Grades of Recommendations, Assessment, Development, and Evaluation (GRADE)(106), Confidence in Network Meta-Analysis (CINeMA)(107) or threshold analysis(108).