Descriptive Results
Distribution for item response by sex for the SDQ is shown in Tables S1-S3. Items that measure internalising symptoms were skewed to indicate fewer internalising traits, whilst pro-social items were skewed to indicate more pro-social behaviour. Items measuring externalising symptoms were typically skewed to indicate fewer externalising traits, but there were three items in boys and two items in girls from the hyperactivity scale that were not skewed. Boys were less likely to report emotional problems than girls, whilst girls typically indicated more pro-social behaviour than boys.
Exploratory Factor Analysis and Confirmatory Factor Analysis
A five-factor structure demonstrated the best model fit, whilst also retaining an eigenvalue greater than one in the EFA (Table S4). Geomin rotated loadings indicated a unique factor for emotional symptoms, but with a weaker loading for the item “complain” (0.36) (Table S5). There were cross-loadings for the items “liked” (loading onto both peer problem and pro-social scales), “restless” and “fidget” (hyperactivity and conduct scales), and weak cross-loadings for “impulse” (pro-social and the hyperactivity scales), and “considerate” (pro-social and conduct problem scales).
Fit statistics in CFA for the five-factor model (model 1) and the second-order model (model 5) were comparable and indicated better fit than the three-factor model (model 3, Table 2). Inclusion of correlated errors improved model fit in all cases, therefore, models 2 and 6 were selected for subsequent validity analyses.
The majority of the standardised factor loadings (Tables S6-S7) for models 2 and 6 demonstrated acceptable loading (> 0.5). Over half of the first-order item loadings (N = 13) exceeded 0.7. Loadings for the emotional symptom factor ranged from 0.49 to 0.83, for conduct problems from 0.65 to 0.81, for the hyperactive scale from 0.65 to 0.76, for the peer problem scale from 0.58 to 0.78, and for the pro-social scale from 0.56 to 0.79 (Fig. 2). Loadings onto the second-order factor were 0.88 and 0.93 for internalising symptoms, and 0.95 and 0.88 for externalising symptoms (Fig. 3). For all items except one (“complains”), the underlying factor explained > 30% of the variance for the items in both the first-order and second-order models. The internal consistency of the subscales and the second-order factors, as indicated by the ordinal alpha, were all good (> 0.7) (Table S8).
Table 2
Goodness of Fit Indices for Competing Models in Confirmatory Factor Analysis.
Model
|
χ2
|
df
|
CFI
|
TLI
|
RMSEA
|
SRMR
|
1) Baseline five-factor Model
|
2227.962
|
265
|
0.921
|
0.911
|
0.036
|
0.062
|
2) Five-factor model with correlated errors
|
1710.710
|
255
|
0.942
|
0.931
|
0.032
|
0.057
|
3) Baseline three-factor Model
|
2947.547
|
272
|
0.893
|
0.882
|
0.042
|
0.075
|
4) Three-factor model with correlated errors
|
1921.125
|
262
|
0.934
|
0.924
|
0.033
|
0.062
|
5) Second-order two-factor model
|
2247.816
|
268
|
0.921
|
0.911
|
0.036
|
0.064
|
6) Second-order two-factor model with correlated errors
|
1741.387
|
258
|
0.941
|
0.931
|
0.032
|
0.058
|
Table 2 Footnote: Abbreviations: χ2 – chi-squared; df – degrees of freedom; CFI – Comparative Fit Index; TLI – Tucker-Lewis Index; RMSEA – Root Mean Square Error of Approximation; SRMR – Standardized Root Mean Squared Residual.
Measurement Invariance
Structural, factorial and strong factorial invariance were demonstrated for the first-order five-factor model across both sex and index of multiple deprivation (Model 2, Table S9). The differences between the configural and metric models, and scalar and metric models ranged from 0.001 to 0.004 (CFI), 0.004 to 0.008 (TLI), 0.001 to 0.002 (RMSEA) and 0.000 to 0.001 (SRMR). For the second-order factor model (model 6), scalar invariance was demonstrated as the model fit was good or acceptable for all indices across all groups tested (Table 3).
Table 3
Fit Indices for the Scalar Model for the Second-Order Factor
Model
|
CFI
|
TLI
|
RMSEA
|
SRMR
|
Sex
|
0.946
|
0.943
|
0.029
|
0.065
|
Index of multiple deprivation
|
0.946
|
0.944
|
0.027
|
0.067
|
Table 3 Footnote: Abbreviations: CFI – Comparative Fit Index; TLI – Tucker-Lewis Index; RMSEA – Root Mean Square Error of Approximation; SRMR – Standardized Root Mean Squared Residual.
Average Variance Extracted
For the five-factor model (model 2), AVE scores were above the 0.5 threshold for the emotional problems and conduct problem factors, borderline for the pro-social factor (AVE 0.49), and lower for the peer problem and hyperactivity (Table S10). For the second-order factor (model 6), AVE scores for the internalising and externalising symptoms were 0.81 and 0.84 respectively (Table 4).
In the five-factor model (model 2), AVE scores for peer problem and emotional problems latent variables were smaller than their squared correlation (0.72), but larger than any other squared correlation, indicating they may be measuring a similar construct but are distinct from other constructs. This was also the case for conduct and hyperactivity factors. When a second-order model is adopted, the AVE scores of internalising (0.81) and externalising (0.84) symptoms were greater than their squared correlation (0.57), indicating they were measuring separate constructs (Table 4). However, the pro-social AVE score was lower than the squared correlation with the externalising symptoms factor (0.51).
Table 4
Average Variance Extracted and Squared Correlations for Second-Order Factor Model
|
Correlation
|
Squared Correlation
|
|
AVE
|
Internalising
|
Externalising
|
Pro-Social
|
Internalising
|
Externalising
|
Pro-Social
|
Internalising
|
0.81
|
|
|
|
|
|
|
Externalising
|
0.84
|
0.76
|
|
|
0.57
|
|
|
Pro-Social
|
0.49
|
-0.47
|
-0.72
|
|
0.22
|
0.51
|
|
Table 4 Footnote
Average variance explained (AVE) scores are the average R 2 score, and represent the average variance explained by the factor in the items that it is measured by.
Predictive Validity
For the five-factor model (model 2) emotional problems and conduct problems were positively associated with depression in the models adjusted sex, ethnicity and stratification characteristics (Table S11). Peer problems and hyperactivity positively predicted ADHD diagnosis, whilst emotional problems negatively predicted diagnosis. For autism, peer problems and hyperactivity positively predicted, whilst conduct problems and pro-social behaviours negatively predicted diagnosis.
For the second-order factors (model 6) a higher level of internalising symptoms were associated with increased likelihood of depression, and a higher level of externalising symptoms were associated with ADHD (Table 5). Pro-social symptoms were also positively associated with a formal diagnosis of ADHD. Internalising symptoms were positively related to autism diagnosis whilst pro-social symptoms were negatively related.
Sensitivity Analysis
Items “impulse” and “liked” were removed in sensitivity analysis (models 7–12), as they represent cross-loadings between conceptually distinct factors. The item “considerate” was not removed due to the lower factor loading on the conduct problem factor (0.36), whilst loading strongly onto the pro-social scale factor (0.52). EFA found the five-factor model with cross-loadings removed to also be best fitting, and geomin rotated factor are shown in Table S12. In the CFA, similar to the main analysis the five-factor model with correlated errors (model 8) and the second-order model with correlated errors (model 12) had the best fit (Table S13).
The AVE scores for the five-factor structure (model 8) were similar to the main analysis, whilst slightly higher for the internalising symptoms (0.85) and marginally lower for externalising symptoms (AVE 0.79) in the second-order model (Tables S14-S15). The AVE score for the pro-social factor (0.49) exceeded the squared correlation with the externalising symptom factor (0.47).
Results for predictive validity with cross-loadings removed (model 8 and 12, Tables S16-S17) were comparable to the results in the main analysis. There was additional evidence that internalising symptoms were negatively associated with ADHD at age 14. In the five-factor model, hyperactivity also negatively predicted depression, and emotional symptoms were negatively associated with autism.
Table 5
Predictive Validity of Second-Order Factors Structure
|
Unadjusted
|
Adjusted
|
|
Estimate
|
SE
|
P Value
|
Estimate
|
SE
|
P Value
|
Depression
|
Internalising Symptoms
|
0.37
|
0.09
|
< 0.001
|
0.32
|
0.08
|
< 0.001
|
Externalising Symptoms
|
0.05
|
0.1
|
0.66
|
0.07
|
0.09
|
0.42
|
Pro-social
|
0.1
|
0.07
|
0.13
|
0.05
|
0.06
|
0.47
|
ADHD
|
Internalising Symptoms
|
-0.16
|
0.09
|
0.071
|
-0.09
|
0.08
|
0.29
|
Externalising Symptoms
|
0.92
|
0.12
|
< 0.001
|
0.8
|
0.11
|
< 0.001
|
Pro-social
|
0.14
|
0.08
|
0.083
|
0.15
|
0.08
|
0.04
|
Autism/ Asperger’s
|
Internalising Symptoms
|
0.54
|
0.07
|
< 0.001
|
0.58
|
0.15
|
< 0.001
|
Externalising Symptoms
|
0.11
|
0.1
|
0.25
|
0.05
|
0.09
|
0.60
|
Pro-social
|
-0.17
|
0.07
|
0.018
|
-0.14
|
0.08
|
0.07
|
Table 5 Footnote
Standardised probit regression coefficients and standard errors (SE). In unadjusted models, all factors are included in the model simultaneously (internalising symptoms, externalising symptoms, pro-social). In adjusted models, adjustments are made for sex, ethnicity and stratification characteristics.