Dataset
Data used in the current study were drawn from Understanding Society, the UK Household Longitudinal Study (UKHLS) of the members of approximately 40,000 households (at Wave 1) in the United Kingdom. Ethical approval was received by the University of Essex Ethics Committee for all waves 48. The youth sample was the focus of the current study. This involved household members ages 10-15 years who, following permission from their guardian, were asked to complete a short self-report survey on aspects of mental health and wellbeing, health, day to day activities, school and social life, and aspirations about one’s future49. At the time of analysis, 10 waves of data collection were available between 2009 and 2019.
Mental health symptoms were assessed once every two years, thus, only five waves were included in the current study: Wave 1 (2009-2011; n = 4,899), wave 3 (2011-2013; n = 4,427), wave 5 (2013-2015; n = 3,655), wave 7 (2015-2017; n = 3,629), and wave 9 (2017-2019; n = 2,821). The current sample consisted of 12,046 young people across all five waves. Young people were asked to indicate whether they were male or female. We, therefore, consider this question to be capturing their sex. 5,997 (49.8%) young people were females and 6,044 (50.2%) were males. Five participants (0.04%) had inconsistent information on the sex variable (score of 0), and were those who either identified with a different sex across time or refused to provide this information. Given the small number of participants we were unable to form a meaningful group, thus, these were excluded from analyses. Participants’ age ranged between 10 and 15 with a mean age of 12.5 (wave 1: M = 12.51 (SD = 1.7), wave 3: M = 12.53 (SD = 1.7), wave 5: M = 12.56 (SD = 1.67), wave 7: M = 12.52 (SD = 1.7), wave 9: M = 12.48 (SD = 1.7)).
Variables
Due to restrictions in the number of variables used by the network package, 17 variables were chosen based on existing evidence reported above focusing on satisfaction with life, mental health symptoms, frequency of social media use, and social relationships. The mental health symptoms and bullying items were drawn from the self-report Strength and Difficulties Questionnaire50 which uses a 3-item response scale (not true, somewhat true, certainly true), with higher scores indicating more mental health difficulties. The Life Dissatisfaction measure, which was developed as part of the Understanding Society study, uses a 7-point scale where categories are represented by a series of visual anchors depicting smiling to frowning faces. Higher scores represent higher life dissatisfaction. The question on social media assesses the frequency of social media use, as perceived by the young person (see Table 1 for a full description of the variables).
Statistical analysis
All network models were estimated using the R package psychometrics (version 0.9)51 and were visualised using the packages qgraph (version 1.6.9)52 and ggplot2 (version 3.3.5)53. The code used in the current study is available at https://osf.io/scz6x/. The psychonetrics package does not yet allow for the inclusion of covariates. Thus, to control for key stable covariates, including age, ethnicity, and family income, using multiple regression, we regressed the 17 variables onto the three covariates for each of the five waves of data, and used their residuals as the main data in the current analysis (S Epskamp, 2020, personal communication, 12 June). Although this was not part of the study aim, for transparency, we also provide the results for networks without the inclusion of covariates at the supplementary material (Table S2, Figures S1-S4 and at https://osf.io/scz6x/). The following time-variant social variables were controlled for by including them in the network model: bullying, lack of family support, and number of supportive friends. Data were treated as continuous and models were estimated using robust full information maximum likelihood estimator (MLR)54.
All analyses were explored separately by sex. Three network models were considered in order to identify the most optimal network: 1) a baseline model was estimated with saturated network structures (i.e. where all edges are included), 2) a pruned model, where non-significant parameters identified in the baseline model were recursively pruned at α = .05, 3) a model in which parameters that were removed in the pruned model were added back in one at a time through the stepup function, until the most optimal Bayesian Information Criterion (BIC) was reached. This data-driven approach is consistent with recommended practice in network modelling55 Model fit was explored using recommended thresholds, with comparative fit index (CFI) and Tucker-Lewis Index (TLI) values > .95, and root mean square error of approximation (RMSEA) < .0656 indicating acceptable model fit. However, given that this is a data-driven approach, a good model fit was expected. The three models were nested (model 1 vs. model 3, and model 3 vs. model 2) and were thus compared based on the Akaike information criterion (AIC) and BIC which penalize for model complexity. Chi-square model difference testing was avoided given its sensitivity to sample size57. Thus, the model with the most optimal fit was that with the lowest AIC and BIC values.
Three matrices were extracted from the best fitting model for further testing. These can be interpreted similarly to a latent random-intercept cross-lagged panel model42. First, the within-person temporal matrix encodes predictive effects over time, which are modelled via a regression on the previous measurement occasion (t-1). This matrix includes autorergessions (stability) and cross-lagged effects, which are interpreted as directed partial correlations (standardised beta coefficients). Second, the within-person contemporaneous matrix encodes the average within-lag cross-sectional associations between variables within the same measurement occasion after taking temporal effects into account, and are interpreted as undirected partial correlations. Third, the between-person matrix allows for the separation of within- and between-person effects. The parsing out of this information is akin to the key difference between a traditional and random intercept cross-lagged panel model. This matrix encodes undirected partial correlations between stable means, which represent stable trait-like differences between individuals58. For example, as described in Figure 1, a positive between-person relationship between the two variables social media and unhappiness would indicate that adolescents with a higher mean on social media, on average across all waves, tend to report a higher unhappiness mean, than adolescents with a lower social media use mean59. Both within- and between-person effects yield useful insights for intervention, as they can help identify what and who the intervention should focus on, respectively60.
These matrices were plotted for each sex, where circles represent the variables of Table 1 (also called nodes), and lines represent the associations between variables (also called edges). Blue and red edges represent positive and negative associations, respectively. The thicker the edges, the stronger the relationship. Edges in the temporal network (standardised beta coefficients) indicate the association between an outcome variable at time t and a predictor at time t-1 after removing the linear effects of all other variables at time t-161. Curved arrows in the temporal network represent autoregressions (i.e. stability).
Two indices were calculated to provide more information at the node-level: First, the intracluster correlation coefficient (ICC) represents the proportion of variance in a given node that is accounted for by the between-person effects. In other words, this would provide a better understanding for the degree to which stable trait-like differences account for the variability of a given outcome in adolescents. Second, the expected influence provided information about the role each node may play in the activation, persistence, and remission of the network32. In the temporal network this can be used to make predictions about the influence of a node. Specifically, the incoming and outgoing expected influence was calculated and represent the network’s predictive influence on the node, and the predictive influence of the node on the rest of the network, respectively. This was preferred over commonly used centrality indices (e.g. strength), as it accounts for negative nodes that might exist in the network32. Expected influence can be used to identify symptoms or factors in a network that may play an important role in the development of mental health32.