The data used for analysis was taken from the two-wave longitudinal study "Understanding the Lives of Adolescents and Youth Adults" conducted in Uttar Pradesh and Bihar by the Population Council under the guidance of the Ministry of Health and Family Welfare, Government of India. The survey collected detailed information on education, economic activity, household work, migration, mass media and social media exposure, growing up, aspirations, agency, gender role attitudes, Awareness of sexual and reproductive health matters, romantic and sexual relationships, and health-seeking behavior etc. The UDAYA adopted a multi-stage systematic sampling design to provide the estimates for states as a whole and urban and rural areas of the states (36). The study has been designed to establish the levels, patterns and trends in the situation of younger (10–14) and older (15–19) adolescents, conducted in 2015–16 (Wave 1), and the follow up of participants conducted to shed light on the factors that determine successful transitions to adulthood, about three years later when they were aged 13–22 in 2018–19 (Wave 2). The analytical sample consists of both younger (10−14 years) and older (15−19 years) adolescents who were interviewed at the baseline in 2015-16 and followed-up in 2018-19 at the age (13-17 years) and (18-22 years) respectively (36). Ethical approval was obtained from the Institutional Review Board of the Population Council. Written consent was obtained from the respondents in both waves. In wave 1 (2015–2016), 20,594 adolescents were interviewed using the structured questionnaire with a response rate of 92%. Moreover, in wave 2 (2018–2019), the study interviewed the participants who were successfully interviewed in 2015–2016 and who consented to be re-interviewed. Of the 20,594 eligible respondents for the re-interview, the survey re-interviewed 4567 boys and 12,251 girls (married and unmarried) with the attrition rate 18.3%. The attrition rate is the percentage of people who stop responding to surveys. After excluding the respondents who gave an inconsistent response to age and education in the follow-up survey (3%), the final follow-up sample covered 4428 boys and 11,864 girls with a follow-up rate of 74% for boys and 81% for girls. The effective sample size for the present study was 4428 adolescent boys and 11,864 adolescent girls aged 13–23 years in wave-2. The cases whose follow-up was lost were excluded from the sample to strongly balance the dataset and set it for longitudinal analysis using the xtset command in STATA 14.
We restrict our analysis to unmarried and present results for boys and girls separately based on sample distribution of the variable of interest. The adequate sample size for this study was 12035 (4428-boys and 7607-girls) adolescents aged 10-19 at Wave 1. Detailed about the sample selection criteria is shown in Figure 1.
Outcome variable
Depressive symptoms were assessed by asking nine questions from the respondents; the respondent was asked about the symptoms for the past two weeks only. The questions included, (a) had trouble falling asleep or sleeping too much, (b) feeling tired or having little energy, (c) poor appetite or eating too much, (d) trouble concentrating on things, (e) had little interest or pleasure in doing things, (f) feeling down, depressed, or hopeless, (g) feeling bad about yourself, (h) been moving or speaking slowly, (i) had thoughts that respondent would be better off dead. All the above questions were asked on a scale of four, i.e., 0 “not at all,” 1 “less than once in a week,” 2 “one week or more” and 3 “nearly every day.” The scale of 27 points will then generated using the egen command in STATA 14 (37). The same method was assessed for both the Wave depression symptoms with Cronbach alpha value of 0.85 and 0.81 at Wave 1 and Wave 2, respectively. The variable were recorded in to 4 categories i.e. (a) No depression (0-4), (b) Mild (5-9), (C) Moderate (10-14) and (d) Severe (15-27). Severe includes moderately severe and severe. The categories were redefined for analytical purpose.
Predictor variables
Different socio-demographic variables including age (10-14 and 15-19 at baseline and 13-16 and 17-22 at follow-up), Sex (adolescent boys and adolescent girls), mother’s education (Illiterate and literate), wealth Index (poorest, poorer, middle, richer richest), measured at Wave 1.
Index scores so constructed ranged from 0 to 57. Households were then ranked according to the index score. This ranked sample was divided into quintiles0 to 57. Households were then ranked according to the index score. Thisre the first quintile representing households of the lowest (poorest) wealth status and the fifth quintile representing households with the highest (wealthiest) status. Wealth quintiles were developed at the state level on the basis of the weighted sample for the whole state.
The survey measured household economic status, using a wealth index composed of household asset data on ownership of selected durable goods, including means of transportation, as well as data on access to a number of amenities.
The wealth index was constructed by allocating the following scores to a household’s reported assets or amenities, Type of house, Agricultural land owned, Irrigated land owned, Access to toilet facility, Cooking fuel used:, Access to drinking water facility, Access to electricity and Ownership of household assets.
Current schooling (never attended, dropout and currently attending), substance use (Yes and No) and paid work in the last 12 months (yes and no), measured at Wave 1 and Wave 2 by using the same questions. Social media use was assessed by asking question to the respondents “Have you ever used social media, for example, facebook, whatsapp, twitter or we-chat”. It was coded as 1 “yes” if the respondent had an affirmative answer and 0 “no” otherwise. Frequency of social media use was categories as “never use”, e.g. which involves respondents do not use any social media, “infrequent user”, e.g. involves respondent who use social media at least once a week, at least once a month or rarely use, “frequent”, e.g. involves respondent who daily use social media, at Wave 1. Duration of social media use was categories as “zero hour”, e.g. involves respondent who do not use in 24 hour, “1-2 hour”, e.g. involves respondent who use social media 1 or 2 hour in last 24 hour, “3 or more hour”, e.g. involves respondent who use social media 3 or more hour in last 24 hour, only for the Wave 2.
Statistical Analysis:
To analyze the association between the binary outcome variable and other explanatory variables, the binary logistic regression method was used. The outcome variable ‘depression’ was recoded as 0 “no” for not having or only having minimal symptoms of depression and 1 “yes” for having any form (Mild/Moderate/Severe) of depressive symptoms. Model 1 includes social media use and control for all explanatory variables other than frequency of social media use. Model 2 included the frequency of social media use and controlled for all explanatory variables other than social media for both rounds of survey (Wave 1 and Wave 2 separately).
A longitudinal cross-lagged path analysis was used to examine whether bidirectional relationships exist between social media use and depression. Cross-lagged path analysis is a type of Structural Equation Modelling (SEM) employed to describe reciprocal relationships, or directional influences, between variables and how they influence each other over time. Cross-lagged path models are estimated using longitudinal data where two or more variables are measured on two or more occasions to examine the causal influences between variables (38). A maximum likelihood estimation procedure was used. The following five models were applied:
1: Unidirectional influence of social media use at Wave 1 on social media use at Wave 2.
2: Unidirectional influence of social media use at Wave 1 on psychological wellbeing at Wave 2. 3: Unidirectional influence of psychological wellbeing at Wave 1 on psychological wellbeing at Wave 2.
4: Unidirectional influence of psychological wellbeing at Wave 1 on social media use at Wave 2 were tested and compared for model fit to determine whether the hypothesized model is the best fitting model. Model fit was examined using Kline's (2010) guidelines according to which good model fit is reached when chi-square value is low and non-significant, comparative fit index (CFI) values are 0.95 or more, root mean square error of approximation (RMSEA) values are 0.05 or less (0.6–0.8 indicates a mediocre model fit), Chi square difference testing and Akaike Information Criterion (AIC) was used to compare the best fit of competing models, whereby the lowest AIC indicated the best fitting model (39). Statistically significant coefficients within the best fitting model was then examined to interpret specific social media use- psychological wellbeing and psychological wellbeing and social media use. Multiple group structural equation modelling was used to assess whether the cross-lagged associations varied by demographic covariates.