We evaluate the association between school suspension and STIs using the National Longitudinal Study of Adolescent and Adult Health (Add Health), which remains the only nationally representative longitudinal study in the United States with STI test results. Add Health comprises a nationally representative sample of adolescents attending public and private high schools and their feeder middle schools in 1994–95 who were interviewed in their homes in 1995 (wave 1, response rate 79.0%), 1996 (wave 2, response rate 88.6%), and 2001 (wave 3, response rate 77.4%). We also used data from interviews with participants’ parents (93% female parents) in 1995 (response rate 82.5%) and school administrators in 1995 (response rate 97.7%.) (35s)
We used data from 7748 respondents who participated in waves 1, 2, and 3 who reported that they had never received an out-of-school suspension (“Have you ever received an out-of-school suspension from school?”) or been expelled from school (“Have you ever been expelled from school?”); reported their birthdate and household size at baseline; and gave a urine sample for the STI tests. Respondents who gave a urine sample for STI testing received an extra $10 incentive; 91.5% of unmarried high school graduates gave a sample. Figure 1 shows the construction of the sample and matched sample. We did not use the survey weights, because they were developed for the entire sample; developing new weights for a highly constrained sub-sample could induce bias (36s).
Predictor: New suspension in 1995-96
We measured a first school suspension between 1995 and 1996 from an affirmative answer to the wave 2 question, “During this school year (during the 1995–96 school year) did you receive an out-of-school suspension from school?” Because the sample was limited to participants without a prior out-of-school suspension or expulsion (see previous section), a suspension reported at wave 2 represents is a first lifetime suspension.
Outcomes: positive STI tests
Three STI outcomes were measured in 2001, 5 years after suspension: testing positive for Chlamydia trachomatis (chlamydia), Neisseria gonorrhoeae (gonorrhea), or Trichomonas vaginalis (trichomoniasis). STI tests that did not return results (357 chlamydia, 810 gonorrhea, and 413 trichomoniasis) were coded as missing, so the sample sizes are slightly different for each STI.
Chlamydia and gonorrhea screening used Ligase Chain Reaction amplification technology in the Abbot LCx Probe System. Trichomoniasis was detected with a PCR-ELISA test for Trichomonas vaginalis DNA which has a sensitivity of 91% in women compared with combined reference standard of wet mount and culture from vaginal swab and 89% in men compared with urethral swab culture, and an adjusted specificity of 93% in women and 95% in men (19). Chlamydia and gonorrhea tests were FDA-approved, but the trichomoniasis test was not yet FDA-approved, so only chlamydia and gonorrhea test results were made available to participants.
We identified 67 potential confounders of the relationship between suspension and STIs using Gottfredson and Hirschi’s self-control theory of deviance (37s) and from past research about suspension (5, 38s), educational attainment (39s), and arrest (40s), including demographics, socioeconomic status, sexual risk-taking, relationships with adults, educational factors, parents’ risk behavior, substance use, personality and mental health, and deviance. The control variables are listed in full in Appendix 1.
The control variables were measured at baseline, except for father ever in prison, which was measured in 2001. Father-in-prison measurement was used as a control variable because it was not likely to be a consequence of their child’s school suspension. The father could have gone to prison after the child’s school suspension, but the father’s propensity to go to prison likely existed prior to the child’s school suspension.
Randomized laboratory experiments routinely include a negative control: a condition under which a null result is expected; if a negative control condition does not produce a null result, that suggests a problem with the experiment. The propensity matching approach used in this paper mimics a randomized experiment; in this case, we use a negative control to detect residual confounding after matching (41s). We used post-suspension impulsivity, measured in 2001, as a negative control because we do not expect post-suspension impulsivity to be greater in suspended than matched non-suspended youth. Impulsivity was the sum of nine Likert-type scale items on a scale from 0 to 1 (α = .94). After matching, suspended and non-suspended youth did not differ on baseline constructs in the Gottfredson-Hirschi self-control model, including systematic versus gut-feeling decision-making (37s.) The same 9-item impulsivity scale was not available at baseline, so it could not be used for matching.
We conducted analyses in the R statistical package 3.5.1.
We identified variables that differed between suspended and non-suspended youth using standardized differences, a measure of effect size defined as the difference in means divided by the standard deviation. The goal of propensity matching methods is to reduce standardized differences to below 0.2, but ideally below 0.1.
Propensity matching method
We used a propensity matching method to identify non-suspended youth that are similar to suspended youth on the control variables to minimize potential selection bias, using the R MatchIt library (20): the specific matching method used is called 3:1 exact and nearest-neighbor Mahalanobis matching with replacement, within propensity score calipers of 0.25 standard deviations. The procedure used for matching described in the next two paragraphs will elucidate the meaning of each term in the name of the specific matching method. The matching method identified suspended and non-suspended youth that had similar values of the 67 potential confounders and the estimated propensity score. The estimated propensity score for each participant is the predicted probability that the individual will be suspended: the fitted value of a logistic regression with the outcome of suspension. The predictors in the logistic regression are specified in the below procedure, but it is important to note that the matching procedure can balance on the 67 potential confounders even though only a subset of the 67 variables are included in the propensity score.
The term “3:1 matching” means that we matched 3 non-suspended youth to each suspended youth using the following procedure. The term “exact matching” means that for each suspended youth, exact matching reduced the set of eligible non-suspended youth by requiring that only non-suspended youth with the same daily smoking status and ever-marijuana status could be considered. The term “within propensity score calipers of 0.25 standard deviations” means that we reduced the set of eligible non-suspended youth further to those within 0.25 standard deviations of the estimated propensity score; that is, these non-suspended youth had similar predicted probabilities of suspension. Finally, the term “nearest-neighbor Mahalanobis matching” means that we identified the 3 closest youth according to a correlation-adjusted distance measure of age in years (not rounded) and grade point average; the correlation-adjusted distance measure is named for the statistician Prasanta Chandra Mahalanobis.
We estimated the propensity for each individual to be suspended using a logistic regression with the outcome of a first suspension between 1995-96 and predictors of demographic factors (rural residence, Northeast region, lives with both biological parents, male gender, age, born in US, Latino, Asian, and Black race/ethnicity, home language is English); socioeconomic status (SES) (mother high school graduate, mother college graduate, parent is currently employed, per capita household in- come, parent reports enough money to pay bills, father ever in prison ), health and risk behavior factors (experiences with violence, delinquency score, respondent smokes daily, household member smokes, mother smokes, depression score, positive expectancies), educational factors (standardized test score, school attachment, expect to attend college, attend private vs. public school, school is strict on civil order, school is strict on substance use, never truant), and personality factors (parent’s assessment of their child, agreeableness, emotional stability, parental closeness, systematic vs. gut-feeling decision making).
Statistical analysis within the matched sample
STIs were rare outcomes, so we used logistic regression to predict chlamydia and trichomoniasis in the unmatched and matched samples, controlling for baseline age, race/ethnicity, gender, and household income tertiles (18). Gonorrhea was rare, with only 21 cases, so we only estimated crude odds ratios, not adjusted odds ratios. Using control variables measured at baseline avoids bias towards the null from using factors that were intermediate between suspension and STIs (21).
We used causal mediation analysis to evaluate whether pre-treatment and post-treatment variables mediated the relationship between suspension and STIs (22). This study evaluated 34 post-suspension variables for mediation: marriage; educational attainment and predictors of educational attainment (e.g., full-time college attendance, community versus four year college matriculation, enrollment gap prior to matriculation); criminal justice outcomes (ever arrested, convicted as adult, arrested as minor, convicted as minor); sexual orientation (identify as lesbian/gay/bisexual (LGB), publicly open as LGB); employment status (full-time, day shift); substance use (ever smoker, current smoker, binge drinking); sexual risk behavior (partner has STI, frequency of sex in the past year, number of partners in past year, number of partner in lifetime, condom use frequency); personality (impulsivity, self-esteem); and expulsion.
We used sensitivity analysis for multiple controls to assess whether observed differences could be attributed to unobserved variables (42s, 43s, 23).