Aims of the study
Using longitudinal data from the Avon Longitudinal Study of Parents and Children (ALSPAC), an ongoing prospective observational population-based birth cohort study the aims of this study were: (i) to investigate the patterns of multiple cancer risk behaviours across adolescence (age 11–18 years) using both a continuous score of cumulated exposure and longitudinal latent class analysis; and (ii) to explore whether and how these patterns are associated with subsequent cancer risk behaviours in early adulthood (age 24 years).
Design & setting of the study
Data were drawn from ALSPAC, an ongoing prospective observational population-based birth cohort study investigating the effects of a wide range of influences on health and development across the life course. (27, 28) Pregnant women, resident in Avon, UK and with expected dates of delivery 1st April 1991 to 31st December 1992 were invited to take part in the study. The initial number of pregnancies enrolled was 14,541 (for these at least one questionnaire has been returned or a “Children in Focus” clinic had been attended by 19/07/99). Of these initial pregnancies, there was a total of 14,676 foetuses, resulting in 14,062 live births and 13,988 children who were alive at 1 year of age. Details of all available questionnaires and data can be found through a searchable data dictionary (http://www.bristol.ac.uk/alspac/researchers/our-data/). Ethical approval for the study was obtained from the ALSPAC Law and Ethics Committee and local Research Ethics Committees. Informed consent for the use of data collected via questionnaires and clinics was obtained from participants following the recommendations of the ALSPAC Ethics and Law Committee at the time.
Exposure measure - adolescent cancer risk behaviours
We used repeated measures of tobacco smoking, alcohol consumption, obesity, sexual risk and physical inactivity at ages ~ 11, ~14, ~ 16 and ~ 18 years (see Table 1). Self-completed questionnaires issued during clinics, self-completed responses to postal questionnaires and parent or carer report questionnaire data were used to derive these measures. Details about the risk thresholds can be found in Supplementary Material 1.
Table 1
Adolescent cancer risk behaviours and their derivation
| Definition/how derived |
Cancer risk behaviours | Age 11 | Age 14 | Age 16 | Age 18 |
Tobacco smoking | Young person has ever smoked. | Young person has smoked cigarettes in past 6 months. | Young person smokes every week. | Young person smokes every week. |
Alcohol consumption | Young person has had a whole drink before age 12 years. | Young person has had whole drink in past 6 months. | Young person has had 6 or more whole drinks in past 30 days. | Young person consumes alcohol ≥ 2–3 times a week or has hazardous alcohol consumption. |
Obesity | Young person has a UK 1990 BMI population reference ≥ 95th centile. | Young person has a UK 1990 BMI population reference ≥ 95th centile. | Young person has a UK 1990 BMI population reference ≥ 95th centile. | Young person has a UK 1990 BMI population reference ≥ 95th centile. |
Sexual risk | Young person has had penetrative sex without the use of a condom on the last occasion they had sex in the past year. | Young person has had penetrative sex without the use of a condom on the last occasion they had sex in the past year. | Young person has had penetrative sex without the use of a condom on the last occasion they had sex in the past year. | Young person has had penetrative sex without the use of a condom on the last occasion they had sex in the past year. |
Physical inactivity | Young person has participated in vigorous physical activity 1–3 times a week or less (parent report). | Young person typically exercises < 5 times a week (self-report) or has participated in vigorous physical activity 1–3 times a week or less (parent report). | Young person typically exercises < 5 times a week (self-report) or has participated in vigorous physical activity 1–3 times a week or less (parent report). | Young person typically exercises < 5 times a week (self-report) or has participated in vigorous physical activity 1–3 times a week or less (parent report). |
Sources of information: |
T1/Age 11: data from sources when the participants were aged between 128–154 months, the midpoint of which is 141 months or 11.75 years. |
T2/Age 14: data from sources when the participants were aged between 166–171 months, the midpoint of which is 168.5 months or 14 years |
T3/Age 16: data from sources when the participants were aged between 186–200 months, the midpoint of which is 193 months or 16 years. |
T4/Age 18: data from sources when the participants were aged between 214–224 months, the midpoint of which is 219 months or 18.25 years. |
Outcome measures – early adult cancer risk
The early adult outcome measures are, where possible, more severe presentations of the adolescent cancer risk behaviours. For example, the adolescent smoking exposure ranges from ever smoked to weekly smoking, whereas the early adult outcome measures were daily smoking and having nicotine dependence. General obesity, as defined by height and weight, was supplemented by measures of central obesity: high waist circumference (≥ 80 cm for females and ≥ 94 cm for males, and high waist-hip ratio (≥ 0.85 for females and ≥ 1.00 for males) at age 24 years.
Early adult cancer risk was based on measurements collected in clinics (measured height and weight to compute body mass index, waist circumference and waist-hip ratio), or responses to questionnaires (harmful drinking, daily smoking and nicotine dependence) by participants at age ~ 24 years (mean age 24 years and 6 months, SD = 9.78 months). We were unable to include measures of accelerometery measured physical inactivity, owing to low numbers with a valid minimum number of days of wear-time (only 380 participants with 3 days of data). We were unable to estimate sexual risk using data about Chlamydia incidence because perfect prediction from measures integral to the final analysis was observed in the multiple imputation model, which may bias the relation of interest. (29) Binary indicators were derived for harmful drinking: a score of ≥ 8 in the Alcohol Use Disorders Identification Test-C (AUDIT-C); daily smoking; nicotine dependence (a score of ≥ 4 in the Fagerström test), obesity (a BMI of ≥ 30); high waist circumference, as defined by the National Institute of Health and Clinical Excellence (NICE) and World Health Organisation (WHO) guidelines: ≥80 cm (females) and ≥ 94 cm (males); and high waist-hip ratio (≥ 0.85 for females and ≥ 1.00 for males). (30, 31)
Confounder measures
We identified potential confounders (common causes of both exposures and outcomes) that occurred before the exposure measures i.e. before age 11 years. All models were adjusted for: sex, parental socioeconomic status, adverse childhood experiences (ACEs), (32) intelligence quotient (IQ), childhood antisocial behaviour, depressive symptoms, conduct problems, maternal smoking, harmful maternal alcohol use and maternal cannabis use. Models relating to the anthropometric outcomes (obesity, waist circumference and waist-hip ratio) were additionally adjusted for birthweight, gestational age, maternal obesity, maternal physical inactivity, and maternal unhealthy diet (see Supplementary Material 2 for more details of how confounder measures were derived).
Statistical analysis
We summarised exposure to our adolescent cancer risk behaviours of interest (tobacco smoking, alcohol consumption, obesity, unprotected sexual intercourse, and physical inactivity) in two ways. First, we calculated a cumulative continuous score, summarising exposure to the five risk behaviours across adolescence and expressed the score as the area under the curve. This was done by summing the product of the total number of risks and the time interval, at four time points between ages ~ 11 and ~ 18 years. Second, using the same data, we derived longitudinal latent growth curves to explore whether the same behaviours cluster to produce qualitatively distinct risk profiles (over and above the cumulative score). The processes used to derive the adolescent exposure measures are described in more detail in the Supplementary Materials 3 & 4.
We explored the patterning of adolescent cancer risk behaviours, using quartiles of the cumulative score to provide a comparative measure for the latent classes. We compared models with between 2–7 classes using both complete case and imputed data (see below for imputation method). The optimum model, as determined by the lowest Bayesian information criterion (BIC), was a 6-class latent class growth analysis, for both the imputation and complete case samples. These models produce a class-assignment probability indicating the confidence with which each participant can be allocated to a specific latent class. Entropy summarises this information as a single measure ranging from zero to one (one indicating absolute certainty that individuals have been assigned to the correct class).
Logistic regression analysis was used to examine prospective associations between quartiles of adolescent cancer risk behaviours and early adult cancer risk behaviours at age 24 years. We ran unadjusted models for all outcomes, including only the exposure and outcome measures followed by a sequence of adjusted models, which additionally controlled for: (i) sex, IQ and socioeconomic status; (ii) adverse childhood experiences (ACEs), (iii) maternal cannabis use, maternal harmful alcohol use and maternal smoking; and (iv) child depressive symptoms (SMFQ), child total difficulties score (SDQ) and child antisocial behaviour. Models with obesity, waist circumference, and waist-hip ratio outcomes were additionally adjusted for birthweight, gestational age, maternal obesity, maternal physical inactivity, and maternal unhealthy diet.
Missing data
Data on all exposures at one time point were available for 6,351 (46.0%) of ALSPAC participants. In our primary analysis, multiple imputation was used to account for missing data (see below). In sensitivity analyses, we investigated associations for each 24-year outcome on complete case samples i.e. those with no missing data on any of exposure, outcome or confounder measures. The flow diagram for deriving the sample can be found in Supplementary Material 5.
Multivariate imputation by chained equations was carried out using the ‘ice’ routine in Stata. This approach is based on the missing at random (MAR) assumption, i.e. that any differences between the missing and observed values, can be explained by differences in the observed data. (33) All variables used in the analyses, including the outcome measures, exposure measures and confounders were included in the imputation model, along with alternative measures that had been collected at different times. These were included as auxiliary variables to reduce bias by improving the precision of the imputation model. Monte Carlo errors were used to compare the results obtained when imputing 25, 100 and 200 data sets. Imputed results shown have been pooled across the 200 data sets, having satisfied White et al.’s rules of thumb for the number of imputations. (34) All analysis was conducted using Stata version 15 (35) and Mplus version 8. (36)