Study sample
Data came from the Health Survey for England (HSE): this dataset is used to monitor progress on numerous national health objectives, including PA [17][18]. Details about the HSE sample design and data collection are described elsewhere [19]. Briefly, the HSE annually draws a nationally-representative sample of people living in private households in England using multistage stratified probability sampling with postcode sectors as the primary sampling unit and the Postcode Address File as the household sampling frame. All adults in selected households are eligible for interview. Fieldwork takes place continuously through the year. Trained interviewers measured participants’ height and weight and assessed their demographic characteristics, self-reported health, and health behaviours including PA using computer-assisted personal interviewing. We used the most recent surveys (2008, 2012, 2016) that included the adult Physical Activity and Sedentary Behaviour Assessment Questionnaire (PASBAQ).
The household response rate ranged from 64% in 2008 to 59% in 2016. This study is restricted to adults (i.e. aged 16 years or over). Participants gave verbal consent for interview. Relevant committees granted research ethics approval for the survey. Overall, 31 399 adults participated in the three surveys, of whom 31 183 had valid PA data. Of these, 6301 had missing income data, leaving an analytical sample of 24 882 adults with complete data.
Assessment of physical activity
PASBAQ data is used to monitor adherence to UK PA recommendations [17][18] and for other epidemiological research [20] [21]. The PASBAQ has demonstrated moderate-weak convergent validity in comparison with non-synchronous accelerometry [22]. PASBAQ assesses frequency (number of days in the last four weeks) and duration (of an average episode of at least ten minutes) in four leisure-time domains [23]: (i) “light” and “heavy” domestic activity; (ii) “light” and “heavy” manual work (e.g. ‘Do-It-Yourself’ (DIY)); (iii) walking (with no distinction between walking for leisure or travel); and (iv) sports/exercise (ten specific and six ‘other’ activities). “Heavy” domestic and manual activities were classed as moderately-intensive. Walking intensity was assessed by a question on usual walking-pace (responses: slow, average, fairly brisk, or fast); moderate-intensity was classed as a fairly brisk or fast pace. Intensity of sports/exercise was determined as indexed in the metabolic equivalent (METs) compendium [24][25] and a follow-up question on whether the activity had made the participant “out-of-breath or sweaty”. In addition to leisure-time PA, participants engaged in any paid or unpaid work answer questions on occupational PA. Our analyses classed three activities – walking, climbing stairs or ladders, and lifting, carrying, or moving heavy loads - as moderate-intensity PA for participants working in occupations identified a-priori as moderately-intensive [17].
Time spent in domain-specific MVPA was calculated as the product of frequency and duration, converted from the last four weeks to hours/week. For sports/exercise, time in vigorous-intensity activities was multiplied by two when combined with moderate-intensity activities to calculate ‘equivalent’ hours/week as specified in MVPA guidelines [26]. Total MVPA was calculated by summing across the five domains, and was truncated at a maximum of 40 hours/week to minimise unrealistic values.
Socioeconomic position ascertainment and confounders
Household income was our chosen marker of socioeconomic position (SEP). The household reference person reports annual gross household income via a showcard (31 bands ranging from ‘less than £520’ to ‘£150 000+’). Household income is equivalised (McClements scale [27]), and grouped into tertiles. Age (in ten-year bands), current smoking (current, ex-regular, never), self-rated health (‘very good/good’, ‘fair’, or ‘bad/very bad’), and BMI were chosen as potential confounders of the SEP and MVPA associations [6]. We computed BMI as weight in kilogrammes (kg) divided by height in metres squared (m2), classifying participants into four groups: underweight (<18.5 kg/m2), normal-weight (18.5-24.9 kg/m2), overweight (25.0-29.9 kg/m2), or obese (at least 30.0 kg/m2).
Statistical analysis
Descriptive estimates
Data was pooled over the three surveys to increase precision (prior analyses revealed no change in associations over time). Differences in age, self-rated health, current smoking, and BMI were estimated by income, using Rao-Scott tests for independence [28]. For total and domain-specific MVPA, we computed descriptive estimates for four outcomes: (i) % doing any; (ii) % ‘sufficiently’ active (i.e. at least 2.5 hours/week MVPA [26]); (iii) average hours/week MVPA (range: 0 to 40 hours/week); and (iv) average hours/week MVPA among those doing any (range: 0.042 to 40 hours/week; hereafter referred to as MVPA-active). Outcomes (iii) and (iv) represent unconditional and conditional (on participation) means, respectively. We decided, a-priori, to conduct gender-stratified analyses due to expected differences in inequalities as reported in the literature [7][8][29]. Income-specific estimates were directly age-standardised within gender using the pooled data as standard. Pairwise differences between income groups (low-income households as reference) were evaluated on the absolute scale using a linear combination of the coefficients [30].
Hurdle models
To handle continuous MVPA data with excess zeros and positive skewness, we used the hurdle model proposed by Cragg, which comprises two parts: a selection/participation model and a latent model [15]. The former determines the boundary points of the continuous outcome (a selection variable equals 1 if not bounded and 0 otherwise), whilst the latter determines its unbounded values (a continuous latent variable which is observed only if the selection variable equals 1). In our analyses, the selection model assessed the influence of income on the binary outcome of participation (any versus none), whilst the latent model assessed its influence on the amount of time spent active, conditional on participation (MVPA-active). We specified a probit model for the former and an exponential form for the latter. Each model contained income (as a three-category variable) and the confounders listed above.
Based on the model estimates, three sets of marginal means by income were calculated, evaluated at fixed values of the confounders. These sets correspond to different definitions of the expected value of MVPA [31]: (i) the probability of doing any, (ii) the average hours/week MVPA for all participants (the unconditional mean), including those who did none; and (3) the average hours/week MVPA conditional on participation (MVPA-active). Inequalities after confounder adjustment (average marginal effects: AMEs) were quantified by computing the absolute difference in the marginal means (low-income as reference).
All analyses accounted for the complex survey design and used non-response weights. Dataset preparation and analysis was performed in SPSS V20.0 (SPSS IBM Inc., Chicago, Illinois, USA) and Stata V15.0 (College Station, Texas, USA), respectively. HSE datasets are available via the UK Data Service (http://www.ukdataservice.ac.uk) [32] [33] [34]; statistical code is available from the corresponding author.