This paper combines two distinct datasets in order to examine the relationship between urban green space and BMI: The Irish Longitudinal Study of Ageing, and a land-use database known as Prime2. The datasets and the methods used to link and analyse them are outlined below.
The Irish Longitudinal Study on Ageing (TILDA)
TILDA is a nationally representative survey of those aged over 50 in the Republic of Ireland. Data for Wave 1 (W1), which forms the basis of the analysis in the current study, was initially collected between October 2009 and July 2011. During this period, 8,175 individuals from a sample of 6,279 households were recruited to participate in the study. Respondents' spouses and partners were also invited to participate, regardless of their age, and so the full W1 sample size is 8,504. The data were primarily collected using Computer Assisted Personal Interviewing (CAPI) carried out by trained interviewers, face-to-face at each individual’s home. Sensitive questions were included in a supplemental self-completed questionnaire (SCQ), which respondents returned by mail. Wave 1 respondents were also invited to attend a nurse-administered health assessment at a dedicated centre or, where attendance was infeasible or impractical, to complete a modified partial assessment in the home. Follow-up data have been collected at two-year intervals (19,20) but are not used here.
TILDA recruitment followed the RANSAM protocol (21), a method which samples households from the population of residential addresses in the Republic of Ireland. The geo-location of each respondent's residential address is thus known and can form the basis of spatial links to additional external data sources.
Outcome: Body Mass Index (BMI)
BMI, calculated as a person's weight in kilograms divided by the square of their height in metres , serves as the health outcome of interest in this paper. The index is widely used as a tool to classify adult obesity based on the cut-off values defined by the World Health Organization (22). Self-reported measures of height and weight are subject to measurement error (11,12), and so we use objective measurements of height and weight that were collected as part of the TILDA health assessment. After each participant had removed footwear and any heavy outer garments, SECA 240 wall mounted rods, and SECA electronic floor scales were used to record height and weight, respectively (23). Since the health assessment was an optional component of the study, a valid BMI measurement is unavailable for 2,302 respondents in our sample, necessitating their exclusion.[1] The those with a BMI more than three standard deviations from the mean of the distribution (n=63) are excluded from the analysis as the recorded values appear biologically implausible. See Figure 1 for full details on how the final sample was constructed. The distribution of BMI values among TILDA respondents in this final sample is presented in Figure A1 in the Appendix. The observed range of BMI scores is 15.88 – 43.89, with a mean value of 28.45.
Additional Control Variables
The geography of urban green spaces may be systematically associated with socioeconomic characteristics (24). In particular, those with favourable economic circumstances may have the ability to self-select into more attractive and potentially greener neighbourhoods (25). While the structure of our combined data source does not allow us to capture all such factors, the richness of the TILDA dataset allows us to control for many socioeconomic, demographic, and health-related factors that may jointly determine BMI and exposure to green space. Importantly, we control for income category in all our econometric models. Failure to do so could lead to overestimation of a positive relationship between greenness and health (13). Our full set of control variables closely follows Dempsey et al. (17) and includes age category, urban location, gender, income category, employment status, marital status, highest level of educational attainment, medical cover, smoking status, and a dummy variable that indicates reported difficulty walking 100m. Descriptive statistics for these variables appear in Table 1.
Table 1: Descriptive statistics.
|
|
Frequency
|
Percent
|
Green Space
|
Non-urban Settlement
|
3,561
|
61.35
|
(1600m Network)
|
Quintile 1A (lowest)
|
449
|
7.74
|
|
Quintile 2A
|
449
|
7.74
|
|
Quintile 3A
|
450
|
7.75
|
|
Quintile 4A
|
447
|
7.70
|
|
Quintile 5A (highest)
|
448
|
7.72
|
Urban Location
|
Non-Dublin
|
4,288
|
73.88
|
|
Dublin
|
1,516
|
26.12
|
Gender
|
Male
|
2,672
|
46.04
|
|
Female
|
3,132
|
53.96
|
Age Category
|
50-64
|
3,462
|
59.65
|
|
65-74
|
1,548
|
26.67
|
|
≥ 75
|
794
|
13.68
|
Income Category
|
0 - 9,999
|
426
|
7.304
|
|
10,000 - 19,999
|
1,009
|
17.38
|
|
20,000 - 39,999
|
1,944
|
33.49
|
|
40,000 - 69,999
|
1,236
|
21.3
|
|
≥ 70,000
|
560
|
9.65
|
|
Not reported
|
629
|
10.84
|
Marital Status
|
Married
|
4,197
|
72.31
|
|
Never married
|
471
|
8.12
|
|
Sep/divorced
|
387
|
6.67
|
|
Widowed
|
749
|
12.9
|
Employment Status
|
Employed
|
2,209
|
38.06
|
|
Retired
|
2,144
|
36.94
|
|
Other
|
1,451
|
25.00
|
Smoker
|
Never
|
2,606
|
44.9
|
|
Past
|
2,266
|
39.04
|
|
Current
|
932
|
16.06
|
Educational Attainment
|
Primary/none
|
1,519
|
26.17
|
|
Secondary
|
2,371
|
40.85
|
|
Third/higher
|
1,914
|
32.98
|
Medical Cover
|
Not covered
|
588
|
10.13
|
|
Medical insurance
|
2,631
|
45.33
|
|
Medical card
|
2,585
|
44.54
|
Mobility
|
No difficulty walking 100m
|
5,456
|
94.00
|
|
Difficulty walking 100m
|
348
|
6.00
|
Total
|
|
5,804
|
100.00
|
Consistent with the overall cohort, females are slightly over-represented, making up 54% of our final sample (23). Despite TILDA’s focus on older people, the W1 cohort is relatively young and active in the labour market, with 59.7% of the sample under the age of 65 and 38% in employment at the time of interview. A broad spectrum of educational attainment and income levels are captured in the data. Smoking habits are prevalent among the cohort with past and current smokers combined accounting for 55.1% of respondents. Mobility-limiting disabilities are relatively uncommon at W1, with 6.1% indicating that their ability to walk 100m would be impeded by some physical or mental health condition. Nevertheless, it is important to control for such difficulties as the relationship between greenness and BMI is likely mediated by an ability to access and utilise the relevant spaces.
Land Use Data: Prime2
The spatial information used to derive the amount of urban green space in the vicinity of TILDA residential addresses is drawn from ‘Prime2’, an object-oriented digital mapping model which standardises a wealth of spatial data for Ireland. The dataset was developed by Ordnance Survey Ireland (OSI), the country’s national mapping agency. Prime2 includes three features that are particularly relevant to the current study: 1) a detailed land-use data from which green areas can be identified, 2) a fully connected road network from which the theoretical accessibility of green areas can be imputed, and 3) a complete (albeit disjoint) set of urban footpaths from which the feasibility of walking along a particular route may be approximated. Walkable footpaths are taken to include the set of paths labelled as Sidewalks, Boardwalk, Walk general, Pedestrian Zone, Walk unmarked and Towpath. They exclude those defined as Pedestrian bridge, Pedestrian plaza or Steps, not all of which are accessible to pedestrians. Footpaths within parks are not available in the dataset. Data covering extensive areas surrounding the country’s five primary urban centres (Dublin, Cork, Galway, Limerick, Waterford) were made available for the purposes of the current study. These areas, however, contain large commuting zones that may be quite rural in nature. We calculate various dimensions of green space footpath-accessibility in regions identified as ‘urban settlements’ in the 2011 Irish Census[2]. Figure 2 provides a map of the areas considered ‘urban’ in the analysis.
Characterising Local Green Space
The strategy we employ to determine greenness of each urban TILDA respondent’s locality builds on existing methods from the literature with the specific aim of accounting for urban accessibility factors, which may be omitted under traditional research designs. Broadly, we use Geographic Information Systems (GIS) to define a buffer zone around each respondent’s residential address, and subsequently calculate the share of land area within the buffer that is made up of green spaces as a measure of exposure.[3] It is ultimately an empirical question how best to specify these buffer zones such that the green space metric captures what has the greatest potential relevance to respondents’ health outcomes. Indeed, past research has shown that observed associations between greenness and health can be sensitive to researchers’ choice of green space characterisation (8).
Basing the analysis on circular buffers ignores various dimensions of connectivity within the urban space and may misrepresent the extent of the area that can be reached by a respondent on foot. For example, if the urban landscape does not offer a straight-line path between the buffer centre and its edge, then an individual wishing to travel between the two locations necessarily transverses a distance greater than the buffer radius. In such cases, a circular buffer can capture green space that lies beyond an assumed maximum walking distance from the residential address. This issue is accentuated in regions where urban layouts do not follow grid systems (as is the case in Ireland) since straight-line paths between locations are generally uncommon. To overcome this issue, we follow a number of recent studies, which have carried out green space analysis within network buffers (27–30). Such buffers are drawn based on a maximum distance travelled across a road network (See Panel A of Figure 3)
While network buffers offer an improved characterisation of the maximum pedestrian-accessible area around a given residential address, they cannot account for all accessibility issues within the chosen buffer space. For example, it may be impractical to walk along certain roads even when they are proximal to one’s residential address. To this issue, we offer a novel solution. We produce network buffers using only roads with which footpaths are associated. Specifically, a junction-to-junction road segment is only included in a network buffer in this study if a set of footpaths, with a combined length which exceeds half that of the road segment, can be identified within 25 metres of the road segment centreline. As a result, our analysis is restricted to geographic areas where the density of local footpaths is high and, on average, green spaces that are not accessible on foot are excluded. A more formal description of our methodology is provided in the Appendix.
Even within these areas, which we term ‘footpath-accessible network buffers’, the proximity of green space to the road network itself might have a mediating role in any association between greenness and health. For example, recent work has identified explicit associations between street-side greenery and health outcomes (31). To test the relevance of such greenery (e.g., green common areas in housing estates) in our context, we define second set of buffer zones which restrict the classification of relevant green spaces to those that fall within 50m of roads provided with footpaths (See Panel B of Figure 3). A comparison of results using the two alternative buffer definitions will allow us to identify which set of green spaces, if any, is most associated with BMI.
The appropriate size to draw the buffers is also unclear. A recent survey of the literature by Browning & Lee (13) suggests that, on average, larger buffers sizes (up to 2000m) best predict dimensions of physical health, but that for studies which centre the zones on exact residential addresses (as is the case in the current study), this predictive power might plateau at a much smaller buffer size (500-1000m). Since our observed results may be sensitive to this choice, we perform our analysis using multiple buffer extents. Our main specification follows Dempsey et al. (17) in using a 1600m buffer, which creates a zone roughly appropriate for a 20-minute walk from one’s home address. We then repeat the statistical analysis with a smaller 800m buffer.
Our final analysis thus utilises four varied characterisations of local green space: “Footpath-accessible network buffers” covering 1600m and 800m spaces and “footpath-accessible street-side buffers” of the same sizes. In order to preserve the anonymity of individual TILDA respondents, the final variables enter our statistical models in categorical form. Specifically, the variables used represent the quintile of green space exposure which a respondent receives. The correlations among these measures are shown in Table 2 for the 1600m metrics. The correlation between street-side and network buffers is high; that between these metrics and circular buffers is lower. Respondents who reside in non-urban settlement areas are coded as a separate category to allow a larger sample to be used, permitting more precise estimation of non-green space control variables.
Table 2: Spearman rank correlations for green space quintiles, comparing 1600m circular, network and street-side buffers.
|
Circular buffer
|
Network buffer
|
Street-side buffer
|
Circular buffer
|
1.00
|
|
|
Network buffer
|
0.697
|
1.00
|
|
Street-side buffer
|
0.641
|
0.841
|
1.00
|
Model
We test the association between urban green space and BMI using regression techniques, specifically, using a generalised linear model (GLM). The GLM framework offers additional model flexibility compared to traditional Ordinary Least Squares (OLS) and is employed when the distribution of the outcome variable may not be normal. In particular, the researcher may specify a functional form that links the outcome variable to a linear index of explanatory variables and make a distributional assumption about the variance of the estimator. The model, as it applies to the current context, is as follows:
[Please see the supplementary files section to view the equations.]
where is a function that links BMI to our independent variables of interest, greeni is a categorical representation of local green space for individual i, and the represents the vector of k socioeconomic and health-related control variables discussed above. We perform a specification search to identify the most appropriate functional form for (link function) and value for (estimator family). In the search process, we allow the link to be the identity (linear), natural log, and square root functions, and (equivalent to Gaussian, Poisson, and Gamma families respectively). The variance of the dependent variable (Var) is assumed to vary (∝) according to some function of the mean of the variable. The models chosen are those with the lowest Akaike’s Information Criterion (AIC) (32) and Schwarz Bayesian Criterion (BIC) (33). The Gaussian family model with identity link function emerges as the most efficient. As such, the results reported are equivalent to those produced by a linear OLS model. Robust standard errors, clustered at the household level, are computed to allow for a general form of heteroscedasticity. We run two specifications of each model; one with the full set of covariates as set out above, and a second “parsimonious” model with groups of covariates that are collectively insignificant at the 5 per cent level excluded. This allows us to check the sensitivity of the urban green space coefficients to the set of covariates chosen for inclusion.
[1] Our sample represents 73 per cent of TILDA wave 1 respondents, Those with more education, in better health and in the youngest age groups were more likely to complete the TILDA health assessment (23).
[2] Specifically, the green space surrounding a TILDA residential address is characterised if a) the address is located within a cluster of at least 50 occupied dwellings; b) the cluster contains is clear evidence of an urban centre; c) the distance to the next nearest occupied dwelling does not exceed 100m (24).
[3] Green spaces are derived from the vegetation layer of PRIME2. The various land uses that this layer incorporates are detailed in Appendix Table A1.