2.1 Study population
UK Biobank is a large prospective cohort of middle-aged adults designed to support biomedical analysis focused on improving the prevention, diagnosis, and treatment of chronic disease, the methods and aim of which have been reported elsewhere7. In brief, between April 2007 and December 2010, UK Biobank recruited 502,628 participants (5.5% response rate, most of whom were age 40–70 years) from the general population8. Participants attended 1 of 22 assessment centers across England, Wales, and Scotland and completed a touch screen questionnaire, had physical measurements taken, and provided biological samples. All participants provided written informed consent, and the study was approved by the NHS National Research Ethics Service. This research has been conducted using the UK Biobank Resource under Application Number 56925.
In the present study, we included participants with data of distance from participant's residence location to the coast (n = 440,874), excluded participants with previous cardiovascular diseases (coronary heart disease and stroke, n = 28,980) or cancer (n = 34,544) at baseline, leaving 377,340 participants remained for analysis (Figure S1).
2.2 Ascertainment of outcome
In UK Biobank, hospital admissions were identified via record linkage to Health Episode Statistics records for England and Wales and the Scottish Mortality Records for Scotland8. Detailed information about recorded linkage procedures is available online. Incident MI, comprising fatal and non-fatal ST-segment elevation and non-ST-segment elevation MI, was defined as ICD 10 (international classification of diseases, 10th revision) code of I21, I21.4, and I21.9 recorded on hospital admission. At the time of analysis, the last recorded MI was on March 31, 2017, which was used as the censoring date for other participants if no outcome had been recorded, whichever occurred first.
2.3 Ascertainment of exposures
Environmental indicators attributed to participants were based on home location grid references. Data on the natural environment were linked using CEH 2007 Land Cover Map data. Measures of residential greenspace were estimated for England residents using the 2005 Generalized Land Use Database for England. It provides data on land use distribution for 2001 Census Output Areas in England and is consistent with previous related research9, 10. Residential distance to the coast was defined as the participant's residence location to the coast according to the participant's address, measured in Kilometers (km). The Euclidean distance raster from the coastline was calculated for a small grid cell size, then values from the grid allocated to UKB point locations. Based on existing literature, distances to coast were collapsed into five categories: 0–1 km, 1–5 km, 5–20 km, 20–50 km, and over 50 km11. To obtain approximately equal sample sizes per category, we divided the data into five quintiles for the current analyses.
2.4 Data on potential confounders and effect modifiers
Sociodemographic factors (age, gender, ethnicity, Townsend deprivation index, professional qualifications, income, employment, and month of recruitment), health-related variables (overall health rating, mental health, handgrip strength, family history of heart diseases, medication for aspirin, cholesterol, and blood pressure, prevalent diabetes and hypertension at baseline), lifestyle factors (smoking status, drinking status, body mass index, total physical activity, sedentary lifestyle, sleep duration, and dietary intake), residential air and noise pollution (nitrogen oxides, nitrogen dioxide [N2O], particulate matter [PM], traffic intensity, average daytime/night sound level of noise pollution), home area population density classified as urban or rural, and greenspace (domestic garden percentage, greenspace percentage, natural environment percentage, and water percentage) were treated as potential confounders.
Age was calculated from dates of birth and baseline assessment. Qualification, average total household income, current employment status, overall health rating, mental health status, family history of heart diseases, the medication used, and sleep pattern were recorded using an electronic questionnaire completed by participants. Smoking status and drinking status were categorized into never, former and current smoker or drinker. Area-based socioeconomic status was derived from the postal code of residence by using the Townsend deprivation score12. Dietary information was collected via the Oxford WebQ; a web-based 24 recall questionnaire developed specifically for large population studies13. Physical activity was based on self-report by using the International Physical Activity Questionnaire short form, and total physical activity was calculated as the sum of walking, moderate, and vigorous exercise measured as metabolic equivalents (MET-h/week)14. Grip strength was accessed through the use of a hydraulic hand dynamometer while sitting14, 15. Total time spent in sedentary behaviors was derived from the sum of self-reported time spent driving, using a computer, and watching television. Land use regression (LUR)-based estimates of NO2, PM10, and PM2.5 for 2010 were generated as part of the European Study of Cohorts for Air Pollution Effects (ESCAPE) and link to geocoded residential addresses of UK Biobank participants16. Noise estimates were derived from a simplified version of the Common Noise Assessment Methods in the European Union (CNOSSOS-EU) framework17. Home area population density classified as urban or rural was derived by combining each participant's home postcode with data generated from the 2001 census from the Office of National Statistics, using the Geoconvert tool from Census Dissemination Unit. More details for each variable are available on the UK Biobank website http://www.ukbiobank.ac.uk/.
2.5 Statistical analysis
Baseline characteristics of 377,340 participants were described as means or percentages and were compared between groups using one-way ANOVA test, the χ2 test, and the Kruskal-Wallis test, as appropriate. We coded missing data as a missing indicator category for categorical variables such as smoking status and mean values for continuous variables.
The association between residential distance to coast and MI was explored using Cox proportional hazard models. The proportional hazard assumption was checked by tests based on Schoenfeld residuals. The results were reported as hazard ratios (HRs) together with 95% confidence intervals (CIs). First, distance to the coast was treated as continuous variables, and HRs were calculated per 1 SD (26.7 km) difference in distance to the coast. Then we categorized the distance to coast into < 1 km, 1–5 km, 5–20 km, 20–50 km, and ≥ 50 km groups and calculated the HRs for the other four groups taking the first group (< 1 km) as reference. We also categorized distance to coast into quintiles (Q1-Q5) base on the sample distribution and calculated the HRs for the last four groups taking the first quintile (Q1) as reference. Models were arranged a priori to investigate the impact of incremental adjustment. Model 1 adjusted for age, gender, ethnicity, social deprivation, income, employment status, total physical activity, overall health rating, smoking, drinking status, BMI, and handgrip strength. Model 2 additionally adjusted for family history of heart diseases, medication for aspirin, cholesterol, and blood pressure, prevalent diabetes, and hypertension. Model 3 further adjusted for air pollution, noise pollution, sleep duration, dietary intake, and home area population density.
To examine the overall statistical significance and the non-linearity of the exposure, we used likelihood ratio tests. A multivariable restricted cubic spline with 3 knots was used to express the dose-response relationship. We calculated HRs for living in the offshore region (< 32 km) and inland region (> 64 km) using Cox-proportional hazard models with incremental adjustment separately, taking participants in the intermediate area (32–64 km) within the lowest risk interval as a reference, according to the result of the restricted cubic spline. We conducted subgroup analyses to assess potential modification effects by the following factors: sex, age, BMI, sedentary behavior, sleep duration, total physical activity, smoking status, drinking status, income, area-based socioeconomic status, mental health status, urban area, air pollution, noise pollution, and hypertension. A sensitivity analysis was also conducted to investigate the effect of removing MI occurring within the first 24 months of follow-up to reduce the possible impact of reverse causation. Effect modifiers were investigated by adding to the fully adjusted model an interaction term between exposure and each of these variables.
All analyses were performed with SPSS V26 (IBM) and Stata V15 (Stata Corporation, College Station, TX, USA). A two-sided P-value < 0.05 was considered statistically significant.