This study used data from the LIBCSP, a population-based case-control study designed to examine the effects of environmental exposures on breast cancer risk in Nassau and Suffolk counties on Long Island, New York. Details of the parent study have been published previously [30]. Institutional Review Board approval was obtained from all participating institutions. This analysis was reviewed by the UNC IRB and was determined to be exempt.
Study Population
LIBCSP population-based sample: The LIBCSP consists of English-speaking women from Long Island, NY. Study enrollment occurred between August 1, 1996 until July 31, 1997. Women aged 20 years or older diagnosed with either invasive or in situ were identified through 31 hospital pathology department in LI and NYC. Control were women over the age of 65 who were identified using probability based random digit dialing. Controls were matched based on the predicted age demographics of BC cases by 5-year age groups. The final study sample included 3,064 women of which 1,508 were cases (1,273 with invasive disease) and 1,556 controls [30]. Ninety-three percent of women identified as White, 5% Black, 2% as other. Of the total cohort 4% identified as Hispanic [30]. The racial distribution of the cohort is representative of the target population at study enrollment
LIBCSP baseline assessments: A structured two-hour baseline questionnaire was conducted by trained interviewers shortly after diagnosis. The interview assessed pre- and at diagnosis BC risk factors and demographic information. Participation in the interviewers were higher among cases (82.1%) than in controls and (62.7%) [30]. To minimize the influence of treatment, blood samples were obtained before treatment initiation for 77.2 percent of breast cancer patients [30].
PAH Exposure Sources Assessment
Detailed PAH source assessment methods in the LIBCSP have been previously published [31]–[34]. The interviewer-based questionnaire included questions on active and passive smoking, grilled/smoked meat consumption, synthetic wood usage, and historical residential addresses[16], [20], [31], [35], [36].
Active/ Passive Smoking— Current active smoking (yes, no) was defined as smoking during the previous 12 months. To evaluate residential ETS exposure study participants individuals were asked whether they have ever lived with a marital partner who smokes (n=1515 controls/1468 cases) [31].
Grilled/ Smoked Food Intake— Lifetime consumption was defined as the mean number of servings ingested per year based the distribution among controls. Women were asked to recount their consumption patterns of 4 categories of grilled or smoked foods (smoked beef, lamb, and pork; grilled/barbequed beef, lamb, and pork; smoked poultry or fish; and grilled/barbequed poultry or fish) during 6 decades of life. For breast cancer cases, assessment stopped at the age of diagnosis. Based on previous findings from the LIBCSP, this variable was dichotomized (<55 servings/year, 55+ servings/year) [16]
Synthetic Log Use – Participants who reported using an indoor stove or fireplace at least 3 per times were asked what types of materials ( wood, coal, synthetic logs, or gas) were burned [20].
Vehicular Traffic—A validated geographic model was developed to estimate vehicular traffic exposure during and before 1995 among study participants [33], [49], [196]. BLR software was used to geocode women's present and former residential locations in Nassau and Suffolk. Geocoding was limited to addresses where study participants resided for at least 1 year. The traffic model incorporated a road network of half-million streets in the greater NY metro area The model included information on past US automobile PAH emission data, NY metro traffic patterns including historic traffic counts and weather and traffic related emission patterns, weather variables, and pollutant dispersion parameters [33]. In a previous analysis of the LIBCSP, the association with breast cancer was found to be limited to the top 5% of those exposed to vehicular traffic (low risk: <95th percentile, high risk: ≥ 95th percentile) [35].
Measurement of Plasma 25(OH)D
Plasma 25(OH)D-- Plasma vitamin D was measured in 2,101 total study participants (1,026 cases and 1,075 controls). The Diasorin RIA method was used to quantify plasma 25(OH)D, which measures both vitamin D3 produced in the skin and dietary derived vitamin D2 [198]. Samples were tested between September 2007 and December 2007 using 8 distinct assay lots [37].
Adjustment for Seasonal Trend in Vitamin D
To estimate the seasonal trend of vitamin D, data from controls were used to fit the following models:
where w(t) is the measured vitamin D concentration for study participant at week t, ς(t) is the seasonal trend in the population and we assume that the error, ε is approximately Normal (0, σ2) and is independent of ς(t) [38]. To remove variation due to season of blood draw, we added the study specific mean to the residuals obtained by applying the parameter estimates from the above model to the entire study population [39]. Adjusted vitamin D measurement was then dichotomized as <30 ng/mL and ≥30ng/mL based on the distribution of this constructed variable among controls. This cut-point was determined using restricted cubic splines based on the value above which breast cancer risk began to decrease. The adjusted values were used for all subsequent analyses that involved measured vitamin D.
Genotyping Assays
The SNP analysis was limited to White women (967 cases / 993 controls) due to sample size and population stratification concerns. We selected 25 SNPs based on a previous analysis in the LIBCSP for their known or suspected impact on the vitamin D pathway, or because they have shown associations with breast cancer in previous studies This selection included 13 SNPs in VDR: BsmI (rs1544410), rs2071358, rs2239181, rs2239182, rs2408876, rs2544038, rs3782905, rs7299460, TaqI (rs731236), ApaI (rs7975232), rs10875694, rs11168287, and rs11168314; 10 SNPs from 24-hydroxylase (CYP24A1): rs927650, rs2181874, rs2244719, rs2585428, rs2762939, rs3787557, rs4809960, rs6022999, rs6068816, and rs13038432; and two from the vitamin D-binding protein (GC): rs4588 and rs7041 [40].
As described previously, SNPs were genotyped using the fluorogenic 5′-nuclease or TaqMan assay, using the TaqMan Core Reagent Kit (Applied Biosystems, Foster City, California [30]. The fluorescence profile of each well was measured in an ABI 7500HT Sequence Detection System, and the results analyzed with Sequence Detection Software [30]. Controls for genotype at each locus and two controls with no DNA were included on each plate [30]. Laboratory personnel were blinded to case/control status.
Genotypes were dichotomized by homozygous common allele and heterozygous or homozygous minor allele. The referent group was determined based on genotypes with the presumed lowest risk in the main effects model from previous analysis in the LIBCSP[40].
Confounder Identification
A review of the literature was used to develop a directed acyclic graph (DAG) and determine potential confounders[41], [42]. Based on the DAG, the minimally sufficient adjustment set included: age at menarche (modeled with restricted quadratic splines with five knots at equally spaced percentiles); parity (nulliparous, parous); lifetime alcohol intake (non-drinkers, <15g/day, 15g–30g/day, ≥30 g/day); education (high school graduate or less, some college, college or post-college); income (<$34,999, $35,000–$69,999, ≥$70,000); BMI (<25kg/m2, 25-30kg/m2 , ≥30kg/m2), months lactation (continuous), and the frequency matching factor, 5-year age group (20-24 year, 25-29 years, 30-34 years, 35-39 year, 40-44 years, 45-49 years, 50-54 years, 55-59 years, 60-64 years, 65-69 years, 70-74 years, 75-59 years, 80-84 years, 85-89 years).
Statistical Analysis
We used unconditional logistic regression to estimate odds ratios (ORs) and 95% confidence intervals (CIs). Multivariable models included the minimally sufficient confounder adjustment set described above. Effect measure modification (EMM) was evaluated on both the additive and multiplicative scales considering both measured vitamin D level and vitamin D-related SNPs as potential modifiers of the relationship between PAH exposure and the risk of breast cancer. For EMM on the additive scale, single-referent models, where the lowest risk group served as the referent, were constructed to compute interaction contrast ratios (ICRs) and corresponding 95% CIs. [43], [44].To analyze EMM on the multiplicative scale, models with and without the interaction term were compared to compute the likelihood ratio test (LRT) with an α=0.05 statistical significance criterion [43]. We also computed the ratios of odds ratios (RORs) and 95% CIs [45]. Treatment may affect circulating vitamin D levels, so we conducted sensitivity analyses restricting the sample to women in whom blood was collected prior to chemotherapy (n=1041). All analyses were completed in SAS 9.4 (Cary, NC).
For each of the 25 selected polymorphisms, subjects were divided into three groups based on genotype (homozygous common, heterozygous, homozygous minor allele). Deviation from Hardy-Weinberg equilibrium was assessed using Proc Allele in SAS/Genetics version 9.4 (SAS Institute Inc., Cary, NC) at α = 0.05. No SNPs exhibited significant departure from HWE [46]. We next assessed the minor allele frequency in both cases and controls. Finally, the SNPs were assessed for linkage disequilibrium. We used the Benjamini-Hochberg false discovery rate (FDR) method for all SNP models to address multiple hypothesis testing for the associations between SNPs and breast cancer [47].