Study design and participants
This study is a longitudinal analysis of the Health Workers Cohort Study (HWCS), an ongoing prospective cohort study established in January 2004 with two cohort follow-up waves at six-year intervals on average. The participants are employees from three different health and academic institutions, as well as their relatives, from the cities of Cuernavaca and Toluca, Mexico.
Details concerning the study population of HWCS and full study design have been described elsewhere . Briefly, from January 7, 2004 to November 27, 2007, 10,729 participants aged 6-94 years old, were recruited. However, due to financial constraints, only 2,500 (23.3%) of the initially enrolled participants from Cuernavaca were invited to the first follow-up phase between 2010 and 2013, with a response rate of 83% (n = 2,070). Figure 1 shows the flow chart of the included participants in this study. For our analysis, we excluded participants who at baseline were younger than 19 years old (n=169), who had missing data on soft drinks (n=22), as well as pregnant women at baseline (n=3). Subjects with missing type 2 diabetes baseline data (n=36) or with previously known or newly diagnosed diabetes (n=127), heart disease (n=43), or cancer (n=6) (except skin or melanoma) at baseline, were excluded. We also excluded 80 participants who responded <75% of the food-frequency questionnaires (FFQs), had missing data in an entire section of the FFQ or implausible energy consumption defined as those who were below a predefined limit of 500 kcal/d or above 6,400 kcal/d, following the standard deviation method , as previously used in studies from this cohort [16,17]. After excluding 144 participants with incomplete data for disease outcome at six-year follow-up, 1,445 participants were used as our analytic sample.
Figure 1. Flow chart of the study participants from the Health Workers Cohort Study included in the analytic sample, from 2004 to 2013.
Soft drinks intake
Soft drinks consumption was assessed at baseline and the subsequent examinations with a semi-quantitative 116-item FFQ that has been validated in the Mexican population . Participants were asked to report the frequency of consumption of a standard portion of each food in the last 12 months using ten possible responses (never, <1 time/month, 1-3/month, 1, 2-4, 5-6 times/week, 1, 2-3, 4-5, 6 or more times/d). Soft drinks were defined as cola soft drinks and flavored carbonated soft drinks with a standard serving of 355 ml. We converted the reported frequency of soft drinks into a daily intake. The frequency was converted into four categories of intake (<1/month, 1-4,/month, 2-6/week, and ³1/d) to get comparable data of soft drinks consumption with previous studies . However, due to most participants were in the middle two categories of consumption (74.1%), we reclassified the categories of exposure as follows: <1 time/week, 1-4 times/week, and >5 times/week.
Type 2 diabetes
Incident type 2 diabetes was defined as having one of the following three criteria during follow-up: self-report of physician-diagnosed type 2 diabetes, new use of hypoglycemic medication, or fasting glucose >126 mg/dL during the examination . A fasting venous blood sample (fasting time ≥ 8 hours) was collected from each participant. We measured fasting glucose with the enzymatic colorimetric method by using glucose oxidize with a Selectra XL instrument (Randox, ELITechGroup, Delhi, India). The onset of type 2 diabetes was defined based on either the date of the follow-up examination or the year of physician diagnosis self-reported by the participants. Intervals of one-year between the two examinations were included in the questionnaire to record the time since type 2 diabetes diagnosis. June 30th was set as the diagnosis date for each year. We estimated the date of physician diagnosis subtracting the date of type 2 diabetes diagnosis to the date when completed questionnaires were returned.
At each study wave, participants completed a self-administered questionnaire that included information regarding demographic characteristics (age, sex, and educational level), previous and current illnesses, family history of diabetes, medication use, and lifestyle habits (smoking status and physical activity). We used the same measurement instruments for time-varying covariates to ensure comparability across waves. Educational level was categorized as middle school or less, high school, college or more. Participants were classified according to smoking status as never, former, and current smokers. Alcohol consumption (in g/d) was estimated from FFQ and categorized in tertiles. We calculated total energy intakes in kilocalories by multiplying the frequency of consumption of each food by the energy content of the food and summing over all foods. Leisure time of physical activity was assessed through a validated physical activity questionnaire . Participants were asked to report the weekly leisure time to 16 activity items like walking, running, and cycling. Participants were classified as active if their leisure time of physical activity was ≥150 min/week .
Medical examinations and anthropometric measurements were also performed. All anthropometric measurements were performed by nurses trained to use standardized procedures. Reproducibility was evaluated, resulting in concordance coefficients between 0.83 and 0.90. Weight was assessed on participants wearing minimal clothing with a previously calibrated electronic TANITA scale. Height was measured with a conventional stadiometer. Body mass index (BMI) was calculated as weight (kg) divided by the square of height (m2). Waist circumference (WC) was measured midway between the lowest border of the rib cage and the upper border of the iliac crest, while the participant was standing up. We defined abdominal obesity as waist circumference >90 cm for men and >80 cm for women . Resting blood pressure (mmHg) was measured twice using an automatic digital blood pressure monitor, and the average of two measurements was calculated. Subjects with a systolic or diastolic blood pressure of >140 mmHg or >90 mmHg, respectively, as well as those who reported use of antihypertensive medication, were classified as hypertensive.
The study sample characteristics across categories of soft-drinks intake were described as means and standard deviation, as medians with interquartile ranges (IQR) for skewed distributions, or percentages for categorical variables. Because of the frequency of missing data at baseline for smoking status (3.5%), education level (2.4%), and abdominal obesity (1.4%), we used a missing indicator category for these covariates to minimize sample size reduction. We calculated person-years of follow-up from the date of returning the baseline questionnaire to the date of type 2 diabetes diagnosis or were censored on the date of their final follow-up visit. To examine the association of soft drinks consumption at baseline with type 2 diabetes, hazard ratios (HRs) along with 95% confidence intervals (CIs) were estimated using Cox proportional hazards regression with the time on study as the time scale. The category of <1 time/week was considered as the reference group in all analyses.
Several models were fitted to assess the relationship between soft drinks intake and type 2 diabetes incidence. Model 1 was adjusted only for the age of participants (centered, continuous variable). Model 2 was further adjusted for potential confounders identified after reviewing the literature and by using the causal diagram methodology to select all variables related to the exposure and outcome. We considered the following covariates in the multivariate-adjusted analyses: sex, educational level (middle school or less, high school, college or more, missing), total energy intake (continuous), smoking status (never, former, current, missing), leisure-time physical activity in hours per week (active ≥150 min/week), family history of diabetes (no, yes, unknown), and alcohol intake at baseline (tertiles of g/d). Multivariable model 2 was further adjusted for hypertension status (no/yes), to test the potential confounding effect of hypertension in the association of soft drinks and type 2 diabetes. Some studies have suggested that having hypertension increases the risk of type 2 diabetes, while at the same time assuming that hypertensive individuals can alter their soft drinks consumption [24,25]. To examine the potential confounding effect of obesity, we additionally adjusted model 2 for BMI and abdominal obesity at baseline, separately. People with overweight and obesity are more likely to have more energy-dense diets, including soft drinks, than people with healthy weight . On the other hand, obesity is a leading risk factor for type 2 diabetes .
The potential modifying effect of first-degree family history of diabetes, as a proxy for genetic susceptibility for type 2 diabetes risk [13,28], was evaluated by stratification on the family history of diabetes, and HRs within each stratum were compared. Also, we examined the overall interaction using the Wald test. This analysis just included the information of participants who responded yes or no in the variable of family history of diabetes (n=1,339). We conducted tests for a linear trend in the HRs by assigning the median value to each category of soft drinks and modeling this variable as a continuous variable into separate Cox regression models (adjusting by the same covariates). The proportional hazards assumption was assessed by a graphical check on the cumulative log hazard versus time and tested by using Schoenfeld residuals , which test the null hypothesis of zero slopes for individual covariates and globally for each regression model. The assumption of proportional hazards was not violated (P > 0.05).
We conducted a complete case analysis using data from those participants with complete follow-up data from 2004 to 2018 (n=600). The complete case approach was not considered as main analysis because the smaller sample size and large loss to follow-up that could affect estimates through selection bias. We are aware that potential changes in soft drink consumption over time due to the ageing of the cohort may impact the soft drink consumption . For this purpose, we further used Cox proportional hazards models where the soft drinks consumption was updated from the follow-up questionnaire and considered as a time-varying variable in these regressions.
All P-values were two-tailed and P <0.05 was considered significant. Statistical analysis was performed using Stata version 14.0 (StataCorp, College Station, TX, USA).