Design and study population
Data come from the Mexico City Diabetes Representative Study, a cross-sectional study conducted between May and June 2015. Participants were selected through cluster sampling, using basic geostatistical areas (AGEB, for its Spanish acronym) as the primary sampling unit. From each AGEB, systematic sampling was conducted to select six houses within six blocks. In each house, two adults aged 20–69 were systematically selected. Trained personnel collected information through face-to-face interviews with validated questionnaires. The response rate for the original study was 69%. Information on 1,334 adults was collected. We excluded adults with invalid or incomplete information on lifestyle and sociodemographic variables. We also excluded individuals with extreme total energy intake (with energy intake/basal metabolic rate < 0.05 or energy intake / estimated energy intake ratio > 3 S.D.), as previously described [10]. Thus, the analyses were conducted with 1,142 individuals (413 men and 729 women).
Sociodemographic and lifestyle factors
Data were collected using survey measures from the National Health and Nutrition Survey in Mexico [11]. Demographic variables included age, sex, years of school attainment, socioeconomic status (SES), marital status, and employment status.
Socioeconomic status was constructed using principal components analysis (PCA). Entered into the PCA were household characteristics (flooring material, ceiling, walls, water source, sewerage, number of persons residing in the household, and domestic appliances). The main extracted factor was divided into tertiles and used as a proxy for low, middle, and high SES. The detailed methodology is described elsewhere [12].
Marital status was defined considering four categories: single, living with a partner, divorced or separated, and widowed.
Employment status was defined based on the answers to two questions. First, participants were asked if they had worked at least for an hour in the last week. Those who answered "yes" were asked, "How many hours per week do you spend in your main job?". Working hours per week were categorized considering legal working hours per week in Mexico as the cut-off point (i.e., 48 or less per week or > 48 hours per week) [13]. The reference category for this variable included unemployed participants.
Lifestyle variables included smoking status, sleeping habits, alcohol consumption, sitting time, and physical activity levels. Smoking status was assessed considering the questions 1) "Have you smoked 100 cigarettes or more in your life?" and 2) "Are you currently smoking?". Current smokers were those who answered "yes" to both questions; ever smokers answered "yes" to question 1) and "no" to question 2), and never smokers answered "no" to both questions.
Sleeping habits were defined using the question "On average, how many hours do you sleep per day?" with five possible answers: 5 or less, 6, 7, 8, and 9 or more hours. Healthy sleeping habits were defined as sleeping between 6–8 hours per day, and unhealthy sleeping hours as 5 or less or 9 or more hours per day.
Alcohol consumption was classified into three categories: never consuming alcohol, consuming 5 drinks per week or less, or more than 5 drinks per week.
Physical activity was measured using the short version of the International Physical Activity Questionnaire (IPAQ). IPAQ consists of 7 questions that measure physical activity of moderate to vigorous intensity, as well as walking and sitting time. This instrument has good reliability in assessing moderate to vigorous physical activity among Mexican adults [14]. Data were processed following the IPAQ Analysis guidelines [15], and metabolic equivalent (MET)-minutes per week for total physical activity were calculated. Participants were classified as with low (< 600 MET-minutes/week or without activity reported), moderate (600–1500 MET-minutes/week), and high levels (≥1500 MET-minutes/week) of total physical activity.
IPAQ also measures sitting time as an indicator of sedentary activity over the past 7 days. Hours and minutes of sitting time in one day of the last seven days were asked. Daily sitting hours were estimated.
Dietary patterns
Diet information was collected with a validated semi-quantitative food frequency questionnaire (SFFQ) used in the National Health and Nutrition Survey in Mexico [11]. The SFFQ described the consumption of 140 foods over the past seven days before the interview and was administered by trained personnel using standardized data collection and entry procedures [16]. Food consumption is characterized by frequency categories ranging from never to seven days and one to six times per day. For each food, a commonly used portion size is specified on the SFFQ. We converted the reported consumption into grams of intake per day. Energy and nutrient values were estimated using a Food Composition Table compiled by the National Institute of Public Health [16].
Foods and beverages from the SFFQ were categorized into 23 food groups to identify dietary patterns (Supplementary Table 1). The food items were placed in a specific food group by the similarity of nutrient profiles (e.g., the proportion of lipids, proteins, carbohydrates, or dietary fiber) or amount of sugar added (e.g., sweetened beverages). We classified the group of tortillas as a sole component because of its culinary use in Mexican culture. Each food group's contribution to the total energy intake was calculated as a percentage and standardized (z-score). A cluster analysis was performed to derive dietary patterns by the k-means method [17, 18] based on the food group's contribution. Two to five solutions were tested. The solution that best discriminated across groups while maintaining enough cases in each group was selected. We also considered the Calinski-Harabaz criterion to determine the optimal number of clusters. We used the percentage energy contribution that significantly differed from other food groups between clusters to define dietary patterns. Patterns were named based on the major food groups and nutrients that characterized each group relative to others and considering a minimum of 0.30 in z-score.
Statistical analysis
The resulting dietary patterns were described according to the mean energy contribution (%). Descriptive statistics were obtained for continuous and categorical sociodemographic and lifestyle variables, in the total sample and across sex. The adjusted relative risk ratio (RRR) of belonging to a dietary pattern was assessed through a multinomial logistic model stratified by sex. We considered the dietary patterns as the outcome variable using the healthier pattern as the reference category, and sociodemographic (age, sex, years of school attainment, socioeconomic status, marital status, and employment status), and lifestyle factors (smoking status, sleeping habits, alcohol consumption, sitting time, and physical activity level) as independent variables. The significance level was established at alpha 0.05. All the analyses were performed using expansion weights and adjusted by survey design with the SVY module of Stata 14 ®.