Study population
In this nested case–control study, we explored an epigenome-wide association of DNA methylation in people with incident T2DM who were identified from the Rural Chinese Cohort Study with 6 follow-up years. Details about the design, participants, methods, and measurements of the cohort study have been previously described [16]. Participants with incident T2DM were defined as those who did not have T2DM at baseline, but T2DM developed during follow-up. Controls were participants who did not have T2DM at both baseline and follow-up. After excluding participants with baseline T2DM (n = 1,499), T1DM (n = 13), cancer (n = 28), stroke (n = 372), myocardial infarction (n = 183), chronic obstructive pulmonary diseases (n = 353), and chronic kidney disease (n = 294), 14,523 non-diabetic participants at baseline were the target participants for the current analysis. During the follow-up, 707 new cases of T2DM were ascertained. Among the new-onset 707 diabetic cases and 13,816 non-diabetic participants, 293 pairs of study population were selected using the case-control matching method for DNA methylation and SNPs measurement [17]. Controls were matched to cases on a 1:1 basis by age (birth at the same year), sex, ethnicity, marital status, and residence (live in the same village). Finally, 293 pairs of cases and controls with complete information and blood samples were included in this nested case–control study (Figure S1).
Data Collection And Relevant Definitions And Diagnosis Criteria
We used a standard questionnaire to collect information on demographics (age, sex, ethnicity, residence and marital status), lifestyle behaviors (smoking, alcohol drinking, and physical activity), and personal and family medical history at baseline. Smoking was defined as ever smoking at least 100 cigarettes during the lifetime [18]. Alcohol drinking was defined as consuming drinking at least 12 times during the last year [18]. Physical activity level was classified as low, moderate and high according to the International Physical Activity Questionnaire [19]. The detailed information on body weight, height, waist circumference (WC), body mass index (BMI), systolic blood pressure (SBP), diastolic blood pressure (DBP), fasting plasma glucose (FPG), total cholesterol (TC), triglycerides (TG), and high-density lipoprotein cholesterol (HDL-C) were described as previously published [20]. BMI was calculated as weight(kg)/height(m)2. Low-density lipoprotein cholesterol (LDL-C) was calculated by the Fried Ewald formula [21].
T2DM was defined as fasting glucose ≥7.0 mmol/L and/or the use of insulin or oral hypoglycemic agents, and/or a self-reported history of diabetes, which agreed with the diagnostic criteria of T2DM at both baseline and follow-up examination [22].
Quantitative DNA Methylation Measurement
DNA samples were extracted from baseline fasting peripheral-blood leucocytes by the automated nucleic acid extraction system (BioTeke Corp., Beijing). The target sequence covered 428 bp (Chr16:53703509–53703936) located in the promoter region of FTO. The sequenom® EpiDesigner system was used to design the primer sequences: forward 5’-aggaagagagTTTGTAGGATTTGGATAGAGATGGT-3’ and reverse 3’-cagtaatacgactcactatagggagaaggctAAATCCAAAAAAAACTACATTTCCC-5’. After genomic DNA was treated with bisulfite, PCR was used to amplify the target sequence, followed by removing 5'- and 3'-phosphate groups from the products by using shrimp alkaline phosphate. EpiTYPER biochemistry began with bisulfite treatment of genomic DNA, followed by PCR amplification of target regions, then DNA transcription in vitro, and Uracil-specific DNA cleavage. Finally, the mass spectra of cleavage products were analyzed by using MALDI-TOF mass spectrometry based on the MassARRAY System (Bio Miao Biological Technology, Beijing), and the level of DNA methylation was measured by MassARRAY EpiTYPER analysis (Agena Bioscience, San Diego, CA). DNA methylation was detected in case-control pairs.
Tag-SNP Selection And Genotyping
The Tag-SNPs were selected from an extensive review of the literature and from the HapMap and NCBI databases. The selection criterion was minor allele frequency (MAF) > 0.01. Eleven SNPs in FTO (rs72803657, rs9939609, rs1121980, rs17817449, rs8050136, rs9940128, rs9926289, rs11076023, rs1558902, rs1421085, and rs9941349) were selected for our study. All SNPs met Hardy–Weinberg equilibrium (HWE). Finally, four Tag-SNPs were selected in our study based on the linkage disequilibrium (LD) analysis (r2 ≥ 0.8). Specific primers were designed by using Assay design3.1 software (Sequenom Inc). Genotyping and PCR involved using a MassARRAY Genotyping system (Agena Inc). We re-genotyped approximately 4% of random samples to control the quality. The agreement of the genotypes determined for the blind control samples was 100%.
Statistical analysis
Baseline characteristics were compared between T2DM cases and controls. Continuous variables are shown as median (interquartile range) for data with skewed distribution, and the Wilcoxon Rank Sum Test was used to assess differences in these data. Continuous variables with normal distribution are shown as mean (SD) and were analyzed by t-test. Categorical variables are shown as number (%) and were analyzed by chi-square test. Chi-square test was used to test for HWE among controls. The association between SNPs and T2DM was assessed by multiple logistic regression models with a dominant genetic model of Tag-SNPs. Haploview was used for analysis of LD and haplotype analyses was used to explore the association of possible haplotypes with T2DM. The methylation level was non-normally distributed and was compared by Wilcoxon Rank Sum Test. Considering important nonmatching variables between cases and controls apart from age, sex, marital status and residence region[8], we used unconditional logistic regression models to evaluate the association between methylation level and risk of T2DM. We adopted three models: 1) unadjusted; 2) adjusted for smoking, alcohol drinking, physical activity, SBP, FPG; and 3) adjusted for smoking, drinking, physical activity, SBP, BMI, and FPG, TG and HDL-C levels. Spearman correlation analysis was used to explore the association with T2DM-related quantitative phenotypes. Wilcoxon-Rank Test was used to compare the methylation level in different SNP alleles. Finally, we used Generalized Multifactor Dimensionality Reduction (GMDR) to explore potential interactions of Tag-SNPs, anthropometry indexes (BMI, WC, and TC, TG, HDL-C and LDL-C levels), environmental factors (smoking, alcohol drinking, physical activity) with CpG locus.