Patients and Sample Selection
This study population included 1083 ESCC cases and 1786 controls. All cases were diagnosed between October 2010 and September 2013 in Taxing, which is one of the cities with the highest incidence of ESCC in East China. More than 90% of the esophageal cancer patients in this area are referred to the 4 largest hospitals (the People’s Hospital of Taixing, the Second People’s Hospital of Taixing, the Third People’s Hospital of Taixing, and the Hospital of Traditional Chinese Medicine of Taixing), individuals diagnosed by the endoscopy units in these hospitals were invited to participate. This approach was designed to reduce nondifferential recall bias, given that these patients were unaware of their cancer diagnosis at the time of recruitment and data collection.(29) All personal and clinical data were surveyed by trained staff with the help of a specifically designed electronic questionnaire. We also matched the cases and controls by age and sex. The detailed study design, including quality control and inclusion and exclusion criteria, have been previously reported(10, 30-32). The patient selection process is shown in Figure 1.
SNP Genotyping, Screening and Quality Control
For this study, 101 SNPs from 59 genes were selected. All SNPs have been previously reported as ESCC susceptibility loci and identified by Genome-wide association study. The SNPs were genotyped using a three-round multiplex polymerase chain reaction procedure with next-generation sequencing method.
To ensure genotyping accuracy, we also implemented quality control procedures, such as by including negative controls. In addition, a randomly selected 8% of total samples were genotyped twice and the consistency was higher than 98%. The average sequencing depth was 1225x. All SNPs had a minor allele frequency of 0.1 or more in both the case patient and control samples, rendering adequate statistical power. Among the 101 SNPs, 4 SNPs were monozygotic, 14 did not reach Hardy–Weinberg equilibrium, and 5 had a missing rate of >10%.
Definition of Tobacco consumption
We collected detailed data on tobacco consumption, including smoking status (never, ever, and current smoker), cigarettes consumption exposure in pack-years, and deep inhalation during smoking (Yes/No). Moreover, according to National Cancer Institutes’ recommendation(33-36), cigarettes consumption exposure was redefined in the form of categorical variables by 30 pack-years, namely, never smokers (0 pack-years), moderate smokers (≤30 pack-years), and heavy smokers (>30 pack-years).
Definition of other variables
In this case-control study, participants were interviewed face-to-face with structured questionnaires, and information on basic characteristics was collected, including age, sex, smoking, alcohol intake, education status, wealth score, marital status, hot tea consumption. Alcohol intake was measured in three categories. (Never/Quitted/Drinker). Wealth score was a continues indictor calculated by a multiple correspondence analysis. It was consisted of the ownership of valuable home items for each participant, like television, cars, washing machines, vacuum cleaner etc. We defined the hot tea or not by calculating the time between placing tea leaves mixed with boiling water and tea drinking. Time less than 5 minutes was regarded as hot tea (Yes), then the warm tea (No).
Student t-test was used to analyze differences in continuous variables between the case and control groups; the chi-square test, to analyze differences in unordered categorical variables; and Kruskal-Wallis rank–sum test, to analyze differences in ordinal categorical variables. Odds ratios (ORs) with 95% confidence intervals (CIs) were calculated to quantify the susceptibility to ESCC as determined by each SNP.
We established two adjusted models to minimizing the effects of potential confounding variables. One model was adjusted for age and sex, and another was adjusted for smoking, alcohol intake, education status, wealth score (which was calculated based on the ownership of valuable home items using a multiple correspondence analysis), tea consumption and teeth brushing frequency. The dose–effect relationship between categorical variables and ESCC risk was evaluated with a Chi-square test for trend. The Chi-square test was also used to examine the Hardy-Weinberg equilibrium (HWE), and P > 0.05 was considered to indicate equilibrium(37).
Univariate logistic regression was used to assess the association between genotype distribution and ESCC risk in codominant, dominant, recessive and over-dominant models, with the ancestral allele as reference. The Akaike information criteria score was used to select the fittest model for every SNP(38). Inherited model classification and calculation were conducted using the R package “SNPassoc” (https://cran.r-project.org, package = SNPassoc)(39).
Multivariable logistic regression models were used to calculate the statistical signiﬁcance of SNPs and smoking as indicators of the ESCC risk. We included the SNP–smoking product term as a measurement of interaction on a multiplicative scale, and adjusted for age, sex, alcohol consumption, wealth score, tea consumption and teeth brushing frequency. The significance of the interaction term was tested by the likelihood ratio test through comparison of two models. Models as follows:
Moreover, as recommended by Andersson et al. and Knol et al.(40, 41), the relative excess risk of interaction (RERI) and the synergy index (S) were also used to evaluate interaction on an additive scale. RERI was calculated with the R package epiR (https://cran.r-project.org, package = epiR).
For individuals, Genetic Risk Score (GRS) (42, 43)was constructed by the sum of effect allele. The formula is described below:
where GRS(i,j) stands for the GRS value for the ith individual, nj for the numbers of SNPs in ESCC risk, and G(i,k) for the dosage in kth SNP for ith individual.
All statistical analyses were performed on the R software (Version 3.6.2; https://cran.r-project.org/). P < 0.05 was considered to indicate statistical significance in all the tests.
The reporting of this study conforms to STROBE guidelines.(44)