The rs28757157 and rs59429575 Polymorphisms in CYP19A1 are Associated with Lung Cancer in Chinese Han Population

Background Lung cancer is the leading cause of cancer death globally. Recent studies have revealed that the CYP19A1 gene played a crucial role in cancer initiation and development. Objectives The aim of this study was to assess the association of CYP19A1 genetic polymorphisms with the risk of lung cancer in Chinese Han population. Method This study randomly recruited 507 lung cancer patients and 505 healthy controls. The genotypes of four SNPs of CYP19A1 gene were identified by Agena MassARRY technique. Genetic model analysis was used to assess the association between genetic variation and lung cancer risk. Odds ratio (OR) values and 95% confidence intervals (CIs) were provided for the evaluation of the lung cancer risk effect. Results Rs28757157 and rs59429575 polymorphisms of CYP19A1 were significantly correlated with the risk of lung cancer. In stratified analysis, rs28757157 was associated with increased cancer risk in males and smoker. Meanwhile, rs59429575 was identified as a risk biomarker in female and lung adenocarcinoma patients ( p < 0.05). While rs28757157 exerted protective role among people with a BMI greater than 24 ( p = 0.033). Conclusions This study identified two new SNPs (rs28757157 and rs59429575) of CYP19A1 associated with lung cancer in Chinese Han population. These findings provide data support for further functional studies of CYP19A1 in lung cancer.


Introduction
Lung cancer is a kind of malignant tumor with high morbidity and mortality 1 . In China, this malignant tumor has the highest mortality rate, accounting for about 25% of cancer-related deaths 2 . At present, there are many risk factors were found to increase the risk of lung cancer. Among them, smoking seems to be most strongly associated with lung cancer risk 3 . However, new research shows that about 10-15% of people diagnosed with lung cancer are non-smokers, indicating the importance of other risk factors such as exogenous air pollution, environmental factors and genetic factors.
According to the latest statistics, about 8% of lung cancer cases are estimated to be caused by genetics alone 4,5 . The fact that the risk of lung cancer in the immediate family of a patient with lung cancer increased by 2.4 times further confirms the decisive role of genetic factors in disease risk 6 . In addition, among non-smoking related lung cancer patients, women have a higher risk of lung cancer than men, especially adenocarcinoma. The incidence of lung cancer in non-smokers is 15% in women and only 10% in men. These findings have drawn attention to the effects of the estrogen on lung cancer risk 7 . It has been reported that both estrogen receptor and aromatase are present in human lung tumors 8-10 . These results suggest that estrogen may play a role in the biological behavior of human lung cancer.
Cytochrome p450 (CYP450) enzymes are pivotal for biological homeostasis. CYP450 enzymes also play a key role in the metabolism of many endogenous substrates and exogenous carcinogens as well as aromatic and heterocyclic amines. They then covalently combine with DNA to form DNA adducts, which in turn cause cancer 11,12 . The CYP450 family 19, subfamily A, polypeptide 1 (CYP19A1) gene encodes aromatase, which is a member of the CYP450 superfamily of enzymes and a key enzyme in oestradiol biosynthesis. Mutations in the CYP19A1 gene can result in either increased or decreased aromatase activity, and aromatase plays an important role in lung cancer 13,14 . This suggests that CYP19A1 genetic variation may indirectly affect the progression of lung cancer, but the exact mechanism is unclear. At the same time, many literatures have reported the inseparable relationship between CYP19A1 gene and lung-related diseases, including lung cancer 15-17 .
Based on previous research, the present study explored the relationship between four single nucleotide polymorphisms (SNPs) of CYP19A1 gene and lung cancer susceptibility through casecontrol study, with particular attention to sex and smoking status of the participants.

Materials And Methods Participants
From May 2015 to February 2018, we recruited 507 pathologically confirmed lung cancer patients from the Shaanxi Provincial Cancer Hospital. The control group was composed of 505 healthy subjects who were volunteer blood donors from physical examination center at the same hospital as the cases.
Eligible study participants were screened by completing a specialized questionnaire, including demographic characteristics, disease history, lung status, and family history of other types of tumors.
All participants were Chinese Han ancestry from northwest China.

SNP genotyping
Genomic DNA was extracted from collected peripheral blood samples using a DNA purification Statistical analysis and Bioinformatics analysis SPSS software (SPSS 22.0, USA) and Microsoft Excel were used for statistical analysis. The differences in sex and age distribution between the case group and the control group were determined by the χ 2 test and independent sample t-test, respectively. χ 2 test was used to determine whether individual polymorphisms were in Hardy-Weinberg equilibrium (HWE). In addition, χ 2 test was used to difference detection on allele and genotype frequencies between cases and controls. The PLINK software (http://www.cog-genomics.org/plink2/) was adopted to define the relationship between the polymorphisms and risk of lung cancer among Chinese Han nationality population in different genetic model analysis (genotype model, model of dominant and recessive models and additive model).
Logistic regression analysis was used to calculate the odds ratio (OR) and 95% confidence interval (CIs) to evaluate the lung cancer risk effect 19-21 . The p < 0.05 was considered statistically significant in all tests. The functionality of candidate SNPs were annotated using the HaploReg v4.1 database. (https://pubs.broadinstitute.org/mammals/haploreg/haploreg.php).

Study population
In this study, a random sample of 507 lung cancer patients (353 males and 154 females) was involved as well as 505 healthy control people (354 males and 151 females), with an average age of 60.75 ± 9.98 and 60.40 ± 7.39 (Table 1). There was no significant difference in sex and age distribution between the case group and the control group (p > 0.05). In addition, the characteristics of the study population were collected for subsequent studies, including body mass index (BMI), smoking and drinking history, pathological type, pathological stage, and lymph node metastasis (LNM).

Basic information of the selected SNPs
Four SNPs in CYP19A1 were genotyped among the subjects. The basic information of all candidate SNPs is listed in Table 2. All SNPs are located on chromosome 15 and in the different position of CYP19A1 gene. The deviation of Hardy-Weinberg equilibrium in the control group was evaluated, and the results showed that the candidate SNPs all met the expected p value (p > 0.05), and all SNPs satisfied further study. In addition, under the allele risk model, there was no significant difference in the distribution of alleles of each SNP between the lung cancer cases and the control group. (p > 0.05). Functional prediction of SNPs were conducted in HaploReg v4.1 database to explore their regulatory effect. The results showed that the four SNPs exhibited potential biological functions in gene regulation.

Genetic model analyses of the selected SNPs
Four genetic models analysis for the relationship between rs28757157 and rs59429575 polymorphism and risk of lung cancer are listed in Table 3. Our results revealed an association between rs28757157 and increased risk of lung cancer in the genotype model (OR = 1.33, 95% CI: 1.03-1.73, p = 0.032).
Rs59429575 was also associated with an increased risk of lung cancer in recessive model (OR = 2.03, 95% CI: 1.00-4.11, p = 0.049). In addition, we conducted a stratified analysis to explore the risk effects of these SNPS in specific groups of people. Stratified analysis of the clinical characteristics of rs28757157 polymorphism are showed in Table 4. The results indicated that rs28757157 heterozygote genotype (CT) was associated with increased susceptibility to lung cancer in people under 60 years of age (OR = 1.6, 95% CI: 1.09-  Stratification analyses by clinical characteristics of the rs59429575 polymorphism are summarized in Table 5. Stratified analysis by sex demonstrated a remarkable relationship between enhanced lung cancer risk and the GG-GC genotype of rs59429575 (OR = 5.18, 95% CI: 1.12-24.05, p = 0.036). In addition, in the patients with lung adenocarcinoma, rs59429575 was identified as a risk factor for lung cancer development. (Homozygote: OR = 2.37, 95% CI: 1.01-5.56, p = 0.047; recessive model: OR = 2.55, 95% CI: 1.09-5.95, p = 0.03). polymorphism is significantly associated with the susceptibility to lung cancer, and is gender-specific.
Our research showed that CYP19A1-rs28757157 was associated with increased cancer risk in males and smoker, and CYP19A1-rs59429575 was identified as risk biomarker in female and lung adenocarcinoma patients. These two SNPs are located in the intron region of the CYP19A1 gene.
Combined with previous studies and database predictions, we speculated that CYP19A1 intron single nucleotide polymorphisms may alter mRNA splicing, leading to changes in the activity of CYP19A1 and related estrogens, and may affect disease susceptibility. Since the statistical significance of the correlation between CYP19A1 gene polymorphism and the risk of lung cancer is slightly weak, further experimental studies are needed to verify the results of this study.
Our study has several limitations. All subjects were enrolled from the same hospital and the limitations of sample selection may affect the accuracy of this experiment. Additional studies that encompass more geographical regions, additional ethnic groups, and larger sample size should be performed. In order to verify the results of this study, it is necessary to clarify the relationship between CYP19A1 gene and lung cancer through subsequent functional studies.
In summary, our study defined two new SNPs of CYP19A1 that were significantly associated with lung cancer susceptibility. These variants may be considered as markers in lung cancer risk assessment for Chinese Han population.