Association of CYP19A1 Polymorphism with Genetic Susceptibility to Lung Cancer in Chinese population:a case-control study

Background Lung cancer is a kind of cancer with high morbidity and mortality related to genetic factors. Many studies have shown that CYP19A1 gene polymorphism is associated with a variety of cancers, but there are few studies on lung cancer at present. The aim of the study was to explore the correlation between CYP19A1 polymorphisms and lung cancer risk in Chinese population.Methods We enrolled 510 lung cancer patients as the case group and 504 healthy people as the control group. Five single nucleotide polymorphisms determined in CYP19A1 gene were genotyped by MassARRAY, and correlation analysis was performed by Chi square test and logistic regression model.Results The genotypes of rs4646 (OR=0.77, p=0.010), rs6493487 (OR=0.76, p=0.006) and rs17601876 (OR=0.69, p=1.15E-04) in CYP19A1 gene were linked to decreasing the risk of lung cancer, while the rs1062033 (OR=1.49, p=0.029) was linked to increasing lung cancer risk. In gender-stratified analysis, female patients with the GG genotype of the rs6493487 (OR=0.31, p=0.037) and male patients with the rs17601876 (OR=0.52, p=0.012) had lower lung cancer risk. In age-stratified analysis, for patients ≥58 years, decreased lung cancer risk was correlated with the genotypes of rs4646 (OR=0.66, p=0.021), rs6493487 (OR=0.65, p=0.021) and rs17601876 (OR=0.39, p=0.001), and increased risk was associated with the GG genotype of rs1062033 (OR=2.09, p=0.003). In pathologic type-stratified analysis, the AA genotype of rs17601876 (OR=0.41, p=0.048) was associated with decreased risk of small cell lung cancer, and the AA genotype of rs4646 (OR=0.41, p=0.027), the GA genotype of rs6493487 (OR=0.65, p=0.024) and the AA genotype of rs17601876 (OR=0.34, p=0.005) were linked to decreased risk of squamous cell carcinoma.Conclusion CYP19A1 polymorphisms are associated with lung cancer risk, especially in elderly patients and patients with pathologic types of small cell lung cancer and squamous cell carcinoma.


Abstract
Background Lung cancer is a kind of cancer with high morbidity and mortality related to genetic factors. Many studies have shown that CYP19A1 gene polymorphism is associated with a variety of cancers, but there are few studies on lung cancer at present. The aim of the study was to explore the correlation between CYP19A1 polymorphisms and lung cancer risk in Chinese population.Methods We enrolled 510 lung cancer patients as the case group and 504 healthy people as the control group. Five single nucleotide polymorphisms determined in CYP19A1 gene were genotyped by MassARRAY, and correlation analysis was performed by Chi square test and logistic regression model.Results The genotypes of rs4646 (OR=0.77, p=0.010), rs6493487 (OR=0.76, p=0.006) and rs17601876 (OR=0.69, p=1.15E-04) in CYP19A1 gene were linked to decreasing the risk of lung cancer, while the rs1062033 (OR=1.49, p=0.029) was linked to increasing lung cancer risk. In gender-stratified analysis, female patients with the GG genotype of the rs6493487 (OR=0.31, p=0.037) and male patients with the rs17601876 (OR=0.52, p=0.012) had lower lung cancer risk. In age-stratified analysis, for patients ≥58 years, decreased lung cancer risk was correlated with the genotypes of rs4646 (OR=0.66, p=0.021), rs6493487 (OR=0.65, p=0.021) and rs17601876 (OR=0.39, p=0.001), and increased risk was associated with the GG genotype of rs1062033 (OR=2.09, p=0.003). In pathologic type-stratified analysis, the AA genotype of rs17601876 (OR=0.41, p=0.048) was associated with decreased risk of small cell lung cancer, and the AA genotype of rs4646 (OR=0.41, p=0.027), the GA genotype of rs6493487 (OR=0.65, p=0.024) and the AA genotype of rs17601876 (OR=0.34, p=0.005) were linked to decreased risk of squamous cell carcinoma.Conclusion CYP19A1 polymorphisms are associated with lung cancer risk, especially in elderly patients and patients with pathologic types of small cell lung cancer and squamous cell carcinoma.

Background
The GLOBOCAN 2018 estimates that there will be 18.1 million new cancer cases and 9.6 million cancer deaths worldwide in 2018, of the cancer cases, Lung cancer (LC) is the most commonly diagnosed cancer (11.6% of the total cases) and it is the leading cause of cancer death (18.4% of the total cancer deaths) [1]. LC results from the interaction of environmental exposure and genetic factors, and most procarcinogens can become carcinogens when they are metabolized in the body.
Cytochromes P450 (CYPs) are proteins of the superfamily of monooxygenases involved in the metabolism of endogenous and exogenous substances. CYP enzymes can covalently bind nucleic acids and proteins to cause genetic mutations, or mediate some signal transduction pathways to induce tumorigenesis and development [2][3][4]. The CYP19A1 gene is located on the chromosome 15 at 15q21.2 and it mainly encodes CYPs aromatase, which involves converting testosterone to estradiol and androstenedione to estrone respectively [5,6]. Most studies of the CYP19A1 gene are about hormone-related cancers, such as the breast cancer, prostate cancer, and endometrial cancer, and some research of them found that they are directly related to endogenous and exogenous steroid hormones that affect cell proliferation [7][8][9][10]. These results suggest that CYP19A1 gene may be associated with the development of tumors. Besides, there have been studies showing a relation between CYP19A1 gene and LC. Researchers found that aberrant activation of alternative CYP19 promoters may lead to upregulation of local aromatase expression in some cases of non-small cell lung cancer (NSCLC) [11]. Immunohistochemical staining tests showed that aromatase was positive in LC specimens and it was mainly distributed in epithelial cells and infiltrating macrophages, suggesting that estrogen release may occur locally in tumor microenvironment [12]. Ikeda K et al. discovered that the rs3764221 on CYP19A1 gene contributes to the development of multi-centric adenocarcinomas in the peripheral lung by causing higher levels of CYP19A1 expression [13]. Therefore, we assume that CYP19A1 polymorphisms might be associated with LC.
So far, the existing studies on CYP19A1 gene and LC mainly focus on NSCLC. However, the present study aimed to reveal the association between CYP19A1 gene and all pathologic types of LC by analyzing five single nucleotide polymorphisms (SNPs) in CYP19A1 gene so as to provide a direction for further study of LC.

Study Population
510 LC patients and 504 healthy people were recruited for the case-control comparative studies. The patients all came from the First Affiliated Hospital of Xi'an Jiaotong University and had been diagnosed and histopathologically confirmed to have primary LC, and clinically staged according to the latest edition of the TNM Staging for LC adopted by the International Union Against Cancer (UICC). For the cases recruited, there were no limitations in age, gender, pathologic types and clinical stages of LC, and the patients had no history of cancer, received no radiotherapy and chemotherapy. The control subjects were recruited from the healthy people who received annual health checkup in Medical Examination Center of the First Affiliated Hospital of Xi'an Jiaotong University and were confirmed to have no any chronic or serious endocrine or metabolic diseases.

Genotyping
Genomic Deoxyribonucleic acid (DNA) was extracted from whole blood by using Whole Blood Genome DNA Purification Kit (Xi'an GOLDMAG Biological Company). DNA concentration and purity were determined by using the Nanodrop Lite Ultraviolet Spectrophotometer (Thermo Technology Company). Primers for amplification process and single base extension reactions were designed with Agena MassARRAY Assay Design 3.0 software according to the sequence of the forward strand from the dbSNP database (http://www.ncbi.nlm.nih.gov/SNP/). The primer information of five SNPs is shown in Supplementary Table 1. Five SNPs were genotyped on the MassARRAY iPLEX (Agena Bioscience, San Diego, CA, USA ) platform by using Matrix-assisted Laser Desorption Ionization-Time of Flight (MALDI-TOF), and the results were output by Agena Bioscience TYPER version 4.0 software. The repeated control samples were set in every genotyping plate and the coincidence rate was >99%. All SNPs were genotyped and the typing rate was> 99.4%.

Statistical Analysis
The Hardy-Weinberg Equilibrium (HWE) analysis of the control group was performed by using the Fisher exact test (p0.05). Chi square (χ 2) test was used to evaluate the correlation between every SNP and the risk of LC. The odds ratio (OR) and 95% confidence intervals (CI) for each genotype were calculated by logistic regression analyses in the Plink software (http://www.cog-genomics.org/plink2/).

Comparison of baseline characteristics for lung cancer (LC) case and control subjects
General characteristics of the case and control group are listed in the Table 1. In the study, the mean age of case group was 58.0±10.55 years and that of the control group was 57.27±10.85 years. There was no statistical difference between them (p=0.227). For the case group, 75.3% were males and 24.7% were females, and for the control group 75.6% were males and 24.4% were females. There was no statistical difference between the two groups (p=0.911). In the case group, pathologic types were mainly small cell lung cancer (SCLC), squamous cell carcinoma (SQC) and adenocarcinoma (ADC), accounting for 95.1% of the total cases. In addition, the number of patients with lymph node metastasis was 14.4% more than that no lymph node metastasis, and 48.6% of case group had advanced stage (stage III and stage IV).

Association of the minimum allele frequencies of CYP19A1 research SNPs with lung cancer (LC) risk
We analyzed five SNPs of CYP19A1 in the study. Table 3 illustrates the frequency distribution for the minimum alleles in the cases and controls. The minimum allele frequencies of rs4646, rs6493487 and rs17601876 in case group are lower than those in the control group. The allele A of rs4646 (OR=0.77, p=0.010), allele G of rs6493487 (OR=0.76, p=0.006) and allele A of rs17601876 (OR=0.69, p=1.15E-04) are associated with lower LC risk. Table 4 displays the genotypes of rs4646 (p=0.035), rs6493487 (p=0.023), rs1062033 (p=0.009) and rs17601876 (p=0.001) are significantly different between the case and the control. In the table, the AA of rs4646 (OR=0.58, p=0.026), the rs6493487 (GG, OR=0.56, p=0.023; GA, OR=0.77, p=0.047) and the rs17601876(AA, OR=0.44, p=4.75E-04; AG, OR=0.71, p=0.009) of all have shown to link to decreasing the risk of LC, while the GG of rs1062033 (OR=1.49, p=0.029) shows to increase the LC risk, and the rs3751599 shows no association with the risk of LC. There are still same after adjusted for age and sex. Table 5 summarizes the correlation between research SNPs and lung cancer in different genetic models including dominant, recessive, and additive genetic model. Same as genotype model, rs17601876 was associated with the reduction of LC risk in all three models. However, rs4646 and rs6493487 were only associated with lower LC risk in dominant and additive model, and rs1062033 was only related with higher LC risk in recessive model. Table 6 demonstrates only rs17601876 is related to LC risk in male (AA, OR=0.52, p=0.012; AG, OR=0.74, p=0.047). Moreover, the GG of rs6493487 (OR=0.31, p= 0.037) and the AA of rs17601876 (OR=0.24, p=0.009) are associated with lowering the risk of LC in females. Table 7 shows all research SNPs have no significant association with LC in <58 years. But in ≥58, the AC of rs4646 (OR=0.66, p=0.021), the GA of rs6493487 (OR=0.65, p=0.021) and rs17601876 (AA, OR=0.39, p=0.001; AG, OR=0.57, p=0.002) are shown to decrease the risk of LC, while the GG of rs1062033 (OR=2.09, p= 0.003) increases the risk of LC.

Discussion
The CYP19A1 gene encodes aromatase, which is involved in the conversion of androstenedione and testosterone to estrone and estradiol respectively as a rate-limiting enzyme [14,15]. The aromatase is expressed both in gonad and extragonadal tissues including lung, brain, and liver. The activity of aromatase in LC tissues is higher than that in normal lung tissues [16]. The expression of aromatase in NSCLC is associated with estrogen production [17]. CYP19A1 polymorphisms locally raises the level of estrogen in peripheral lung tissue [18]. Estrogen directly causes cell proliferation and DNA damage of lung tissue [19], and regulate the expression of growth factors such as Vascular Endothelial Growth Factor (VEGF), which promotes microangiogenesis of LC [20], and leads to the beginning and development of LC.
In the present study, we found that the genotypes of rs4646, rs6493487, rs17601876 were linked to lowering the LC risk, while the rs1062033 may increase the LC risk. In the study by Olivo-Marston SE et al. the rs4646 was found to be associated with lowering the levels of serum estrogen among LC patients [16]. Therefore, we conclude that the four SNPs in CYP19A1 gene have an impact on the risk of LC by affecting local estrogen levels in LC tissues. In addition, it has been shown by Kohno M et al. that high aromatase expression was associated with poor prognosis for both recurrence-free survival and overall survival in lung adenocarcinomas [21]. Hence, we evaluated the correlation between CYP19A1 gene expression and prognosis in LC tissues through TCGA database (shown in Fig 1). We found that LC with low expression of CYP19A1 had a higher survival rate than those with high expression of CYP19A1 (http://kmplot.com/). These results suggest that CYP19A1 polymorphisms are associated with LC.
In male, the rs17601876 was associated with decreasing the risk of LC, and the other SNPs were not associated with LC. Estrogen receptor is also expressed in male non-reproductive system and regulated by estrogen. Verma MK et al. found that co-expression of estrogen receptor (ER) β and aromatase in male can promote the development of LC, suggesting that rs17601876 may be associated with the risk of LC in male [22,23]. Overall, the rs6493487 was associated with lowering the risk of LC, but in gender-stratified analysis, the GG of rs6493487 was only correlated with female. Yang SY et al. found that TTTA repeat polymorphism in intron region of CYP19A1 gene was associated with L858R mutation which is one of epidermal growth factor receptor (EGFR) mutations in female never-smokers [24], and rs6493487 was located in the intron variant of CYP19A1, so we assumed that rs6493487 may be related to EGFR mutations and may become a target for future targeted therapy of female LC.
In ≥58 years, the genotypes of rs4646, rs6493487, rs17601876 were associated with decreasing the LC risk, and rs1062033 was linked increasing the risk and all these SNPs had no association with the risk of patients <58 years. The estrogen of postmenopausal women and men is mainly synthesized by aromatase in non-gonadal tissues (e.g. lung ) [18], and it may explain the difference in two age groups in the association between CYP19A1 gene polymorphism and LC risk. Scholars believe that the use of exogenous estrogen in perimenopausal and menopausal women can increase the risk of time-dependent LC, and anti-estrogen therapy can reduce the incidence of secondary lung cancer in breast cancer patients >50 years old [25]. Additionally, studies have shown that elderly women with NSCLC have a longer life span than men and young women, which can be partly explained by lower estradiol (E2) levels [26]. The above studies suggest an important role of estrogen in LC of aged patients, and provide a suitable population for the study of CYP19A1 gene in LC.
The major pathologic types of LC include SQC, ADC and SCLC. At present, a majority of studies believe that estrogen plays an important role in the beginning and development of NSCLC, especially in lung adenocarcinoma. Estrogen promotes the growth of lung adenocarcinoma cells expressing ERβ receptor, and antagonizing estrogen does the opposite [27]. However, the present study found that there was no correlation between CYP19A1 gene and lung adenocarcinoma. Interestingly, we found that the genotypes of rs4646, rs6493487, rs17601876 were associated with lowering the risk of SCLC and SQC. Therefore, we believe that there may be other mechanisms between CYP19A1 gene and LC, and it needs further study.
The present study demonstrated the correlation between CYP19A1 polymorphisms and LC risk in different genders, age groups and pathologic types, but did not analyze the estrogen level. In the future, we will analyze the estrogen level in different groups, exclude the influence of gender, age and other factors on estrogen level, further clarifying the correlation between CYP19A1 gene expression, estrogen level and the occurrence of LC.

Conclusions
CYP19A1 gene, encoding the aromatase, is associated with the estrogen level in LC tissues. The genotypes of rs4646, rs6493487 and rs17601876 in CYP19A1 gene are associated with lowering the risk of LC in the elderly, SCLC and SQC, while rs1062033 has a correlation with increasing the LC risk in elderly patients. It may provide a direction for future research of CYP19A1 gene used for risk prediction and treatment of LC.

Ethics approval and consent to participate
The present study was approved by the Ethics Committee of the First Affiliated Hospital of Xi'an Jiaotong University and informed consent was obtained from each participant after a full explanation of the study.

Consent for publication
Not applicable Tables Table 1 Comparison of baseline characteristics for lung cancer (LC) case and control subjects      Figure 1 Survival Percentage of CYP19A1 Gene at Different Expression Levels: Black line represents survival curve of LC patients with low expression of CYP19A1; red line represents survival curve of LC patients with high expression of CYP19A1. Log-rank test shows that p<0.05, suggesting that two survival curves have statistical significance, and high expression of CYP19A1 group has higher death risk than low expression of CYP19A1 group (HR=1.23).

Supplementary Files
This is a list of supplementary files associated with this preprint. Click to download.