The association between COPD and LC has garnered significant attention from researchers and clinicians in recent years. At a rate of 0.8–1.7% of patients with COPD develop LC per year and LC accounts for 33% of all COPD-related deaths [24]. There is a growing interest in identifying LC risk in patients with COPD to realize effectively management. In this study, we evaluated the association of both genetic susceptibility and clinical variables including demographic, environmental and lifestyle factors with LC in COPD. By using univariate analysis, candidate gene study and multivariate analysis, BMI, smoking pack-years, emphysema and rs56113850 in CYP2A6 were ultimately identified as independent risk factors and applied to predict LC in patients with COPD and serious COPD. Moreover, besides the patients with coexistence of COPD and LC, two control groups (patients with only COPD and patients with only LC) were set in this study to comprehensively assess the association of some variables with COPD + LC. To the best of our knowledge, this is the first study to integrate genetic and clinical data to predict LC in patients with COPD.
The chromosomal region 19q13.2 contains the primary nicotine metabolizing gene, CYP2A6 [25]. The CYP2A6 gene is a highly polymorphic enzyme that metabolizes nicotine to cotinine, then cotinine to trans-3’-hydroxycotinine (3HC), and the nicotine metabolite ratio (3HC/cotinine) means the efficacy of nicotine metabolism through CYP2A6. It is demonstrated that the genetic variants in CYP2A6 are associated with nicotine metabolism, smoking behavior, smoking cessation, tobacco-related LC risk [26]. As slower nicotine metabolism, CYP2A6 activity variation may influence LC risk via tobacco exposure and procarcinogen activation [27]. The rs56113850, the sentinel associated SNP in CYP2A6, was reported the effect allele C was associated with increased nicotine metabolism activity [28, 29]. A GWAS summary statistics analysis showed the results that the effect allele C of rs56113850 was associated with an increased risk of heavier smoking, COPD and LC [16]. A recent single-variant and Mendelian randomization analysis acknowledged the causal pathway connecting rs56113850 in CYP2A6, cigarette consumption, and LC susceptibility in smokers [30]. These results all supported our findings. In the present study, rs56113850 (risk allele C) was firstly discovered increased the risk of LC in COPD in Chinese Han population after adjustment for confounding factors. Compared with COPD group or LC group, rs56113850 all increased the disease risk in COPD + LC group. Our results suggested rs56113850 in CYP2A6 might be a potential genetic biomarker of LC in COPD.
The rs1489759 in HHIP was found associated with LC in COPD in unadjusted models in this study, but this association was no longer significant after adjustment. The HHIP gene locates on chromosome 4q31 locus and acts as a negative regulator of Hedgehog signaling by binding to Hedgehog protein. Hedgehog signaling pathway is involved in epithelial-mesenchymal transition, mediates cigarette induced oncogenic transformation of bronchial epithelial cells and is essential for cellular proliferation of many LC cell lines. In addition to this, the expression of HHIP may cause changes in lung repair mechanisms and then lead to COPD [15, 31]. It has also been reported the rs1489759 might associated with both COPD and LC in several studies [15, 32]. Of note, some of the SNPs identified associated with LC in COPD in other studies were not significant in this study. The differences in the results with other findings may be due to the ethnic and regional distributional differences of the samples.
Regarding clinical data, our results showed decrease in BMI, increase in smoking pack-years and emphysema presence were independent risk factors for LC in COPD. BMI is a measurement of body fat based on height and weight. Mounting evidences have suggested that BMI is inversely associated with the risk of LC [33, 34]. A recent pooled analysis of 10 prospective cohort studies indicated that leanness (BMI < 18.5 kg/m2) was associated with a higher risk of LC and every 5 kg/m2 increase in BMI was associated with a 21% lower risk of LC [35]. Our study focused on patients with COPD and found BMI was inversely associated with LC in COPD. It is well known that COPD and LC are two of the most important smoking-related diseases. Smoking plays a pivotal role in the development of both diseases. Our findings showed that pack-years in patients with COPD + LC were much higher than that in patients with only COPD, indicating the underlying mechanisms of nicotine-induced carcinogenesis. Emphysema is a type of lung damage related to smoking. Previous studies have demonstrated that the degree of emphysematous lesions was associated with LC development in COPD [24, 36]. Similarly, our results found the prevalence of emphysema was much higher in COPD + LC group than that in COPD group.
Those clinical risk factors discovered in this study are similay to the COPD-LUCSS prediction tool for LC in COPD, which included age greater than 60 years, BMI less than 25 kg/m2, pack-years greater than 60, and the presence of emphysema [37]. Given the prominent role of genetic factors, we carried out the candidate gene study, and then identified the rs56113850 in CYP2A6 as an independent genetic risk factor for LC in COPD, and ultimately developed a prediction model integrating genetic and clinical data. The AUC of this prediction model was modest and less than 30% of patients using this model would be misclassified. Moreover, the predictive ability for LC risk of this model would significantly increase for patients with serious COPD and only less than 20% of patients using this model would be misclassified.
There are, however, several limitations of this study. The subjects in this study were all Chinese Han ethnicity and the results of the candidate gene study may not be applicable to other ethnic groups. The predictive ability of the model ought to be validated against other cohorts of patients from other settings. Furthermore, a prospective cohort study with a larger sample size and more tag SNPs covered the genes is needed to elucidate the causal relationships.