Landscape of Genetic and Modular Networks Between Non-Small Cell Lung Cancer and Chronic Obstructive Pulmonary Disease

The relationship between non-small cell lung cancer (NSCLC) and chronic obstructive pulmonary disease (COPD) has been demonstrated in many studies. However, the underlying connection is still unknown. The present study explored the landscape between them. Based on literature research and modularized analysis, NSCLC and COPD network were constructed. We employed functional annotation to measure the relationships among the annotation terms of overlapping modules. The two diseases share 154 overlapping genes, 2374 common biological processes and 601 pathways. MMP9, BCL2, BAX, TP53, PIK3CA were common hub genes of the top three modules between NSCLC and COPD. The most common signicant biological process was inammatory response. Via validation, MMP9 and BCL2 were highly expressed in the NSCLC patients with COPD than healthy donors, while BAX and TP53 were lower expressed. Our results provide novel molecular connection between NSCLC and COPD, which may facilitate the dignosis and treatment of multiple diseases.


Introduction
Non-small cell lung carcinoma (NSCLC) is the most common malignancy, accounts for approximately 85% of lung cancers [1] . Chronic obstructive pulmonary disease (COPD) also has a high prevalence and mortality rates [2] . The association between COPD and the NSCLC, including gene pathogenicity, endothelial cell remodeling and in ammatory mechanisms, has been observed in multiple studies, [3,4] .
Comparing to healthy people, COPD patients are relatively at an high risk of developing NLSCL [5] . Advanced NSCLC patients are more likely to be accompanied by COPD, and have more respiratory symptoms than those without COPD [6] . Chronic activation of airway immune system by infections increased the incidence of NSCLC. Meanwhile, airway chronic in ammation occurs throughout the course of COPD, which may induce bronchial epithelial carcinogenesis [7] . Smoking is recognized as important risk factor for NSCLC and COPD, either active or passive. It also aggravate respiratory symptoms such as dyspnea, coughing and wheezing. Smoking cessation has a positive effect on the prevention and control of NSCLC, while the respiratory symptoms of COPD can also be alleviated to some extent.
Although the two diseases are closely related, the molecular mechanism between them remains unclear.
Network construction helps to understand the biologial interaction of cancer [8] . The regulation network of single molecule was explored in previous report, and provided guideline for future NSCLC therapies [9] . In this study, We applied gene network to reveal the systemic biological functions, and bridged the connection between the two disease.

Materials And Methods
Obtaining the genes and network construction The terms ''non small cell lung cancer" or "chronic obstructive pulmonary disease" were inputted into to the search box of the Online Mendelian Inheritance in Man (OMIM) database, (http://www.ncbi.nlm.nih.gov/omim), a knowledge database of human genes and genetic disorders (13).
Disease-associated genes were then submitted to Agilent Literature Search software, version 3.2.2 (http://www.agilent. com/labs/research/litsearch.html), which automatically querying multiple text-based searching and extracting associations among genes of interest. The overview network of gene associations was constructed.

Network analysis
Cytoscape software version 2.71 (http://www.cytoscape.org) was used for visualization of diseaseassociated networks and analysis of the network properties. Network parameters including the network coeffcient, diameter, centralization and radius were calculated.
Identi cation of modules MCODE (version 1.32) is a program that is used for network module division (http://baderlab.org/Software/MCODE). Subsequent to the disease network data being obtained, each disease network was imported and MCODE was used to divide it into several modules using the following parameters: Connectivity threshold, 2; corethreshold K, 2; node score threshold, 0.2.

Functional enrichment analysis
Hypergeometric distribution tests were performed to analyze the function of the modules that contained the most genes in each network (COPD-associated and NSCLC-associated gene networks) using the Database for Annotation, Visualization and Integrated Discovery (http://david.abcc.ncifcrf.gov). The following parameters were used: Count,2; EASE, 0.01; and species and background, Homo sapiens.Using the gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes annotation, the biological processes and pathway corresponding to the modules were identifed, and the P-values were ranked.  Table 4). The cycle condition was set as follows: predenaturation at 95 °C for 1 min, denaturation at 95 °C for 5s, annealing at 60 °C for 30s, extension at 70 °C for 30 s, 40 cycles in total.

Data analysis
Stepone plus V2.0 software was used to calculate the PCR ampli cation e ency. The average 2 -ΔΔ CT of each reaction was used for comparison of the genes expression by t test. Data was analyzed using SPSS 23.0 software (SPSS Inc., USA).
Diseases network topological analysis NSCLC network (Fig. 1A) and COPD network (Fig. 1B) spread outward from center. As shown in (Table   1), topological parameters of the two disease networks are compared. NSCLC network include 1615 associated genes and 4,405 edges (interactions), which was more complex than COPD 417 genes and 907 edges). The increase in the node degree (the number of node-edges in the network) in the NSCLC ( Fig. 1C) and COPD (Fig. 1D) networks followed a power-law distribution.
Functional enrichment analysis of NSCLC DAVID functional annotation was used to enrich the genes of the two diseases. After gene ontology, 6327 GO terms and 1702 pathways were obtained, and the number of GO term connections was 317051 (Fig. 2). The most signi cant functions were regulation of apoptotic process, cellular protein metabolic process, and positive regulation of protein metabolic process. Cellular protein modi cation process, regulation of cellular macromolecule biosynthetic process, and RNA metabolic process had the largest number of genes, which were 569, 561 and 546 respectively. The most signifcant pathways were IRSmediated signaling, SOS-mediated signaling, and downstream signaling events of B Cell Receptor (BCR) ( Table 2).

Functional enrichment analysis of COPD
The COPD network consists of 1515 GO terms and 716 pathways, the number of GO term connections was 107254 (Fig. 3). The most signifcant functions was in ammatory response, regulation of apoptotic process, and cellular response to chemical stimulus. Cellular protein modi cation process, regulation of cellular macromolecule biosynthetic process, and RNA metabolic process had the largest number of genes. The most signi cant pathways was cytokines and in ammatory response ( Table 3).

Modules of NSCLC and its functional enrichment analysis
A total of 124 modules analyzed by MCODE were identifed from the NSCLC network. 315 biological processes were involved in the top 3 modules of NSCLC-associated network (Fig. 4), including positive regulation of transcription from RNA polymerase II promoter, negative regulation of apoptotic process and in ammatory response. There are 12 hub genes in the top 3 modules (genes with the most connections above 10 in the network).

Modules of COPD and its functional enrichment analysis
Then, 35 modules were identi ed from the COPD network (Fig. 5). The top 3 modules of COPD-associated network have 121 biological processes, including negative regulation of apoptotic process, cell proliferation and in ammatory response. They shared 81 pathways, including the AMPK signaling pathway, pathways in cancer and non-small cell lung cancer pathway.

Validation of hub genes
We examined the expression pro les of overlapping hub genes in the top three modules of NSCLC and COPD. Finally, MMP9 and BCL2 were highly expressed in the blood of NSCLC patients with COPD compared to healthy donors, while BAX and TP53 were lowly expressed (Fig. 6).

Discussion
Numerous molecular characteristics were explored by network construction, and the common biological backgrounds bridged the connection between NSCLC and COPD. The overlapping genes, similar biological processes and pathways provide an e cient method to study the genetic correlation of them.
MMP9 is a member of Matrix metalloproteinases (MMPs), which degrade the basement membrane and promote the invasion of lung cancer [10] . The analysis of DEGs in lung cancer tissues and construction of regulatory networks showed that MMP9 played as an important role in the development of lung cancer [11] . MMP9 also participated in the progression of COPD, it change the normal alveolar structure and is involved in the airway remodeling. Positive correlation between MMP9 and COPD assessment test score in COPD patients was previously demonstrated [12] .
TP53 mutation occurs in many human malignancies and is involved in cell apoptosis, senescence and DNA repair [13] . The expression of mutant TP53 is associated with platinum-based chemo-resistant [14] . Emphysema is one of the pathological type of COPD, TP53 take part in the development of emphysema. In ammatory factors may be lead to the accumulation of TP53 protein and increase the Bax/Bcl-x (L) ratio in emphysema patients lung tissue [15] . Recently, a study used bioinformatic analyses indenti ed that TP53 was target gene of miRNAs which was found signi cantly downregulated in COPD patients [16] . MMP9 and TP53 are hub genes in the same module of NSCLC and widely studied seperately before. We found that MMP9 was highly expressed and TP53 was lowly expressed in the NSCLC patients with COPD.
Functional enrichment demonstrate that cellular and protein metabolic processes are common biological processes they mainly involved in NSCLC. Apoptotic process is related to both NSCLC and COPD. These may supply a new perspective on the connection between MMP9 and TP53 in NSCLC and COPD.
BCL2 gene family is an important regulator of apoptosis. BCL2 can inhibit cells procedural death and BAX that is involved in programmed cell death, which are both closely linked to NSCLC. The overexpression of BCL2 and the low expression level of BAX can decrease lung cancer cell apoptosis [17] .
Other non-overlapping hub gene such as KRAS, EGFR and ALK also play important roles in the two diseases. The PFS (progression free survival) of KRAS/TP53 or PIK3CA/TP53 mutation patients is shorter than mono-mutation of KRAS and TP53. KRAS, PIK3CA and TP53 mutations are associated with distant metastases and poor prognosis. Patients with NSCLC should receive routine genetic test, the mutations of ALK, KRAS, PIK3CA and TP53 mutations determine the clinical decision and prognosis [18] .
The top three modules of NSCLC and COPD shared 82 common biological process and 78 pathways. The most signi cant biological process was in ammatory response. In ammatory response is effective prognostic indicator in NSCLC patients [19] and also worsen the situation of COPD [20] .
According to our ndings, in ammatory and apoptotic process were exert a enormous function on both diseases. Activation of the in ammatory process or apoptotic process are assumed to cause tumor progression promotion [21] . Anti-in amation or increase apoptosis have anticancer effect in cancer induced mice [22] . COPD is an in ammatory lung condition that features incompletely reversible air ow obstruction [22] . Thus, therapies targeting the apoptotic and anti-in ammatory signaling pathways regulated by hub genes may represent promising treatments for NSCLC or COPD.
With the explosive growth of the study amount, the potenttial molecular function and pathways are developed as targeted agents for NSCLC. EGFR tyrosine kinase inhibitors (TKIs) and ALK inhibitors are used as rst-line treatment for NSCLC patients in the NCCN Guidelines. Nevertheless, there are many limitation for the use of TKIs and ALK for NSCLC patients. More targets and more biological functions need to be widely investigated in the treatment not only for NSCLC but COPD.

Conclusions
In this study, we provide a detailed overview of NSCLC-genes network and COPD-genes network. We illustrate the possible underlying biological processes between NSCLC and COPD. Network analysis helps us to gain new insights on pathological mechanisms and treatment of the two diseases.

Declarations
Ethics approval and consent to participate Not applicable.

Consent for publication
Not applicable.

Availability of data and material
The data used to support the ndings of this study are available from the corresponding author upon request.      The expression of hub genes. MMP9 and BCL2 were highly expressed in NSCLC patients with COPD compare to healthy people, BAX and TP53 were lowly expressed. *P < 0.05;**P < 0.01;***P < 0.001.