In lung cancer, lots of researches have shown that many signaling pathways could be activated by smoke, including β adrenergic receptor-mediated (β-ARs), nicotinic acetylcholine receptor-mediated (nAChRs), nuclear factor-KB (NF-KB), epidermal growth factor receptor (EGFR) and gamma aminobutyric acid (GABA) signaling pathways [23]. However, understanding the mechanisms that lead to lung adenocarcinoma in smokers remains a hard work. At the genetic level, smokers and nonsmokers were accurately identified in LA based on a support vector machine (SVM) classification model constructed from 27 characteristic genes with significant enrichment of cancer proteoglycan and Ras signaling pathway [24]. It was suggested that 7 mRNAs including CYP17A1, PKHD1L1, RPE65, NTSR1, FETUB, IGFBP1 and G6PC might be used as prognostic indicators in smokers with LA [9]. However, very few researchers focus on the development from normal lung tissue to LA tissue in smokers. Therefore, our research is committed to discover potential biomarkers that predict progression from smokers to lung adenocarcinoma.
In this study, a total of 245 DEGs (196 down-regulated genes and 49 up-regulated genes) were identified. In GO and KEGG enrichment analysis of 196 down-regulated genes, we found that these genes exist in the extracellular matrix and the endoplasmic reticulum lumen in the cytoplasm. They are mainly involved in the biological process of the construction of extracellular matrix and tissues development through protein digestion and absorption and extracellular matrix receptor action pathways, catalyzing the synthesis of extracellular matrix structural components, as well as the combination of growth factors and proteases. However, in GO and KEGG enrichment analysis of 49 up-regulated genes, we found that existing in extracellular matrix, cytoplasmic vesicles, secretory vesicles and plasma membrane, these genes catalyze the binding of various molecules through the complement system pathway, which is mainly involved in cell or molecule adhesion, blood vessel formation and development, and the regulation of inflammatory response. Thus, it has been seen that 245 DEGs mainly acted on the tumor microenvironment which is essentially composed of genetically abnormal cells surrounded by blood vessels, fibroblasts, immune cells, stem cells and extracellular matrix (ECM) [25]. The complement pathway is a type of innate immunity that mainly supplements immunoglobulin and enhances the ability of immune cells to clear by promoting inflammation and attacking pathogen cell membranes [26]. In the tumor microenvironment, complement regulates both pro-tumor and anti-tumor pathways. It has been proved that complement activation via a C3a receptor pathway mediates lung cancer progression while RNA interference with CD59 synthesis can inhibit the growth and metastasis of lung adenocarcinoma cells [27, 28]. Dysregulated complement activation is a key link between inflammation, the suppression of antitumour immune responses and the promotion of tumorigenesis [29]. Furthermore, It was suggested that complement inhibitors or activators combined with targeted therapy or immunotherapy have promising prospects in the treatment of lung adenocarcinoma [26].
After a series of screening, COL1A1, COL1A2, COL3A1 and DCN genes were defined as key genes. COL1A1 (Collagen Type I Alpha 1 Chain) and COL1A2 (Collagen Type I Alpha 2 Chain) are protein coding genes that encode the pro-chains of type I collagen (COL1) which is related to osteogenesis imperfect [30]. COL3A1 (Collagen Type III Alpha 1 Chain) is another kind of collagen coding gene and its mutation is responsible for Ehler-Danlos syndrome type IV [31]. DCN, also called Decorin, is a protein coding gene and its mutation was regarded as a main aetiological agent of Congenital stromal corneal dystrophy (CSCD) [32]. Genetic alterations of these key genes were more common in female LA smokers. The network of co-expression and pathway indicated that these 20 genes/proteins were mainly functioned in extracellular matrix (especially collagen). CD36 was the most relevant pathway gene/protein with COL1A1 and COL1A2. CD36 could mediate the related pathway to inhibit angiogenesis which is the basis of tumor growth and metastasis [33]. Thus, we showed that COL1A1 and COL1A2 might promote tumor progression by inhibiting CD36 related pathways.
Regarding the immune infiltration analysis, we found that COL1A1, COL1A2, COL3A1 mainly regulate central memory CD8 T cell, regulatory T cell, memory B cell, natural killer (NK) cell, and natural killer T (NKT) cell while DCN mainly regulates mast cell, type 1 T helper cell, natural killer cell, macrophage and regulatory T cell. These immune cells act different roles in tumor immune microenvironment. Central memory CD8 T cell plays an important role in immunocytotherapy that it was obtained from the patient's body, proliferated in vitro and then transferred back to the body to achieve the anti-tumor effect [34]. For patients who were unresponsive to PD-L1 or cytotoxic T lymphocyte-associated protein 4 (CTLA-4), regulatory T cell was accelerated to express by antibody-mediated depletion of immune checkpoint 4-1BB in order to modulate an antitumor immune response [35]. Memory B cells secreted high level of immunoglobulin against tumor antigens that accumulated in regional lymph nodes partly caused by PD-L1 blockade [36]. Studies have shown that patients with low NK cells activity had an increased risk of lung cancer, and injecting multiple allogeneic NK cells tended to have a better prognosis [37, 38]. Based on hematopoietic stem cell-engineer, invariant NKT cell, a potent immune cell for targeting cancer, changed its disadvantage of low level in cancer patients and developed an original therapy proved with long-term effect and no toxicity in vivo [39]. Mast cells can both anti-tumor through tumor infiltration can directly affect the proliferation and invasion of tumor cells and promote tumor through the establishment of the tumor microenvironment and regulating tumor cell immune response and whether anti-tumor or promote tumor depends on cancer types, tumor progression and the location of immune cells in tumor [40, 41]. Particularly, it was suggested that abnormal activation of mast cells led to lung immune dysfunction in smokers, which also contributes to tumor development and progression [42]. Type 1 T helper cell mainly secretes cytokines Interferon-γ (IFN-γ) which is also beneficial to tumor cells such as facilitating tumor growth, altering immune resistance of tumor and promoting immunosuppressive tumor microenvironment [43, 44]. The ability of macrophage to mount an effective antitumor response was governed by metabolism meanwhile its metabolism can be actively reprogramed by the tumor microenvironment via metabolites, cytokines or other signaling mediators so that the anticancer effect of macrophages was reduced[45]. Thus, inhibition of this reprogramming process has become a new approach for tumor therapy.
The immune environment infiltrated by these genes has potential value in tumor development and treatment. It is the importance of these genes in tumors has prompted us to look for relevant targets to inhibit them. In drug target analysis, COL1A1, COL1A2, COL3A1 genes had a common targeted drug called Collagenase clostridium histolyticum (DB00048) and DCN gene had a targeted drug called Tromethamine (DB03754). However, both of them haven’t been applied in cancer therapy. Furthermore, single cell sequencing data was applied to explore expression specificity of these genes. It was displayed that COL1A1 gene more often expressed in male LA patients while COL1A2 and DCN genes more often expressed in female LA patients with brain metastases and COL3A1 gene was specific high expression in female LA patients with brain metastases. Via IHC, Liu Y et al proved that COL1A1 and COL3A1 were significantly high expression in brain tumor tissue metastasized from LA [46]. Due to the specific expression of COL3A1, it has the potential to predict brain metastases in LA. However, no studies have described the relationship between other genes (COL1A2 and DCN) and brain metastases.
Although potential biomarkers were found in our study, there are some deficiencies as followed. Firstly, limited to a clinical condition of smoker, we only choose ONCOMINE data to verify the expression key genes and didn’t analyze the correlation of key genes and tumor stages. Both of them can further enhance veracity and richness in our study. Secondly, due to the limited experimental conditions, we didn’t conduct experimental verification of the key genes, and some of these genes were only verified by the experimental results of published literature. Thirdly, the concrete functions and mechanisms of how key genes induce LA are still unclear. Thus, more studies are needed to clarify those questions.