Lung cancer is a leading cause of cancer-related deaths globally and cancer survival largely depends on the stage at diagnosis, unfortunately, 40–60% of lung cancer patients are not diagnosed until advanced stages [17]. Therefore, effective biomarkers and therapeutic targets are in need. Integrated bioinformatics analysis, which focuses on screening of DEGs, discovering hub genes of network-based and doing survival analysis, which has been diffusely used to recognize latent biomarkers related to cancer diagnosis, treatment, and prognosis estimation.
In this study, DEGs in LUAD based on the GEO expression profiles of GSE118370 (6 LUAD samples, 6 normal samples), GSE32863 (58 LUAD samples, 58 normal samples) and GSE43458 (80 LUAD samples, 30 normal samples) were identified by bioinformatics analysis. 312 DEGs were identified including 74 up-regulated and 238 down-regulated genes. Furthermore, through Enricher online web tool, we visualized the outcomes derived from Gene Ontology and KEGG pathway enrichment analysis. As for the biological processes, these up-regulated DEGs were enriched in the extracellular matrix organization, extracellular matrix disassembly, collagen fibril organization and so on. As for the cellular component, it showed that these genes mainly encoded the components of endoplasmic reticulum lumen. And the molecular function included platelet-derived growth factor binding, protease binding, fucosyltransferase activity and so on. According to the result of KEGG, up-regulated DEGs emerged a high enrichment in the pathways of protein digestion and absorption, ECM-receptor interaction, glycosphingolipid biosynthesis and so on. As for the down-regulated DEGs, they were mainly enriched in cell adhesion molecules, tyrosine metabolism, PPAR signaling pathway, drug metabolism and so on from the analyses of KEGG. GO analysis showed that down-regulated DEGs mainly participated in the biological processes of the regulation of angiogenesis, sprouting angiogenesis, positive regulation of cell differentiation and so on. In addition, the results of cellular component explicated the hub genes were mainly components of plasma membrane and molecular function included amyloid-beta binding, low density lipoprotein particle binding and so on. Angiogenesis is closely related to the occurrence and progression of cancers [18] and the formation of extracellular matrix is also associated with tumor metastasis and invasion [19, 20]. According to KEGG pathway enrichment analysis, tyrosine metabolism was found to be significant in LUAD. Li et al. revealed that activation of tyrosine metabolism in CD13+ cancer stem cells may drive relapse in hepatocellular carcinoma by means of generating nuclear acetyl-CoA to acetylate and stabilize Foxd3, and allowing CD13+ cancer stem cells to sustain quiescence and resistance to chemotherapeutic agents [21]. Nitration of protein tyrosine has proved to be involved in a variety of biological processes, including signal transduction, protein degradation, energy metabolism, mitochondrial dysfunction, enzyme inactivation, immunogenic response, cell apoptosis and cell death, and plays an important role in the occurrence and metastasis of lung cancer [22]. Therefore, the signaling pathway of tyrosine metabolism was expected to be a potential drug therapy target for LUAD.
Next, DEGs PPI network was constructed via the STRING online database and Cytoscape software. By virtue of “CytoHubba” plug-in, the top 9 hub genes, KIAA0101, CDCA7, TOP2A, CDC20, ASPM, TPX2, CENPF, UBE2T and ECT2 were identified. Using GEPIA and Oncomine validation, the mRNA expression of these 9 hub genes in LUAD samples was higher than normal lung samples, HPA database data also displayed that the protein level of hub genes was consistent with mRNA level that most genes were overexpressed in lung cancer tissue. Furthermore, to inquire prognostic biomarkers of LUAD, we analyzed the influence of hub genes expression level on survival of lung cancer patients and found that high level gene expression of these all hub genes were related to lung cancer patients’ poor overall survival. Therefore, these all 9 genes may be functional in lung cancer’s occurrence and development.
KIAA0101 encoded protein is a cell-cycle regulated oncoprotein that can regulate DNA synthesis, maintenance of DNA methylation, and DNA-damage bypass, through its interaction with the human sliding clamp PCNA [23, 24]. It has been verified that KIAA0101 overexpressed in various solid tumors including lung cancer [24–28]. Kim et al. found that PAF (PCLAF/KIAA0101) that was highly expressed in LUAD and associated with poor prognosis, could drive cell quiescence exit to promote lung tumorigenesis by remodeling the DREAM complex [27]. Another study also confirmed that overexpression of KIAA0101 could promote the progression of NSCLC and KIAA0101 knockdown induced G1 phase cell cycle arrest and inhibited NSCLC cell proliferation and migration [29]. And the interaction of KIAA0101 and UbcH10 could also regulate NSCLC cell proliferation by disrupting the function of the spindle assembly checkpoint [30].
Cell division cycle associated 7, CDCA7, was identified as a c-Myc responsive gene, and behaved as a direct c-Myc target gene. Overexpression of CDCA7 was found to enhance the transformation of lymphoblastoid cells, and it complements a transformation-defective Myc Box II mutant, suggesting its involvement in c-Myc-mediated cell transformation [31]. In quite a number of tumors, such as hepatocellular carcinoma [32], colorectal cancer [33], lymphoma [34], breast cancer [35], CDCA7 was all reported up-regulated and might be a potential prognostic factor and therapeutic target. Wang et al. have found that CDCA7 could promote lung adenocarcinoma proliferation via regulating the cell cycle and silencing CDCA7 inhibited cell proliferation in LUAD through G1 phase arrest and induction of apoptosis, which implied that CDCA7 might be identified as a potential therapeutic target for new biomarkers and LUAD [36].
Topoisomerase II (TOP2) has been clarified to have crucial functions, including DNA replication, transcription and chromosome segregation, and more and more active anticancer drugs targeted it [37]. TOP2 contains two types of isozymes: TOP2A and topoisomerase II beta (TOP2B) [38], TOP2A is the only enzyme able to cleave and re-ligate the double-strand backbone of DNA, which is indispensable for DNA replication, transcription, and repair [39, 40]. Ejlertsen et al. [39] showed that TOP2A was a direct molecular target of anthracyclines that can improve the sensitivity of anthracycline-containing chemotherapy in high-risk breast cancer patients. In malignant peripheral-nerve sheath tumor, TOP2A was the most overexpressed gene compared with benign neurofibromas [41]. High expression of TOP2A was found to be correlated to worse overall survival (OS) in all non-small-cell lung cancer and lung adenocarcinoma patients, but not in lung squamous cell carcinoma patients [42, 43]. It has also been reported that miRNA’s being associated with TOP2A plays an important role in lung cancer, for example, down-regulation of miRNA-144-3p whose potential target was TOP2A, was highly enriched in various key pathways like the protein digestion and absorption and the thyroid hormone signaling pathways in non-small cell lung cancer from the comprehensive meta-analysis [44].
CDC20, which is called Fizzy, contains seven WD40 repeats that are necessary for mediating protein-protein interactions [45]. Mounting evidence has revealed that CDC20 plays an oncogenic role in human tumorigenesis. Overexpression of CDC20 was observed in a variety of human tumors. For example, in pancreatic cancer over-expression of CDC20 was detected in pancreatic tumor tissues compared with normal adjacent tissues from pancreatic cancer patients [46]. Besides, Yuan et al. reported that the mRNA and protein levels of CDC20 were significantly higher in breast cancer cells and high-grade primary breast cancer tissues [47]. In lung cancer, multiple studies have indicated that CDC20 is highly expressed, and could be a potential prognostic marker in human NSCLC [48]. However, the exact molecular mechanism of CDC20-mediated lung tumorigenesis is still elusive and needs to be further explored.
ASPM is a kind of protein involved in multiple cellular or developmental processes, such as neurogenesis and brain growth. It is widely reported that ASPM is expressed in multiple tumor tissues and involved in the development and progression of several cancers including hepatocellular carcinomas, gastric cancer, pancreatic ductal adenocarcinomas, and lung cancer [49–52]. Previous study indicated that ASPM was involved in the development and progression of lung adenocarcinoma and was associated with poor prognosis [53].
TPX2, which is also known as DIL2 or p100, uses two flexibly linked elements ('ridge' and 'wedge') in a novel interaction mode to simultaneously bind across longitudinal and lateral tubulin interfaces [54, 55]. In ovarian cancer, it can promote the proliferation and migration of human ovarian cancer cells by regulating PLK1 expression [56]. Except for these function, in various cancers can TPX2 also control bladder cancer cell’s proliferation and invasion via TPX2-p53-GLIPR1 regulatory circuitry [57], regulate the PI3K/AKT signaling pathway to facilitate hepatocellular carcinoma [58], interactive with miRNA such as miR-485-3p [59], miR-361-5p [60], miR-335-5p [61], miR-216b [62] and so on. Zhou et al. have verified that TPX2 can activate the epithelial-mesenchymal transition process and promote both the expression and activities of matrix metalloproteinase (MMP)2 and MMP9 in non-small cell lung cancer (NSCLC), which means TPX2 promotes the metastasis and malignant progression of NSCLC and could thus serve as a marker of poor prognosis in NSCLC [63].
CENPF,which contains 3,210 amino acids and has 367 kDa in full-length molecular weight, has been proven to be highly expressed in lung adenocarcinoma. Furthermore, the expression of CENPF and ERβ2/5 (Estrogen receptors beta2/5) in LUAD patients have been shown to be correlated to TNM staging, providing a basis for exploring the interactions between CENPF and ERβ2/5. Rattner et al. demonstrated that CENPF is involved in mitosis and tumor proliferation [64]. CENPF is directly associated with disease outcomes after undergoing gene amplification [65]. In prostate cancer, CENPF has been shown to predict survival and tumor metastasis [66].
UBE2T whose full name is ubiquitin-conjugating enzyme E2T plays a significant role in carcinogenesis [67]. It has been proved that UBE2T promotes the development and progression of numerous types of cancer, including gastric cancer [68], breast cancer [69], hepatocellular cancer [70, 71] and also lung cancer [72]. A study showed that UBE2T promoted proliferation, migration, invasion, and radiation resistance in vitro and in vivo by accelerating the G2/M transition and inhibiting apoptosis and in mechanism, UBE2T promotes epithelial-mesenchymal transition via ubiquitination-mediated FOXO1 degradation and Wnt/β-catenin signaling pathway activation in NSCLC [72, 73]. Other bioinformatics analyses also found that UBE2T could have a potential efficacy in the predictability of early stage NSCLC [74, 75].
ECT2, epithelial cell transforming 2, is a guanine nucleotide exchange factor and transforming protein that is related to Rho-specific exchange factors and yeast cell cycle regulators. The expression of this gene is elevated with the onset of DNA synthesis and remains elevated during G2 and M phases [76]. Multiple important studies have detected the relationship between ECT2 and lung cancer. Justilien et al. demonstrated that nuclear ECT2 GEF activity was required for Kras-Trp53 lung tumorigenesis in vivo and that ECT2-mediated transformation requires ECT2-dependent rDNA transcription [76]. Furthermore, Nuclear PKCι-ECT2-Rac1 and ribosome biogenesis mignt be a novel axis in lung tumorigenesis [77], through extracellular matrix dynamics and focal adhesion signaling, ECT2 could promote lung adenocarcinoma progression [78].
In summary, our study indicated that KIAA0101, CDCA7, TOP2A, CDC20, ASPM, TPX2, CENPF, UBE2T and ECT2 might be involved in LUAD tumorigenesis and progression. Multiple database analysis and survival analysis demonstrated that these 9 hub genes may regard as a latent prognostic biomarker and the overexpression of these 9 hub genes might lead to reduced overall survival in lung cancer patients and the mechanism and their mutual regulation network were worthy of further research and experiments. However, in our present study, only bioinformatic analysis was performed, and the role of identified hub genes in LUAD in vivo and in vitro should be further verified. Anyway, all of our analysis may provide some useful direction into the potential biomarkers and prognosis evaluation of LUAD.