PLK1 is a Potential Prognostic Factor Associated With the Tumor Microenvironment During Lung Adenocarcinoma Progression

Background: Lung adenocarcinoma (LUAD) accounts for more than 40% of lung cancer cases worldwide, and the 5-year survival rate of LUAD patients is less than 10% due to a lack of reliable therapeutics. Here, we sought to identify a new therapeutic target for LUAD via bioinformatics analysis. Methods: Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis was conducted for the differentially expressed genes (DEGs) identied in 551 samples from the Cancer Genome Atlas (TCGA) database. Gene set variation analysis (GSVA) and the CIBERSORT method were performed to estimate the expression prole of biological pathways and the population of tumor-inltrating immune cells (TICs) in the TCGA dataset. DEGs were further analyzed by protein-protein interaction (PPI) network and Cox regression analyses, followed by RT-PCR and western blotting for conrmation. Results: Cell cycle was the only shared pathway identied by the KEGG and GSVA interaction analyses. Cell cycle score was positively associated with clinical characteristics (age, clinical stage, and metastasis) and negatively associated with overall survival in LUAD patients. PPI and Cox analyses identied PLK1 as a prognostic factor, which was positively correlated with clinical stage and negatively correlated with overall survival in LUAD patients. CIBERSORT analysis indicated that PLK1 expression was signicantly positively correlated with CD8+ and activated memory CD4+ T cells, and negatively correlated with activated natural killer cells. Additionally, PLK1 overexpression resulted in increased immune cytotoxic metrics, such as cytolytic activity score, IFN-γ score, and IFN-γ level. Conclusions: PLK1 may be useful for survival estimation in LUAD patients due to its strong correlation with features of TICs in the tumor immune microenvironment. This work represents a signicant step forward in elucidating the molecular mechanisms behind LUAD progression. PLK1 was revealed as both a potential therapeutic target for LUAD, as well as a novel prognostic factor for LUAD patient survival. PLK1 is strongly correlated with several TICs in the tumor immune microenvironment. Further investigation should be conducted to improve the accuracy of the combined analysis and further illuminate the roles of PLK1 in the TIC community during LUAD progression.

metabolism and has also been shown to promote tumor progression in LUAD. In a recent study, it was suggested that silencing circ-ENO1 reduced glycolysis, repressed cell growth, and subsequently induced apoptosis via suppression of the ENO1 gene expression in LUAD patients [7]. Foster et al. suggested that ATMIN typically had lower expression in LUAD tissues, and heterozygous ATMIN deletion contributed to cancer cell proliferation and cancer grade in a mouse model of LUAD, indicating that ATMIN serves as a tumor suppressor for LUAD [8]. A previous study has identi ed PLK1 as a key gene correlated with poor survival for LUAD [9]. However, the correlation of PLK1 expression with TICs has not been demonstrated.
In recent years, signi cant evidence has been found indicating that TICs in the tumor microenvironment (TME) play essential roles in the regulation of malignance and improvement of clinical outcomes in many tumors [10][11][12]. In particular, CD4+ and CD8+ T cells, regulatory T-cells (Tregs), dendritic cells (DCs), and macrophages appear to be highly predictive of patient outcome [13][14][15]. Zhang et al. found that 55% of ovarian cancer patients had tumor-in ltrating T cells, and the 5-year overall survival rate for those patients with tumor-in ltrating T cells was signi cantly higher than those without [16]. Based on 179 triple-negative breast cancer (TNBC) patients, Vihervuori et al. found that the survival rate of patients with at least a 14% fraction of CD8+ T cells in an 18-year follow-up was signi cantly higher than those with less than 14%, indicating that CD8+ T cells are correlated with better clinical outcomes [17]. Kim et al. also found that a high level of tumor-in ltrating lymphocytes (TILs) was signi cantly negatively correlated with histologic grade of tumors [18]. Meanwhile, LUAD patients with high TILs had better progression-free survival and overall survival compared to those with low TILs [19]. The focus of this paper was to identify different TIC compositions among LUAD and normal lung tissues, verify the association of hub genes with TICs, and further elucidate the underlying mechanisms behind the LUAD.
The detailed work ow of this work is provided in Fig. 1.

Methods
Data collection and identi cation of differentially expressed genes (DEGs) Transcriptome RNA-sequencing data (HTSeq-FPKM) of LUAD patients were retrieved from the Cancer Genome Atlas (TCGA) database (https://cancergenome.nih.gov/), which contained 497 samples from the lungs of LUAD patients and 54 samples from normal lung tissue. The GSE31210 gene expression pro le was downloaded from Gene Expression Omnibus (GEO) database (https://www.ncbi.nlm.nih.gov/geo/), including 226 samples from LUAD patients and 20 normal samples. The R packages "edgeR" and "limma" were applied to identify DEGs in TCGA dataset and normalize the GSE31210 matrix data, respectively. An adjusted p-value<0.05 and |logFC|>2 were used for the cutoff criteria for DEGs.

Protein-protein interaction (PPI) network construction and Cox regression analysis
Data containing known interactions among different proteins was retrieved from the Search Tool for the Retrieval of Interacting Genes database (STRING, https://string-db.org/), and a combined score of ≥0.4 was used as a cutoff. Next, a PPI network was visualized by Cytoscape software (version 3.6.0, https://cytoscape.org/). The R package "survival" was used for Cox regression analysis, and genes with p-value<0.001 were shown in the resulting forest plot.

Survival analysis and correlation analysis
The R packages "survival" and "survminer" were used for estimation of overall survival in LUAD patients. The Kaplan-Meier method was used to plot the survival curves and p-value<0.05 was considered signi cant. The Pearson coe cient was used for correlation analysis via Graphpad prism 7.0.

Validation of gene expression
The TCGA dataset and GSE31210 pro le were used to perform statistical analyses for the hub genes. A meta-analysis was developed on the Oncomine database website (https://www.oncomine.org) to identify the expression pattern of hub genes among LUAD and normal lung samples. The immunohistochemical results of corresponding genes from the Human Protein Atlas (HPA, https://www.proteinatlas.org) were adopted to illustrate the protein expression levels. Meanwhile, the mRNA and protein expression levels of certain genes were measured by quantitative real-time polymerase chain reaction (qRT-PCR) and western blotting.
Assessment of immune activity of the TME The CIBERSORT computational method was adopted to estimate the relative fractions of TICs in all LUAD samples. The R package "corrplot" was applied for plotting the heatmap of correlations among TICs. The

qRT-PCR
Total RNA was extracted using Total RNA Extraction Kit (Solarbo, Beijing, China), and reverse transcription was performed using the rst-strand cDNA synthesis kit (Invitrogen, Carlsbad, CA, USA) according to the manufacturers' instructions. RT-PCR was conducted using Premix Ex Taq SYBR Green PCR (TaKaRa, Dalian, China) on an ABI PRISM 7300 plus (Applied Biosystems, Foster City, CA, USA) following the manufacturer's protocols. The primer sequences used in the study were as followed: PLK1 forward primer AAAGAGATCCCGGAGGTCCTA, PLK1 reverse primer GGCTGCGGTGAATGGATATTTC; GAPDH forward primer GGAGCGAGATCCCTCCAAAAT, GAPDH reverse primer GGCTGTTGTCATACTTCTCATGG.

Statistical analysis
Statistical analysis was performed using R software (version 3.6.0) and Graphpad prism 7.0. The correlation analysis was carried out via Pearson's correlation tests. A log-rank test was used to compare statistical differences in the Kaplan-Meier analysis. The results of qRT-PCR and western blotting were analyzed with Student's t-test. All experiments were performed at least three independent times and p-value<0.05 was considered statistically signi cant.

Results
Upregulation of the cell cycle pathway in LUAD GSVA was used to explore different KEGG pathways in samples based on the TCGA dataset. Compared with normal lung samples, a total of 48 KEGG pathways (19 downregulated and 29 upregulated) were signi cantly different in LUAD samples ( Fig. 2a and Supplementary Table S1), the patterns of which were visualized in a heatmap (Fig. 2b). The 1,419 DEGs (310 downregulated and 1,109 upregulated) identi ed from TCGA dataset (Fig. 2c) were mainly enriched in cell cycle, protein digestion and absorption, ECMreceptor interaction, complement, coagulation cascades, and p53 signaling pathway (Fig. 2d and  supplementary Table S2). Cell cycle was the only common pathway identi ed by GSVA and KEGG enrichment analyses (Fig. 2e).
Correlation of cell cycle pathway with overall survival (OS) rate and clinical characteristics in LUAD patients We further analyzed the TCGA dataset to evaluate the possibility of utilizing cell cycle pathway genes as prognostic markers for LUAD patients. The results of this analysis indicated that there were higher cell cycle scores in LUAD samples than normal samples (Fig. 3a). The cell cycle score was signi cantly higher in male (p=0.0036), M1 (p=0.0186), N1-3 (p=0.0007), and stage III&IV (p=0.0007) subgroups compared with other subgroups (Fig. 3b-e). Additionally, the LUAD patients were strati ed into high and low cell cycle score subgroups according to their median cell cycle scores. The high cell cycle score subgroup was distinctively associated with worse OS in LUAD patients as shown in Fig. 3f.

Interaction analysis of PPI network and univariate Cox regression analysis
The 23 DEGs involved in the cell cycle pathway were provided in Table 1 and were used for construction of a PPI network. This analysis revealed that there were 23 nodes and 240 edges in the PPI network (Fig.   4a), and the connectivity degree of each gene in the network was calculated and displayed in the histogram (Fig. 4b). BUB1B, CDC6, CDK1, MCM2, CDC20, CCNA2, CCNB1, CCNB2, CDC45, PLK1, CHEK1, and CCNE1 were the DEGs with the highest degrees of connectivity. Additionally, CCNA2, PTTG1, CDC25C, CCNB1, and PLK1 were signi cantly associated with OS in LUAD patients, as determined by univariate Cox analysis for the 23 DEGs (Fig. 4c). We found that CCNA2, CCNB1, and PLK1 were shared between the twelve leading genes in the PPI network and the ve genes determined by univariate Cox analysis (Fig.  4d).

Identi cation of PLK1 as a prognostic factor for the LUAD patients
To further evaluate the prognostic value of PLK1, we conducted univariate and multivariate Cox regression analyses based on the TCGA dataset and GSE31210 pro le. As indicated by Table 2, PLK1expression and clinical stage had a signi cant association with OS in LUAD patients determined by Cox regression analysis in both the TCGA dataset and GSE31210 pro le (p<0.05). Compared with stage I, patients with stage II, III,and IV had higher PLK1 expression according to the TCGA dataset (p<0.05, Fig.  5a), as well patients with stage II in the GSE31210 pro le (p<0.05, Fig. 5b). PLK1 expression had a signi cantly positive correlation with stage classi cation in both theTCGA dataset (r=0.2851, p<0.0001, Fig. 5c) and GSE31210 pro le (r=0.2996, p<0.0001, Fig. 5d). In addition, the LUAD patients were grouped into subgroups with high and low medianPLK1 expression. The PLK1 high expression subgroup was signi cantly associated with worse OS of LUAD patients in both the TCGA dataset (Fig. 5e) and GSE31210 pro le (Fig. 5f).

High expression of PLK1 in LUAD samples
Both paired and unpaired difference analyses of the TGCA dataset revealed that PLK1 was expressed at a signi cantly higher level in LUAD samples compared to normal samples ( Fig. 6a and b). Meta-analysis indicated that PLK1 expression was signi cantly higher in LUAD tissues in ve different datasets ( Fig. 6c). Immunohistochemical images from HPA indicated that PLK1 was upregulated in LUAD tissues compared to normal samples (Fig. 6d). We also found that PLK1 expression was signi cantly upregulated in LUAD versus normal lung tissues in the GSE31210 pro le (Fig. 6e). Additionally, both qRT-PCR and western blotting suggested that PLK1 was distinctively elevated in A549 and PC-9 cell lines compared to HBE cells ( Fig. 6f and g).

Association of PLK1 with immune activity of TME
To investigate the correlation between PLK1 and LUAD in the immune microenvironment, we further analyzed the TCGA dataset with a focus on TICs, CAS, and IFN-γ score. As shown in Fig. 7, the proportion of 21 kinds of in ltrating immune cells in LUAD samples was calculated with the CIBERSORT algorithm. The TIC pro les were used to generate a histogram and the correlation among TICs was visualizedin a heatmap. The proportion scores of 12 kinds of TICs were distinctively different between high and low PLK1expression subgroups (Fig. 8a). Additionally, 13 kinds of TICs were signi cantly associated with PLK1 expression via Pearson correlation analysis (Fig. 8b). The 12 kinds of common TICs were determined by both difference analyses and correlation tests (Fig. 8c and Supplementary Table S3), and the resulting altered TICs were termed PLK1-related TICs. Compared with normal tissues, CAS, IFN-γ expression, and IFN-γ score were signi cantly elevated in LUAD samples (Fig. 9a-c). Compared to the low PLK1 expression subgroup, CAS, IFN-γ expression, and IFN-γ score were signi cantly increased in high PLK1 expression subgroup (Fig. 9d-f). Additionally, CAS, IFN-γ expression, and IFN-γ score had signi cant associations with PLK1 expression (p<0.0001, p<0.0001, and p=0.0013, respectively) ( Fig. 9g-i).

Discussion
LUAD accounts for more than 40% of lung cancer worldwide and is one of the deadliest cancers [25]. Currently, the most common strategy for treating LUAD involves surgery, which bene ts greatly from early detection. The survival rate of patients who receive surgical intervention in combination with chemotherapy further increases by 5 to 10% [26]. Despite these interventions, the 5-year survival rate for LUAD is still unacceptably low. Additionally, limited knowledge about molecular and cellular mechanisms involved in LUAD have made it di cult to identify new therapeutics and strategies for LUAD patients. In this work, PLK1 was identi ed as a hub gene involved in cell cycle progression which had positive correlations with clinical stage and cytotoxic activity in the TME. PLK1 belongs to the polo-like kinases (PLKs) family, which is a subgroup of serine/threonine protein kinases [27]. It impacts the cell cycle in multiple ways, including acting as a G2/M checkpoint, regulating the centrosome during cell cycle, and coordinating the spindle assembly and chromosome segregation [28]. Several studies have demonstrated that high expression of PLK1 commonly occurs in malignant tumors, and its overexpression is negatively correlated with prognosis in tumor patients [29][30][31]. Based on 25 datasets and 12 publications, Lin et al. found that the mRNA and protein expression levels of PLK1 were signi cantly increased in gastric cancer compared with normal gastric tissues, and high PLK1 expression was correlated with unsatisfactory OS (p<0.001), lymph node metastasis (p=0.013), and advanced TNM stage (p=0.038) [32]. Meanwhile, high expression of PLK1 protein had a prominent association with patients of ovarian clear cell carcinoma (OCCC), and silencing PLK1 sensitized OCCC cell lines (OVTOKO and KK) to the cisplatin treatment via activation of autophagy and apoptosis [33]. Based on 3,173 samples, Takeshita et al. demonstrated that PLK1 was highly expressed in TNBC and its overexpression was dominantly associated with TICs (CD8+ T cells and macrophage) and unfavorable prognosis in the whole cohort [34]. They also found that both cell cycle-related gene sets (G2/M check point and E2F targets) and MYC target gene sets were enriched insamples with high PLK1 expression via Gene Set Enrichment Analysis. It also has been reported that PLK1 up-regulated TGF-β signaling to drive cancer cell invasiveness in vitro and served as a strong predictor for diminished survival rates in metastatic NSCLC patients [35]. Many previous studies have suggested that downregulating PLK1 expression could strongly repress cell growth and lead to apoptosis in many types of tumors, indicating that PLK1 could be a promising treatment target for cancers [36][37][38].
Previous studies have demonstrated that TICs play a signi cant role in the tumor immune microenvironment, and directly or indirectly affect tumor progression and survival of cancer patients [39][40][41]. Based on 212 pancreatic cancer cases, Ino et al. found that patients with high levels of tumorin ltrating CD4+ and CD8+ T cells survived longer than those with low levels [42]. In NSCLC, most tumorin ltrating CD4+ and CD8+ T cells expressed CD69 and a subset also produced CD103, both of which are biomarkers of resident memory T cells [43]. It has been reported that CD103+ T cells not only served as a guard against secondary infections, but also attacked tumor cells [44,45]. Both CD8+ and CD4+ T cells produced inhibitory receptors PD-1, 2B4, and CTLA-4. Particularly, CD103+CD4+ T cells were the most dominant producer of TNF-α and IFN-γ [46][47][48][49]. Additionally, cytotoxic T cells could release two key cytolytic effectors, perforin (PRF1) and pro-apoptotic granzymes (GZMA), to destroy cancer cells via the granzyme-perforin pathway [50]. Based on 280 tumor samples, Mandal et al. found that CD56dim natural killer (NK) cells were the most predominant in ltrating cells in advanced head and neck squamous cell carcinoma (HNSCC), and their presence was positively correlated with survival rate in HNSCC patients [51]. Another study found that the population of both in ltrating NK cells and macrophages served as strong predictors for the survival of stage II&III esophageal cancer patients [52]. Previous studies have also suggested that despite the powerful cytotoxic activity of NK cells in response to tumor cells, this activity might be limited by the TME due to interference with NK cell activation pathways and NK cell in ltration [53,54]. In addition, the TME-mediated interference has also been shown to suppress the functions of in ltrating DCs in maintaining balance between CD8+ T cell immunity and tolerance to tumor antigens, which can further promote tumor progression [55]. This TME-mediated suppression has prevented the development of many tumor immunotherapies, making a better understanding of the mechanisms behind it key to developing new treatments.
Although PLK1 was shown to be associated with clinical stage and the tumor immune microenvironment, this study still contains some limitations. First, although qRT-PCR and western blotting were used to verify PLK1 expression in LUAD cell lines, the proposed mechanism of PLK1 regulation of tumor progression via the cell cycleneeds further validation. Second, clinical samples were not applied in the work because of limitationsonexperimental conditions. Third, the correlation analyses of clinical characteristics were not satisfactory due to the unbalanced clinical data in datasets. Finally, the association of PLK1 with TICs and molecular mechanisms were not fully elucidated.

Conclusions
This work represents a signi cant step forward in elucidating the molecular mechanisms behind LUAD progression. PLK1 was revealed as both a potential therapeutic target for LUAD, as well as a novel prognostic factor for LUAD patient survival. PLK1 is strongly correlated with several TICs in the tumor immune microenvironment. Further investigation should be conducted to improve the accuracy of the combined analysis and further illuminate the roles of PLK1 in the TIC community during LUAD progression. The datasets used and analysed during the current study are available from the corresponding author on reasonable request.

Competing interests
The authors declare that they have no competing interests.

Funding
This research did not receive any speci c grant from funding agencies in the public, commercial, or notfor-pro t sectors.
Authors' contributions ZH performed the data analysis, conducted all experimental procedures, and drafted this manuscript. XL edited and revised the manuscript. ZH and XL were in charge of analysis tools. All authors read and approved the nal manuscript.