PLK1 is A Promising Biomarker for the Prognosis of Lung Adenocarcinoma

Background: As one of the commonly occurred lung cancer, over 40% of lung cancer are lung adenocarcinoma (LUAD) in the world. However, due to the lack of effective treatment method, the prognosis of this disease is poor. In this study, we identied a novel treatment target for LUAD. Differentially expressed genes (DEGs) were extracted from 551 LUAD cases in the Cancer Genome Atlas (TCGA) database and used for Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis. Gene set variation analysis (GSVA) and the CIBERSORT method were applied for the estimation of the biological pathways and the tumor-inltrating immune cells (TICs). Protein-protein interaction (PPI) network and Cox regression analyses were conducted for the analysis of DEGs. Quantitative real-time PCR (qPCR) and immunoblotting were used for the validation of DEGs. Results: Genes related to cell cycle were sorted out. Positive association between cell cycle scores and clinical features including age, clinical stages and metastasis was found. Negative association between cell cycle scores and overall survival of LUAD patients was detected. In PPI and Cox analyses, PLK1 was estimated to be a factor for the prognostic predictions of LUAD patients. In LUAD patients, PLK1 was positively associated to the clinical stages and negatively associated to the overall survival. As suggested by CIBERSORT analysis, signicant positive correlations were found between the transcription level of PLK1 and the function of CD8+ and activated memory CD4+ T cell. Meanwhile, signicant negative correlation between the transcription level of PLK1 and activated natural killer cell was also found out. Moreover, PLK1 overexpression caused elevated immune cytotoxicity, including cytolytic activity score, IFN-γ score and IFN-γ level. Conclusions: PLK1 displayed robust correlations with key features of TICs related to the immune-microenvironment of cancer, and therefore was suggested as a promising biomarker for the prognostic prediction of LUAD. LUAD: lung adenocarcinoma; DEGs: Differentially expressed genes; TCGA: The cancer genome atlas; GSVA: Gene set variation analysis; TICs: Tumor-inltrating immune cells; PPI: Protein-protein interaction; TME: Tumor microenvironment; TILs: Tumor-inltrating lymphocytes; NSCLC: Non-small cell lung carcinoma; PLK1: Polo-like Kinase 1; PFS: Progression-free survival; OS: Overall survival.


Introduction
Lung cancer is the most commonly occurred solid tumor worldwide. 1 Among all lung cancer cases, 85% cases are non-small cell lung carcinoma (NSCLC). NSCLC includes adenocarcinoma, squamous cell carcinoma and large cell carcinoma. 2,3 Among them, the incidence of lung adenocarcinoma (LUAD) accounts for 2 third of all NSCLC cases. 4 Despite recent developments in the multiple targeted treatments and new immunotherapies of LUAD, the 5-year prognosis of LUAD patients is still very poor. 5 Such phenomenon is possibly caused by the early disseminations and metastasis of LUAD. Moreover, the mechanism underlying the onset and development of LUAD are unclear. Meanwhile, how the tumorin ltrating immune cells (TICs) interact with LUAD is also poorly understood.
Previous researches report several LUAD progression-correlated genes. Elevated expression of the p53inducible gene 3 (PIG3) is found to associated with incidence of lymph node metastasis in LUAD patients, and demonstrated to contribute to the development of LUAD through the enhancement of FAK/Src/paxillin pathway activation. 6 Enolase 1 (ENO1) is a protein coding gene for glycolysis enzyme related to glucose metabolism. It is also demonstrated to enhance LUAD tumor development. Latest research suggests that circ-ENO1 silencing causes the decrease in glycolysis, inhibits cell proliferation, and therefore promotes apoptosis through suppressing the transcription level of ENO1 in LUAD patients. 7 Also, low expression level of ATM interactor (ATMIN) gene is reported to associated with LUAD.
Heterozygous ATMIN de ciency is found to participate in promoted tumor proliferation and advance cancer grade in animal study of LUAD, suggesting the tumor suppressive effect of ATMIN in LUAD. 8 Moreover, Polo-like Kinase 1 (PLK1) is previously demonstrated to associated with poor prognosis for LUAD patients. 9 However, it is still not clear how transcription level of PLK1 associates with TICs.
Recently, accumulative evidences suggest the pivotal role played by TICs in regulating the malignance and impacting the treatment outcome of various types of cancer through the regulation of tumor microenvironment (TME). [10][11][12] Speci cally, features immune cells are found to be reliable indicators for the treatment outcomes of patients. [13][14][15] Tumor-in ltrating T cells are found in over half of patients with ovarian cancer. The present of tumor-in ltrating T cells is also found to signi cantly associated with better 5-year prognosis in ovarian cancer. 16 Clinical follow-up study of triple-negative breast cancer (TNBC) indicates that better prognosis is associated with high CD8+ T cell number (over 14%). 17 High tumor-in ltrating lymphocytes (TILs) level is also found to associate with low histologic grades of tumor. 18 In LUAD, patients with high TILs are also found to have better prognostic predictions in both progression-free survival (PFS) and overall survival (OS). 19 In this study, we aimed to demonstrate the differences of TICs between LUAD and healthy lung, con rm the correlation between hub genes and TICs, and unveiling the possible mechanism underlying the onset and development of LUAD. Work ow details of this study was listed as in the Figure 1.

DEGs extraction
Data of RNA-sequencing (HTSeq-FPKM) of LUAD patients were obtained from Cancer Genome Atlas (TCGA, https://cancergenome.nih.gov/). The dataset included 497 LUAD lung tissue samples and 54 normal samples from healthy controls. The GSE31210 gene transcription pro le was obtained from Gene Expression Omnibus (GEO) database (https://www.ncbi.nlm.nih.gov/geo/), containing 226 LUAD samples and 20 healthy controls. The R packages "edgeR" were used for the identi cation of DEGs. Normalization of GSE31210 matrix data were carried out by "limma". Cutoff value for DEGs was set as adjusted p-value<0.05 with |log fold change|>2 in this study.

PPI network and Cox regression analysis
Protein-protein interaction (PPI) network analysis was carried out by Search Tool for the Retrieval of Interacting Genes database (STRING, https://string-db.org/) (cutoff value: combined score≥0.4) and illustrated by Cytoscape (version 3.6.0, https://cytoscape.org/). The R package "survival" was applied for Cox regression analysis (p-value<0.001).

Survival and correlation analyses
The R packages "survival" and "survminer" were applied for estimating OS and PFS of patients with LUAD. The Kaplan-Meier method was used the generation of survival curves (signi cance was set as p-value<0.05). Subsequently, Kaplan-Meier Plotter (https://kmplot.com/analysis/) and Gene Expression Pro ling Interactive Analysis (GEPIA) (http://gepia.cancer-pku.cn/) were conducted for survival analysis. The Pearson coe cient was applied for correlation analysis using SPSS 22.0.

Gene expression con rmation
Statistical analysis of Hub genes was carried out by TCGA dataset and GSE31210. Difference in hug gene expression pattern between LUAD and healthy controls was carried out by meta-analysis on Oncomine database website (https://www.oncomine.org). Illustration of the protein levels were carried out based on the immunohistochemical result of gene extracted from Human Protein Atlas (HPA, https://www.proteinatlas.org). Finally, both mRNA and protein level of targeted genes were con rmed using quantitative real-time polymerase chain reaction (qRT-PCR) and immunoblotting.

TME immune activities assessments
Estimation of the portions of TICs in LUAD was carried out by CIBERSORT. Heatmap of TICs correlation was generated by R package "corrplot". Transcript Per Million (TPM) was adapted from the FPKM values of TCGA dataset. The immune cytolytic activity score (CAS) was evaluated according to the average values of GZMA and PRF1 level in TPM. Score of interferon-gamma (IFN-γ) was evaluated according to the average of CD8A, GZMA, GZMB, IFN-γ, EOMES, CXCL9, CXCL10 and TBX21 level in FPKM.

Statistical analysis
All statistical analyses were carried out by R software (version 3.6.0) and SPSS (version 22.0). Pearson's correlation test was used for correlation analysis. Kaplan-Meier analysis was used for the comparison of statistical differences using log-rank test. Analysis of qRT-PCR and immunoblotting were carried out by Analysis of Variance (ANOVA). Each experiment was carried out in triplicates. Statistical signi cance was set as p-value < 0.05.

Up-regulated signaling in LUAD
GSVA was conducted for the exploration of KEGG pathways according to data extracted from TCGA dataset. As compared to healthy controls, 48 KEGG signaling pathways (19 down-regulated and 29 upregulated) were differentially expressed in LUAD ( Figure 2A and Supplementary Table S1). The pattern of the results was illustrated as heatmap ( Figure 2B). KEGG enrichment analysis of a total number of 1419 DEGs (310 down-regulated and 1,109 up-regulated) extracted from the TCGA dataset ( Figure 2C) included cell cycle, protein digestion and absorption, ECM-receptor interaction, complement, coagulation cascades, and p53 signaling pathway ( Figure 2D, supplementary Table S2). Change in signaling pathway related to cell cycle was commonly found in GSVA and KEGG enrichment studies ( Figure 2E).

Cell cycle correlated with OS and clinical features of LUAD
TCGA dataset was studied for the evaluation of the potential usage of genes related to cell cycle as indicators for the prognostic prediction of LUAD. As suggested by the results, higher cell cycle score was found signi cantly related to LUAD patients (Supplementary Figure S1). Also, higher cell cycle scores were found in male patients (p=0.0036) and patients with N1-3 (p=0.0007) and M1 (p=0.0186). However, no signi cant difference of cell cycle score was found related to age or T classi cation subgroups ( Figures 3A-E). Also, cell cycle scores were found to positively correlated with tumor stages of LUAD patients ( Figures 3F, G). We subsequently divided the LUAD patients by thier high or low cell cycle score, and found that the patients with high cell cycle score demonstrated signi cant association with poor OS and PFS (Figures 3H, I).

Combined analysis of PPI network and univariate Cox regression analysis
A total number of 23 DEGs related to signaling pathways of cell cycle were listed as in Table 1. PPI network analysis was carried out using those DEGs, and generated 23 nodes and 240 edges ( Figure 4A). The connectivity degrees of genes were estimated and illustrated in the histogram ( Figure 4B). In the results, DEGs including BUB1B, CDC6, CDK1, MCM2, CDC20, CCNA2, CCNB1, CCNB2, CDC45, PLK1, CHEK1 and CCNE1 were found to have the highest degrees of connectivity. Also, univariate Cox analysis identi ed that CCNA2, PTTG1, CDC25C, CCNB1 and PLK1 were found to correlated with the OS of LUAD patients ( Figure 4C). CCNA2, CCNB1 and PLK1 were commonly found in the results of PPI network (12 genes) univariate Cox analysis (5 genes) ( Figure 4D). 3.4 High PLK1 in LUAD PLK1 was found signi cantly highly expressed in LUAD patients compared to that in healthy controls ( Figures 5A, B). Meta-analysis suggested elevated PLK1 level associated with LUAD in 5 datasets (Beer Lung, 20 Garber Lung, 21 Hou Lung, 22 Stearman Lung, 23 and Su Lung, 24 Figure 5C). Also, revealed by the tmmunohistochemical images from HPA, PLK1 level was found elevated in tissue samples collected from LUAD patients compared to those in healthy controls ( Figure 5D). In the analysis of GSE31210 pro le, PLK1 expressive level was also suggested to be signi cantly higher in LUAD samples than in healthy controls ( Figure 5E). Moreover, as validated by qRT-PCR and immunoblotting analysis, the transcription and protein level of PLK1 were found increased in H1975 and SPC-A-1 cell than BEAS cells (Figures 5F, G).

PLK1 for prognosis of LUAD
Survival analysis and Cox regression analysis were carried out for the evaluation PLK1 as prognostic marker. We found that high PLK1 expression signi cantly indicated poor OS of LUAD patients ( Figures  6A-C). The LUAD patients with high PLK1 expression were negatively associated with PFS ( Figures 6D-F).
Meanwhile, the results of Cox analysis suggested that PLK1 could be used as independent prognostic prediction marker for patients with LUAD ( Table 2

PLK1 Associates with immune responses of TME
For the investigation of the association of PLK1 level in the immune-microenvironment of LUAD, TCGA dataset were analyzed focusing on TICs, CAS and IFN-γ score. Using CIBERSORT algorithm, the fraction of 21 subtypes of in ltrating immune cells was estimated as illustrated in the Supplementary Figure S4. TIC pro les were visualized by histogram, meanwhile the correlations between TICs were demonstrated in the heatmap. Signi cantly difference was found in the proportion scores of the 12 subtypes of TICs between high and low PLK1 level groups ( Figure 8A). Also, as revealed by the Pearson correlation analysis, 13 subtypes of TICs were signi cantly related to PLK1 level ( Figure 8B). The 12 subtypes of TICs were commonly identi ed ( Figure 8C and Supplementary Table S3). The generated results of changed TICs were identi ed as the PLK1-associated TICs. In the results, the expressive levels of CAS and IFN-γ, scores of IFN-γ were increased in LUAD patients ( Figures 9A-C). Moreover, the expressive levels of CAS and IFN-γ, scores of IFN-γ were found higher in LUAD patients with high PLK1 level ( Figure 9D-F 4 Discussions LUAD was one of the cancers with the highest mortalities. 25 In clinic, surgeries were the most widely used method for LUAD treatment, which was is mainly facilitated by early diagnosis. Combined use of surgery and chemotherapy treatments increased 5-10% survival rates. 26 However, the survival rates for LUAD patients were unsatisfactory. Poorly understood mechanisms related to the onset and development of LUAD also signi cantly limited the discoveries of novel effective treatment method for LUAD patients. In this study, we found that PLK1 was a hug gene related to cell proliferation which was positively correlated with the clinical stages and cytotoxic activities in the TME. transcription and protein level with poor prognosis and treatment outcomes in gastric tumor. 31 High PLK1 level was also found strongly correlated with ovarian clear cell carcinoma (OCCC). PLK1 inhibiting promoted the sensitivities of OCCC cell to the cisplatin treatments through the enhancement of autophagy and apoptosis in vitro. 32 High expressive level of PLK1 was found to associate with TNBC according to a clinical study containing 3,173 samples. Overexpression of PLK1 signi cantly correlated to TICs and poor prognostic prediction of the patients. 29 Gene set enrichment study also found that genes related to cell cycle and MYC targets were signi cantly enriched in TNBC patients with high PLK1 expressive level. In in vitro study, PLK1 was demonstrated to promote the signaling of transforming growth factor (TGF)-β and therefore increased the invasiveness of cancer cells. This result was consistent with the clinical ndings that the expression PLK1 robustly predicted the survival of patients with metastatic NSCLC. 33 Moreover, PLK1 was also suggested as a potentially effective treatment target of various types of cancer due to the suppressive effects on cell growth and promoting effects on apoptosis of tumor cells via the inhibition of PLK1 expression. [34][35][36] TICs was found to contribute to the regulation of the immune-microenvironment of tumor, and therefore impacted on the development of tumor and the treatment outcomes of patient with cancers. [37][38][39] High tumor-in ltrating CD4+ and CD8+ T cell levels in pancreatic cancer strongly indicated longer survival time. 40 Two subsets of tumor-in ltrating CD4+ and CD8+ T cells were found in NSCLC respectively expressing CD69 and CD103, both of which were markers of resident memory T cells. 41 Other than functioning as memory cells ready for quick responses to secondary infections, CD103+ T cells was also demonstrated to target tumor cells in previous studies. 42,43 Programmed cell death-1 (PD-1), CD244 and cytotoxic T-lymphocyte Antigen-4 (CTLA-4) were majorly expressed on both CD8+ and CD4+ T cells. Meanwhile, both tumor necrosis factor alpha (TNF-α) and IFN-γ were predominantly produced by CD103+ CD4+ T cells. [44][45][46][47] Also, cytotoxic T cells were able to produce cytolytic effectors including perforin (PRF1) and pro-apoptotic granzymes (GZMA), and participated the immune reactions against tumor cells through the action of granzyme-perforin pathway. 48 Previous study also demonstrated that the dominant cytotoxic effects of CD56dim natural killer (NK) cells against tumor cells in advanced head and neck squamous cell carcinoma (HNSCC). In HNSCC patients, CD56dim NK cells were found to robustly correlated to promising prognostic predictions. 49 Moreover, both in ltrating NK cells and macrophages were found to indicate good survival rates in patients with stage II and III esophageal cancer. 50 Most importantly, TME was demonstrated to produce strong impact on the cytotoxic effects of NK cells against tumor cells possibly through its interaction with various signaling pathways related to the activation and in ltration of NK cells. 51,52 Moreover, the immune-regulatory effects mediated by TME was found in the suppression of in ltrating DCs in maintaining the homeostasis of CD8+ T cell immuno-reaction and tumor antigen tolerance, leading to the promotion of tumor growth. 53 For this reason, the immuneregulatory effect of TME and the molecular mechanisms underlying this effect were proposed as a pivotal factor in the development of new immunotherapy approaches of various cancers.
In this study, we found that the expressive level of PLK1 was correlated to the clinical stages and the regulations of immune microenvironment of tumor cells. However, our ndings were limited due to the following reasons. LUAD cell lines were used for the qRT-PCR and immunoblotting veri cations of PLK1 level. Therefore, further evidences in in vivo and clinical studies were necessary for the validations of the mechanisms suggested by this study. Also, lack of veri cations in clinical samples in this study limited the reliability of our ndings. Moreover, the correlation studies of clinical features were unsatisfactory caused by the unbalanced clinical data obtained from the available datasets. Finally, in our results, the underlying mechanisms of the association between PLK1 and TICs were not fully understood.

Conclusions
Our ndings provided evidences for a further understanding of the onset and development of LUAD. As suggested by our results, PLK1 level could potentially be used as both new treatment target and novel prognosis biomarkers in LUAD patients. PLK1 was found to robustly correlated with the function of TICs in TME of tumor cells. Further studies were needed for validations and mechanism investigations of the role played by PLK1 in the regulation of TICs in LUAD. Declarations Figure 1 Flow chart for the PLK1 as a prognostic factor for lung adenocarcinoma (LUAD).   Validation of PLK1 expression. PLK1 expression is signi cantly upregulated in LUAD samples compared to normal lung samples in both unpaired (A) and paired (B) difference analyses based on the TCGA dataset. PLK1 expression is signi cantly increased in LUAD samples compared with normal samples in a meta-analysis (C), immunohistochemical analysis (D) and GSE31210 pro le (E). Gene expression level (F) and protein expression level (G) of PLK1 were signi cantly increased in H1975 and SPC-A-1 cell lines compared with BEAS cells. n = 3. Data represent mean ± SEM, *p < 0.05, **p < 0.01, ***p < 0.001 vs. BEAS.