Comprehensive Analysis of The Expression and Prognosis For KRTs in LUSC Based on Bioinformatics Analysis

Keratins (KRTs), a family of genes that encode a series of intermediate ament proteins, which expressed in epithelial cells of various tissues and had been identied as being involved in various tumors. Previous evidences indicated that KRTs were implicated in tumorigenesis and development of lung cancer. However, a comprehensive analysis of KRTs in lung squamous cell carcinoma (LUSC) and their roles in the tumorigenesis and progression of this disease is lacking. Therefore, we investigated the transcriptional level, proteomic level, and prognosis of KRTs in patients with LUSC from TCGA, Gene Expression Proling Interactive Analysis (GEPIA), Human Protein Atlas (HPA), Kaplan-Meier Plotter, and cBioPortal databases. We conrmed that the expression levels of KRT5, KRT6A, KRT6B, KRT6C, KRT13, KRT14, KRT15, KRT16, KRT17, KRT19, and KRT23 were higher in LUSC tissues than in adjacent normal lung tissues. Survival analyses using Kaplan-Meier Plotter database and TCGA database revealed that the high expression levels of KRT5, KRT6A, KRT6B, KRT6C, KRT13, KRT14, KRT15, KRT16, KRT17, KRT19, and KRT23 were associated with poor prognosis in patients with LUSC. ROC curves indicated that KRTs had poor roles in predicting the overall survival (OS) of LUSC patients. KRTs and their 50 most frequently altered neighbor genes were mainly enriched in epidermis development, keratinization, keratinocyte differentiation, Rap1 signaling pathway, and transcriptional misregulation in cancer. The results of clinical data analyses indicated that the expression levels of KRT13, KRT16, KRT17, and KRT19 were positively correlated with nodal metastasis, the expression level of KRT15 and KRT17 were positively correlated with primary tumor, and the high expression level of KRT15 was positively correlated with TNM stage. Our study may provide new theoretical basis and research direction for the discovery of potential therapeutic targets and new prognostic biomarkers in patients with LUSC. on the prognosis of patients with LUSC, the results showed that low expression levels of KRT5 (P = 0.00493), KRT6A (P = 0.00163), KRT6B (P = 2e-05), KRT6C (P = 0.00026), KRT13 (P = 0.01032), KRT14 (P = 0.00121), KRT15 (P = 0.00074), KRT16 (P = 0.02083), KRT17 (P = 9e-04), KRT19 (P = 0.00027), and KRT23 (P = 0.00111) were predicted to have signicantly better OS in LUSC patients (Fig. 4C), which were consistent with the results of Kaplan-Meier survival plotter database. heterogeneity of the of KRT5, KRT6A, KRT6B, KRT6C, KRT13, KRT14, KRT15, KRT16, KRT17, KRT19, and KRT23 in LUSC might play an important role in LUSC oncogenesis and progression, and could also serve as molecular markers to identify high-risk group

In addition, we downloaded the survival data of patients with LUSC from TCGA database, combined the survival data and gene expression data, and divided them into high expression group and low expression group to study the effect of KRTs on the prognosis of patients with LUSC, the results showed that low expression levels of KRT5 (P = 0. ROC of the KRTs in patients with LUSC. We obtained the gene expression data and clinical data from TCGA database. ROC curve analysis was performed in RStudio using procedures from the 'pROC' package. The results indicated that the KRTs had different speci city and sensitivity in predicting the prognosis of LUSC patients (Fig. 5). The area under receiver operating characteristic (AUC) of KRT5 was 60.753, which was the biggest of all KRTs. The sensitivity and speci city of KRT13 and KRT23 were 74.194 and 86.31, respectively, which were the biggest of all KRTs.
Potential molecular mechanism of the KRTs in patients with LUSC. We used the cBioPortal online tool for TCGA LUSC cohort to analyze the alterations of KRTs. The mutation, which can change the function of protein by changing gene sequence, is the most common gene alteration in KRTs (Fig. 6A). In the current study, the results showed that 13% (59/469) cases had genetic alterations, including inframe mutation, missense mutation, splice mutation, truncating mutation, ampli cation, and deep deletion (Fig. 6B). The KRT23 (2.8%) was the most common mutational gene among the KRTs, including missense mutation, splice mutation, ampli cation, and deep deletion.
The co-expression of KRTs in patients with LUSC. TCGA database was used to analyze the expression data of KRTs, and RStudio software was used to calculate the correlation between them, and Pearson's correction was performed. The heapmap was shown in Fig. 7 GO function and KEGG pathway analyses for KRTs and their related genes. We constructed the network for KRTs and their 50 most frequently altered neighbor genes, and found that there were 262 nodes and 1,121 edges in this network (Fig. 8A). The results of functions of 262 genes were shown in Table 2. GO enrichment analysis predicted the functional roles of 262 genes on the basis of three aspects, including biological process (BP) terms, cell component (BP) terms, and molecular function (BP) terms. We found that the 262 genes were mainly enriched in epidermis development, keratinization, and keratinocyte differentiation in BP terms (Fig. 8B), enriched in corni ed envelope, extracellular exosome, and desmosome in CC terms (Fig. 8B), and enriched in structural molecule activity, structural constituent of cytoskeleton, and cadherin binding involved in cell-cell adhesion in MF terms (Fig. 8B). KEGG pathway analysis demonstrated that the 262 genes were enriched in P53 signaling pathway, drug metabolism-cytochrome P450, Rap1 signaling pathway, and transcriptional misregulation in cancer ( Fig. 8C).   Fig. 9).

Discussion
Numerous studies have reported that the expression levels of KRTs in many tumors were dysregulated 16-20 , and the roles of KRTs in occurrence, progression, and metastasis of multiple cancers has been partially con rmed 36-39 . However, systematic bioinformatics analyses about the function of KRTs in LUSC are limited and have yet to be performed. This is the rst time to systematically study the expression of KRTs in LUSC and their prognostic values. We hope our ndings will help to provide new therapeutic measures and improve the prognosis of patients with LUSC.
KRTs family is a group of intermediate ament proteins which express in various types of epithelial cells 40,41 . Previous studies have con rmed that KRTs played signi cant roles in maintaining the structural stability of epithelial cells and were involved in numerous biological activities, including epithelial cell signal transduction, stress response, tumor cell apoptosis, and tumor cell proliferation 42,43 . Moreover, Nazarian et al. 44 reported that the KRTs were dysregulated in a variety of tumor tissues, and played vital roles in invasion and metastasis of tumors. In summary, this kind of proteins has great potential in diagnosis, typing, and prognosis prediction of tumors. Therefore, the roles of KRTs as multifunctional regulators of epithelial tumorigenesis are worthy of further exploration. We used TCGA database to KRT5 is usually speci cally expressed in the basal layer of the epidermis, and its normal expression is essential for the protection of epithelial cells 45 . Moll et al. 19 reported that KRT5 was speci cally expressed in squamous cell carcinoma of multiple tissues. For lung cancer, the expression level of KRT5 in LUSC tissues was signi cantly higher than in adjacent normal lung tissues 20,46,47 , which was helpful for the differential diagnosis of LUSC and LUAD 47 . In our study, TCGA database revealed that the expression level of KRT5 was higher in LUSC tissues than in adjacent normal lung tissues. By analyzing the Kaplan-Meier plotter database and TCGA database, we determined the prognostic value of KRT5 in patients with LUSC, and the results indicated that the high KRT5 expression was signi cantly associated with poor OS and PFS in patients with LUSC followed up for 200 months. In addition, the AUC of KRT5 was 60.753, which had a certain predictive effect on the prognosis of LUSC.  69 . In our report, we demonstrated that the expression level of KRT13 in LUSC tissues was higher than in normal lung tissues, the high KRT13 expression was correlated with poor prognosis in patients with LUSC, and the sensitivity of KRT13 was highest, which seemed consistent with the role of KRT13 as an oncogene and a diagnosis factor of LUSC.
KRT14 is an intermediate lament protein, which is usually expressed by basal epithelial progenitor cells located in epithelial niches of healthy adult tissues. Previous study indicated that KRT14 was not detected in normal tissues, was signi cantly increased in many tumor tissues, and was expressed in squamous cell carcinoma of different origins and degrees of differentiation 70 . According to the IHC analysis of surgical specimens, the expression level of KRT14 in lung cancer tissues was signi cantly increased 71 , especially in LUSC tissues 46,47 , which indicated that KRT14 played an important role in the occurrence and development of LUSC. Chen et al. 47 found that the high KRT14 expression could be helpful in the differential diagnosis of LUAD and LUSC. In addition, Huang et al. 72 reported that KRT14 was not expressed in normal cervical tissue, but with the increase of the grade of cervical epithelial neoplasia, the expression level of KRT14 gradually increased, which indicated that KRT14 could be used as one of the indicators for the diagnosis of early cervical cancer. In our study, we demonstrated that the expression level of KRT14 in LUSC tissues was higher than in cancer-free lung tissues, and high KRT14 expression was related to poor OS and PFS in patients with LUSC.
KRT15 is a type I keratin, which is mainly expressed in basal keratinocytes of strati ed epithelium and plays a signi cant role in maintaining cytoplasmic stability 73,74 . Previous studies conformed that the overexpression of KRT15 was involved in tumor formation and progression, including NSCLC 75,76 , breast cancer 77,78 , oral squamous neoplasms 79 , urothelial cell carcinomas 80 , and hepatocellular carcinoma 81 . Sanchez-Palencia et al. 82 reported that KRT15 expression level was higher in LUSC tissues than in cancer-free lung tissues, and may be a marker of LUSC. In addition, the high expression KRT15 was associated with poor prognosis of colorectal cancer 83 and gastric cancer 84 . In the present study, KRT15 was signi cantly overexpressed in LUSC tissues, and its high expression level was related to the poor prognosis in patients with LUSC.
KRT16 is an important part of type I cytoskeleton. KRT16 was reported to play a role in LUAD tumorigenicity via EMT, and increased KRT16 expression was associated with poor outcomes of tumors, including LUAD 85 , oral squamous cell carcinoma 36 , and metastatic breast cancer 86 , which indicated KRT16 had an oncogenic role in tumors. In our report, we demonstrated that the expression level of KRT16 was higher in LUSC tissues, which was signi cantly related to poor OS and PFS in patients with LUSC.
KRT17 is a 48 kDa type I keratin 20 . Extensive tissue screening results showed that KRT17 was low expressed in mature epithelial tissues, but regenerated and highly expressed in cancer tissues 87 , and high expression of KRT17 indicated poor prognosis in NSCLC 88 , which indicated that KRT17 can be a biomarker in predicting progression and poor prognosis in patients with LUSC 87,88 . Chen et al. 47 reported that KRT17 was highly expressed in LUSC, and in LUSC was signi cantly higher than in LUAD, suggesting that KRT17 may be a tumor marker for differentiating LUSC and LUAD subtypes of lung cancer. In addition, KRT17 was also highly expressed in breast cancer 89,90 , cutaneous squamous cell carcinoma 91 , cervical carcinoma 92 , gastric carcinoma 93 , and oral sprays 94 , which was associated with poor prognosis of these tumors [89][90][91][92][93][94] . In this report, we found that KRT17 was highly expression in LUSC tissues, which was correlated with poor OS and PFS of the patients with LUSC.
KRT19 is a member of the keratin family, which is functionally related to maintaining the structural integrity of epithelial cells, including bronchial epithelial cells 95 105 , and hepatocellular carcinoma 106 . In our report, we demonstrated that the expression level of KRT23 was higher in LUSC tissues than in normal lung tissues, and the high KRT23 expression was signi cantly correlated with poor prognosis in patients with LUSC. ROC curve analysis found that the speci city of KRT23 was 86.31, which had a certain predictive effect on the prognosis of LUSC.
In our study, the analysis of co-expression of KRTs found that KRTs were co-expressed in different degrees in LUSC, which was consistent with the nding of Travis, who found that KRT5 GO function and KEGG pathway analyses found that the KRTs and their neighbor genes were mainly enriched in epidermis development, keratinization, keratinocyte differentiation, and Rap1 signaling pathway. This was consistent with the knowledge that cytokeratin was abundant and stable in epithelial cells, and it had high speci city in tissues and cells, so it could be used as an ideal biological indicator to identify the origin of tumor and predict prognosis 110 . Some studies have reported that keratinization of tumors was closely related to the poor prognosis of tumor patients 111,112 , suggesting the importance of keratinization in neoplastic prognosis 111

Declarations Data Availability Statement
High-throughput gene expression data of LUSC tissues and normal lung tissues were extracted from the Cancer Genome Atlas (TCGA) data portal (https://tcga-data.nci.nih.gov/tcga). These RNA-seq data (HTSeq-count) from Illumina HiSeq RNASeq platform consisted of 502 LUSC samples and 49 adjacent non-cancerous lung samples, and were achieved from the publicly available Genomic Data Commons (GDC) data portal (https://portal.gdc.cancer.gov/).

Author Contribution Statement
Qiangqiang Zheng designed this study, Qiangqiang Zheng analyzed the data. Qiangqiang Zheng wrote the manuscript. All authors reviewed the manuscript.

Con ict of Interest Statement
The authors declare no competing nancial interests and competing non-nancial interest.

Ethics Statement
Not applicable.  The IHC -based protein expression of KRTs in LUSC tissues and normal lung tissues. All the IHC staining images were obtained from the HPA database.