Common targets of CEP and LUSC
Through PubChem database, we obtained the 3D structure of CEP (Figure 1a), and predicted 231 and 121 CEP related target proteins by PharmMapper and SwissTargetPrediction databases, respectively. After removing repeated targets, a total of 322 related target genes of CEP were collected (Table S1). 7947 LUSC-related target genes were retrieved from GeneCards database (Table S2). In addition, RNA-seq data of 501 LUSC samples and 49 para-cancer samples from TCGA database and sequencing data of 578 normal lung tissue samples from GTEx database were downloaded to obtain 2194 up-regulated genes in tumor tissues by difference analysis (Figure 1b). Finally, through the intersection of the three gene sets, 41 potential targets for the regulation of LUSC progression were obtained (Figure 1c), and drug-target-disease network was drawn (Figure 1d).
Interaction and enrichment analysis of target proteins
In order to study the correlation between CEP and LUSC, a 41 nodes and 130 edges protein-protein interaction (PPI) network was constructed using String database (Figure 2a). To further define key target proteins in PPI network, the MCC, Degree, DMNC, MNC and Closeness algorithms in CytoHubba plug-in are used to calculate node topology parameters (Table S3), and the top 10 targets are selected to intersect. Finally, we get six key target proteins: AURKA, CCNA2, CCNE1, CDK1, CHEK1, PLK1 (Figure 2b). In addition, we also conducted functional enrichment analysis of the GO and KEGG pathways involved in these targets using the R package “ClusterProfile”. GO enrichment analysis consisted of three items, biological process (BP), cellular composition (CC) and molecular function (MF). Biological processes mainly involve: negative regulation of apoptotic process, collagen catabolic process, extracellular matrix disassembly, et al. Cellular composition mainly involve: cytosol, extracellular exosome, cyclin-dependent protein kinase holoenzyme complex, et al. And molecular function mainly involve: protein serine/threonine/tyrosine kinase activity, protein kinase activity, endopeptidase activity, et al (Figure 2c). The KEGG pathway they involved mainly includes: Pathways in cancer, IL-17 signaling pathway, Progesterone-mediated oocyte maturation, Cell cycle, Cellular senescence, p53 signaling pathway, et al (Figure 2d).
Molecular docking
After identifying the 6 key target proteins, we further verified the interaction between CEP and these target proteins by molecular docking. As shown in Table 1, the lowest binding affinity of CEP with AURKA, CCNA2, CCNE1, CDK1, CHEK1 and PLK1 were -9.1 kcal/mol, -8.4 kcal/mol, -9.0 kcal/mol, -9.2 kcal/mol, -7.6 kcal/mol and -8.5 kcal/mol, respectively. The above all show a strong binding affinity. As shown in Figure 3, the binding of CEP with 1mq4 (AURKA) is mainly through the hydrophobic interaction with LYS-143, PHE-144, and GLU-260, hydrogen bonding with amino acid residues LYS-143, LYS-162, LYS-258, and TRP-277, and salt bridge with GLU-260. The binding of CEP with 1fin (CCNA2) is mainly through the hydrophobic interaction with ILE-182, GLN-313, and THR-316, hydrogen bonding with amino acid residues ASN-173, and salt bridge with GLU-268. The binding of CEP with 1w98 (CCNE1) is mainly through the hydrophobic interaction with GLN-240, hydrogen bonding with amino acid residues ASN-236. The binding of CEP with 4y72 (CDK1) is mainly through the hydrophobic interaction with VAL-227, ILE-269, TYR-71 and LYS-274, hydrogen bonding with amino acid residues TYR-270. The binding of CEP with 1ia8 (CHEK1) is mainly through the hydrophobic interaction with GLU-33, and ALA-34, Hydrogen bonding with amino acid residues TYR-71 and TYR-86, and salt bridge with ASP-139 and GLU-140. The binding of CEP with 1q4o (PLK4) is mainly through the hydrophobic interaction with LYS-420, ASP-438, and LYS-474, Hydrogen bonding with amino acid residues ARG-456, and salt bridge with ASP-438 (Table S3).
Expression levels of these targets
To further explore the mechanism of CEP in LUSC, we performed bioinformatics analysis on these 6 key targets. First, we analyzed TCGA transcriptional data and found that they were significantly up-regulated in LUSC tissue compared to normal lung tissue (all P < 0.05) (Figure 4a). In the meanwhile, we analyzed the expression data of cell lines in the CCLE dataset and found that these 6 genes were also expressed differently in various cell lines of LUSC (Figure 4b). In addition, we analyzed the HPA database and found that compared with normal lung tissues, the protein expression levels of AURKA, CCNA2, CCNE1, CDK1 and PLK1 were significantly increased in LUSC tissues (Figure 4c-g). The expression of CHEK1 protein has not been recorded in HPA database, but in Grabauskiene's study, CHEK1 protein was upregulated in LUSC[28]. In the analysis of the expression of these targets and survival, the high expression of CCNA2 and CHEK1 was associated with shorter overall survival in LUSC patients (P < 0.05), while the other gene expression levels were not significantly associated with overall survival in LUSC patients (P > 0.05) (Figure 5).
Evaluation of stemness index
To evaluate the possible effect of CEP on cancer stemness, we divided LUSC samples from TCGA into high expression group and low expression group according to the mRNA expression median value of target genes. OCLR machine learning algorithm was used to calculate stemness index (mRNAsi) to analyze the stemness degree among samples with differential expression of CEP targets. The results showed that the mRNAsi in the samples with high mRNA expression of AURKA, CCNA2, CCNE1, CDK1, CHEK1 and PLK1 was significantly higher than that in the samples with low mRNA expression (Figure 6).
Immune cell infiltration analysis of targets
Cancer stem cells have been proven to have immunosuppressive effects, and previous studies have shown that CEP has immunomodulatory effects. Therefore, we analyzed the relationship between expression of CEP targets and level of immune cell infiltration in LUSC. By calculating levels of immune cell infiltration for six types, the expression of AURKA were negatively correlated with infiltration levels of B cell, CD4+ T cell, CD8+ T cell, Neutrophil, Macrophage, Myeloid dendritic cell. Similarly, the expression of other target genes is also negatively correlated with the level of immune cell infiltration (Figure 7). Further, we analyzed the relationship between immune subtypes and expression of targets, and immune subtypes were classified into six types, including C1 (wound healing), C2 (IFN-gamma dominant), C3 (inflammatory), C4 (lymphocyte depleted), C5 (immunologically quiet) and C6 (TGF-b dominant). The results showed that AURKA was low expressed in inflammatory samples, while AURKA was high expressed in lymphocyte depleted samples. The correlation between immune subtypes and expression of the other five targets showed similar results (Figure 8).
Tumor mutation burden analysis
Antitumor immunity requires T cells to recognize neoantigens caused by somatic mutations. Therefore, we analyzed the correlation between tumor mutation burden and the expression of targets, and the results showed that the expression of AURKA, CCNA2, CCNE1, CDK1, CHEK1, and PLK1 were significantly positively correlated with the tumor mutation burden (Figure 9).