Establishment and validation of the prognostic risk model based on the anoikis related genes in esophageal squamous cell carcinoma

doi:10.21203/rs.3.rs-3978091/v1

Download PDF

Research Article

Establishment and validation of the prognostic risk model based on the anoikis related genes in esophageal squamous cell carcinoma

https://doi.org/10.21203/rs.3.rs-3978091/v1

This work is licensed under a CC BY 4.0 License

Version 1

posted

You are reading this latest preprint version

Background

Esophageal squamous cell carcinoma (ESCC) is a malignant condition in humans. Anoikis related genes (ARGs) are crucial to cancer progression. Therefore, more studies on the relationship between ARGs and ESCC are warranted.

Methods

The study acquired ESCC-related transcriptome data from the cancer genome atlas (TCGA). Differentially expressed ARGs (DE-ARGs) were obtained by performing differential analysis on the training set, and candidates were filtered out by survival analysis of high and low expression groups of DE-ARGs. Prognostic genes were determined by univariate and multivariate Cox and lasso regression based on candidate genes and were analyzed by gene set enrichment analysis (GSEA). A risk model was constructed on the basis of prognostic gene expressions. To find out how these genes contribute to ESCC development, immune infiltration study was done. Based on prognostic genes, the miRNA-mRNA-TF regulation network was constructed. IC50 test was adopted to assess the clinical response of chemotherapy drug. Single-cell analysis was performed on the GSE145370 dataset. Moreover, with the use of quantitative reverse transcription (qRT)-PCR, we verified prognostic gene expressions.

Results

53 DE-ARGs (46 upregulated; 7 downregulated) were screened by differential analysis. Survival analysis yielded four candidate genes consisting of PBK, LAMC2, TNFSF10 and KL. The two prognostic genes, TNFSF10 and PBK, were determined by univariate and multivariate Cox and lasso regression. In terms of hallmark, TNFSF10 was involved in 32 pathways, and PBK was partake in 34 pathways. In immunomic signatures, 4558 enrichment entries were associated with TNFSF10, like genes downregulated CD8 T cells, PBK enriched to 4262 pathways, such as genes downregulated in B cells. Immuno-infiltration analysis revealed positive associations of PBK with Macrophages M0 cells, and TNFSF10 with Macrophages M1 cells. miRNA-mRNA-TF network was generated with prognostic genes, which contained hsa-miR-562-TNFSF10-FOXO3, hsa-miR-216b-5p-PBK-ATM regulatory relationship pairs, etc. The result of chemotherapy drug susceptibility analysis showed that IC50 values of predicted drugs, in the case of Tozasertib 1096 and WIKI4 1940, were significantly variant between risk groups. Single-cell analysis revealed that TNFSF10 and PBK levels were higher in epithelial cells than in other cells. The prognostic genes expression results by qRT-PCR were compatible with the dataset analysis.

Conclusion

The study explored the biomarkers related to anoikis based on bioinformatics technology and established a prognosis model of ESCC. It provided a reference for the research of ARGs in ESCC.

Esophageal squamous cell carcinoma

Biomarker

Anoikis

Esophageal carcinoma is an extremely malignant tumor in the world. 87% of esophageal carcinoma cases are esophageal squamous cell carcinoma (ESCC), with a 5-year relative survival of fewer than 20% [1, 2]. Even with the advancement of diagnostic and treatment approaches, ESCC patients survival period was not appreciably extended [3–5]. Prognostic factors are used for stratifying the patients’ risk level and predicting the clinical outcome. Conventional prognostic factors are mainly composed of TNM stage, and lymph node metastasis, which effect prognosis judgment and guide treatment. However, shortcoming of conventional prognostic factors is that they fail to achieve accurate prediction as the survival outcomes of the same clinicopathological characteristics patients differ greatly from each other [6]. Therefore, focusing on the discovery of new prognostic factors will improve the evaluation of prognosis and the development of therapeutic targets.

Anoikis is a caspase-mediated apoptosis owing to the loss of attachment between cell and extracellular matrix 1 [7, 8]. A correct attachment and interaction with extracellular matrix proteins is vital to define whether a cell is currently in the correct location. Hence, anoikis is a nomal physiological process involved in tissue homeostasis and development and its deregulation is often occurs in several diseases [9, 10]. Numerous studies indicated ARGs play a central role in gastric, lung, breast, and endometrial cancer metastatic cascade and progression [11–14]. Nonetheless, little is known regarding anoikis functions in ESCC. As a result, it is crucial to comprehend their interaction in greater detail. We downloaded mRNA expression data together with the ARGs from TCGA and GeneCards to determine DE-ARGs in ESCC, and then performed survival analysis to determine prognostic hub genes. A two-ARG signature (including PBK and TNFSF10) was identified by univariate, lasso and multivariate cox regression. We evaluated predictive value of the prognostic model in TCGA and GSE53622 dataset via KM survival, ROC, and risk curve methods. A nomogram including risk score, N stage and pathological stage was built and DCA curve of the combined model showed greater prognosis-predictive property than either alone. Besides, DE genes (DEGs) between high- and low-risk group were identified, then GO and KEGG analyses were done. In addition, tumor microenvironment was analyzed to explore the correlations between risk model and ESCC-infiltrative immune cells. Chemotherapeutic response was calculated via the oncoPredict R package. Moreover, we adopted single cell analysis to identify associations of prognostic genes with cell clusters. Finally, qRT-PCR was also applied to further validate two prognostic ARG expression. Overall, our work established an anoikis related prognostic signature and may provide a new clue for exploring the relationship between ARGs and ESCC.

2.1 Data Collection

From the cancer genome atlas (TCGA; https://about-nci/organization/ccg/research/structural-genomics/tcga), 82 samples of ESCC and 11 samples of normal esophageal tissue were obtained as the training set, which included gene expression information, survival information, clinical data (age, clinical stage, pathological grade, etc.), and mutation information. GSE53622 dataset involving 60 normal samples and 60 ESCC samples was obtained as a validation set from GEO at https://www.ncbi.nlm.nih.gov/gds [15]. ARGs were downloaded from GeneCards (https://genecards.weizmann.ac.il/v3/).

2.2 Screening for DE-ARGs

The DE-ARGs between ESCC samples and normal esophageal tissue samples in TCGA were screened by setting the conditions |log₂FC| > 2.5 and p < 0.001 with limma package [16]. Heatmap was drawn using pheatmap package [17] to show DE-ARGs and volcano map was plotted using the ggplot2 package [18] to show the results. Next, DE-ARGs with p < 0.005 were selected as candidate genes for survival analysis, and survival curves were plotted for visualization. Finally, sample scores were calculated using ssGSEA, and samples were assigned into high and low expression groups of candidate genes based on scores. Candidate genes with high and low expressions were subjected to survival analysis.

2.3 Prognostic gene acquisition and risk model development

Candidate genes were screened for prognostic genes based on the prognostic information of ESCC samples in TCGA by univariate, multivariate cox and lasso regression analysis respectively. The risk model was built with prognostic gene expressions. We computed risk scores of samples in training set based on riskscore = β1×X1 + β2×X2+... + βn×Xn, and divided ESCC sample into high and low risk groups with median risk score as criteria. Principal component analysis (PCA) was done to show the ability of risk scores for samples. Kaplan-meier (K-M) survival analysis was conducted on risk groups. Results were visualized by plotting the survival curve. Model accuracy was tested by plotting receiver operating characteristic (ROC) curve. In addition, we used external validation set GSE53622 to verify the validity of risk model, and prognostic gene levels between normal and ESCC groups were also validated.

2.4 Gene set enrichment analysis (GSEA) of prognostic genes

Firstly, the study downloaded the hallmark gene set and immunomic signatures gene set as background sets for enrichment analysis. Then, correlation coefficients were set to |NES| > 1, p < 0.05 and q values < 0.25 as criteria for GSEA enrichment analysis.

2.5 Correlation analysis of risk scores and clinical features

In order to understand correlations between risk scores in the training set and different clinicopathological characteristics, we used wilcoxon test and kruskal-wallis test to analyze the risk scores in different clinical characteristics (e.g. age, clinical stage, pathological grade) based on threshold p < 0.05. In addition, risk scores for each subtype under clinical characteristics were analyzed, and K-M survival curves were plotted to show results.

2.6 Independent prognostic analysis of risk model and the generation of a nomogram

The study included clinical features including age, gender, and ESCC stage in univariate Cox regression. Then, risk scores and clinical features were analyzed for independent prognosis using multivariate Cox regression analysis with screening threshold of p < 0.05. Forest plots were drawn to present results for analysis. Furthermore, to predict the survival rate of patients at different stages, a nomogram was constructed following univariate and multivariate Cox regressions. A correction curve was made to test the nomogram predictive accuracy. Decision curve analysis (DCA) was plotted to evaluate the degree of patient benefit.

2.7 Functional enrichment analysis of risk groups

The limma package with condition |log₂FC| > 0.5, p < 0.05 was implemented to screen DEGs between risk groups in TCGA data. Then we utilized clusterprofiler package [19] to do GO and KEGG analyses on DEGs.

2.8 Analysis of tumor microenvironment

To observe composition of immune cells among the training set samples, CIBERSORT algorithm was implemented to compute percentage abundance of 22 immune cells. Distribution of immune cells between risk groups was compared by wilcox test. Correlation analysis of prognostic gene and immune cell abundance was performed using spearman.

2.9 Establishment of miRNA-mRNA-TF and miRNA-lncRNA regulatory networks

Firstly, we applied TRRUST website to extract transcription factors (TF) of prognostic genes and TF-mRNA relationships. Then, the upstream miRNAs of prognostic genes were predicted by TargetScan and miRTarBase databases. The TF-mRNA relationship pairs and mRNA-miRNAs were intersected to establish miRNA-mRNA-TF regulatory network. Finally, lncRNAs were predicted by LncRNASNP database based on upstream miRNAs, and miRNA-lncRNA regulatory network was built.

2.10 Chemotherapy analysis

Using oncopredict package [20], half maximal inhibitory concentration (IC50) for each patient in risk groups of the training set was computed based on GDSC website to assess therapeutic response rate and sensitivity to traditional chemotherapeutic medicines in risk groups. The expression of IC50 values of chemotherapeutic drugs between risk groups was analyzed by rank sum test.

2.11 Single cell analysis

Cell clustering and identification analysis was done based on the GSE145370 dataset (7 normal samples and 7 ESCC samples). Top 20 principal components were chosen for clustering using FindNeighbors and FindClusters functions set to dims = 1 : 20, and UMAP was used for cell category visualization. We calculated prognostic gene expressions in varying classes of cells to display their relationship with cell classes, and screened out the cell classes which was the most associated with prognostic gene expression. Expression profiles were plotted to demonstrate that the cell category with the most relevant expression in ESCC samples and normal samples, and their correlation between nFeature-RNA and nCount-RNA Cell clusters of this cell category were identified by UMAP analysis, and we calculated the correlation between DEGs and prognostic gene expression in cell clusters.

2.12 qRT-PCR

10 ESCC and corresponding paracancerous tissues were obtained from Anyang tumor hospital. All participants provided an informed consent and this study was approved by the ethics committee of Anyang Tumor Hospital. Total RNA was extracted with TRIzol reagent (Cat. No. 356281, Ambion) and quantified with NanoPhotometer N50 (Thermo Fisher Scientific, USA). Reverse transcription of mRNA was done with SweScript RT I First strand cDNA Synthesis Kit (cat.: G3330-50) from the Servicebio company. The diluted cDNA was subjected to the qPCR reaction and GAPDH was utilized as internal reference. Relative mRNA level was normalized and computed by 2^−ΔΔCt method. Primer sequences are listed Table 1.

Table 1

Primer sequences for quantitative Real Time PCR
Primers	Sequence ( 5’ to 3’)
TNFSF10 F	AAGTGGCATTGCTTGTTTCTTA
TNFSF10 R	ATGTGTTGCTTCTTCCTCTGGT
PBK F	GGTTTGTCTCATTCTCCTTGG
PBK R	TTGGCTGGCTTTATATCGTTC
GAPDH F	CGAAGGTGGAGTCAACGGATTT
GAPDH R	ATGGGTGGAATCATATTGGAAC

3.1 Screening of DE-ARGs and candidate prognostic markers

Based on the TCGA and GeneCards, we isolated ARGs expression data and compared them between 82 ESCC and 11 paracancerous samples. As shown in the volcano map (Fig. 1A), 53 DE-ARGs (46 upregulation, 7 downregulation;Additional file 1) were screened by differential expression analysis. Heatmap of DE-ARGs was depicted in the Fig. 1B. Survival analysis of DE-ARGs was performed to identify candidate genes with p < 0.005. Ultimately, four genes including PBK, LAMC2, TNFSF10 and KL were selected for the following analysis. Figure 1C-F shows the survival probability was lower in low PBK and LAMC2 expression group than in high expression group, while survival probability was lower in high TNFSF10 and KL expression group than in low expression group. The enrichment of samples on the gene set was calculated by ssGSEA. Survival curves indicated that high levels of four candidate genes were implicated in shorter OS of ESCC patients (Fig. 1G).

3.2 Determination of prognostic genes and establishment of a risk model

To find vital genes in the prognosis of ESCC, univariate Cox regression was done based on four candidate genes. According to results (p < 0.05), TNFSF10 and PBK were identified with significantly effects on patient prognosis (Fig. 2A). Lasso regression was utilized to avoid overfitting in identifying stable markers from two survival related candidates (Fig. 2B, C). Ultimately, a prognostic signature comprising the two genes (TNFSF10 and PBK) was constructed using stepwise multivariate regression (Fig. 2D). Prognostic risk model was generated: riskscore = 0.4861524*Exp (TNFSF10) – 0.3919675*Exp (PBK). Furthermore, we found that TNFSF10 was a risk factor with HR > 1, while PBK was a favorable factor with HR < 1.

3.3 Verification of risk model

To explore predictive performance of the two-ARG signature (TNFSF10 and PBK), we computed risk scores. Using median risk score of 2.4077608, patients in training set were assigned into high- (n = 47) and low-risk (n = 34) groups. PCA showed that samples could be distinguished by the risk score profile (Additional file 2). A comparison of survival times revealed that patients in low-risk group outlived those in high-risk group (Fig. 3A). Then, ROC analysis investigated diagnostic value. AUC calculated respectively from ROC curves were 0.780 and 0.791 for 1 and 3 years, indicating that the risk model had some degree of predictive power (Fig. 3C). Thereafter, Fig. 3E presents the risk curve and the gene expression heatmap of two hub ARGs of patients. Furthermore, we tested prognosis effects of risk model in another independent dataset, GSE53622. K-M survival curves displayed that patients in high-risk set had a shorter OS (Fig. 3B). AUC for 1- and 3-year OS were 0.684 and 0.627 respectively (Fig. 3D). Patients’ risk score, survival time, and the two ARGs levels were also presented (Fig. 3F). In summary, risk model had a satisfying performance for OS prediction.

3.4 Functional annotation analysis on prognostic gene

To clarify the potential function and involved pathways of the two prognostic genes, GSEA was done based on hallmark gene set and immunomic signatures gene set. For TNFSF10, 32 entries were enriched in hallmark, such as a subgroup of genes regulated by MYC - version 1 (v1), genes encoding cell cycle-related targets of E2F TFs, genes involved in DNA repair, and so on (Fig. 4A). Besides, 4558 enriched entries were obtained in immunomic signatures, containing genes downregulated CD8 T cells (mock transduced versus wildtype), etc (Fig. 4B). For PBK, hallmark results indicated that 34 pathways were enriched including a subgroup of genes regulated by MYC - version 1 (v1), a subgroup of genes regulated by MYC-version 2 (v2), genes encoding proteins involved in glycolysis and gluconeogenesis (Fig. 4C). Based on immunomic signatures, PBK was involved in 4262 enrichment pathways, including genes downregulated in dendritic cells and B cells (Fig. 4D).

3.5 Correlation analysis of risk scores and clinical features

To investigate clinical value of prognostic signature, association of risk score with clinical features was investigated. Correlation analysis unveiled significant differences in risk scores between pathologic-N0, pathologic-N1, pathologic-N2 and pathologic-N3, and highly significant differences in the alive and dead categories of status (Fig. 5A, B). We also validated the prognostic prediction capability in different subgroups. K-M survival curves demonstrated that survival rates of pathologic-T2, pathologic-N0, pathologic M0 and tumor stage II were lower in high-risk group (Fig. 5C-F). Taken together, risk signature could function as a great potential indicator for OS prediction in ESCC.

3.6 Independent prognostic analysis and establishment of nomogram

To find out if our model was a clinically independent prognostic factor for ESCC patients, univariate and multivariate regression analysis were employed. It was concluded from the forest plot that risk score, pathological stage and N stage were significant factors associated with survival and could be utilized as independent prognostic factors (Fig. 6A, B). According to these three factors, we then built nomogram for 1- and 3-year OS prediction (Fig. 6C). C-index values for prognostic were 0.759. Calibration curve analysis revealed that observed versus predicted rates of 1- and 3-year OS displayed good consistency (Fig. 6D). DCA displayed that combined model including three independent prognostic factors (riskscore + N + stage) had best net benefit for 1- and 3-year OS than the alternative options (that of each factor alone) (Fig. 6E).

3.7 Annotated analysis of the function of DEGs

To further explore biological functions and pathways of ARGs, we analyzed DEGs between high- and low-risk groups in the training set. 1852 DEGs (722 upregulated; 1130 downregulated) were screened by differential analysis (Fig. 7A). A heat map was displayed in Fig. 7B. GO and KEGG analyses presented enrichment of DEGs in 450 GO terms and 39 pathways. GO pointed out the enrichment in biological processes like cellular response to interferon-gamma and type Ⅰ interferon signaling pathway (Fig. 7C). KEGG unveiled enrichment in cell adhesion molecules, and phagosome (Fig. 7D).

3.8 Immuno-infiltration analysis

The tumor microenvironment was analyzed to screen differences in immune infiltrating cells in risk groups. By using the CIBERSORT algorithm with values of p < 0.05, the relative proportions of 22 tumor-infiltrating immune cells were acquired. As illustrated by box plot in Fig. 8A, T.cells.CD4.memory.resting exhibited an increase in low-risk group, whereas expressions of B.cells.memory and T. cells.Follicular.Helper were significantly increased in high-risk group. To further investigate correlation between immune infiltrating cells and the prognosis genes, Spearman’s correlation analysis with p < 0.05 and | cor | > 0.3 was applied. The statistical correlations were summarized in Fig. 8B. The Scatter plot revealed that PBK was positively linked to Macrophages M0 cells, whereas TNFSF10 was positively correlated with Macrophages M1 cells in Fig. 8C.

3.9 Establishment of regulatory network

TRRUST database was applied to extract the TFs of prognostic genes and 17 pairs of mRNA-TF relationships were obtained. The miRNA-mRNA interaction relationships were downloaded from miRTarBase and TargetScan to screen for miRNA interacting with the two prognostic genes. A total of 34 overlapping targeting relationship pairs existed in the two databases. After miRNA-mRNA network and mRNA-TF network were predicted, miRNA-mRNA-TF relationships were tested. miRNA-mRNA-TF relationship pairs such as hsa-miR-562-TNFSF10-FOXO3 and hsa-miR-216b-5p-PBK-ATM were obtained by taking the intersection of mRNA-TF and miRNAs-mRNA (Fig. 9A, B). The lncRNASNP was utilized to predict lncRNAs targeted by 34 identified miRNAs. It can be seen from the miRNA-lncRNA regulatory relationship network diagram that 602 miRNA-lncRNA regulatory relationship pairs were predicted, including hsa-miR-130b-3p-LINC00339, hsa-miR-26a-5p-lnc-ATP6V1G3-4, etc. (Fig. 9C).

3.10 Chemotherapy drug susceptibility analysis

Based on the oncoPredict R package, association of risk group with drug sensitivity was explored by calculating the IC50 value. The box plot illustrated that IC50 values of predicted drugs, in the case of Tozasertib 1096 and WIKI4 1940, were significantly variant between risk groups by rank sum test at p < 0.05 (Fig. 10). We observed that patients in low-risk group were more sensitive to Tozasertib 1096, while patients in high-risk group were more sensitive to WIKI4 1940. It can be indicated that risk score can be a favorable source indicator for providing basis for drug selection.

3.11 Single cell analysis

ScRNA sequencing data of seven normal samples and seven ESCC samples were downloaded from the GSE145370 dataset and analyzed to identify and characterize cell populations. As with other clusters, significant DEGs in every cluster were identified by “FindAllMarkers” function and then cell types were identified in an unbiased manner by the specific markers. We observed that ten cell clusters including NK cells, T cells, monocytes/macrophages, B cells, mDC, pDC, plasma cells, mast cells, fibroblasts, and epithelial cells were identified by UMAP analysis (Fig. 11A). Subsequently, two prognostic gene expressions in each cell type was visualized by a dot plot in Fig. 11B. TNFSF10 was higher in epithelial cells and mast cells, while PBK was higher in epithelial cells than other cell types. The violin plot in Fig. 11C showed the expression of epithelial and mast cells in ESCC samples and normal samples. Additionally, the correlation value between nFeature-RNA and nCount-RNA was 0.93 and 0.96 for epithelial and mast cells (Fig. 11D). Subsequently, we further clustered the two-cell category and calculated the correlation between DEGs and prognostic gene expression based on |Cor| > 0.3 and p < 0.05. 964 DEGs were screened by cell clustering for epithelial (Fig. 11E, F), and 252 DEGs were screened for mast cells (Fig. 11G, H).

3.12 Validation of prognostic genes expression by qRT-PCR

In the training set, both TNFSF10 and PBK were increased in ESCC compared to the paracancerous tissues (Figs. 12A, B). Consistent with the results in TCGA, the average expression levels of TNFSF10 and PBK in validation set were significantly upregulated in the cancer tissues (Figs. 12C, D). To confirm expression pattern of the two biomarkers, qRT-PCR was performed. Results indicated that two gene expressions were completely congruous with data mining (Figs. 12E, F).

Owing to complicated molecular mechanisms, ESCC is a rapidly progressive disease with extremely poor outcome. Therefore, identifying novel reliable biomarkers to predict outcome and prognosis targets of ESCC is urgent. Despite the fact that increasing researches [21–25] have reported some anoikis-related prognostic models in several cancers, few studies have been reported in ESCC. Here, we constructed a two-anoikis-related gene (TNFSF10 and PBK) risk model and verified that with this model, patients with varying prognoses from ESCC could be efficiently identified. As a TNF-related apoptosis-inducing ligand, TNFSF10 functions through interaction with its receptors and preferentially trigger cell death in transformed cells [26–28]. Numerous studies have indicated TNFSF10 is involved in modulation of OS in ovarian cancer, prostate cancer, and breast cancer patients [29–32]. Zhang et al. found that TNFSF10 was enhanced in esophageal tumor tissues, and it was related to patient development and progression [33]. Additionally, its overexpression is negatively relevant to patients survival/clinical outcomes. This finding is consistent with our research results, indicating that TNFSF10 can be a biomarker for ESCC patient prognosis.

PBK is a mitotic serine/threonine kinase that manipulates cell proliferation, apoptosis, metastasis and inflammation [34–36]. Prior studies have defined PBK is enhanced in lung, stomach and ovarian cancers [37–39]. Consistently with our observations, Ohashi et al indicated that an increased PBK is linked to unfavorable prognoses in ESCC [40]. In contrast to this issue, Zheng et al revealed that overexpression of PBK predicts better prognosis for ESCC patients [41]. Bias in individual cohort studies might be the cause of the inconsistent outcome. Collectively, these data suggest that PBK is likely to be a prognosis indicator and treatment target.

GSEA analysis based on hallmark showed that TNFSF10 was functionally enriched in 32 entries, including a subgroup of genes regulated by MYC-version 1 (v1). C-MYC has been defined as a proto-oncogene that affects tumorigenesis and tumor development [42–44]. Moreover, several published studies have shown that c-MYC is overexpressed in a large proportion of ESCC patients, and that suppressing MYC attenuates malignant growth [45–48]. Significant enrichment entries of TNFSF10 among the immunomic signatures contain genes downregulated CD8 T cells (mock transduced versus wildtype), etc. CD8⁺ T cells are key mediators of tumor elimination by secreting perforin and granzymes. Thus, downregulated CD8⁺ T cells diminish anti-tumor responses, thereby allowing tumor growth. Consistently, a high level of TNFSF10 indicated poorer prognosis in ESCC. As for PBK, hallmark results indicated that 34 pathways were enriched, including genes encoding cell cycle-related targets of E2F TFs. E2F family of TFs is crucial in the events of cell functions linked to DNA replication and cell cycle progression [49]. E2F1, E2F3, E2F5, and E2F8 have been previously researched and shown to play vital roles in ESCC [50–52]. Based on immunomic signatures, PBK was involved in 4262 enrichment pathways, including genes downregulated in dendritic cells and B cells. Dendritic cells and B cells are vital antigen-presenting cells in events of initiation and modulation of innate and adaptive antitumor immune response [53, 54]. Reducing the dendritic cells may weaken antitumor immunity in ESCC.

Two candidate drugs containing Tozasertib_1096 and WIKI4_1940 were searched out in high- and low-risk patients. Tozasertib, an aurora kinase inhibitor, plays pivotal roles in several mitotic events including mitotic spindle formation, microtubule-kinetochore attachment, and so on [55]. Microtubule and tubulin binding molecular functions were identified from GO analysis, which illustrated that these were the candidate signaling pathways associated with risk signature and chemotherapy drug sensitivity.

Anoikis-related gene signature offers a practical way to evaluate ESCC patients’ prognosis and a viable strategy for selecting chemotherapeutic treatments to outline an individual treatment plan. However, several limitations must be considered. The model currently lacks external validation datasets even after we validated risk model prognostic values and prediction accuracy in 142 confirmed ESCC patients and confirmed two prognostic gene expressions using qRT-PCR. To validate and improve the model, datasets and bioinformatics techniques are warranted. Next, to substantiate theory that the proposed two-ARG signature functions in anoikis, fundamental experiments are required. Moreover, in vitro and vivo research is warranted to evaluate prediction performance of drug sensitivity findings.

In conclusion, the current study identified that the anoikis related gene signature, including PBK and TNFSF10, provide excellent prognostic value for ESCC patients. In addition, this signature can be a favorable source indicator for making clinical decision of drug selection. Furthermore, miRNA-mRNA-TF network analysis contribute to our understanding of molecular regulatory mechanism of anoikis process in ESCC.

ESCC，Esophageal squamous cell carcinoma；ARGs，Anoikis related genes；TCGA， The cancer genome atlas；GEO，Gene expression omnibus；DE-ARGs， Differentially expressed ARGs；GSEA，Gene set enrichment analysis；ssGSEA， single sample gene set enrichment analysis；PCA，Principal component analysis；qRT-PCR，quantitative reverse transcription PCR；KM，Kaplan-Meier；ROC， Receiver operating characteristic；DCA，Decision curve analysis；TF， Transcription factors；LASSO，Least absolute shrinkage and selection operator；DEGs， Differentially expressed genes；GO，Gene ontology；KEGG，Kyoto encyclopedia of genes and genomes；IC50，Half maximal inhibitory concentration；UMAP， Uniform Manifold Approximation and Projection；OS，Overall survival

Ethics approval and consent to participate

The participants involved in this study all signed the informed consent. This research was approved by the Ethical Review Committee of Anyang Tumor Hospital (approval number: 2024WZ01K02; date of approval: February 19, 2024)

Consent for publication

Not applicable

Availability of data and materials

The training set, including 82 samples of ESCC and 11 samples of normal esophageal tissue, were obtained from TCGA (https://about-nci/organization/ccg/research/structural-genomics/tcga). The dataset GSE53622 can be found in the GEO (http://www. ncbi. nlm. nih. gov/ geo). The anoikis related genes were downloaded from GeneCards (https://genecards.weizmann.ac.il/v3/).

Competing interests

The authors declare that they have no competing interests

Funding

This work was funded by postdoctoral research grant of Henan Province (202003105) and Major Projects of Science and Technology Department of Anyang City (2023C01SF086)

Authors^，contributions

S.C. carried out all the data analysis and wrote the manuscript. M.L. carried out all the data analysis and revised the manuscript. Z.C., Y.L., W.N. and W.Z. carried out partial of the data analysis. J.L. collected ESCC and paracancerous tissues. L.D., S.L. and Z.G. performed qPCR experiments. Y.Z. supervised the study and revised the manuscript. All authors reviewed the manuscript.

Pennathur A, Gibson MK, Jobe BA, Luketich JD. Oesophageal carcinoma. Lancet (London England). 2013;381(9864):400–12.
Siegel RL, Miller KD, Jemal A, Cancer statistics. 2020. CA: a cancer journal for clinicians. 2020;70(1):7–30.
Chen W, Zheng R, Baade PD, Zhang S, Zeng H, Bray F, et al. Cancer statistics in China, 2015. Cancer J Clin. 2016;66(2):115–32.
Zeng H, Zheng R, Zhang S, Zuo T, Xia C, Zou X, et al. Esophageal cancer statistics in China, 2011: Estimates based on 177 cancer registries. Thorac cancer. 2016;7(2):232–7.
Patel N, Benipal B. Incidence of Esophageal Cancer in the United States from 2001–2015: A United States Cancer Statistics Analysis of 50 States. Cureus. 2018;10(12):e3709.
Liu J, Xie X, Zhou C, Peng S, Rao D, Fu J. Which factors are associated with actual 5-year survival of oesophageal squamous cell carcinoma? European journal of cardio-thoracic surgery: official journal of the European Association for Cardio-thoracic Surgery. 2012;41(3):e7–11.
Fidler IJ. The pathogenesis of cancer metastasis: the 'seed and soil' hypothesis revisited. Nat Rev Cancer. 2003;3(6):453–8.
Paoli P, Giannoni E, Chiarugi P. Anoikis molecular pathways and its role in cancer progression. Biochim Biophys Acta. 2013;1833(12):3481–98.
Danial NN, Korsmeyer SJ. Cell death: critical control points. Cell. 2004;116(2):205–19.
Gilmore AP, Anoikis. Cell Death Differ. 2005;12(Suppl 2):1473–7.
Ye G, Yang Q, Lei X, Zhu X, Li F, He J, et al. Nuclear MYH9-induced CTNNB1 transcription, targeted by staurosporin, promotes gastric cancer cell anoikis resistance and metastasis. Theranostics. 2020;10(17):7545–60.
Jin L, Chun J, Pan C, Kumar A, Zhang G, Ha Y, et al. The PLAG1-GDH1 Axis Promotes Anoikis Resistance and Tumor Metastasis through CamKK2-AMPK Signaling in LKB1-Deficient Lung Cancer. Mol Cell. 2018;69(1):87–99e7.
Buchheit CL, Angarola BL, Steiner A, Weigel KJ, Schafer ZT. Anoikis evasion in inflammatory breast cancer cells is mediated by Bim-EL sequestration. Cell Death Differ. 2015;22(8):1275–86.
Bao W, Qiu H, Yang T, Luo X, Zhang H, Wan X. Upregulation of TrkB promotes epithelial-mesenchymal transition and anoikis resistance in endometrial carcinoma. PLoS ONE. 2013;8(7):e70616.
Chen S, Gu J, Zhang Q, Hu Y, Ge Y. Development of Biomarker Signatures Associated with Anoikis to Predict Prognosis in Endometrial Carcinoma Patients. J Oncol. 2021;2021:3375297.
Wang Y, Wang Z, Sun J, Qian Y. Identification of HCC Subtypes With Different Prognosis and Metabolic Patterns Based on Mitophagy. Front cell Dev biology. 2021;9:799507.
Zhang MY, Huo C, Liu JY, Shi ZE, Zhang WD, Qu JJ, et al. Identification of a Five Autophagy Subtype-Related Gene Expression Pattern for Improving the Prognosis of Lung Adenocarcinoma. Front cell Dev biology. 2021;9:756911.
Min SH, Zhou J. smplot: An R Package for Easy and Elegant Data Visualization. Front Genet. 2021;12:802894.
Yu G, Wang LG, Han Y, He QY. clusterProfiler: an R package for comparing biological themes among gene clusters. OMICS. 2012;16(5):284–7.
Maeser D, Gruener RF, Huang RS. oncoPredict: an R package for predicting in vivo or cancer patient drug response and biomarkers from cell line screening data. Brief Bioinform. 2021;22(6).
Chen Z, Liu X, Zhu Z, Chen J, Wang C, Chen X, et al. A novel anoikis-related prognostic signature associated with prognosis and immune infiltration landscape in clear cell renal cell carcinoma. Front Genet. 2022;13:1039465.
Chen Y, Huang W, Ouyang J, Wang J, Xie Z. Identification of Anoikis-Related Subgroups and Prognosis Model in Liver Hepatocellular Carcinoma. Int J Mol Sci. 2023;24(3).
Chi H, Jiang P, Xu K, Zhao Y, Song B, Peng G, et al. A novel anoikis-related gene signature predicts prognosis in patients with head and neck squamous cell carcinoma and reveals immune infiltration. Front Genet. 2022;13:984273.
Chen J, Sun M, Chen C, Kang M, Qian B, Sun J, et al. Construction of a novel anoikis-related prognostic model and analysis of its correlation with infiltration of immune cells in neuroblastoma. Front Immunol. 2023;14:1135617.
Xiao Y, Zhou H, Chen Y, Liu L, Wu Q, Li H, et al. A novel anoikis-related gene prognostic signature and its correlation with the immune microenvironment in colorectal cancer. Front Genet. 2023;14:1186862.
Zhou K, Yan Y, Zhao S. Esophageal cancer-selective expression of TRAIL mediated by MREs of miR-143 and miR-122. Tumour biology: J Int Soc Oncodevelopmental Biology Med. 2014;35(6):5787–95.
Almasan A, Ashkenazi A. Apo2L/TRAIL: apoptosis signaling, biology, and potential for cancer therapy. Cytokine Growth Factor Rev. 2003;14(3–4):337–48.
Cantarella G, Di Benedetto G, Puzzo D, Privitera L, Loreto C, Saccone S, et al. Neutralization of TNFSF10 ameliorates functional outcome in a murine model of Alzheimer's disease. Brain. 2015;138(Pt 1):203–16.
Lancaster JM, Sayer R, Blanchette C, Calingaert B, Whitaker R, Schildkraut J, et al. High expression of tumor necrosis factor-related apoptosis-inducing ligand is associated with favorable ovarian cancer survival. Clin cancer research: official J Am Association Cancer Res. 2003;9(2):762–6.
Horak P, Pils D, Haller G, Pribill I, Roessler M, Tomek S, et al. Contribution of epigenetic silencing of tumor necrosis factor-related apoptosis inducing ligand receptor 1 (DR4) to TRAIL resistance and ovarian cancer. Mol cancer research: MCR. 2005;3(6):335–43.
Anees M, Horak P, El-Gazzar A, Susani M, Heinze G, Perco P, et al. Recurrence-free survival in prostate cancer is related to increased stromal TRAIL expression. Cancer. 2011;117(6):1172–82.
Hung CM, Liu LC, Ho CT, Lin YC, Way TD. Pterostilbene Enhances TRAIL-Induced Apoptosis through the Induction of Death Receptors and Downregulation of Cell Survival Proteins in TRAIL-Resistance Triple Negative Breast Cancer Cells. J Agric Food Chem. 2017;65(51):11179–91.
Zhang H, Qin G, Zhang C, Yang H, Liu J, Hu H, et al. TRAIL promotes epithelial-to-mesenchymal transition by inducing PD-L1 expression in esophageal squamous cell carcinomas. J experimental Clin cancer research: CR. 2021;40(1):209.
Abe Y, Matsumoto S, Kito K, Ueda N. Cloning and expression of a novel MAPKK-like protein kinase, lymphokine-activated killer T-cell-originated protein kinase, specifically expressed in the testis and activated lymphoid cells. J Biol Chem. 2000;275(28):21525–31.
Gaudet S, Branton D, Lue RA. Characterization of PDZ-binding kinase, a mitotic kinase. Proc Natl Acad Sci USA. 2000;97(10):5167–72.
Zhao R, Choi BY, Wei L, Fredimoses M, Yin F, Fu X, et al. Acetylshikonin suppressed growth of colorectal tumour tissue and cells by inhibiting the intracellular kinase, T-lymphokine-activated killer cell-originated protein kinase. Br J Pharmacol. 2020;177(10):2303–19.
Lei B, Liu S, Qi W, Zhao Y, Li Y, Lin N, et al. PBK/TOPK expression in non-small-cell lung cancer: its correlation and prognostic significance with Ki67 and p53 expression. Histopathology. 2013;63(5):696–703.
Ohashi T, Komatsu S, Ichikawa D, Miyamae M, Okajima W, Imamura T, et al. Overexpression of PBK/TOPK relates to tumour malignant potential and poor outcome of gastric carcinoma. Br J Cancer. 2017;116(2):218–26.
Ma H, Li Y, Wang X, Wu H, Qi G, Li R, et al. PBK, targeted by EVI1, promotes metastasis and confers cisplatin resistance through inducing autophagy in high-grade serous ovarian carcinoma. Cell Death Dis. 2019;10(3):166.
Ohashi T, Komatsu S, Ichikawa D, Miyamae M, Okajima W, Imamura T, et al. Overexpression of PBK/TOPK Contributes to Tumor Development and Poor Outcome of Esophageal Squamous Cell Carcinoma. Anticancer Res. 2016;36(12):6457–66.
Zheng L, Li L, Xie J, Jin H, Zhu N. Six Novel Biomarkers for Diagnosis and Prognosis of Esophageal squamous cell carcinoma: validated by scRNA-seq and qPCR. J Cancer. 2021;12(3):899–911.
Wang X, Liu Y, Shao D, Qian Z, Dong Z, Sun Y, et al. Recurrent amplification of MYC and TNFRSF11B in 8q24 is associated with poor survival in patients with gastric cancer. Gastric cancer: official J Int Gastric Cancer Association Japanese Gastric Cancer Association. 2016;19(1):116–27.
Lee KS, Kwak Y, Nam KH, Kim DW, Kang SB, Choe G, et al. c-MYC Copy-Number Gain Is an Independent Prognostic Factor in Patients with Colorectal Cancer. PLoS ONE. 2015;10(10):e0139727.
Jung M, Russell AJ, Liu B, George J, Liu PY, Liu T, et al. A Myc Activity Signature Predicts Poor Clinical Outcomes in Myc-Associated Cancers. Cancer Res. 2017;77(4):971–81.
Zhang HF, Wu C, Alshareef A, Gupta N, Zhao Q, Xu XE, et al. The PI3K/AKT/c-MYC Axis Promotes the Acquisition of Cancer Stem-Like Features in Esophageal Squamous Cell Carcinoma. Stem cells (Dayton. Ohio). 2016;34(8):2040–51.
Yang L, Zhu JY, Zhang JG, Bao BJ, Guan CQ, Yang XJ, et al. Far upstream element-binding protein 1 (FUBP1) is a potential c-Myc regulator in esophageal squamous cell carcinoma (ESCC) and its expression promotes ESCC progression. Tumour biology: J Int Soc Oncodevelopmental Biology Med. 2016;37(3):4115–26.
Xin Z, Xin G, Shi M, Song L, Wang Q, Jiang B, et al. Inhibition of MUC1-C entering nuclear suppresses MYC expression and attenuates malignant growth in esophageal squamous cell carcinoma. OncoTargets therapy. 2018;11:4125–36.
Wang Y, Cheng J, Xie D, Ding X, Hou H, Chen X, et al. NS1-binding protein radiosensitizes esophageal squamous cell carcinoma by transcriptionally suppressing c-Myc. Cancer Commun (London England). 2018;38(1):33.
Xanthoulis A, Tiniakos DG. E2F transcription factors and digestive system malignancies: how much do we know? World J Gastroenterol. 2013;19(21):3189–98.
Li P, Lv H, Wu Y, Xu K, Xu M, Ma Y. E2F transcription factor 1 is involved in the phenotypic modulation of esophageal squamous cell carcinoma cells via microRNA-375. Bioengineered. 2021;12(2):10047–62.
Luo C, Zhao X, Wang Y, Li Y, Wang T, Li S. A novel circ_0000654/miR-375/E2F3 ceRNA network in esophageal squamous cell carcinoma. Thorac cancer. 2022;13(15):2223–34.
Ishimoto T, Shiozaki A, Ichikawa D, Fujiwara H, Konishi H, Komatsu S, et al. E2F5 as an independent prognostic factor in esophageal squamous cell carcinoma. Anticancer Res. 2013;33(12):5415–20.
Zilionis R, Engblom C, Pfirschke C, Savova V, Zemmour D, Saatcioglu HD, et al. Single-Cell Transcriptomics of Human and Mouse Lung Cancers Reveals Conserved Myeloid Populations across Individuals and Species. Immunity. 2019;50(5):1317–34e10.
Song Q, Hawkins GA, Wudel L, Chou PC, Forbes E, Pullikuth AK, et al. Dissecting intratumoral myeloid cell plasticity by single cell RNA-seq. Cancer Med. 2019;8(6):3072–85.
Carmena M, Earnshaw WC. The cellular geography of aurora kinases. Nat Rev Mol Cell Biol. 2003;4(11):842–54.

No competing interests reported.

Download PDF

Version 1

posted

You are reading this latest preprint version

Establishment and validation of the prognostic risk model based on the anoikis related genes in esophageal squamous cell carcinoma

Status:

Version 1

Abstract

Background

Methods

Results

Conclusion

Figures

1. Introduction

2. Materials and methods

2.1 Data Collection

2.2 Screening for DE-ARGs

2.3 Prognostic gene acquisition and risk model development

2.4 Gene set enrichment analysis (GSEA) of prognostic genes

2.5 Correlation analysis of risk scores and clinical features

2.6 Independent prognostic analysis of risk model and the generation of a nomogram

2.7 Functional enrichment analysis of risk groups

2.8 Analysis of tumor microenvironment

2.9 Establishment of miRNA-mRNA-TF and miRNA-lncRNA regulatory networks

2.10 Chemotherapy analysis

2.11 Single cell analysis

2.12 qRT-PCR

3. Results

3.1 Screening of DE-ARGs and candidate prognostic markers

3.2 Determination of prognostic genes and establishment of a risk model

3.3 Verification of risk model

3.4 Functional annotation analysis on prognostic gene

3.5 Correlation analysis of risk scores and clinical features

3.6 Independent prognostic analysis and establishment of nomogram

3.7 Annotated analysis of the function of DEGs

3.8 Immuno-infiltration analysis

3.9 Establishment of regulatory network

3.10 Chemotherapy drug susceptibility analysis

3.11 Single cell analysis

3.12 Validation of prognostic genes expression by qRT-PCR

4. Discussion

5. Coclusions

Abbreviations

Declarations

References

Additional Declarations

Supplementary Files

Status:

Version 1