A Nomogram Based on Glycolysis-related Gene Expression Prolings Serve as a Novel Prognosis Risk Classiers for Human Hepatocellular Carcinoma

Metabolic pattern reconstruction is an important element in tumor progression. The metabolism of tumor cells is characterized by the abnormal increase of anaerobic glycolysis, regardless of the higher oxygen concentration, resulting in a large accumulation of energy from glucose sources, and contributes to rapid cell proliferation and tumor growth which is further referenced as the Warburg effect. We tried to reconstruct the metabolic pattern in the progression of cancer to screen which genetic changes are specic in cancer cells. A total of 12 common types of solid tumors were enrolled in the prospective study. Gene set enrichment analysis (GSEA) was implemented to analyze 9 glycolysis-related gene sets, which are closely related to the glycolysis process. Univariate and multivariate analyses were used to identify independent prognostic variables for the construction of a nomogram based on clinicopathological characteristics and a glycolysis-related gene prognostic index (GRGPI). The prognostic model based on glycolysis genes has the highest area under the curve (AUC) in LIHC (Liver hepatocellular carcinoma). 8-gene signatures (AURKA, CDK1, CENPA, DEPDC1, HMMR, KIF20A, PFKFB4, STMN1) were related to overall survival (OS) and recurrence-free survival (RFS). Further analysis demonstrates that the prediction model can accurately distinguish between high- and low-risk cancer patients among patients in different clusters in LIHC. A nomogram with a well-tted calibration curve based on gene expression proles and clinical characteristics improves discrimination in internal and external cohorts. Furthermore, the altering expression of metabolic genes related to glycolysis may contribute to the reconstruction of the tumor-related microenvironment. DEPDC1, HMMR, KIF20A, PFKFB4, and STMN1. We developed and optimized a robust prognostic model based on the 8-gene signatures and veried the powerful performance in 3 independent verication cohorts. Our results suggest that gene signatures related to the glycolysis pathway have the ability to accurately predict the poor prognosis and recurrence of HCC patients. Surprisingly, the prognostic model shows more accurate predictive ability and is superior to other pathological features. In an attempt to evaluate the GRGPI-based risk model in the clinical setting, a nomogram integrating multiple important clinicopathological characteristics has been established, which can be used as a powerful and easy-to-use tool for evaluating the survival


Introduction
Cells are authorized to choose energy metabolism patterns for biosynthesis, depending on cell function and availability of metabolites. In addition to oxidative phosphorylation of glucose, other metabolic pathways, including lipid, nucleotide, and amino acid metabolism can also provide energy to meet its biosynthetic requirements for cell growth and proliferation 1,2 . The energy metabolism pattern of tumor cells has changed dramatically, compared with oxidative phosphorylation (OXPHOS) of normal cells. To maintain survival and meet the synthesis of biological macromolecules, energy metabolism tends to another embodiment, which is referred to as glycolysis or Warburg effect [3][4][5] . The Warburg effect represents the transformation of glucose utilization by tumor cells from oxidative phosphorylation to glycolysis, which is now acknowledged as a major feature of tumors 6,7 . This change in energy metabolism is determined by complex factors, including pressure on the tumor microenvironment and genetic changes [8][9][10][11] . The enhanced glycolysis of tumor cells is mainly due to the increased expression or activity of key glycolysis enzymes 12 . In recent years, people are making a concerted effort to target tumors by inhibiting the activity of key enzymes in the tumor glycolysis pathway. Some studies have shown that speci c inhibition of glycolysis is coupled with signi cant tumor suppression, and induces cell death. Glycolysis key enzymes such as glycokinase 2 (HK2), phosphofructosidase (PFK), and M2-type acetone kinase (PKM2) have become tumor markers, and their expression and activity can affect tumor glycolysis, which in turn affects tumors of proliferation [13][14][15][16][17] . However, the ability of glycolysis in tumor cells remains a long-standing ill-de ned puzzle for re ning strati cation and management of cancer patients. Early diagnosis and personalized treatment were considered effective methods to improve the survival time of patients. Histopathology is believed to be able to predict the prognosis and outcome of cancer patients to a certain extent. Due to its limitations, patients with the same pathology have different prognoses due to different molecular subtypes 18,19 . With the advent of high-throughput sequencing nucleotide technology of recent years, we are allowed to better understand the dynamic changes at the molecular level. A single gene failed to predict the outcome accurately of cancer patients.
In contrast, the utility of these biomarkers combinations may be altered by optimizing the sensitivity and speci city of patient outcomes. Multiple biomarkers that are increasingly related to survival and prognosis can identify high-risk patients, ameliorate the prognosis of cancer patients, and assist with appropriate intervention therapy.
Gene set enrichment analysis (GSEA) was generally used in genomic research to identify potential biological mechanisms. In this study, we tried to develop some potential gene signatures through GSEA analysis, which is closely associated with glycolysis metabolic pathways. Credit to TCGA dataset, we evaluated the tumor glycolysis metabolic patterns of 12  developed important GRGPI signatures in LIHC and established multiple risk characteristics that can effectively forecast the prognosis of patients. Surprisingly, glycolysis-related risk characteristics can be applied to recognize patients with dismal outcomes in the high-risk group. Besides, acting following the Cox multivariate hazard ratio analysis, the risk score outperformed other clinical traits in evaluating patient prognosis.

Materials And Methods
Gene expression pro les and patient clinical information. We obtained transcriptome expression pro les from multiple data repositories, including The Cancer Genome Atlas Program (TCGA, https://portal.gdc.cancer.gov/), the International Cancer Genome Consortium (ICGC, http://www.icgc.org), and the Gene Expression Omnibus (GEO, http://www.ncbi.nlm.nih.gov/geo/). Datasets without insu cient sample size (< 200) or available clinical information were excluded. The raw counts were transformed into transcripts per kilobase million (TPM) values for subsequent analysis.
Gene Set Enrichment Analysis. The Molecular Signatures Database (MSigDB) was explored to identify gene sets and speci c biological processes that are signi cantly differentially expressed in different groups. This method produced a statistically signi cant improvement in the connectivity between the data expression pattern and the biological process, ignoring the clear differential gene threshold 20 . 9 gene sets associated with glycolysis processes including go glycolytic fermentation, go glycolytic process, hallmark glycolysis. kegg glycolysis gluconeogenesis, module 306, reactome glycolysis, and reactome regulation of glycolysis by fructose − 2-6 bisphosphate metabolism, which was downloaded from MsigDB. For each gene set permutations were performed 1000 times, normalized enrichment scores (NES) and FDR values were used to de ne the enrichment pathways in each phenotype. GSEA was performed to explore whether there is a signi cant difference in glycolysis-related gene sets between tumor tissues and matched normal tissues. P value and FDR value < 0.05 were set as the threshold.
Construction of risk prediction model and statistical analysis. Univariate Cox regression models were constructed to assess the statistical relationships between mRNA expression levels and RFS or OS. We used a linear regression model with a stepwise forward method to predict signi cant variations between variables, with the beta value (β) from univariate Cox regression analysis as weighting factors 21 . We performed a multivariable logistic regression analysis after the LASSO (least absolute shrinkage and selection operator) algorithm, which simultaneously selects the variables and penalizes the model coe cients for overoptimism 22 . Multivariate analysis was applied using the Cox proportional hazards (Cox-PH) model to identify independent predictors of survival that involved the above-mentioned variables. After a series of univariate and multivariate analyses, covariates with a P value < 0.05 were used for subsequent risk prediction model construction. The standardized risk score was calculated using a formula described as follows: Patients with complete clinicopathological characteristics were segmented into a high-and a low-risk group, given the median value of the risk score. Kaplan-Meier curves served to compare the survival probability variation in low-and high-risk groups. The log-rank test P < 0.05 reveals the signi cance of survival time differences. All these analyses are done using R version 3.6.1, along with the corresponding R packages 23 . The P value less than 0.05 was regarded as the threshold of statistical signi cance. Immunohistochemistry (IHC) analysis. Immunohistochemical slides and relative clinical pathology information were approved from the Human Protein Atlas (HPA, https://www.proteinatlas.org/) 24 . The immunohistochemical staining results were evaluated by two independent pathologists, according to the integrated index by multiplying the intensity by the proportion of immunopositive cells of interest.
Weighted gene co-expression network analysis. To uncover the transcriptomic differences between HCC subgroups, weighted gene co-expression analysis was performed under the unique characteristics of the subgroups to identify potential functional modules that can characterize the biological functions of each subgroup. The optimal soft threshold parameter β (β = 7) was used to construct a scale-free coexpression network. Subsequently, based on Pearson's coe cient, genes with the same expression pattern were concentrated into speci c gene modules. The top 2 modules that had the strongest association with subgroups were selected for further analysis. GO and KEGG pathway enrichment analyses were utilized to examine whether genes from various terms are found more frequently than expected in subgroups.

Results
Screening for glycolysis gene sets with signi cant differences between tumor tissues and adjacent normal tissues. As demonstrated in Fig. 1, the panoramic ow chart illustrated our comprehensive analysis process. 12 solid tumors with complete clinical information and gene expression pro les were re ected in the present study, consisted BLCA, BRCA, COAD, HNSC, KIRC, KIRP, LIHC, LUAD, LUSC, OV, PRAD, and THCA. All the above data were derived from TCGA and have undergone normalization before the implementation of GSEA. GSEA was performed to carry out analysis with 9 gene sets associated with the glycolysis process. We attempt to phase whether these gene set variants produced signi cant differences between tumors and their adjacent noncancerous tissues. At least one gene set with an FDR value less than 0.05 was selected for subsequent studies. As presented in Fig (Table S1).
To further investigate whether these core genes participated in the glycolysis process, we performed GO and KEGG pathway analysis by using the R package ClusterPro ler. Further dissection of these genes and pathways indicated enrichment for glucose metabolisms, such as pyruvate metabolic process, pyruvate biosynthetic process, a glycolytic process in GO (Figure S1b-d), and Glycolysis/Gluconeogenesis in KEGG ( Figure S1e). These results indicate that these core genes have a profound effect on glucose metabolism, especially glycolysis.
Construction and validation of the prognostic glycolysis associated-gene signature. Core genes were assessed for correlation with OS through univariate regression analysis and future applied in multivariate Cox-PH regression model with a stepwise procedure to identify those important variables. We identi ed statistically signi cant gene signatures (GRGPI) in BLCA, BRAC, HNSC, LIHC, LUAD, and LUSC, the detailed results were shown in Table S2. These results highlight the power of GRGPIs to identify patients with adverse outcomes who would be classi ed as high risk according to these glycolysis gene-related classi ers. We further investigated the area under the time-dependent ROC curves (AUC) values for each cancer type. The highest AUC was observed in LIHC compared with BLCA, BRAC, HNSC, LUAD, and LUSC at 0.5 (0.852), 1(0.840), 2(0.871), 3(0.830), and 5-year (0.756) in Fig. S3. More speci cally, in a univariate Cox regression analysis, 92 glycolysis-related genes with signi cant correlations with overall survival were regarded as signi cant (Fig. 3a) in LIHC. To avoid over tting and unnecessary complexity, the independent prognostic factors were restricted to those variables that contributed most toward the nal model coe cients based on the AIC and the model χ2 score. The selected features were incorporated into a least absolute shrinkage and selection operator (LASSO) regression model to penalize for model complexity over tting. 8 genes (AURKA, DEPDC1, CDK1, CENPA, HMMR, KIF20A, PFKFB4, and STMN1) remained with their individually nonzero LASSO coe cients ( Fig. 3b and 3c). Multivariate analysis of these variables contributed to their virtual statistical weighting, determining their impact on prognostic risk, using Cox proportional hazard regression. Finally, the risk score of 8 gene signatures was established as follows: Risk score = (0. 1224 * expression of AURKA) + (0.0534 * expression of CDK1) + (0.0920 * expression of CENPA) + (0.1323 * expression of DEPDC1) + (0.1140 * expression of HMMR) + (0.2425 * expression of KIF20A) + (0.1562 * expression of PFKFB4) + (0.0911 * expression of STMN1). Patients with high or low risk were clustered based on the median risk score of the TCGA discovery cohort. The distribution of risk scores, survival status, and gene expression landscapes of patients varied signi cantly within the two subgroups as shown in Fig. 4a. Kaplan-Meier survival analysis disclosed that the survival of the low-risk group was signi cantly longer than the high-risk group (Fig. 4b, P < 0.001). The cumulative event probability curve shows that HCC patients in the high-risk group have a signi cantly higher probability of cumulative events during the entire follow-up period than in low-risk patients ( Fig. 4c, P < 0.001). We applied the classi er to assess whether the 8-mRNA panel can predict an individual or a speci c HCC recurrence. The TCGA dataset containing recurrence events and recurrence time was used as an internal training cohort (TCGA training cohort). Our prognostic evaluations of survival analysis for 8 gene signatures were based on TCGA recurrence-free survival (RFS) outcomes. The distribution of risk score, survival status, and gene expression patterns of patients was demonstrated in Fig. 4d. Patients with low-risk scores also had longer RFS time than patients with high-risk scores ( Fig. 4e, P < 0.001). The cumulative event occurrence curve revealed a signi cant cumulative risk (HR) of HCC patients in the highrisk group. (Fig. 4f)The analysis results show that this 8-gene signature can be used as a prognostic indicator for the outcome and recurrence of HCC patients. Subsequently, we performed another two independent analyses on the datasets from GEO and ICGC datasets. Consistently, as described earlier, in two independent validation sets, the 8-gene model sharply divided two risk subgroups ( Fig. 5a and 5d).
Not surprisingly, the survival analysis and cumulative risk curve indicated that the high-risk group had a shorter OS and higher cumulative risk ( Fig. 5b-c and 5e-f). We have observed the robust prognostic value of the classi er in 3 independent cohorts. Independent predictive value of the 8-mRNA signature. We constructed risk scores and developed predictive models to predict OS and RFS. To verify the assignments of sub-categories, we also performed t-SNE to constraint the dimensionality of the features. The T-SEN analysis revealed that the two risk subgroups are scattered in two discrete directions (Fig. 6a-d). In the TCGA discovery cohort, TCGA training cohort, GEO validation cohort, and ICGC validation cohort, we performed the time-dependent ROC curves analysis to estimate the prognostic accuracy of the 8 glycolysis-related signatures at 0.5-, 1-, 2-, 3and 5-years, for the prediction of OS and RFS, respectively. These results con rmed the high prognostic accuracy of the 8-signature model (Fig. 6e-h).
The explanation for many clinical situations one can identify some standard variables that have previously been demonstrated to have prognostic value and are generally measured for most patients having the particular diagnosis. As mentioned earlier, we then explored tumor-related clinicopathological variables associated with our classi er based on TCGA (Fig. 7a, Table S3), GEO (Fig. 7b, Table S4), and ICGC (Fig. 7c, Table S5) cohorts. Patient clinicopathologic characteristics were listed in Table 1. The results of Pearson's Chi-Square test showed that there were several signi cant correlations between clinicopathological characteristics and HCC risk subtypes in three independent cohorts. More speci cally, for the TCGA discovery cohort, T classi cation (P = 0.0032), Stage (P < 0.001 ), Grade (P = 0.0175), Family Cancer History (P = 0.0359), AFP level (P = 0.0147), Cancer Status (P < 0.001), Recurrence Event (P < 0.001) and patient Status (P < 0.001). In the TCGA training set, there was a signi cant correlation between T classi cation (P = 0.0095) and Stage (P = 0.0033) between HCC subgroups (Table S3). Similarly, in the GEO and ICGC cohorts, some important clinicopathological characteristics also have signi cant correlations between subgroups. More detailed results are shown in Table S4 and Table S5. To prove the independence of GRGPI, a Cox proportional hazard regression analysis was performed in the TCGA, GEO, and ICGC cohorts. The adjustment results of clinical variables suggested that risk score remained an independent prognostic factor, con rming its robust predictive ability for OS (HR = 1.267, P < 0.001) and RFS (HR = 1.027, P < 0.001) of HCC patients ( Fig. 8a and b). In the Cox regression model, some clinicalpathological factors (Cancer Status (P = 0.017), Hepatitis Virus Infection (P = 0.041), and Child-Pugh Score (P = 0.011) for OS; Cancer Status (P < 0.001), Hepatitis Virus Infection (P = 0.017), and BMI (P = 0.009) for RFS) were also considered as independent poor prognostic factors, and could prove valuable for risk strati cation in pathological subgroups in Fig. 8a and Fig. 8b. In addition, KM survival analysis revealed that disease-speci c survival rates are signi cantly different in some pathological subgroups, such as T classi cation (T1-2 Figure S4). These results were consistent with those obtained using a univariate Cox regression for OS with adjustments for the prognostic factors (Fig. 8a, Table S3). In addition, the index can independently predict the OS of GEO (HR (95% CI) = 2.430 (2.054-2.874), P < 0.001) (Fig. 8c) and ICGC cohorts (HR (95% CI) = 1.108 (1.069-1.149), P < 0.001) (Fig. 8d). All in all, these results strongly suggested that GRGPI was an independent prognostic factor for HCC patients.
Subsequently, further strati ed analysis was also performed to investigate the independence of the model within the same subgroups of clinicopathological features. Taking advantage of the clinicopathological parameters, we divided the TCGA discovery cohort into subgroups based on their clinical-pathological features, such as Gender (Male/Female), Age (≤ 65/>65), Grade (G1-2/G3-4), Stage (Stage I-II/III-IV), T classi cation (T1-2/3-4), Tumor Status (Tumor Free/With Tumor), etc. As revealed in Fig. S5 and Fig. S6, regardless of strati cation, 8-mRNA signatures still can make a distinction from high-risk patients. Finally, similar results in the GEO (Fig. S7-S9) and ICGC (Fig. S10-11) cohorts were also statistically signi cant.
To assess the sensitivity and speci city of the OS/RFS prognostic model and other clinical pathology variables, we performed a multiple ROC curve analysis. To estimate its performance, we applied the 8-mRNA model to the TCGA discovery cohort, TCGA training cohort, GEO, and ICGC cohort, and compared its prediction quality by evaluating the area under the ROC curve. Following the multivariate Cox regression and AUC analysis, the prognostic model remained a moderate and independent prognostic indicator (TCGA discovery cohort: AUC = 0.860; TCGA training cohort: AUC = 0.801; GEO validation cohort: AUC = 0.834; ICGC validation cohort: AUC = 0.843; Fig. 8e-h). It has demonstrated that the risk score model outperformed the other clinical pathology variables for the prognostic evaluation of HCC patients. These results indicated that GRGPI signatures have a predominately higher favorable value than other parameters in predicting the OS and RFS of HCC patients.
Development and veri cation of a personalized Nomogram. To provide clinicians with a portable quantitative table to predict the prognosis of liver cancer patients, a nomogram integrating GRGPI and clinicopathological characteristics was constructed in TCGA, GEO, and ICGC cohorts. In the TCGA cohort, the risk score contributed the largest risk point, compared with other clinicopathological characteristics, then followed by T classi cation, Hepatitis virus infection, Child-Pugh Score and Stage, etc (Fig. 9a). As shown in Fig. 9b and 9c, a total of 371 patients were reclassi ed in the new Nomogram model for OS NRI (net reclassi cation index) = 0.415. ROC analysis revealed the accuracy of the Nomo model, which is a good predictor of patient survival, with an AUC value of 0.873 (Fig. 9d). In the decision curve analysis, the novel nomogram showed more net bene t across the range of decision threshold probabilities than the Risk score model and integrated clinicpathology model (Fig. 9e). The calibration curves showed a stable agreement between the prediction by the nomogram and the actual observation for 1-, 2-, and 3-year OS in Fig. 9f. This novel Nomogram model that integrated GRGPI and clinicopathological features maintained good agreement between the predicted and observed survival probabilities in the GEO (AUC = 0.854) and ICGC (AUC = 0.863) cohorts (Fig S12 and Figure S13).
The landscape of immune in ltration in HCC risk subgroups. Due to the signi cant differences between subtypes, immune in ltration was studied to characterize their immunological characteristics. The CIBERSORT algorithm was used to calculate the abundance of 22 immune-related cell types and displayed in the heatmap and box plot in the TCGA (Fig. 10a-b), GEO (Fig. 10c-d), and ICGC ( Fig. 10e-f) cohorts, respectively. Consistently, the frequency of CD8 + T cells in the low-risk group was signi cantly higher, whereas the proportion of M2 macrophages were rather higher in the high-risk group, in 3 independent cohorts. After calculation of tumor immune in ltration levels of each patient, we found high CD8 + T cell levels could predict better survival, while high levels of M2 cells often indicated worse OS and RFS in HCC tissues (Fig. 10g-k).
We performed WGCNA and GSEA analysis to identify gene expression patterns between different subgroups. Here, based on average clustering, we did not detect outlier samples (Fig. 11a). The soft threshold β was set at 7 to determine a scale-free network (Fig. 11b). Subsequently, the genes were assigned to 16 modules, among which, the gray modules included genes that cannot be clustered (Fig. 11c). The two gene modules most correlated with high-(pink, yellow) and low-risk (greenyellow, turquoise) groups were presented in Fig. 11d. We performed GO and KEGG analysis to identify the potential biological signi cance of related TOP2 modules in different subgroups (Fig. 11e-h). Besides, we also performed a GSEA analysis based on the overall TCGA-LIHC expression pro les. The terms existing in both WGCNA and GSEA analysis results were shown in Fig. 11i-l. Speci cally, some terms related to cell cycle transition, chromatin separation, DNA replication, and DNA helicase activity were signi cantly enriched in a high-risk group.
Further veri cation of the 8-gene signature. To further determine the reliability of the observed gene signatures, we conducted additional veri cation at the transcriptome level and protein level. We evaluated the expression of 8 gene signatures, based on the TCGA, GEO, and ICGC databases. Inspection of the results revealed a general trend that these 8 gene signatures are dysregulated in HCC tumor tissues, as is demonstrated in Fig. 12a-c. Furthermore, we also examined the expression levels of 8 gene signatures at the protein level using immunohistochemistry (IHC), based on the Human Protein Altas. IHC con rmed the upregulation of protein in HCC tissue. Moderate or high staining intensity of these 8 proteins in HCC tissues contrasted sharply with the low intensity or lack of staining in normal tissues (Fig. 12d).

Discussion
Hepatocellular carcinoma (HCC) is classi ed as a highly malignant tumor that accounts for approximately 90% of total primary liver cancer 25,26 . It has been recognized as the most common malignancy and the most common cause of cancer mortality by absolute cases globally 27 . China represents an area with a high incidence of HCC. According to the World Cancer Report released by the World Health Organization in 2019, new cases of liver cancer in China account for half of the global new cases, and the total number of death accounts for more than half of the world 28 . Therefore, the treatment of hepatocellular carcinoma has received increasing attention worldwide. Surgical treatment has always been regarded as the main treatment for HCC [29][30][31] . However, most patients cannot be treated surgically because of tumor anatomical location, tumor size, tumor number, insu cient liver residual volume, or extrahepatic metastasis 32 . Nonsurgical treatment is currently available for most liver cancer patients. The recent development of medical technology and equipment has raised the urgent requirements for the management strategy for HCC 33 . Under the regulation of various ontogenetic modi cations, across in ammation, immune suppression, and through direct modulation of host cell behavior. Cancer cells undergo adaptive metabolic programming to maintain their distinctive metabolic state of in nite proliferation. Metabolic rewiring in cancer cells may render them highly dependent on speci c metabolic enzymes or processes, may be a viable strategy to design cancer-speci c therapeutics 34,35 . Glycogen metabolism is a crucial metabolic process in the liver. Compared to non-cancerous cells of origin, reprogramming of glucose metabolism speci cally contributes to facilitates aberrant proliferation and survival in HCC cells [36][37][38] . A variety of enzymes and proteins engaged in the process of HCC can undergo structural, functional, and expression changes to achieve metabolic reprogramming, which in turn controls the entire glycogen metabolism network to make it suitable for HCC growth [39][40][41][42] . Here, we focused in more detail on the abnormalities in their expression patterns of glycolysis-related genes to re ect the metabolic activity of tumors in hypoxic mode. Based on comprehensive bioinformatics analysis, we applied a series of gene sets containing genes responsible for encoding key glycolysis enzymes. Based on the GSEA consequences, we selected those core genes that are signi cantly enriched in tumor tissues for subsequent analysis. Prospectively, we implemented a combination of transcriptomic analyses followed by a K-M analysis to evaluate the correlation of the expression of glycolysisassociated gene signatures with patient prognosis in 12 solid tumors. After multivariate Cox-PH regression analysis, we identi ed 8 independent prognostic signatures in HCC, including AURKA, CDK1, CENPA, DEPDC1, HMMR, KIF20A, PFKFB4, and STMN1. We developed and optimized a robust prognostic model based on the 8-gene signatures and veri ed the powerful performance in 3 independent veri cation cohorts. Our results suggest that gene signatures related to the glycolysis pathway have the ability to accurately predict the poor prognosis and recurrence of HCC patients. Surprisingly, the prognostic model shows more accurate predictive ability and is superior to other pathological features. In an attempt to evaluate the GRGPI-based risk model in the clinical setting, a nomogram integrating multiple important clinicopathological characteristics has been established, which can be used as a powerful and easy-to-use tool for evaluating the survival probability of HCC patients.
Several studies have shown that some gene signatures are derived from glucose metabolism, including several that are critical for glycolysis and are typically overexpressed in glycolytic cancer cells. Aurora kinase A (AURKA) is an important regulatory protein involved in the regulation of chromosome congression/alignment, regulation of chromosome segregation, and regulation of spindle dynamics 43,44 .
Beyond all those effects in the cancer environment, AURKA actively promotes DNA repair and acts as a transcription factor to promote cell migration and invasion 44,45 . It is equally located in the mitochondrial membrane to regulate mitochondrial dynamics and ATP production 46,47 . It has been recognized as a powerful prognostic indicator that probably integrates multiple oncogenic events in the progression of tumors 43,48,49 . CDK1 (Cyclin-dependentkinase 1) is a serine/threonine-like protein kinase that plays an essential role in controlling cell proliferation at the G2/M point of the cell cycle. Some reports have con rmed that high CDK1 expression is an independent predictor for tumor recurrence in one and ve years, and it has been noted that compounds targeting CDK1 could be novel antitumor reagents [50][51][52] .
Growing evidence suggests that CENPA was frequently overexpressed in a variety of cancers, playing an auxiliary but important role in cancer pathogenesis, progression, distant metastases, invasion angiogenesis, etc 53,54 . Previous publications showed that CENPA is abnormally overexpressed in hepatocellular carcinoma (HCC) tumor tissue. It has been declared to be associated with overall survival and ketone bodies 71,72 . Studies in the past decade have found that aerobic glycolysis and the resulting acidi cation of tumor microenvironment (TME) should exert speci c inhibitory effects on the antitumor immune response mediated by T cells and the activity of tumor-in ltrating myeloid cells. Therefore, intervening for sugar metabolism and/or lactic acid production and secretion is an attractive anticancer treatment strategy 73,74 . Also, we evaluated the in ammatory in ltration landscape in HCC tissues based on 22 immune cells using the cibersort 75 . T cells CD8 have emerged as important immunomodulatory cytokines, which play a critical role at the interface between innate and adaptive immunity, especially in antitumor immune response [76][77][78][79][80][81][82] . Macrophages M2 have been proven to promote tumor progression and poor prognosis, especially the formation of metastases in target organs [82][83][84] . We hypothesize that the enhanced glycolytic activity contributes to the highly acidic environment in the TME. It is widely understood that tumor-reactive T cells are suppressed and resulting in loss-of-function in the acidic TME induced by glycolysis activity, representing a critical barrier for the success of cancer immunotherapy 85 .
Most immunotherapies target the immune system but not cancer and, therefore, immunotherapies are believed to be a promising foundation to build treatment regimens for a variety of tumor types. However, the complexity of the metabolic regulation of immune cell subsets and the in uence of the TME may have signi cant implications for the effectiveness of these therapies. Our research revealed 8 gene signatures related to the prognosis of HCC, which are important regulators involved in glucose metabolism and energy production, especially the glycolysis process. Altogether, these observations suggest that the glycolysis pathway is required for the proliferation of most cancer cells and energy production in recreating the microenvironmental characteristics, which enables future efforts for therapeutic optimization to block the glycolysis pathway, thereby controlling tumor progression and improving patients prognosis. These ndings may have remarkable prognostic and therapeutic implications for HCC patients.
Inevitably, we acknowledge that some limitations concerning the current study should be considered.
First, we use univariate Cox regression analysis and the LASSO method to ltrate glycolytic genes associated with clinical outcomes of HCC and build a prognostic model through multivariate Cox-PH regression analysis. In the linear regression model, adjustments were made stepwise in major groups, which did reveal which variables contributed the most to confounding, some important components with similar contributions may be ignored. Second, we developed and validated the prognostic risk model based on public databases, which was not veri ed by prospective clinical trials. Subsequent research should consider some traditionally recognized clinical factors, which have a profound impact on tumor progression and prognosis of HCC patients. There are factors related to a clinical interaction that may be missed, such as tumor volume, TP53 mutation, CTNNB1 mutation, lifestyle, and patient follow-up time, and relevant therapeutic information. The factors mentioned above will have more or less impact on the scienti c prediction of the model. Therefore, the predictive performance of predictive models based on glycolysis-related gene signatures should be more thoroughly explored in subsequent studies.
Importantly, whether clinical decision-making dictated by such approaches leads to improved clinical risk strati cation remains an unanswered question that will require well-designed, prospective, multicenter collaborative trials. In order to overcome these limitations, more in vivo and in vitro studies are urgent to verify our model and explore more complex and in-depth biological mechanisms. These will be the focus of our future research.

Page 16/33
We have developed and optimized a novel 8-gene signature for identifying outcomes and recurring events in HCC patients. This predictive model improves the accuracy of predicting patient prognosis. Also, the 8-         of Nomogram for 1-, 2-, and 3-year. f Calibration curves of 1-, 2-, and 3-year OS for HCC patients in the TCGA discovery cohort.