A Novel Nomogram Based on Lipid Metabolism-related Risk Gene Expression Can Better Predict Overall Survival for Hepatocellular Carcinoma

Metabolic reprogramming has been proven to be a hallmark of cancer. The pathogenic factors involved in Hepatocellular carcinoma (HCC) lead to an abnormal lipid metabolism that facilitates the malignant transformation of liver cells. However, the association between lipid metabolism and the prognosis of HCC has not been systematically delineated. In this study, the training set comprised 221 patients from The Cancer Genome Atlas (TCGA) based on the gene expression details, whereas 230 patients within the International Cancer Genome Consortium (ICGC) comprised the validation set. Ten lipid metabolism-related risk genes were screened; they were found to be signicantly related to the prognosis of HCC. The risk score was calculated based on ten screened lipid metabolism-related risk genes and was conrmed to be an independent prognostic factor for HCC even when excluding clinical features. Therefore, a novel nomogram integrating the risk score and other proven clinical attributes was constructed. The results of the area under the receiver operating characteristics curve (AUC), C index, and calibration plot supported the better predictive capacity of the nomogram over others. Treatment with metformin signicantly positively affected the expression of four out of ten genes; this was benecial to longer overall survival. The results provide a new insight into accurate prognostic prediction, as well as understanding the carcinogenesis and process of HCC. showed that ACSL3 contributes to the growth of castration-resistant prostate cancer (CRPC) through intratumoral steroidogenesis 19 . Haarith Ndiaye et al, utilizing immunohistochemical analyses of HCC tissues and subcellular fractionation of cultured HepG2 cells, discovered the increasing expression of ACSL3 in HCC in contrast to normal liver 20 . In our study, HCC patients with ACSL3 high expression encountered a worse survival rate than those with low expression in both TCGA and ICGC databases. Lysocardiolipin acyltransferase 1 (LCLAT1), a cardiolipin-remodeling enzyme of mammalian mitochondrial cardiolipin, modulates mitochondrial membrane potential, cardiolipin remodeling, reactive oxygen species generation, and apoptosis of alveolar epithelial cells 21 . One study demonstrated that LCLAT1 causes insulin resistance 22 . Another study demonstrated that insulin resistance promotes HCC process. There were no reports about LCLAT1 on tumors. In our study, metformin intake was related to decreased LCLAT1 expression. For this study, high expression of LCLAT1 predicted a poor prognosis in both the training and validation sets. Lysophosphatidylcholine acyltransferase 1 (LPCAT1) participates in phospholipid metabolism, particularly in the process of converting lysophosphatidylcholine into phosphatidylcholine when acyl-CoA exists. Bi et al. identied LPCAT1 as a hub among signaling, tumor growth, and the expression of genetically driven growth factor receptors 23 . Mounting evidence suggests that alteration in LPCAT activities is involved in the pathological processes, such as NAFLD, viral infections, and cancer 24 . Several studies found that LPCAT1 is upregulated or overexpressed in human cancers, including colorectal, renal, prostate, lung, and breast cancer 25,26 . Moreover, LPCAT1 upregulation leads to poor prognosis by promoting progression and recurrence of breast and prostate cancer 27,28 . LPCAT1 also stimulates brain metastasis of lung adenocarcinoma . For HCC cells cultured in phospholipid biosynthesis and hematopoietic stem cell differentiation 42,43 . In pancreatic beta cells, the downregulation of PTPMT1 led to an elevation of insulin production and cellular ATP levels 44 . A study reported that PTPMT1 downregulation promoted cancer cell death 45 . Another reported modulating PTPMT1 alternative splicing would ameliorate cancer cell radioresistance 46 . Suppressor of cytokine signaling 2 (SOCS2) encodes SOCS2 family proteins, which are negative regulators of cytokine receptor signaling via JAK/SATA pathway. SOCS2 is a well-known cancer suppressor. It inhibits the progression and metastasis in colon, breast, and lung cancer 47–49 . An experiment in mice indicated that SOCS2 is a modulator of obesity via regulating the metabolic pathways depending on adipocytes’ size. Moreover, SOCS2 also serves as an inammatory regulator through controlling cell differentiation or recruitment into adipose tissue and cytokines release during the progression of obesity 50 . Another study proved that SOCS2 plays a protective role in acute liver injury through balancing immune response and oxidative stress 51 . Chen et al. elucidated a mechanism of epigenetic alteration in HCC; SOCS2 expression was suppressed by methyltransferase-like 3 (METTL3) via an m6A-YTHDF2-dependent process 52 . Ren et al. concluded that high expression of SOCS2 inhibits HCC progression via the JAK/STAT pathway related to downregulating miR-196a or miR-196b. 53 In our results, the high expression of SOCS2 also displayed a protective effect for HCC patients. Transducing (beta)-like 1X-linked receptor 1 (TBL1XR1), belonging to WD40 repeat-containing family, presents the sequence identity of TBL1X and is required for transcriptional activation. Mutations or recurrent translocations in this gene have been observed in intellectual disability and infrequently in some Several studies showed that the upregulation of TBL1XR1 not only promotes cancer cells (including lung, cervical, ovarian, breast and gastric cancer) proliferation, migration, invasion, and


Introduction
Liver cancers have the sixth-highest incidence of all cancers and are the fourth leading cause of cancer-related deaths.
There are approximately 841,000 new cases and 782,000 deaths each year 1 . Its morbidity and mortality are increasing year by year. The American Cancer Society estimates that there would be more than 42000 new cases and 30000 deaths by the end of 2020 in the USA alone 2 . Hepatocellular carcinoma (HCC) accounts for 85 to 90 percent of primary liver cancers and had been a hot spot in cancer research. The e cacy of modern diagnostic and therapeutic options against HCC is unsatisfactory 3 . HCC with a 5-year survival of 18%, is only less malignant than pancreatic cancer 4 . Many studies have endeavored to develop an ideal tool for HCC prognosis prediction. However, the optimal models have not been established yet.
The initiation of HCC is closely related to the underlying liver disease, such as hepatitis B or C virus (HBV or HCV) infection, A atoxin B1 (AFB1) infection, or alcohol abuse. They, singly or synergistically, cause liver cell fat degeneration and lipid deposition, which leads to an imbalance in liver lipid metabolism and facilitate malignant transformation of liver cells 5 . Studies have reported that the plasma levels of triglycerides, cholesterol, free fatty acids, high-and lowdensity lipoproteins, and apolipoproteins were signi cantly reduced in most liver cancer patients 6,7 . In western countries, nonalcoholic fatty liver disease (NAFLD) may soon become the dominant causative factor in HCC 8 .
Metabolic reprogramming can be a hallmark of cancer 9 . In early 1953, Medes et al. described the increased de novo lipid synthesis metabolic alteration in cancer. They concluded that essential lipids for cancer cell growth are obtained from the host 10 . Multiple studies have focused on lipid metabolism and the lipogenic phenotype in cancer cells 11 . Some potential drugs targeting lipid metabolic reprogramming have undergone clinical trials 12 . Metformin is commonly used for blood sugar control in diabetic patients, especially those with excessive body mass index (BMI). It inhibits hepatic gluconeogenesis and reduces hepatic glycogenolysis. Tseng CH found that metformin reduces the risk of HCC in a speci c dose-response pattern 13 . Another study showed that type 2 diabetes promotes HCC through insulin resistance 14 . Metformin does not increase insulin secretion by stimulating islet B cells. It directly acts on the metabolic process of sugar, promotes the anaerobic glycolysis of sugar, and increases glucose uptake and utilization by peripheral tissues such as muscles and fat. This unique mechanism of action may help reduce insulin resistance and further bene t HCC patients.
Our research found a lipid metabolism-related HCC gene set related to the prognosis of HCC, and calculated an HCC prognostic risk score depending on screened lipid metabolism-related genes through the LASSO regression analysis. We established a nomogram for HCC prognostic prediction by combining the risk score with clinical factors.

Consistency Clustering Analysis
The lipid metabolism-related gene set were downloaded from the Gene Set Enrichment Analysis (GSEA) (https://www.gsea-msigdb.org/ gene set: GO_GLYCEROLIPID_METABOLIC_PROCESS). Gene expression data and clinical details were obtained from the cancer genome atlas (TCGA) (https://portal.gdc.cancer.gov/) and International Cancer Genome Consortium (ICGC) (https://icgc.org/) databases. The patients with unclear pathological diagnosis or follow-up times < 30 days were excluded. A total of 451 HCC patients were enrolled in the study. Among them, 221 cases from the TCGA were considered as the training set and 230 from the ICGC group as the validation set (Table  supplement 1). A consistent clustering of metabolic genes set was conducted in the TCGA database using the "Consensus Cluster Plus" package of R (https://www.r-project.org). The cumulative distribution function (CDF) and consensus matrices were carried out to estimate the best numeral of clusters. The discrepancy of gene expression between the clusters was evaluated using principal component analysis (PCA) through R package named "princomp".

Identi cation, Selection and Evaluation of DEGs
The R and "limma" Bioconductor packages were used to identify different expression genes (DEGs) with |logFC|>1 and FDR<0.05. A PPI network was constructed using String (Version10.5, http://string-db.org) with con dence >0.9 as a cutoff criterion for up-or down-regulated genes in both lipid metabolism and HCC. The analysis was executed to nd out which pathways the screened DEGs enriched in, using DAVID (Database for Annotation, Visualization and Integrated Discovery, version 6.8, https://david-d.ncifcrf.gov), Kyoto Encyclopedia of Genes and Genomes (KEGG) and Gene Ontology (GO).

Con rmation of Hub Risk Genes and Patients Grouping
According to the Level of Risk Score All DEGs were further screened using univariate analysis and LASSO regression. The ltered DEGs were then selected to build a risk score formula as follows: Risk score = (coe cient * expression of gene 1) + (coe cient * expression of gene 2) + ... + (coe cient * expression of gene X). The patients were separated into low-and high-risk groups based on the cut-off value de ned by the median value of risk score in both the training and validation sets.

Analysis of Screened Risk Genes
Survival curves of screened DEGs were drawn according to the gene expression in the training and validation sets using Kaplan-Meier analysis. Later, the proteins encoded by screened risk genes were analyzed using The Human Protein Atlas (https://www.proteinatlas.org) to gure out if there was a distribution difference between HCC and adjacent tissues. We used GSEA (http://software.broadinstitute.org/gsea) to nd potential functional annotations about these genes.

The Correlation between the Risk Score and Clinical Features
We compared the prognostic process and clinical characteristics of the high and low-risk groups using Kaplan-Meier analysis in both the training and validation sets. The relationship between the risk score and clinical characteristic was assessed using univariate and multivariate Cox analysis.
2.6. Construction and Evaluation of the Nomogram Sex, age, TNM stage, and risk score were selected as prognostic factors to establish a nomogram. The area under the receiver operating characteristic curve (AUC), C index, and calibration plot were performed to assess the precision of the nomogram in both the training and validation sets.

Effect of Metformin Treatment on Risk Genes
The expression changes of all screened risk genes in GSE69850 was investigated to probe the correlation between risk genes and metformin intake in HCC. This included 9 samples of HepG2 cells handled by metformin and another 39 control samples handled by dimethyl sulfoxide (DMSO), through unpaired t-tests in GraphPad Prism 8.0.

Subtype of HCC Owing to the Lipid Metabolism-related Gene Set
The patients were divided into 2 clusters (K=2) by "consensus" to unscramble the correlation of lipid metabolism-related genes expression with outcome of HCC, ( Figure 1A-C). PCA revealed two clusters that presented signi cant differences ( Figure 1D). A chi-square test showed the difference in sex, age, tumor grade, and TNM stage between two clusters (Table supplement 1).

Selection and Evaluation of The Prognostic Lipid Metabolismrelated Genes
Lipid-metabolism DEGs (214) were identi ed between HCC tissues and adjacent non-tumor tissues ( Figure 1E). A PPI network was built to inspect the interaction among the 214 DEGs, by utilizing the STING tool ( Figure 1F), and GO and KEGG analysis were performed. The GO analysis showed three main DEGs pathways including glycerolipid metabolic process, plasma lipoprotein particle, and phosphoric ester hydrolase activity ( Figure 2A). The KEGG analysis showed three main DEGs pathways containing glycerolipid, phospholipid, and glycerophospholipid metabolic process ( Figure  2B).

Ten Screened Lipid Metabolism-related Genes
Ten risk DEGs (ACSL3, LCLAT1, LPCAT1, PIGU, PLA2G7, PLEKHA8, PON1, PTPMT1, SOCS2, TBL1XR1) were chosen after univariate and LASSO regression analysis ( Figure 1G and 1H), to originate a risk score formula. Patients were classi ed into high and low-risk groups in terms of the median risk score (0.71) (Figure supplement 1). The Kaplan-Meier plot showed that the cohorts at the high-risk group had a shorter survival than those at low risk, both were investigated in two sets ( Figure 1I and 1J). GSEA revealed that all ten risk genes were involved in ten pathways. The high-risk score group mainly enriched in vascular smooth muscle contraction, hypertrophic cardiomyopathy (hcm), neuroactive ligand-receptor interaction, calcium signal pathway, dilated cardiomyopathy pathways, whereas the low-risk score group mainly enriched in homologous recombination, cell cycle, pyrimidine metabolism, RNA degradation, and spliceosome pathways ( Figure  We also searched The Human Protein Atlas to investigate all ten proteins encoded by ten key DEGs; Eight proteins (ACSL3, LCLAT1, LPCAT1, PIGU, PTPMT1, TBL1XR1, PON1, and SOCS2). Immunohistochemistry showed signi cant differences in the distribution of all eight proteins between cancers and adjacent tissues ( Figure 3K-Z). The dependence of all eight related genes and survival rates in our study were statistically signi cant.

The Risk Score Was An Independent Prognostic Factor for HCC
We investigated the correlation of the risk score and clinicopathological factors of HCC patients in two sets. There were signi cant differences in grade, T stage, and TNM stage but no signi cant differences in sex and age between the high and low-risk groups (Table 1). Moreover, poor differentiation, higher T stage, and worse TNM stage are positively correlated with the risk score ( Figure

Metformin Was Conducive to HCC Prognosis
Unpaired t-test results revealed that metformin intake was associated with the expression changes of four genes (LCLAT1, PIGU, PON1, PTPMT1) in GSE69850 (p<0.05) ( Figure 3AA-AD). DMSO was used as a control.

Discussion
Fast-multiplying cancer cells mainly draw energy from increasing aerobic glycolysis, which is known as the "Warburg Effect" 15 20 . In our study, HCC patients with ACSL3 high expression encountered a worse survival rate than those with low expression in both TCGA and ICGC databases. Lysocardiolipin acyltransferase 1 (LCLAT1), a cardiolipin-remodeling enzyme of mammalian mitochondrial cardiolipin, modulates mitochondrial membrane potential, cardiolipin remodeling, reactive oxygen species generation, and apoptosis of alveolar epithelial cells 21 . One study demonstrated that LCLAT1 causes insulin resistance 22 . Another study demonstrated that insulin resistance promotes HCC process. There were no reports about LCLAT1 on tumors. In our study, metformin intake was related to decreased LCLAT1 expression. For this study, high expression of LCLAT1 predicted a poor prognosis in both the training and validation sets. Lysophosphatidylcholine acyltransferase 1 (LPCAT1) participates in phospholipid metabolism, particularly in the process of converting lysophosphatidylcholine into phosphatidylcholine when acyl-CoA exists. Bi et al. identi ed LPCAT1 as a hub among signaling, tumor growth, and the expression of genetically driven growth factor receptors 23 . Mounting evidence suggests that alteration in LPCAT activities is involved in the pathological processes, such as NAFLD, viral infections, and cancer 24 . Several studies found that LPCAT1 is upregulated or overexpressed in human cancers, including colorectal, renal, prostate, lung, and breast cancer 25,26 . Moreover, LPCAT1 upregulation leads to poor prognosis by promoting progression and recurrence of breast and prostate cancer 27,28 . LPCAT1 also stimulates brain metastasis of lung adenocarcinoma 29 . For HCC cells cultured in favorable conditions, LPCAT1 modulated phospholipid composition and distribution. Moreover, LPCAT1 overexpression promoted HCC cell proliferation, invasion, and migration. LPCAT1 knockdown produced the opposite effect. 30 In our study, the high expression of LPCAT1 resulted in poor prognosis in the training set but not obtained in the validation set. Phosphatidylinositol glycan anchor biosynthesis class U (PIGU), encoding GPI transamidase fth subunit, was con rmed as an oncogene for bladder cancer 31 . For HCC, PIGU was a signi cant stage-speci c DEG 32 .
Additionally, consistent with our study, PIGU overexpression was reported as an independent predictive factor for poor prognosis in HCC and the incorporation of PIGU expression with a typical TNM stage was thought to elevate prognostic strati cation 33 . Phospholipase A2 group VII (PLA2G7) catalyzes the activation of the platelet-activating factor. PLA2G7 defects lead to platelet-activating factor acetylhydrolase de ciency. Moreover, knocking out PLA2G7 leads to the absence of the activity of soluble lipoprotein-associated phospholipase A2 34 . Most studies involving PLA2G7 focus on the process of in ammatory interaction or lipid metabolism in Coronary heart disease, stroke, diabetes, and obesity, but few on tumors [35][36][37][38] . Pleckstrin homology domain-containing A8 (PLEKHA8), also known as FAPP2, participates in vesicle maturation and promotes cytoplasmic lipid transfer. Chen et al. demonstrated that PLEKHA8 overexpression promotes human colon cancer cell growth via an active Wnt signaling 39 . Paraoxonase 1 (PON1) is a restricted expression toward the liver, which exhibits lactonase and ester hydrolase activity. The enzyme is synthesized in the liver and kidney and binds to high-density lipoprotein (HDL) particles after being secreted into circulation, and hydrolyzes thiolactones and xenobiotics. Sun et al. reported that serum PON1 level could be used to distinguish early hepatocellular carcinoma from liver cirrhosis with a sensitivity of 71.4% and 95.2% and a speci city of 94.7% and 78.9%, respectively 40 . Ding et al. found that the serum level of PON1 was better than AFP for microvascular invasion prediction and did not uctuate signi cantly with the change of tumor size in HCC patients 41 . In our study, PON1 showed a signi cant predictive capability for survival rate. PON1 low expression indicated a better prognosis. Protein tyrosine phosphatase mitochondrial 1 (PTPMT1) was a crucial intermediate in cardiolipin biosynthesis and hematopoietic stem cell differentiation 42,43 . In pancreatic beta cells, the downregulation of PTPMT1 led to an elevation of insulin production and cellular ATP levels 44 . A study reported that PTPMT1 downregulation promoted cancer cell death 45 . Another reported modulating PTPMT1 alternative splicing would ameliorate cancer cell radioresistance 46 . Suppressor of cytokine signaling 2 (SOCS2) encodes SOCS2 family proteins, which are negative regulators of cytokine receptor signaling via JAK/SATA pathway. SOCS2 is a well-known cancer suppressor. It inhibits the progression and metastasis in colon, breast, and lung cancer [47][48][49] . An experiment in mice indicated that SOCS2 is a modulator of obesity via regulating the metabolic pathways depending on adipocytes' size. Moreover, SOCS2 also serves as an in ammatory regulator through controlling cell differentiation or recruitment into adipose tissue and cytokines release during the progression of obesity 50 . Another study proved that SOCS2 plays a protective role in acute liver injury through balancing immune response and oxidative stress 51  suggested that the nomogram could better predict HCC prognostic process. Furthermore, we found that Metformin intake was associated with decreased LCLAT1, PIGU, and PTPMT1 expression and increased PON1 expression. These trends matched the calculated prognosis trends. Therefore, this study speculates that the four genes may offer an effective therapeutic target of HCC with abnormal lipid metabolism. Of note, the lipid metabolism-related risk genes remained an independent prognostic factor even with the exclusion of clinical features. So, combining the risk score and other proven features could produce a better prediction of HCC.

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.