Regulatory network difference between tumor and normal liver cells
Due to the indispensability between metabolic and regulatory mechanisms, it is reasonable to take regulatory network into consideration when studying cancer metabolism. There are many algorithms designed to infer these networks from sequencing profiles and the results have been validated in model organisms like S.cerevisiae and E.coli. We employed MERLIN and CMIP algorithms (Methods) together with paired RNA-Seq data obtained from GEO database (GSE77314)(10) to construct the regulatory networks of HCC and paired normal tissues. There are in total 15143 pairs and 29127 pairs of regulation between transcriptional factors and target genes deduced from tumor and normal samples (Supplementary Table 1). Among them, 1654 pairs are same. Cytoscape was used to visualize the topology difference (Methods) of these two networks, after deducting the nodes with little influence, the most cored structure is shown in Fig. 2.
NME2 and NFKBIA are clearly the hub TFs playing important roles in normal liver models. In contrast, these two TFs are absolutely absent in HCC tumor model. It is well-known that nuclear factor κB (NF-κB) affects multiple biological process by regulating immune response and inflammation, so it has been recognized as hallmark in cancer progression(16). NFKBIA is a member of a cellular protein family which can mask the nuclear localization signals of NF-κB and block its way to bind DNA. Because of this inhibition ability, it has long been considered as a tumor suppressor(17). Also, NME is a gene family associated with suppressing cancer metastasis and invasion(18). Especially, NME2, which is located on 17q21, its products have been reported to inhibit the metastasis of breast cancer and lung cancer (19, 20). Besides, there are experiments supporting this idea by showing that the RNA expression of NFKBIA and NME2 are significantly decreased in HCC tissues compared to normal tissues. Therefore, the reconstructed regulatory networks effectively revealed the critical differences between liver cancer and normal tissues. It is persuasive to consider NME2 and NFKBIA as tumor suppressor factors for future study.
Integrative regulatory-metabolic network identified abnormality of Hippo signaling as key misregulation in HCC
In our previous work, we developed a framework called Integrated Deduced Regulation And Metabolism (IDREAM) (21), which uses bootstrapping linear regression model on large-scale gene expression dataset to predict transcription factor (TF) regulation on enzyme-encoding genes, and then utilizes a probabilistic regulation of metabolism approach to apply regulatory constraints to the metabolic network. Then the integrated model can predict the influence of each TF knockout on certain objective function, such as cell growth. It has been successfully applied in Saccharomyces cerevisiae to effectively predict the influence of transcriptional regulation on metabolic phenotype. Here we used IDREAM algorithm (Methods) to integrate regulatory network with metabolic models to identify potential transcription factors vital to the growth of HCC cells.
The composition of integrated model for HCC and normal liver tissue are listed in Supplementary Table 1. The basic metabolism part is consistent, while the transcriptional factors and target genes are different. There are 1313 TFs in HCC model and 1312 TFs in normal model, among them, 33 and 32 TFs are unique, respectively. To analyze the 32 specific TFs (including NME2 and NFKBIA mentioned above) not involved in HCC regulatory network, the Reactome database show that they were enriched for YAP1- and WWTR1 (TAZ)-stimulated gene expression pathway. They are transcriptional co-activators interacting with TEAD family genes to promote expression of TFs critical to cell proliferation and apoptosis through Hippo signaling pathway, which means the depletion of these 32 TFs will lead to the abnormality of Hippo signaling and may induce a wide range of cancers. In addition, the 33 specific TFs in HCC integrated model were mainly enriched in pathways in cancer metabolism and transcriptional misregulation. Both conclusions supported that the integrated models we reconstructed are credible.
For each TF knockout, we changed the constraints on corresponding reactions according to activation/inhibition interactions and then simulated the cell growth rate to get the ratio over wild type. We found TFs affecting both tumor and normal cell growth, and TFs only reduce growth of tumor cell particularly (Supplementary Table 2), for example, ETV7, whose knockout will lead to nearly 2/3 reduction in tumor cell growth; CREB3L3, which seems to be closely correlated with poor prognosis in HCC.
Precise stratification of TCGA-LIHC samples based on metabolic and transcriptional gene expression
Although researchers have long been working on identifying genes or pathways available for treatment, the high heterogeneity of HCC has always been an obstacle in the precise clinical diagnosis, it is partly due to the fact that the current TNM stage classification is not that efficient. Therefore, researchers are devoted to try to stratifying HCC patients on molecular level and hope to find corresponding therapeutic targets. Bidkhori et al. utilized metabolic network-based method to divide 179 TCGA-LIHC samples into three subtypes and identified their specific characteristics; Jiang et al. used proteomic data to classify HCC patients and explored the mechanism of early-stage HCC tumor cell. Inspired by them, here we used 315 HCC samples attached with actual clinical stage information from TCGA-LIHC dataset and 3492 genes (Methods) obtained from multiple sources to identify altered metabolism among different subgroups and specific characteristics of poor prognosis subgroup.
Using a non-negative matrix factorization consensus-clustering analysis, three major classes are concluded in TCGA-LIHC cohort, 130, 127 and 58 cases are classified into Class1, Class2 and Class3 respectively (Methods). The survival curves are shown in Fig. 3a,from which we can see that Class2 has an significantly lower overall rate of survival. By collecting 159 overlapped samples used in Bidkhori et al.‘s study and ours, we reached a relatively good agreement in identifying the lowest OS subgroup: 90.57% (48 of 53, their results that are also in ours ) and 70% (49 of 70, our results that are also in theirs). Consequently, we focused on digging out the characteristics of poor prognosis subgroup Class2 at both transcriptome and metabolism level.
A supervised analysis through LIMMA(22) revealed Class2-specific gene expression patterns, comprising 287 up-regulated (including three potential therapeutic targets: ALDOA, G6PD, ACSS1 specific to lowest OS subgroup identified by Bidkhori et al. ) and 112 down-regulated genes (Supplementary Table 3) (Methods), which are enriched in 17 and 13 non-overlapped KEGG pathways respectively, as shown in Fig. 3c.
In order to validate the effectiveness of our stratification strategy, we applied the same strategy on LIRI-JP data (ICGC database) to split it into three subgroups, as we can see in Fig. 4a, the P-value of survival curves are still significant. By analyzing the pathways enriched for differentially expressed genes, the result shows high consistency with the ones of TCGA-LIHC data: the up-regulated genes are mainly enriched in well-known cancer related pathways in improving cell proliferation, such as increased glucose uptake as a principal nutrient source in central carbon metabolism of cancer, cell cycle and fructose metabolism; HIF signaling, which consists of master regulators of oxygen homestasis that not only allow tumor cells to adapt to hypoxic environment by enhancing oxygen delivery but also affect important growth factors like VEGF; let alone viral carcinogenesis and hepatitis B pathways which can be directly link with HCC development. In the meantime, the down-regulated genes are generally found in pathways contributing to drug metabolism, for example, the PPAR signaling pathway, which has also been identified in less aggressive HCC subtypes through proteomics analysis; drug cytochrome P450 metabolism, which is indeed reduced in advanced cancer patients (23).
Integrated models revealed that PI3K-Akt and mTOR signaling pathways are critical to HCC tumor cell growth
By using IDREAM algorithm, we found 8, 13, 5 TFs being vital for HCC cell growth respectively in Class1, Class2 and Class3 of TCGA-LIHC (after excluding TFs that also affect normal tissue). There are 3 TFs common in all three classes, as listed in Table 1. The knockout of ETV7 shows the greatest decrease in growth rate in all three classes. ETV7 is a transcription factor belonging to ETS family, which has been proved to be responsible for the development of different tissues as well as the progression of several cancers, such as HCC(24, 25). Due to its translocation function, the overexpression of ETV7 has been associated with tumorigenic transformation and anti-apoptosis by blocking Mys-induced apoptosis pathway(26–28). Furthermore, there are growing experimental evidences showing that ETV7 also plays a significant role in mTOR signaling pathway by assembling mTOR3 complex, which can stimulate cell proliferation and is not sensitive to rapamycin, a common anti-tumor agent, unlike mTOR1/2(29). So the depletion of ETV7 may cause the inactivation of mTOR3, thus leading to tumor cell death after treatment.
Table 1
Cell-growth ratio by influential TFs knockouts
Ratio after knockout of common TFs in all 3 classes of TCGA-LIHC |
TF | Class1 | Class2 | Class3 |
CTBP1 | 0.926 | 0.926 | 0.926 |
HTATIP2 | 0.926 | 0.926 | 0.926 |
ETV7 | 0.234 | 0.12 | 0.09 |
Ratio after knockout of specific TFs in lowest survival class |
TF | TCGA-LIHC | LIRI-JP |
NR1I3 | 0.978 | 0.978 |
HNF4A | 0.969 | 0.984 |
RORC | 0.935 | 0.888 |
F2 | 0.975 | 0.967 |
CREB3L3 | 0.856 | 0.876 |
In addition, CTBP1 is a well-known cancer hallmark taking responsibility in pro-tumorigenic process as well as affecting regulatory network(30). It can bind to C-terminus of the adenovirus protein E1A to promote cell-proliferation and invasion(31). Besides, the characteristic of cancer cells (high NADH level) makes it possible for CTBP1 to bind to NADP with a high affinity thus triggering a conformational change, leading to hyper-activity in both tumorigenesis and tumor progression.
To explore characteristics of TFs leading to low survival rate and poor prognosis, we selected TFs whose knockout only influence samples in Class2 but not in Class1 or Class3 of TCGA-LIHC dataset. There are 5 TFs specific for Class2 (threshold: ratio < 0.98), in addition, 4 of 5 these TFs are also in the result of the lowest survival subgroup (Class3) of LIRI-JP dataset (the ratio of HNF4A exceeded the threshold a little), as shown in Table 1.
The knockout of CREB3L3, which also appears in the decreases the growth rate of tumor cells by over 15%, but have no effect on normal tissues. CREB3L3 is reported to be activated in the lipid metabolism in liver-specific tissue in a mutual manner with PPARα(32). They both play important roles in the utilization of fatty acid for energy under fasted state, just like that in cell proliferation, so it is not surprising to see its absence will result in the decreased growth rate in tumor cells. The expression of CREB3L3 is link with anti-apoptosis, cell survival and HBV-associated HCC development by regulating hepatic genes in PI3K-Akt and AMPK signaling pathways. With the combination of consistent in-silico analyses and biological knowledge, we suggested CREB3L3 as a therapeutic target, especially for those HCC patients in advanced stage.
Metabolic genes in cholesterol biosynthesis are druggable targets in HCC treatment
We incorporated patient-specific models established by Uhlen et al. to do metabolic analyses, including identification of metabolic genes essential for tumor cell growth and annotation of the specific reactions altered during tumor development. All 315 metabolic models were built at the aim of representing tumor growth. On the basis of genetic human metabolic model HMR2 and RNA-Seq expression data from TCGA-LIHC, a task-driven model reconstruction algorithm called tINIT was employed to construct all models.
We conducted the single gene deletion simulation by a function provided in COBRA Toolbox (33). The total gene number of each model ranges from 1106 to 2169. We first differentiate the essential genes respectively in the three subtypes of TCGA-LIHC samples by the NMF stratification strategy mentioned above. And then collected genes that existed in at least half of the samples in each class. After wiping out those having no influence on tumor cell growth, there are 19, 20, and 18 genes left, respectively. The 18 genes found in Class3 (relatively high overall survival rate) are same with other two classes, and one other gene, ACADSB was shared by Class1 and Class2, while the other gene CMPK1 was identified in Class2 alone. We checkout the total 20 influential genes to see whether there is prior knowledge about their therapeutic potentiality in Drugbank, as shown in Table 2.
Table 2
Lethal metabolic genes as potential targets and corresponding drugs in DrugBank
Lethal Gene | Target Drug | Drug state | Brief Description of drug |
IDI1 | Dimethylallyl Diphosphate | Experimental | |
SQLE | Ellagic Acid | investigational | Antioxidant and anti-proliferative/anti-cancer effects |
FDFT1 | TAK-475 | Investigational | Target rate-limiting enzyme in the hepatic biosynthesis of cholesterol |
CRAT | Levocarnitine | Approved | Treatment of primary systemic carnitine deficiency |
EBP | Tamoxifen | Approved | Treatment of metastatic breast cancer |
ACADSB | Isoleucine Valproic Acid | Approved Approved | Anti-proliferative effects useful in cancer therapy |
SLC22A5 | Levocarnitine | Approved | Treatment of primary systemic carnitine deficiency, affect fatty-acid synthesis |
HMGCR | Lovastatin Cerivastatin Simvastatin Atorvastatin Rosuvastatin Meglutol | Approved Approved Approved Approved Approved Experimental | Lowering LDL cholesterol and triglycerides, hypercholesterolemia; Target rate-limiting enzyme in the hepatic biosynthesis of cholesterol |
CMPK1 | Gemcitabine Lamivudine Sofosbuvir | Approved Approved Approved | Various advanced cancers Treatment of HBV Treatment of HCV Reduce incidence of HCC |
MVK | Farnesyl thiopyrophosphate | Experimental | |
HSD17B7 | NADH | Approved | Treating Parkinson's disease, chronic fatigue syndrome, Alzheimer's disease and cardiovascular disease |
NSDHL | NADH | Approved | Treating Parkinson's disease, chronic fatigue syndrome, Alzheimer's disease and cardiovascular disease |
DHCR7 | NADH | Approved | Treating Parkinson's disease, chronic fatigue syndrome, Alzheimer's disease and cardiovascular disease |
ACACB | Soraphen A | Experimental | Anti HCV viral activity |
LSS | R048-8071 Lanosterol | Experimental Experimental | |
FDPS | | | |
PMVK | | | |
MVD | | | |
CYP51A1 | | | |
SC5D | | | |
According to Drugbank, 9 genes (CRAT, EBP, ACADSB, CMPK1, SLC22A5, HMGCR, HSD17B7, NSDHL, DHCR7) have already been targeted by approved drugs in the treatment of cancer or relative diseases, other 6 genes have corresponding drugs under experimental or investigational state. Both CMPK1 and ACADSB seem to be vital to tumor cell growth in HCC models with lower survival rate, and they actually have long been regarded as prognosis biomarkers associated with worse survival in multiple tumor(34–39). In fact, CMPK1 has also been targeted by three FDA approved cancer drugs in clinical usage, especially in the treatment of diseases induced by virus infection, such as HCC caused by HBV/HCV: Gemcitabine, Lamivudine and Sofosbuvir. Li et al.(2) recently reported that in Kaposi’s sarcoma, a common AIDS-related malignancy caused by infection of KSHV, cells can increase invasiveness and motility by over-expression of CMPK, meanwhile, this theory has been validated by the knockout experiments carried out in cell line.
Among 18 genes lethal in all three classes, there are 15 genes participating in cholesterol biosynthesis via desmosterol (DESMOL pathway), which is the dominant form of liver cholesterol biosynthesis. (40): Firstly, HMGCR, MVK, PMVK, MVD and IDI1 are genes involving in mevalonate pathway to convert acetyl-CoA into dimethylallyl pyrophosphate (DMAPP), and enzyme encoded by FDPS helps DMAPP synthesize farnesyl pyrophosphate (FAPP), then FDFT1 catalyzes the dimerization of two FAPP into squalene (SQNE). In the next step, SQLE and LSS play the important rate-limiting roles in cholesterol biosynthesis by catalyzing SQNE to lanosterol (LNSOL). After that, LNSOL goes through demethylation, oxidation, reduction catalyzed by CYP51A1, NSDHL, HSD17B7 to become zymosterol (ZYMOL), the precursor in DESMOL pathway, in which EBP, SC5D, DHCR7 are taking responsibilities by order to convert ZYMOL into DESMOL. Finally, DESMOL is reduced by DHCR24 to produce cholesterol. So the knockout of any gene will cause the break of cholesterol biosynthesis and lead to the depletion of cholesterol, which is disastrous for tumor cell growth.
There are only five of the predicted essential genes having not been recorded in Drugbank, but the high confidence of existing drug targets hitting suggested that those five metabolic genes are possibly be potential targets and worthy to be explored in depth for future researches.
Enhancement of glutathione and fatty-acid biosynthesis is important metabolic reprogramming associated with poor prognosis
It is widely accepted that tumor cells reprogram certain metabolic pathways consisting of numerous reactions to meet the demand of unlimited cell proliferation, aggressive invasiveness and anti-apoptosis. We investigated 1329 reactions existing in all 315 patient-specific models to see the flux patterns and enzymes differed between the poor-prognosis subgroup (Class2) and the other two classes. We conducted the flux balance analysis with cell growth as objective function to get flux distribution for each patient, and then selected candidate reactions having similar flux change in over half samples of each subgroup. There are four types of flux alteration patterns: 1) from negative flux value in Class1 and Class3 to positive flux in Class2; 2) from positive flux value in Class1 and Class3 to negative flux in Class2; 3) from non-zero flux in Class1 and Class3 to zero flux in Class 2; 4) from zero flux in Class1 and Class3 to non-zero flux in Class 2. The altered reactions, formula, enzymes and corresponding types of flux patterns are shown in Supplementary Table 4.
There are two reactions having flux change of type1 and type2, respectively. According to these four reactions, GSH (glutathione) production is suspected to increase in Class2 samples due to the enhancement of AKG biosynthesis and cysteine accumulation in cytosol. GSH is a key member of cell immune response system, lacking it can easily lead to cell death, several labs have confirmed its common occurrence in all cancers(41) and considered it as potential therapeutic target. Besides, the loss of SLC25A11, the enzyme catalyzing these reactions, has been validated to inhibit tumor cell growth in non-small cell lung cancer (42). Baulies et al. (43) suggested that the over-expression of SLC25A11 works as an adaptive mechanism of HCC to provide enough GSH for vast cell growth, meanwhile, SLC25A11 induced exportation of AKG to cytosol also activates mTOR pathway to promote cell growth and anabolism through EGLNs (egl-9 family hypoxia-inducible factors)(44).
There are eleven reactions carried no flux in Class 2 but having positive flux in Class 1/3 (type 3), four of them belongs to porphyrin metabolism. The enzyme UROD involved in this pathway is a recently identified potential anti-cancer target due to its ability to convert uroporphyrinogen to coproporphyrinogen(45). Another enzyme ALAD has been found over-expressed in breast cancer patients with a favorable clinical outcome for its up-regulation can suppress cell proliferation and invasion(46). In addition, a set of enzymes responsible for carnitine shuttle including SLC22A1, SLC25A20, SLC25A29 and CPT2 are also down-regulated in HCC tumor cells, they mainly play rate-limiting roles in controlling fatty acid oxidation(47). Their low-expression are proved to be significantly associated with worse patient survival(48) and differentiation state by impairing NO production and mTOR signaling pathway mediated by arginine, in some situation, even leading to severe autophagy(49, 50).
There are three reactions having non-zero flux in Class 2 but zero flux in Class 1/3 (type 4), involving fatty acid activation responsible for providing adequate ATP and CoA for tumor cell growth; glycine, serine, threonine metabolism helping reduce ROS pressure through SGOC metabolic network during tumor metastasis(51), and arginine/proline metabolism, which can regulate response to nutrient and oxygen deprivation in oncogenesis thus avoiding cell apoptosis(52). Furthermore, the exploration of enzymes revealed that ACADSB (which is also indicated by previous analyses), ACSL3 and ACSL4 regulate distinct proteins like p-AKT, LSD1 and β-catenin to stimulate tumor cell proliferation(53).
To conclude, the altered reactions specific to Class2 samples generally aim at promoting tumor cell growth and decreasing their sensitivity towards normal apoptosis signals. Several key enzymes have already been regarded as biomarkers in cancers, with deeper exploration, it is convincing to figure out their detailed mechanism and find more available therapeutic targets. Meanwhile, the metabolic reprogramming accounting for poor prognosis also supports that our stratification of the HCC patients are meaningful.
miRNAs regulating the influential genes for HCC cell proliferation
To disclose a more comprehensive interplay between regulation and metabolism of HCC, we further retrieved microRNAs regulating the influential genes highlighted in our previous analyses: including 3 common TFs showing influence in all three Classes; 5 overlapped TFs specifically affecting the lowest survival subgroup of TCGA-LIHC(Class2) and LIRI-JP(Class3); and 20 metabolic genes revealed by single gene deletion result. From MIRNET, we found six highly-connected microRNAs, three of which have been validated to be directly linked with HCC, including miR-124-3p, miR-1-3p, and miR-24-3p (Supplementary Table 5).
MiR-124-3p and miR-1-3p have been reported to be down-regulated in HCC patients compared to normal(54, 55). While miR-24-3p has been considered to be involved in HCC diagnosis panel for its abnormal over-expression.
The specific mechanism of how the loss-function or gain-function of these miRNAs contribute to tumorigenesis remains unclear, however there are some experiment-based theories available. Especially, miRNA-124-3p, which also correlates with a wide range of other cancers including breast cancer, lung cancer, colorectal cancer, etc. appears to be the key microRNA during oncogenesis(56–58). Zheng et al.(59) believes that miR-124-3p participates in reducing tumor cell motility and invasion by controlling epithelial–mesenchymal cell transition as well as cytoskeletal events through cpG-island methylation(60). As for miR-1-3p, Zhang et al.(61) suggests that its over-expression can inhibit cell proliferation and induce apoptosis by targeting PI3K-Akt and mTOR pathways through ETV7. While the down-expression of mir-24-3p can assist this process by deactivating Fas-receptor in NOTCH pathway and inhibiting HNF4A to drive a feedback loop to lead to cancer-related inflammatory reaction (62). Plus, studies by Wang et al.(63) and Chen et al.(64) indicates the regulatory impact of miR-24-3p not only on altering cell-cycle by inducing p53 mutation but also on avoiding cell death by targeting Fas-receptor in NOTCH pathway(65).
In addition, miR-26b-5p, which has the maximal connection (9) in regulating 28 highlighted genes, has been experimentally validated to under-express in HCC patients with worse prognosis. Because it can suppress tumor invasion as well as induce apoptosis by targeting SMAD1(66), which is consistent with our conclusion about SMAD gene. Nevertheless, three genes obtained by our integrated regulatory-metabolic analysis in previous section, CMPK1, ACADSB, RORC are directly regulated by miR-26b-5p, and the fact that they are all specific genes for Class2 (the worst overall survival rate) proved our conclusion is credible. By combining the in-silico results in our work together with prior knowledge obtained from literatures, Fig. 5a shows the core network comprising of microRNAs, TFs and genes involved in HCC, which can provide basis for further studies in HCC development.
To take 28 candidate genes and 6 top-connected miRNAs into whole consideration, we conducted IPA (Ingenuity Pathway Analysis) to explore the biological connection among them. As shown in Fig. 5b, EGFR is inferred and densely linked with our core gene set. EGFR is one of the most crucial genes responsible for cancer cell growth, its over-expression can lead to unlimited cell proliferation, just like that in tumor cells, so it has long been focused and treated as therapeutic targets in cancer therapy. Multiple FDA-approved drugs such as Gefitinib, Lapatinib have been proved to be effective in EGFR-related non-small-cell lung cancer and several other cancers. So the IPA result indicated that our core-gene set concluded from integrated regulation-metabolism analyses are convincible and worthy further researches.