Functional drug–target–disease network analysis of gene–phenotype connectivity for curcumin in cancer

The anti-tumor properties of curcumin have been elucidated in many cancer types. However, a systematic functional and biological analysis related to its target proteins has yet to be documented fully. The aim of this study was to explore the underlying mechanisms of curcumin and broaden the perspective of targeted therapies. Direct protein targets (DPTs) of curcumin were searched in the DrugBank database. Using the STRING database, the interaction between curcumin and DPTs and indirect protein targets (IPTs) was documented. The protein–protein interaction (PPI) network of curcumin-mediated proteins was visualized using Cytoscape. The Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis was performed for all curcumin-mediated proteins. Furthermore, the cancer targets were searched in the Comparative Toxicogenomics Database (CTD). The overlapping targets were studied using Kaplan– Meier analysis to evaluate cancer survival. Further genomic analysis of overlapping genes was conducted using the cBioPortal database. Lastly, CKK-8, qPCR, and WB analysis were used to validate the predicted results on HCC cells. of drug every (the 20, 10 µl CCK-8 each well and incubated for 4 absorbance value of each at 450 calculated the cell vitality The Cell viability% (OD in experimental group – OD in blank group)/(OD in control group – OD in blank


Background
Curcumin, also known as diferuloylmethane, is the active ingredient of the dietary spice occurring in the rhizomes of Curcuma longa, a plant belonging to the ginger family (Fig. 1A). Extensive studies investigating curcumin over the past few decades revealed the health bene ts of curcumin, including anticancer, anti-in ammatory, antioxidant, and hypoglycemic effects [1]. In recent years, curcumin has been increasingly recognized for its anti-tumor and cancer chemopreventive properties, especially in gastrointestinal tumors [2]. For example, curcumin has shown chemopreventive effects in animal models of colon cancer [3], stomach cancer [4], and hepatocellular carcinoma (HCC) [5]. Several phase I and II clinical trials with curcumin have been conducted for the treatment of different types of cancer [6,7].
However, the precise molecular mechanisms underlying the anti-tumor and cancer chemopreventive activities of curcumin have yet to be elucidated.
Recent studies using data mining techniques for the analysis of speci c bioinformatics domains have made great progress [8]. Accordingly, a strategy to systematically explore the rich data publicly available and the underlying connectivity between gene and phenotype mediated by curcumin should be helpful.
With advances in genomics, data network analysis has been used to effectively analyze candidate genes linked to experimentally veri able pathways via mining of web-accessible, open portal databases [9]. In addition, an in-depth analysis of the databases may uncover hidden, previously unknown relationships between drugs, targets, and cancers. These results can be used to generate authentic and rational leads for further investigations. Recently, the combination of related databases including DrugBank, STRING, cBio Cancer Genomics Portal (cBioPortal), and Comparative Toxicogenomics Database (CTD) has been utilized for drug-target-cancer network analysis [10,11]. In this study, we employed DrugBank to broadly analyze curcumin and drug-target data to obtain related direct protein targets (DPTs). The interactions between DPTs and indirect protein targets (IPTs) were predicted with the STRING database. The readout DPTs and IPTs were further analyzed for functionality via Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis with STRING. The pivotal genes were obtained via Intersection of CTD and STRING databases, and the genomic alterations were investigated using the cBioPortal database. Our ndings provide information to obtain a better understanding of the anti-tumor mechanisms of curcumin and identify new targets in the treatment of HCC by curcumin.

Search for DPTs of curcumin
The DrugBank database (https://www.drugbank.ca) is a richly annotated resource that combines detailed drug data with comprehensive drug target and drug action information [12]. In this study, the latest release of DrugBank database (version 5.1.4) [13] was employed to search for the interaction between curcumin and its DPTs and generate a curcumin-target network. The word "Curcumin" was searched as a keyword under drug classi cation entry.

PPI network generation
The STRING database (http://www.string-db.org/) provides a critical assessment and integration of protein-protein interactions, including direct (physical) as well as indirect (functional) associations [14].
Using the latest STRING database (version 11.0)[15] search function, data underlying the interaction between DPTs and IPTs were rst generated for curcumin by setting a minimum required interaction score of 0.5 and a maximum number of interactors of 50. The data were integrated into a curcuminmediated network and visualized using Cytoscape (version 3.7.1)[16], which is an open-source software platform for visualization of complex networks and their integration with any attribute [17].
KEGG pathway enrichment analysis and overlapping HCC genes Biochemical pathways linked to curcumin-DPT/IPT interaction were navigated via the KEGG pathway enrichment analysis tool in the STRING database. The top 15 pathways with an FDR less than 1.00E-10 were selected. To identify disease targets for HCC, we searched for "Hepatocellular carcinoma" in CTD (http://www.ctdbase.org/). CTD is a robust, publicly available database that aims to advance the understanding of environmental exposures and their impact on human health. It provides manually curated information about chemical-gene/protein interactions, and chemical-disease and gene-disease relationships [18]. Overlapping genes of HCC and cancer-related pathways and HCC targets were selected for further analysis.
Exploring cancer genomics data linked to curcumin The cBioPortal (https://www.cbioportal.org/) for Cancer Genomics provides a web resource for exploration, visualization, and analysis of multidimensional cancer genomics data [19,20]. In this study, the screened genes from foregoing investigation were assessed in all HCC studies available in cBioPortal databases. Using the portal search function, curcumin-related genes in HCC were evaluated for genomic alteration, performing network analysis, and identifying mutual exclusivity or concurrent relationships between gene pairs of the same gene set. The study with the largest sample size was chosen to analyze the screened proteins in cBioPortal databases to further interpret the result.

Analysis of curcumin-associated genomics datasets and HCC survival
The Kaplan-Meier method (http://kmplot.com/) [21] was used to evaluate the overall survival (OS) in 364 HCC samples with altered genes. The OS was de ned as the time to death. A two-tailed P value of less than 0.05 was considered statistically signi cant.

Cell culture
HepG2 and Hep3B cells were purchased from Cell Cook. They were authenticated using short tandem repeat matching analysis and cultured in DMEM supplemented with 10% v/v FBS and 1% v/v penicillin/streptomycin in CO 2 incubator at 37 °C and 95% relative humidity.

Cell viability test
The MTT assay was used to assess cell viability. The 100 µL cells (about 5000-10,000 cells) were transfused into the 96-well culture plate, which was then placed in the hatching house overnight for preculture (37 °C, 5% CO 2 ). According to the drug concentration, all cells were divided into ve groups. After pre-culture, 10 µl of drug solution was added to every well (the drug concentrations were 0, 5, 10, 20, 30 µM) and incubated at 37 °C. After that, 10 µl CCK-8 solution was added to each well at 24, 48, and 72 h and incubated at 37 °C for 4 h. We determined the absorbance value of each hole at 450 nm and calculated the cell vitality value. The formula is as follows: Cell viability% = (OD in experimental group -OD in blank group)/(OD in control group -OD in blank group) × 100%.
Extraction of total RNA and reverse transcription-quantitative PCR (RT-qPCR).
After being treated and cultured, 5-10 × 10 6 cells per group were collected, and total RNA was isolated using TRIzol® reagent according to the manufacturer's protocol. A total of 1 µg extracted RNA was used to synthesize cDNA in a reaction using oligo (dT) primers and M-MLV reverse transcriptase according to the manufacturer's instructions. PCR ampli cation of mRNA was performed with a GoTaq® qPCR Master mix on an RT-qPCR instrument. The thermocycling conditions were as follows: Initial denaturation at 95 °C for 120 s; 40 cycles of 95 °C for 15 s, and a nal extension at 60 °C for 30 s. The PCR products were calculated with the 2-ΔΔCq method, using GAPDH as an mRNA internal control.

Western blot analysis
After being treated and cultured, the cells were harvested, and protein extractions were prepared with a modi ed RIPA buffer (Beyotime Institute of Biotechnology) with 0.5% SDS in the presence of a proteinase inhibitor cocktail (Beyotime Institute of Biotechnology). A bicinchoninic acid protein concentration kit was used to determine protein concentration. A total of 20 µg protein/lane was separated using 6% or 12% SDS-PAGE. Proteins were then transferred to a PVDF membrane. The membranes were then blocked with 5% BSA in TBS(containing 0.05% Tween-20) for 2 h at room temperature and incubated overnight at 4 °C with the primary antibodies. The samples were washed three times with PBS containing 0.1% Tween-20 for 5 min each. Then the membranes were incubated with horseradish peroxidase-linked immunoglobulin G secondary antibody for 2 h at room temperature. The membranes were developed using an enhanced chemiluminescence system.

Characterization of curcumin DPTs
We queried Drugbank using curcumin as the keyword to identify the bioactivities and to determine the DPTs of curcumin and retrieve relevant information. The result showed an accession number of DB11672 and classi ed curcumin as a highly pleiotropic molecule with anti-tumor, antibacterial, anti-in ammatory, hypoglycemic, antioxidant, wound-healing, and antimicrobial activities. Clinical data showed that curcumin is undergoing clinical trials for colon cancer, prostate cancer, breast cancer, and lung cancer. Subsequent screening demonstrated ve DPTs in human beings. Table 1 summarizes the ve DPTs of curcumin: PPARG, VDR, ABCC5, CBR1, and GSTP1. In addition, the interactions between ve DPTs were analyzed by STRING and illustrated in Fig. 1B.

Characterization of curcumin IPTs and visualization of PPI network construction
Expanding our search using STRING database, we detected a total of 204 target proteins of curcumin including 199 IPTs, which were related to ve DPTs (Table S1). The dataset obtained was integrated to construct a biological network using Cytoscape. As shown in Fig. 2A, 1962 PPI pairs were found in the network of curcumin-mediated proteins (Table S2). In this network, nodes represent proteins, and edges denote protein-protein associations. In addition, the node degree, which indicates the centrality of proteins, was calculated using CentiScaPe 2.2 (Table S3). As a result, 14 proteins including four DPT targets (GSTP1, VDR, CBR1, and PPARG) with a degree value ≥ 50 are shown in Table 2 and displayed in Fig. 2A with larger node sizes. To identify the functional features of curcumin-mediated targets, the KEGG pathway enrichment analysis was performed using STRING. Target genes were found in 121 molecular pathways in KEGG enrichment (Table S4). As shown in Table 3 and Fig. 2B, the top 15 KEGG pathways connected to curcumin-mediated proteins included metabolism of xenobiotics by cytochrome P450 (52 genes), chemical carcinogenesis (50 genes), drug metabolism-cytochrome P450 (44 genes), retinol metabolism (32 genes), steroid hormone biosynthesis (30 genes), drug metabolism-other enzymes (31 genes), pentose and glucuronate interconversion (22 genes), ascorbate and aldarate metabolism (20 genes), metabolic pathways (65 genes), glutathione metabolism (20 genes), porphyrin and chlorophyll metabolism (19 genes), arachidonic acid metabolism (18 genes), pathways in cancer (36 genes), HCC (22 genes), and uid shear stress and atherosclerosis (19 genes). These KEGG enrichment pathways showed functional features of curcumin gene sets and indicated that curcumin-mediated proteins were mainly associated with basal metabolism and cancer-related pathways. Based on the anti-tumor properties of curcumin reported in diverse malignant tumors, two pathways were selected, including pathways in cancer (36 genes) and HCC pathway (22 genes). This result suggested that HCC might be used as a phenotype connected to curcumin-mediated proteins. In addition, to further identify disease targets for HCC, the keyword "Carcinoma, Hepatocellular" was searched in CTD, and a total of 505 genes with either a curated association or an inferred association via a curated chemical interaction with the disease were identi ed (Table S5). These genes were marked with "T," which indicates a gene that is or may be a therapeutic disease target, or "M," which denotes a gene that may be a disease biomarker or involved in the etiology of a disease. Five overlapping genes (TP53, RB1, TGFB1, GSTP1, and GSTM1) resulting from the intersections between pathways in cancer (36 genes), HCC pathway (22 genes), and targets for HCC by CTD (505 genes) were visualized using Venn diagrams and STRING ( Fig. 3A and B).  -70   ADH1A, ADH1B, ADH1C, ADH5, AKR1C1, AKR7A2, ALDH3A1, CBR1, CBR3, CYP1A1, CYP1A2, CYP1B1,   CYP2A13,  CYP2A6,  CYP2B6,  CYP2C9,  CYP2D6,  CYP2E1,  CYP2F1,  CYP3A4,  CYP3A5

Genetic alterations connected with curcumin-associated genes in HCC
To further validate the link between curcumin-associated genes and HCC, cBioPortal databases were used to explore the ve genes (TP53, RB1, TGFB1, GSTP1, and GSTM1) associated with curcumin in HCC. Seven HCC studies were included in cBioPortal [22][23][24][25][26][27], and the ve selected overlapping genes were queried. The results showed that 363 (33%) of the 1135 samples in 7 studies had alterations in one or more of these genes. Alterations ranged from 0.2-29% for gene sets submitted for analysis (Fig. 4).
The ve genes (TP53, RB1, TGFB1, GSTP1, and GSTM1) carried ve gene pairs showing mutually exclusivity alterations, while another ve gene pairs showed concurrent alterations (with TP53 and RB1 showing a statistically signi cant alteration, Table 4). Furthermore, Liver HCC (TCGA, Provisional) study, which carried the largest sample size, was selected individually to analyze the screened proteins in cBioPortal databases. The results showed that 156 (35%) patients/samples carried an alteration in at least one of the ve genes queried using OncoPrint; the frequency of alteration in each gene is shown in

Curcumin-associated genes and survival in HCC
The Kaplan-Meier analysis was used to perform survival analysis of patients with HCC based on the ve selected genes (TP53, RB1, TGFB1, GSTP1, and GSTM1). A total of 364 patients were involved in this analysis for OS. As shown in Fig. 6A−E, high mRNA levels of TP53, RB1, and GSTM1 indicated increased OS in HCC, whereas elevated mRNA levels of TGFB1 were correlated with poor prognosis in the same group of patients.

Curcumin-induced decrease in the activity of HepG2 and Hep3B cells
The test results showed that cell viability appeared dependent on the concentration of curcumin. For Hep3B cells, curcumin (10 µM) can signi cantly decrease cell viability after 24, 48, and 72 h, and 20 and 30 µM of curcumin were found to signi cantly reduce cell viability after 24, 48, and 72 h (Fig. 7A). For HepG2 cells, curcumin (20 and 30 µM) can signi cantly decrease the cell viability after 48 and 72 h (Fig. 7B). According to these result, we chose curcumin at a concentration of 20 µM and 48 h as the duration of treatment for subsequent testing.

Curcumin regulation of TGFB1 and GSTP1 expression in Hep3B cells
The expression of TP53, Rb1, TGFB1, GSTP1, and GSTM1 was assessed using qPCR in Hep3B cells treated with 20 µM curcumin for 48 h. The results showed that 20 µM curcumin could regulate the expression of TGFB1 and GSTP1 in Hep3B cells, but it had no signi cant effect on the expression of other genes (Fig. 7C). After treatment of Hep3B cells with 0 and 20 µM curcumin for 48 h, the levels of TGFB1 protein were detected via WB. The results showed that 20 µM curcumin could moderately inhibit the expression of TGFB1 in Hep3B cells (Fig. 7D, 7E).

Discussion
Over the past several years, numerous studies have evaluated the effects of curcumin and its analogs in diverse cancers in vitro and in vivo. According to these studies, the potential use of curcumin as a chemopreventive and therapeutic agent in cancers depends on its potent antioxidant and antiin ammatory activities as well as its ability to modulate various molecular signaling mechanisms [1,2,6,7]. Nevertheless, the mechanism underlying the wide range of anti-cancer effects of curcumin remains incomplete. Current knowledge about curcumin function and mechanisms is based on conventional experiments, which have yet to be fully integrated and understood. Therefore, new analytical methods are needed to correlate curcumin with its target proteins and the observed biological effects. Functional/activity network (FAN) [9] is a new analytical method, which elucidates the molecular mechanisms of a drug and its association with clinical outcomes in cancer by using a set of web-based tools, such as DrugBank, STRING, cBioPortal, CTD, or Cytoscape. Using this method, we conducted the functional drug-target-cancer network analysis of gene-phenotype connectivity associated with curcumin. Our study bridged curcumin with its primary or secondary targets and illustrated the underlying mechanisms of curcumin and its clinical outcomes in HCC.
Primary liver cancer is the sixth most common cancer and the second largest cause of cancer mortality in the world [28]. HCC constitutes approximately 80% of all primary liver cancer [29]. The effects of curcumin and its analogs have been a subject of investigation over the past decade in preclinical models of HCC [30], but research in this eld is far from complete. A search of PubMed using "curcumin" as the keyword returned more than 13000 publications, whereas using "curcumin and hepatocellular carcinoma," only 192 publications were retrieved to date. By using functional/activity network, more direct mechanisms may be mined with reasonable experimental feasibility to validate hypotheses explaining the effects of curcumin in HCC.
In this study, we demonstrated the feasibility of FAN analysis underlying the connectivity between curcumin and HCC. As a result, we identi ed a network including ve genes (TP53, RB1, TGFB1, GSTP1, and GSTM1) as targets of curcumin in HCC via a functional drug-target-cancer network analysis.
Among the ve genes, TP53 and RB1 are tumor suppressor genes consistent with their high transcriptional expression correlated with better OS in HCC in this study. Similar to other cancers, TP53 and RB1 were considered as the most commonly inactivated or mutated in the case of HCC (Fig. 5).
Moreover, mutual exclusivity analysis revealed a tendency toward concurrence between TP53 and RB1 in our report. Previous studies indicated that curcumin may inhibit cancer growth and induce apoptosis in colon cancer cells [31], ameliorate the in vitro e cacy of car lzomib in human multiple myeloma cells [32], promote apoptosis in non-small cell lung cancer [33], and inhibit cell growth in nasopharyngeal carcinoma mediated via TP53 signaling pathway [34]. Debata et al. reported that the sunitinib-curcumin combination was effective in restoring the tumor suppressor activity of RB gene in renal cancer cells [35]. Furthermore, Su et al. reported that curcumin signi cantly enhanced p53 or markedly inhibited the RB pathway by suppressing RB phosphorylation in the signaling pathways of glioblastoma [36]. The foregoing studies demonstrated the anti-tumor activity of curcumin and prompted further investigation of curcumin in HCC.
In the case of HCC, most of the genetic alterations in TGFB1, GSTP1, and GSTM1 were ampli cations (Fig. 5), which may cause increased expression. However, the prognostic signi cance varied in this study.
The ampli cation of TGFB1 indicated a worse OS in HCC, while ampli cations of GSTM1 and GSTP1 (not statistically signi cant) showed better prognosis. TGFB1 is a pleiotropic gene with a dual role in hepatocarcinogenesis: apoptosis induction in early phases, but promotion of tumorigenesis in cells with mechanisms to overcome the suppressor effects [37] . Recent studies found that the expression of TGFB1 genes was downregulated in breast cancer cells treated with curcumin [38]. Glutathione S-transferases (GSTs) are a family of phase II detoxi cation enzymes that catalyze the conjugation of a wide variety of endogenous and exogenous toxins. A previous study showed that the inter-individual GST variation plays a central role in reducing cell exposure to carcinogens [39]. For example, the reduced GSTP1 expression may contribute to oxidative stress in HCC [40]. Similar results were observed in breast cancer cells and curcumin-activated GSTP1 expression via antioxidant response element [41]. However, studies linking curcumin and GST genes in HCC are rare, underscoring the need for further investigation in this eld.
These results were partially con rmed by in vitro experiments. HepG2 and Hep3B were used as the experimental objects because of their advantages of fast growth and easy passage. Our study con rmed the inhibitory effect of curcumin on liver cancer cells, and this inhibition increased with time and concentration. In the liver HCC (TCGA, Provisional) study, three genes (TGFB1, GSTP1, and GSTM1) showed ampli cation. For this reason, changes in these ampli ed genes were more easily detected using PCR. This was con rmed in our study when we found that two genes that were prone to ampli cation were suppressed in HCC cells treated with curcumin. The other genes, however, did not show changes in expression because the main means of mutation was not ampli cation mutations but truncating mutations and deep deletions. In terms of the relationship between curcumin and protein expression, only one protein change TGFB1 was detected by WB. This may be because of the complexity of posttranscriptional regulation of mRNA, which indicated that mRNA ampli cation may not be consistent with protein expression. We need to establish the speci c molecular mechanism by which curcumin regulates hepatocellular carcinoma cells.

Conclusions
In summary, by using Drugbabk, STRING, CDC, and cBioPortal databases, we discovered the connectivity between curcumin and HCC. Curcumin has the potential to become an alternative chemotherapy or chemoprevention for HCC. The drug-target-cancer network analysis utilized in this study facilitated the testing and validation of reasonable hypotheses explaining curcumin-induced gene alterations in cancers by applying the available biological information in studies from bedside to bench. As advances in curcumin research using traditional experimental approaches continue, additional drug targets will undoubtedly be identi ed, leading to improved curcumin-related genetic networks, signaling pathways, and cancer types. Availability of data and materials: All of the data generated or analyzed during this study are included in this published article and its supplementary information les.   Overview of genetic alterations related to curcumin-associated genes in genomics data sets available in 7 different HCC studies in cBioPortal databases.

Figure 5
A visual heatmap of mRNA-level alterations based on 5 genes (TP53, RB1, TGFB1, GSTP1, and GSTM1) across a HCC study (data taken from the Liver HCC (TCGA, Provisional) study) in cBioPortal databases. Each row represents a gene, and each column represents a tumor sample.  (E) Quantitative analysis of TGFB1 protein expression.

Page 24/24
This is a list of supplementary les associated with this preprint. Click to download.