Data collection, processing and differential expression analysis
Gene microarray expression data(GSE76427), and associated clinical data, were obtained from the National Center of Biotechnology Information(NCBI) GEO database(http://www.ncbi.nlm.nih.gov/geo). The GSE76427 dataset contains complete expressional data for 115 HCC samples and 52 non-tumor samples. The data was downloaded and analyzed using R software（v. 3.6.1）（www.r-project.org/）. Differential expression analysis was performed using the Limma package in R (v.3.30.13), with the DEG threshold set as P<0.05 and (|LogFC|)>0.585.
Weighted gene co-expression network construction
DEGs were utilized to construct the co-expression networks using the WGCNA package in R. Pairwise correlation analyses were performed and an adjacency matrix was calculated using an appropriates soft threshold to generate a scale-free network using the integrated function (pick Soft Threshold) in the WGCNA package. For the clustering analysis, aminimum module size=30(genegroup) was utilized for the gene dendrogramand, and a height cut off=0.25 was used for the module dendrogram.
Identification of clinically relevant modules and candidate hub genes
To identify clinically relevant modules, correlations between the clinical data, including age, sex, tumor-node-metastasis(TNM)staging classification, overall survival (OS) status, and OS time, and then the identified modules was examined.Two methods were used to test the correlation between each module and the development of HCC. Modules with a higher gene significance (GS) or higher correlation of module eigengene(ME) and GS were considered to be more related to the development of the disease and were selected for further analysis. Next, Hub genes were identified based on module connectivity (gene Module Membership>0.8) and clinical trait relationship(gene Trait Significance>0.2).
Correlation analysis between hub genes and survival and tumor TNM stage
The optimal cut-off value was determined by the surv_cutpoint function in survminer (R package). Based on the cutoff value, patients were divided into high expression or low expression groups and a Kaplan-Meier survival curve was plotted. To determine possible correlations between hub genes and TNM stage, a linear regression analysis was performed.
Functional enrichment analysis
To further characterize the underlying mechanisms of hub gene, the co-expression genes of hubgene were screened and Gene Ontology(GO) annotation and Kyoto Encyclopedia of Genes and Genomes(KEGG) pathway analysis were performed using Database for Annotation, Visualization and Integrated Discovery(DAVID; http://david.abcc.ncifcrf.gov/).
TCGA validation dataset
To further validate the findings based on the GEO dataset, a HCC dataset was downloaded from the TCGA database, with the associated clinical data obtained from UCSC XENA(https://xena.ucsc.edu/). A total of 343 samples were included in the analysis, with some samples removed due to incomplete survival and/or TNM stage information. The 343 samples were then divided into high expression and low expression groups based on the median expression of the hub gene. Then, a Kaplan-Meier survival curve and a regression analysis were performed to determine correlations between gene expression and survival or TNM stage.
Validation of the hub gene via quantitative real-time PCR (qPCR)
A total of 24 HCC patients were recruited to obtaining tumor and adjacent non-tumor tissues. The patients provided informed written consent and this study was approved by the Ethics Committee of the Second Clinical Medical College at Jilin University, Changchun, Jilin, China. Total RNA was extracted from the matched samples using TRIzol reagent (Sangon Biotech),The obtained total RNA(1 µg)was reverse-transcribed to cDNA using a 1st strand cDNA Synthesis Kit (Sangon,China) with the following primers synthesized by Sangon Biotech:PCK2(forward:5’-GCCATCATGCCGTAGCATC-3’,reverse:5’-AGCCTCAGTTCCATCACAGAT-3’)and GAPDH(forward:5’-GACAGTCAGCCGCATCTTCT-3’,reverse:5’-ACCAAATCCGTTGACTCCGA-3’). Next, qPCR was performed using2× SG Fast qPCR Master Mix (Sangon,China) according to the manufacturer’s protocols. Briefly, 5ng of cDNA in a final volume of 20µl was amplified under the following cycles: an initial 3 min at 95°C, followed by 40 cycles of 95°C for 3 s and 60°C for 30 s in a LightCycler480 II Fast Real-Time PCR System (Roche, Indianapolis, IN, USA). Relative gene expression was calculated using the comparative Ct method (ΔΔCt), with PCK2 expression normalized to GAPDH expression levels. Gene expression is displayed as a fold change (2-△△CT) relative to the non-tumor samples.
Construction of cell lines over-expressing PCK2 and cell-proliferation assay
SMMC7721 and Huh-7 cell lines were obtained from the College of Basic Medical Sciences, JilinUniversity, Changchun, Jilin, China.The cells were cultured in DMEM medium supplemented with 10% fetal bovine serumand were incubated at 37ºC with 5% CO2. The cells were collected at the logarithmic phase and were seeded into a 96-well plate at 5,000 cells/well.Each cell line was then transfected with PCK2 overexpression plasmids(OriGeneTechnologies,Inc.China) or with empty plasmids using Trans Intro EL Transfection Reagent(Transgen,China) according to the manufacturer’s protocols.Successful transfection was verified by examining PCK2 expression via qPCR as described above.
To evaluate the effect of PCK2 overexpression on cellular proliferation, a MTT assay was performed. At 24, 48, and 72h post-transfection, 100μl of MTT solution (0.5mg/ml)was added to each well. The samples were then incubated for 4h at 37°C, with the solution then discarded and replaced with 150μl of dimethylsulfoxide (DMSO).The absorbance was measured at 490nm using a plate reader.
Data obtained from the qPCR and MTT analyses were analyzed using GraphPad Prism (version7.0),with differences among groups statistically evaluated by using a two-tailed Student’s t-tests. Differences were considered statistically significant at P<0.05, with all samples examined in triplicate.