Identification of potential significant genes in HBV related hepatocellular carcinoma via bioinformatical analysis

Background: The mortality rate of hepatocellular carcinomaHCCis the third highest worldwide. Infection with hepatitis B virus (HBVis an important risk factor for the development of HCC. The fact that there is no available target drug for the HCC highlights the necessity to further explore its underlying mechanism. Methods: Gene expression profiles of GSE121248, GSE55092 and GSE62232 were accessible from GEO database. From 129 HCC tissues and 138 normal tissues in the three profile datasets, we picked out differentially expressed genes (DEGs) using GEO2R and Venn diagram softwareanalyzed Gene and Genome (KEGG) pathway and gene ontology (GO) in DEGs through DAVID software, and simulated the interactions between DEGs using the plotting function of STRING database, as well as constructed a protein-protein interaction (PPI) network by Cytoscape software Consequently significant genes with potential poor prognosis were selected using UALCAN and validated in Gene Expression Profiling Interactive Analysis. Results: In total of 103 DEGs in the three datasets, there were 26 up-regulated genes rich in regulation of attachment of spindle microtubules to kinetochore, protein localization to kinetochore, mitotic cytokinesis, cytokinesis, positive regulation of cytokinesis, Cell cycle and p53 signaling pathway while 77 down-regulated genes enriched in Retinol metabolism, Caffeine metabolism, Drug metabolism - cytochrome P450, Metabolism of xenobiotics by cytochrome P450, Chemical carcinogenesis, oxidation-reduction process, exogenous drug catabolic process, xenobiotic metabolic process, monocarboxylic acid metabolic process, epoxygenase P450 pathway and drug metabolic process. PPI network analyzed by Molecular Complex Detection (MCODE) plug-in, we found 14 hub genes including TOP2A, CCNB1, RACGAP1, DTL, PBK, NEK2, PRC1, CDK1, RRM2, BUB1B, ECT2, ANLN, HMMR, ASPM, among which demonstrated 13 genes (except PRC1) had a significantly worse prognosis based on UALCAN analysis. All of the 13 genes


Introduction
The mortality rate of HCC is the third highest worldwide [1]. HBV is a predominant causative agent for chronic hepatitis, cirrhosis, and HCC with approximately 257 million chronic carriers all over the world, especially in East Asia [2]. HBV regulatory protein HBx has been demonstrated to be implicated in HBV-associated oncogenesis through targeting the epigenetic control of cellular genes expression.
Although, the five-year overall survival rate of HCC patients is only 50-70% [3]due to lack of effective target drug, except that surgical treatment may be effective in the early stages. Therefore, it is crucial to understand the molecular mechanisms involved in carcinogenesis and progression of HBV (+) HCC, which facilitates effective diagnosis and treatment. However, the key genes of carcinogenesis and progression of HBV (+) HCC remain largely unknown. Therefore, more reliable prognostic biomarkers should be explored as a target for improving the treatment effect and better understanding the underlying mechanism.
Bioinformatics analysis based on microarray and deep sequencing technology has been widely used to reveal molecular heterogeneity between different samples at the genomic level. And it helps us identify differentially expressed genes (DEGs) and abnormal pathways involved in the carcinogenesis and progression of HBV (+) HCC. Our investigation contributes to identifying potential key genes and therapeutic targets for the carcinogenesis and progression of HBV (+) HCC.

Microarray data information
In a free public database of microarray/gene profile named GEO [4] (GEO, https:// www.ncbi.nlm.nih.gov/geo), we found 3 gene expression profiles of GSE121248, GSE55092 and GSE62232 containing more paired HBV (+) HCC tumor and adjacent HBV (+) liver tissues. We selected 37 normal tissues and 70 HCC tissues, 91 normal tissues and 49 HCC tissues, 10 normal tissues and 10 HCC tissues, respectively. Ethical approval was waived since this study used only publicly available data, and did not involve any experiment on animals or humans.

Data processing of DEGs
DEGs between HBV(+)HCC tumor tissue and HBV(+) liver tissue was performed in GEO2R online tool (www.ncbi.nlm.nih.gov/geo/geo2r), with |logFC| > 2 and adjust P value < 0.05. Venn software online (http://bioinformatics.psb.ugent.be/webtools/Venn/) was used to detect common DEGs among three datasets. The DEGs with log FC < 0 was considered as down-regulated genes, while the DEGs with log FC > 0 was considered as up-regulated gene.
Gene ontology and pathway enrichment analysis Gene ontology (GO) is a common bioinformatics tool to annotate genes and identify unique biological properties of high throughput transcriptome or genome data, which can be classified into biological process (BP), cellular component (CC), and molecular function (MF) [5]. Kyoto Encyclopedia of Genes and Genomes (KEGG) is a useful database for realizing high level functions and utilities of biological system from large-scale molecular profiles [6]. GO function and KEGG pathway enrichment of DEGs were analyzed by the Database for Annotation, Visualization and Integrated Discovery (DAVID; version 6.8; https://david.ncifcrf.gov). Adjusted p value < 0.05 was considered statistically significant.
PPI network and module analysis PPI information can be evaluated by an online tool, STRING (Search Tool for the Retrieval of Interacting Genes) [7]. Then, the STRING app in Cytoscape [8] was applied to examine the potential correlation between these DEGs (maximum number of interactors = 0 and confidence score ≥ 0.4). In addition, the MCODE app in Cytoscape was used to check modules of the PPI network (degree cutoff = 2, max. Depth = 100, k-core = 2, and node score cutoff = 0.2).
Survival analysis and RNA sequencing expression of core genes UALCAN is a comprehensive, user-friendly, and interactive web resource for analyzing cancer OMICS data. It is built on PERL-CGI with high quality graphics using javascript and CSS [9]. The log rank P value and hazard ratio (HR) with 95% confidence intervals were computed and showed on the plot. To validate these DEGs, we applied the GEPIA website to analyze the data of RNA sequencing expression on the basis of thousands of samples from the GTEx projects and TCGA [10].
Then, we used Venn diagram software to identify the commonly DEGs in the three datasets. Results showed that a total of 103 commonly DEGs were detected, including 77 down-regulated genes (logFC< 0) and 26 up-regulated genes (logFC> 0) in the HCC tissues (Table 1 & Fig. 1).   Table 2). Table 2 Gene ontology analysis of differentially expressed genes in HBV (+) HCC    Fig. 2A). Then we applied Cytotype MCODE for further analysis and results showed that 14 central nodes which were all upregulated genes were identified among the 96 nodes (Fig. 2B).

Analysis of core genes by the Kaplan Meier plotter and GEPIA
Kaplan Meier plotter (http://kmplot.com/analysis) was utilized to identify 14 core genes survival data.
It was found that 13 genes had a significantly worse survival while 1 had no significant (P < 0.05,  Fig. 3). Then, GEPIA was used to dig up the 13 gene expression level between cancerous and normal people. Results reported that all genes reflected high expressed in HCC samples contrasted to normal samples (P < 0.05, Fig. 4).  To understand the possible pathway of 13 selected hub genes, KEGG pathway enrichment was reanalyzed via DAVID (P < 0.05). Results showed that two genes (CDK1 and CCNB1) enriched in p53 signaling pathway.
CDK1, a member of the Ser/Thr protein kinase family, plays an essential role in the G1/S and G2/M phase transitions of eukaryotic cell cycle by inter acting with CCNB1. Cyclin B1 (CCNB1), a regulatory protein, plays an important role in controlling the G2/M transition phase during mitosis. CCNB1 upregulation has been reported to be a significant prognostic marker for poor outcome in HCC [11].
Zhang Yong et al. [12] found that CCNB1 gene knockout can significantly increase the sensitivity of hepatocellular carcinoma HepG2 to the chemotherapeutic drug daunorubicin, and that the proliferation of hepatocellular carcinoma cells with low expression of CCNB1 is significantly inhibited.
It can be seen that the clonal proliferation ability of HepG2 is indeed affected by the high expression of CCNB1. It is worth noting that by inhibiting the expression of CCNB1 not only effectively inhibits the ability of liver cancer cells to proliferate, but also increases their sensitivity to chemotherapy drugs.
Zhang et al. [13] found that silencing CCNB1 can induce p53 reactivation and regulate apoptosisrelated proteins, reduce the proliferation capacity of pancreatic cancer cells and the proportion of liver cancer cells in S phase, significantly enhance the apoptosis and aging of pancreatic cancer cells, and increase G0 / G1 phase cell ratio. Similarly, some literatures have also reported that the abnormal expression of CCNB1 can affect tumor biological effects through p53 signaling pathways [14,15].
However, the role of these genes in HBV (+) HCC is unclear, and further research is needed. In conclusion, this study revealed that CDK1 and CCNB1 were two potential key genes for hepatocarcinogenesis, and may be candidate biomarkers and potential therapeutic targets for HBV and poor patient survival. DTL depletion inhibited liver cancer cell growth, increased senescence, and reduced tumorigenesis. Moreover, DTL silencing inhibited the growth of patient-derived primary cultured HCC cells [19].
Never in mitosis gene-A (NIMA)-related expressed kinase 2 (NEK2) has been recently reported to play a role in tumor progression, drug resistance and tumorigenesis. NEK2 was overexpressed in human HCC. NEK2 overexpression was significantly associated with liver noncapsulation and predicted poor survival outcomes in HCC patients after hepatectomy. In addition, NEK2 significantly enhanced HCC cell invasive ability [20].
PRC1 expression is associated with early recurrence of liver cancer and poor prognosis in patients. In HCC, PRC1 promoting tumorigenesis. And the expression and distribution of PRC1 are dynamically regulated by Wnt3a signaling [21].
Liu X et al. found that RRM2 might be targeted for HBV inhibition, and the RRM2-targeting compound osalmid and its derivative YZ51 could be a novel class of anti-HBV candidates with potential use for hepatitis B and HBV-related HCC treatment [22].
Silenced the expression of BUB1B in HepG2, a hepatocellular carcinoma cell line, found that proliferation ability and the invasion ability of hepatocellular carcinoma cells decreased. The survival rate of HCC patients with high expression of BUB1B gene is worse than that of HCC patients with low expression of BUB1B gene. Further analysis found that among HCC patients (n = 150) with a history of hepatitis virus infection, the level of BUB1B gene expression has no effect on prognosis, only HCC patients not infected with hepatitis virus (n = 167) can be used as molecular markers to predict prognosis. BUB1B gene is highly expressed in HCC patients and promotes the proliferation and invasion of liver cancer cells [23].
The upregulation of ECT2 is significantly associated with early recurrent HCC disease and poor survival. Knockdown of ECT2 markedly suppressed Rho GTPases activities, enhanced apoptosis, attenuated oncogenicity and reduced the metastatic ability of HCC cells. Also, ECT2 is closely associated with the activation of the Rho/ERK signalling axis to promote early HCC recurrence [24].

Conclusion
Taken above, our bioinformatics analysis study identified two DEGs (CDK1 and CCNB1) between HCC tissues and normal tissues on the base of three different microarray datasets. Results showed that two genes could play key roles in the progression of HCC. However, these predictions should be

Availability of data and materials
The data that support the findings of this study are available from GEO database, DAVID, STRING, and GEPIA database, as is mentioned in the "Methods" section.
Author contributions YZ and YYG collected and analyzed the data. YZ wrote the manuscript. WYY contributed significantly to analysis and manuscript preparation;YYW, DXC, YB OY and XN L designed and supervised the study. All authors read and approved the final manuscript.
Ethics approval and consent to participate Our paper used and analyzed public datasets. So, ethical approval was not needed for this study.

Consent for publication
No conflict of interest. The prognostic information of the 14 core genes. Kaplan meier plotter online tools were used to identify the prognostic information of the 14 core genes and 13 of 14 genes had a significantly worse survival rate (P < 0.05) Figure 6 The prognostic information of the 14 core genes. Kaplan meier plotter online tools were used to identify the prognostic information of the 14 core genes and 13 of 14 genes had a significantly worse survival rate (P < 0.05) Figure 7 Significantly expressed 13 genes in HBV (+) HCC patients compared to healthy people. To further identify the genes' expression level between HBV (+) HCC and normal people, 13 genes which were related with poor prognosis were analyzed by GEPIA website. All genes had significant expression level in HCC specimen compared to normal specimen (*P < 0.05).
Red color means tumor tissues and grey color means normal tissues Significantly expressed 13 genes in HBV (+) HCC patients compared to healthy people. To further identify the genes' expression level between HBV (+) HCC and normal people, 13 genes which were related with poor prognosis were analyzed by GEPIA website. All genes had significant expression level in HCC specimen compared to normal specimen (*P < 0.05).
Red color means tumor tissues and grey color means normal tissues