Genomics and Prognosis Analysis of Circadian clock genes in Hepatocellular Carcinoma

16 Background: Circadian clock genes have been reported to exhibit a regulatory effect 17 on the carcinogenesis and progression of numerous cancers. Nevertheless, the specific 18 relationship between hepatocellular carcinoma (HCC) and circadian rhythm associated 19 genes still remain to be clarified. Therefore, we evaluate the prognosis function of 20 circadian clock genes in HCC with the online datasets of The Cancer Genome Atlas 21 (TCGA) and the international cancer genome consortium (ICGC). 22 23 Methods: In our research, the RNA-seq of the selected core circadian genes in HCC 24 patients and their relevant clinical data were acquired from the online TCGA database 25 and the ICGC database. R software and cBioPortal website were performed. 26 27 Results: As consequence, among the 22 typical circadian clock genes, 16 genes were 28 statistically expressed between HCC and adjacent normal tissues. Accordingly, 11 clock 29 genes with regression coefficients were used to constitute a new risk score formula, 30 which was related to the prognosis in HCC. Moreover, the new nomogram, which 31 consisting risk score and several clinical traits, could be applied for the purpose of 32 accurate prediction of the overall survival (OS) time for the patients. Finally, we 33 identified a novel nomogram related with OS in HCC patients with a comprehensive 34 analysis of circadian clock genes and other clinical characteristics profiles. It was also 35 the first time we systematically demonstrated the relationship between clock genes and 36 the HCC prognosis, which would contribute to the treatment of HCC. 37 38 Conclusions: The current study demonstrated the potential of circadian clock genes as 39 clinically associated biomarkers for prognosis prediction in HCC, which may make a 40 significant contribution to the further investigations of HCC progression. 41


45
Hepatocellular carcinoma (HCC) is one of the most frequent reasons of human 46 abdominal malignancies-associated mortality worldwide(1). As the most prevalent kind 47 of primary liver cancers, HCC accounts for over 70% of liver carcinomas(2), with its 48 incidence increasing over the past several decades(3), which essentially results from 49 hepatitis, alcohol abuse and several metabolic syndromes(4). The present tactics to 50 HCC consist of radical treatments such as curative resection, liver transplantation for 51 local lesions(5), and palliative therapies including ablation, chemotherapy et cetera(6). 52 However, owing to the limited existing medical examinations and the asymptomatic 53 progression, the patients tend to be diagnosed during middle or even advanced period, 54 which will inevitably affect the prognosis of HCC. Previous researches have 55 demonstrated several markers related with the development of HCC(7-9), it helped 56 clinicians to make reasonable treatment choices and evaluate the prognosis among 57 patients. Nevertheless, owing to the sample size and selection bias, marker targeted 58 treatment may be affected to a certain degree. Therefore, it is imperative to figure out 59 3 reliable biomarkers to develop a more accurate model for early diagnosis and prognosis 60 prediction for the patients.

62
As the endogenous oscillator, circadian clock is located at the suprachiasmatic 63 nucleus of the hypothalamus and it is controlled by a couple of transcription-translation 64 feedback loops(10, 11). The mentioned feedback loops coded genes are the so-called 65 circadian clock genes, and the expression of circadian clock genes vary during the 24-66 hour basis oscillation(12). Core circadian clock genes contain CLOCK, ARNTL, PER1, 67 PER2, PER3, CRY1, and CRY2; however the core components of the mammalian 68 circadian clock are more extensive and contain 22 genes, including the core clock genes 69 plus BTRC, CSNK1D,CSNK1E,CUL1,DBP,FBXL21,FBXL3,NFIL3,NR1D1,70 NR1D2, PRKAA1,PRKAA2,RORA,RORB,and SKP1(13,14), which were involved in  The TCGA database normalized as log2 (TPM+1) were selected as the training set and 107 the ICGC dataset log2 (RNA-seq+1) transformation was used as validation set. A total 108 of 664 specimens were analyzed in our research. The selected core circadian clock 109 genes expression analysis was carried out between 374 liver cancer tissues and 50 110 adjacent normal tissues in TCGA database as well as 240 primary liver cancer tissues 111 in ICGC database. The Basic clinical characteristics of HCC were shown in Table 1.

112
All the data analyzed in this research were online publicly available.  Table 1. The clustering ability of circadian clock genes-based risk score was confirmed by  Subsequently, to evaluate the specificity of the risk score construction, we performed 143 the receiver operating characteristic (ROC) analysis with the survivalROC package in 144 R. The prognostic or predictive accuracy was described by the specific area under the 145 curve (AUC). P value < 0.05 was regarded to be statistically significant. The nomogram was conducted with the package rms in R software. Based on TCGA 149 database, the age, gender, tumor grade, tumor stage, and the risk score were selected as 150 risk elements for prognosis prediction of HCC. The discrimination of the nomogram 151 was measured by C-index (concordance index). The value of C-index ranged from 0.5 152 (no discrimination) to 1 (perfect discrimination), and higher C-index showed better 153 discrimination of the prognostic model.

163
Rhythm related genes were differentially expressed in HCC. 164 In order to explore the expression of circadian clock genes in HCC, we selected 22 165 core clock genes according to the current researches(25, 26) and downloaded the patient 166 samples from the TCGA dataset. As illustrated in the heat map in Figure 1A     Patients suffered from unsatisfied survival time with elevated risk scores. 213 To further confirm the credibility of the risk score construction, we studied the 214 prognostic value of the risk score. Based on the TCGA database, risk score was an 215 independent prognostic factor for overall survival. Patients in the low-risk group 216 showed a relatively considerably favorable overall survival (p = 1.79e-06) than those 217 in high-risk group ( Figure 6A and B). What's more, the ROC analysis also testified that 218 the risk score had a higher prediction accuracy in comparison with clinical 219 characteristics, with its AUC values equaled to 0.754 for one year, 0.691 for three years 220 and 0.639 for five years' survival predictive accuracy ( Figure 6C). Intriguingly, the 221 ICGC dataset verified the above results as well. As presented in the Figure 6D

266
In our study, it was our first time tried to explore the robust prognosis prediction with their corresponding regression coefficients, which separated the patients into low 305 risk group and high-risk group according to the medium of the risk score. Subsequently, 306 the PCA suggested that there were significant differences between two groups.

308
In addition, several vital clinical characteristics were found to be related with the risk 309 score. Specifically, the correlation analysis between risk score and clinical traits 310 explained that tumor grade and tumor stage classification were positively connected 311 with risk score, whereas null correlation was existed within risk score and age or sex.

312
It was well known that tumor stage and grade were related with patient survival, which 313 further confirmed our analysis. As expected, risk score was an independent prognostic 314 factor for overall survival and increased risk score indicted poor prognosis. What's 315 more, the ROC analysis also testified that the risk score had a higher prediction 316 accuracy in comparison with clinical characteristics.

318
Finally, combined risk score with age, gender, tumor grade and stage, a novel 319 nomogram was developed to assess the prognosis in the diagnosed HCC patients. The      Figure 1