3.1 The human nuclear receptors (NR) could effectively separate HCC samples with different prognosis
Based on the cumulative distribution function of clustering (Fig. 1A and B), we performed consensus clustering analysis with the mRNA levels of 48 NRs by using the "ConsensusClusterPlus" function package in R language, and divided all HCC samples into four categories (k = 4). The consistency matrix (Fig. 1C) and the heatmap of consensus matrix (Fig. 1D) all showed that the consistency clustering based on the mRNA level of NRs could clearly distinguished these four categories. The results of PCA analysis also showed that the differences among the four groups of samples were significant (Fig. 1E). The survival analysis based on Kaplan-Meier method was performed and indicated that there were significant differences in overall survival among the four types of samples, and cluster3 exhibited a worst prognosis (Fig. 1F). These results indicated that the mRNA level of NRs could efficiently separate HCC samples with different prognosis.
3.2 Prognostic significance of NRs in HCC
In order to determine the prognostic role of NRs in HCC, the Univariate cox regression analysis with training set samples based on the mRNA level of 48 NRs was performed, and the hazard ratio (HR) of each NR was calculated with P < 0.05 as the significant threshold. The results indicated that these six NRs including PPARD (HR = 1.3, 95% CI: 1 - 1.6, p = 0.016), PPARG (HR = 1.2, 95% CI: 1 - 1.5 , p = 0.021), NR1H3 (HR = 1.4, 95% CI: 1.1 - 1.7, p = 0.0057), VDR (HR = 1.3, 95% CI: 1.1 - 1.6, p = 0.011), NR2C1 (HR = 1.3, 95% CI: 1 - 1.6, p = 0.018) and NR6A1 (HR = 1.4, 95% CI: 1.1 - 1.7, p = 0.0047) were significantly related to the overall survival in HCC samples, and were risk genes that can result in the poor prognosis (Fig. 2A). Meanwhile, the three NRs including NR1I2 (HR = 0.81, 95% CI: 0.68 - 0.96, p = 0.013), ESR1 (HR = 0.77, 95% CI: 0.63 - 0.94, p = 0.012) and AR (HR = 0.82, 95% CI: 0.69 - 0.99, p = 0.038) were also significantly related to the overall survival in HCC samples, but these three NRs were protective genes that can be favorable for prognosis (Fig. 2A). Then, LASSO Cox regression analysis on training set samples based on the selected 9 NRs was performed, eight optimal NRs (NR1H3, ESR1, NR1I2, NR2C1, NR6A1, PPARD, PPARG and VDR) were determined based on the lowest lambda value of each gene (Fig. 2B and C).
Next, to obtain a uniform threshold to successfully divide all HCC patients into high risk group and low risk group across different sample sets, we standardized the expression values of 8 genes both in the TCGA dataset and ICGC dataset into the values with an average value of 0 and a standard deviation (SD) of 1. Then we weighted the normalized expression of each nuclear receptor with the regression coefficient of LASSO Cox regression analysis and established a risk score model for predicting patient survival by the following formula: Risk Score = 0.1765 * express value of NR1H3 - 0.11 * express value of ESR1 - 0.1501 * express value of NR1I2 + 0.0495 * express value of NR2C1 + 0.1377 * express value of NR6A1 + 0.0917 * express value of PPARD + 0.0004 * express value of PPARG + 0.1276* express value of VDR. Based on the formula, risk score of each patient was calculated. And the samples of TCGA training set, TCGA testing set and ICGC verifying set were divided into high-risk group and low-risk group according to the calculated optimal cut-off point (0.0326), the risk score distribution of samples in three data sets was shown in the left panel of Fig. 3. Meanwhile, the expression of eight NR in the model exhibited significant differences between high risk group and low risk group (the second from left of Fig. 3). The survival curve showed that the HCC samples of high-risk group had poor overall survival than low-risk group in the three data sets (the third from left of Fig. 3). In addition, the time dependent ROC curve showed that the AUC of 1-year, 3-year and 5-year survival of the training set is 0.732, 0.701 and 0.678; the AUC of 1-year, 3-year and 5-year survival of the testing set was 0.719, 0.651 and 0.57; and the AUC of the 1-year, 3-year and 5-year survival of ICGC verifying set is 0.522 0.615, and 0.593, respectively (the right panel of Fig. 3). The results indicated that prognostic model constructed based on the eight NRs (NR1H3, ESR1, NR1I2, NR2C1, NR6A1, PPARD, PPARG and VDR) could effectively predict the prognosis of HCC patients in three sets of data.
3.3 Immune status of HCC samples between the high and low risk groups
We estimated the differences of the immune infiltration including 22 immune cells in HCC samples between the high and low risk groups through comprehensive analysis based on CIBERSORT and LM22 eigenmatrix. The result of immune cells infiltration in 365 patients with HCC was summarized in Fig. 4A, and changes in the proportion of tumor infiltrating immune cells among different patients might represent the intrinsic characteristics of individual differences. Meanwhile, the results of analysis indicated that the proportion of infiltration of different types of immune cells was weakly correlated (Fig. 4B). In addition, there were significant differences in the infiltration proportion of nine types of immune cells including B cells memory, dendritic cells resting, macrophages M0, macrophages M2, mas cells resting, monocytes, NK cells resting, T cells follicular helper and T cells regulatory (Fig. 4C).
The expression of immune checkpoint has becoming a promising biomarker for the selection of immunotherapy for liver cancer patients . We found a close correlation between risk score of HCC patients and the expression of key immune checkpoints composed of CTLA4, PD1, TIM3, LAG3 and TIGIT (Fig. 4D). Meanwhile, expression of CTLA4, PD1, TIM3, LAG3 and TIGIT in HCC patients between the high and low risk groups were analyzed, and the results indicated that the expression of CTLA4, TIM3, LAG3 and TIGIT in high risk group were obviously lower than low risk group (p < 0.05) (Fig. 4E), suggesting that the poor prognosis of HCC patients with high risk might be due to the immunosuppressive microenvironments in liver cells.
3.4 Risk score is an independent prognostic signature for HCC patients
The Multivariate Cox regression analysis was performed to determine whether the risk score is an independent prognostic signature bringing into multiple factors including age, gender, TNM Stages, grade, vascular tumour invasion and risk score. The results indicated that risk score was still significantly correlated with overall survival (Fig. 5A). Meanwhile, the samples with high risk score had a greater risk of death, which was an adverse prognostic factor with the low risk group as a reference (HR=2.114, 95% CI: 1.329 - 3.36, P =0.0016) (Fig. 5A).
To further explore the prognostic value of risk Score in HCC samples with different clinicopathologic factors including age, gender and TNM Stage), we regrouped HCC samples according to these above factors and performed Kaplan-Meier survival analysis. For samples of age < = 61 (Fig. 5B), samples of age > 61 (Fig. 5C), female samples (Fig. 5D), male samples (Fig. 5E), samples of early cancer (Stage I+II) (Fig. 5F) and samples of late cancer (Stage III+IV) (Fig. 5G), the overall survival in the high risk group was significantly lower than that in the low risk group, indicating that risk score could predict the prognosis of patients with HCC as an independent prognostic signature.
3.5 The Nomogram model could efficiently predict the long-term survival of HCC patients
We constructed the nomogram model based on the two independent prognostic factors, TNM Stage and Risk Score (Fig. 6A). For each patient, we draw three lines up to determine the point which was obtained from each factor in the nomogram. The sum of these points was located on the "Total Points" axis, and then a line is drawn down from the "Total Points" axis to determine the probability of the survival for 1, 3 and 5 years in HCC patients. The results indicated that the corrected curve is close to the ideal curve (a 45-degree line with slope of 1 through the origin of the coordinate axis), suggesting that the prediction is in better agreement with the actual results (Fig. 6B). And compared with nomogram containing one independent prognostic factor, the nomogram containing all independent prognostic factors had the largest AUC of survival for 3-year or 5-year in HCC patients (Fig. 6C), indicating that the nomogram constructed based on the all independent prognostic factors could efficiently predict the long-term survival of HCC patients compared with the nomogram based on a single independent prognostic factor.