Construction and validation of mRNA model
320 HCC samples with complete mRNA expression profiling and survival information were kept as a training set. 267 DE-mRNAs (including 184 up-regulated and 83 down-regulated mRNAs in HCC) were identified (Figure S1A, 1B, and Table S1) to perform univariate Cox regression analysis. Among those DE-mRNAs, 124 mRNAs were significantly associated with OS (Table S2). Then 91 mRNAs with HR > 1 and up-regulated in HCC, and 31 mRNAs with HR < 1 and down-regulated in HCC were selected to perform LASSO Cox analysis for key mRNAs selection (Figure 2A). Parameter log (λ) = -3.154 (λ = 0.04269) chosen by 10-fold cross validation method with minimum criteria was regarded as the best value (Figure S1C). 16 key mRNAs with nonzero coefficients (Figure 2B), associated with HCC OS (Figure S1D), and significantly changed in HCC samples (Figure S1E1, S1E2) were used to build the mRNA model (Figure S1F). The mRNA risk score for each patient was computed as follows: mRNA risk score = ∑βi × exp-mRNA, where exp-mRNA is the expression level of key mRNA and β is the regression coefficient derived from the LASSO COX analysis (Table S9). The mRNA model was evaluated with C-index, ROC curve, and survival analysis (Figure 2C, 2D), which showed a relatively good predictive ability (C-index = 0.752). 160 HCC samples were randomly selected as a test set to validate the mRNA model, which still had a good performance (C-index = 0.744) (Figure 2E, 2F). Besides, the mRNA model was externally verified in the LIRI-JP dataset with survival analysis, which also had a decent performance (P value = 0.00016) (Figure 2G).
Construction and validation of lncRNA model
320 HCC samples with complete lncRNA expression profiling and survival information were retained as a training set. 540 up-regulated and 27 down-regulated lncRNAs in HCC were identified (Figure S2A, 2B, and Table S3) to perform univariate Cox regression analysis (Table S4).137 lncRNAs with HR > 1 and up-regulated in HCC, and 9 lncRNAs with HR < 1 and down-regulated in HCC were selected to perform LASSO COX analysis for key lncRNAs selection (Figure 3A). Parameter log (λ) = -2.99 (λ = 0.05027) chosen by 10-fold cross validation method with minimum criteria was regarded as the best value (Figure S2C). 20 key lncRNAs with nonzero coefficients (Figure 3B), associated with OS (Figure S2D), and significantly changed in HCC samples (Figure S2E1, S2E2) were used to build the lncRNA model (Figure S2F). The lncRNA risk score for each patient was computed as follows: lncRNA risk score = ∑βi × exp-lncRNA, where exp-lncRNA is the expression level of key lncRNA and β is the regression coefficient derived from the LASSO Cox analysis (Table S9). In the training set, the AUC of the lncRNA model at 1, 3, and 5 year OS was 0.854, 0.796, and 0.806 respectively, while the C-index was 0.753(Figure 3C). In the test set, the AUC at 1, 3, and 5 year OS was 0.807, 0.785, and 0.767 respectively, while the C-index was 0.727(Figure 3E). In addition, the log-rank analysis revealed that scoring using the lncRNA risk score could discriminate the risk groups in the training set and test set (P value < 0.0001) (Figure 3D, 3F).
Construction and validation of miRNA model
321 HCC samples with complete miRNA and survival information were retained as a training set. 16 up-regulated and 70 down-regulated miRNAs in HCC were identified (Figure S3A, 3B, and Table S5). 5 miRNAs with HR > 1 and up-regulated in HCC, and 5 miRNAs with HR < 1 and down-regulated in HCC were selected via univariate Cox regression analysis (Figure 4A, Table S6). Through the stepwise Cox analysis, 5 key miRNAs were selected to build the miRNA model (Figure 4B, S3C, and S3D). The miRNA risk score for each patient was computed as follows: miRNA risk score = ∑βi × exp-miRNA, where exp-miRNA is the expression level of key miRNA and β is the regression coefficient derived from the stepwise Cox analysis (Table S9). Survival analysis showed that the high risk group has poor outcome in the training set (P = 0.00037) and test set (P = 0.027) (Figure 4D, 4F). Besides, whether in the training set or the test set, the AUC values of the miRNA model at 1,3, and 5 year point were all more than 0.68, and the C-index values were over 0.65(Figure 4C, 4E).
Construction and validation of CNV model
324 HCC samples with complete CNV and survival information were retained as a training set. 5006 genes with different copy number alteration between HCC and non-tumor samples (Figure 5A, Table S7) were selected to perform univariate Cox regression analysis, and 357 CNV genes significantly associated with OS were identified(Table S8). Then LASSO COX analysis was performed for key CNV genes selection. Parameter log (λ) = -2.634269 (λ = 0.07177142) chosen by 10-fold cross validation method with minimum criteria was regarded as the best value (Figure S4A). 5 key CNV genes with nonzero coefficients (Figure 5B), significantly different in HCC samples (Figure S4B) and associated with OS (Figure S4C) were used to build CNV model (Figure S4D). The CNV risk score for each patient was computed as follows: CNV risk score = ∑βi × CNV gene status, where β is the regression coefficient derived from the LASSO Cox analysis (Table S9). The CNV model was evaluated with survival analysis in the training set (Figure 5D) and test set (Figure 5F), which showed a worse prognosis in the high risk group (at least one key CNV gene with copy number alteration). Moreover, the AUC values of the CNV model at 1,3, and 5 year OS were all over 0.65, and the C-index values were more than 0.63(Figure 5C, 5E).
Construction and validation of SNP model
313 HCC samples with complete SNP and survival information were retained as a training set. 85 high-frequency SNPs (Figure 6A, Figure S5A) in HCC were selected to perform univariate Cox analysis, and 10 high-frequency SNPs significantly associated with OS were identified (Figure S5B). Through stepwise Cox analysis, 7 key SNPs were used to build the SNP model (Figure S5C). The SNP risk score for each patient was computed as follows: SNP risk score = ∑βi × SNP status, where β is the regression coefficient derived from the stepwise Cox analysis (Table S9). In the training set, the AUC of the SNP model at 1, 3, and 5 year OS was 0.799, 0.703, and 0.745 respectively, while the C-index was 0.709(Figure 6B). In the test set, the AUC at 1, 3, and 5 year OS was 0.745, 0.660, and 0.737 respectively, while the C-index was 0.683(Figure 6D). In addition, survival analysis showed that the high risk group (at least one key SNP with non-synonymous mutation) has poor prognosis in the training set (P < 0.0001), test set (P < 0.0001), and external validation set (P = 0.029) (Figure 6C, 6E, and 6F).
Construction and validation of multi-omics model
302 HCC samples with complete mRNA, lncRNA, miRNA, CNV, SNP, and survival information were retained as a training set. Through multiple Cox regression analysis, the 5 single-omic models were integrated to construct a multi-omics model and visualized as a nomogram (Figure 7A). A fairly good agreement was observed between the expected and observed outcomes for 1, 3, and 5 year OS in the calibration curves of the multi-omics model (Figure 7B). Whether in the training set or the test set, the AUC values of the multi-omics model at 1,3, and 5 year point were all over 0.840, and the C-index values were more than 0.800 (Figure 7C, 7H), which were significantly greater than those of the 5 single-omic models (all P values are less than 0.05)(Figure 7G). DCA analysis showed that the multi-omics model has better performance in predicting prognosis than the 5 single-omic models (Figure 7E, 7F). In addition, we stratified patients into low, medium, and high risk groups based on the total points of nomogram (cut-off points were selected at each tertile point), and found that scoring using the nomogram effectively discriminated the risk groups in the training set and test set (P < 0.0001) (Figure 7D, 7I).