3.1 High-throughput sequencing of mRNAs from CON, CHB, LF/LC and HCC
Figure 1 shows the overall design of the study. A total of 332,720,614, 405,844,258, 425,724,230 and 430,992,628 reads were sequenced from CON, CHB, LF/LC, and HCC samples, respectively. For CON samples, 314,022,012 clean reads were obtained, which accounted for 94.4% of reads. Meanwhile, 384,396,180, 404,504,412 and 407,007,164 clean reads were obtained for CHB, LF/LC and HCC samples, respectively. We sought to obtain the sequences of mRNAs shared among the four groups.
3.2 Identification of candidate biomarkers for the progression of HBV-related diseases
Bioinformatic analysis was then performed on all sequenced mRNAs to identify the key biomarkers in the four groups. Compared with the CON group, 1,581 differential expressed mRNAs, including 901 upregulated and 680 downregulated, were detected in the CHB group, 2,141 differential expressed mRNAs, including 1,493 upregulated and 648 downregulated in the LF/LC group, and 1588 differential expressed mRNAs, including 1,130 upregulated and 458 downregulated mRNAs, in the HCC group (Fig. S1A). Then, we used Venn and Volcano maps to analyze differentially expressed mRNAs (Fig. S1B-E). To determine the key genes and modules involved in HBV-related liver diseases, WGCNA was conducted to reveal the highly correlated gene pairs as well as the co-expression networks. As shown in Fig. 2A-C, the power of β was fixed at 6 to establish a scale-free network, and the hierarchical clustering dendrogram identified 30 gene modules. As we were interested in the potential biomarkers that could be practical and convenient for predicting the progression of HBV-associated liver diseases, the green-yellow, green, turquoise and white modules were screened out. Subsequently, differentially expressed mRNAs were clustered by STEM method, and the results demonstrated that 25 mRNAs were highly expressed among CON, CHB, LF/LC and HCC groups (Fig. 2D, E). By combining with WGCNA analysis, 9 progressively overexpressed mRNAs, such as SHC1, SLAMF8, IL-32, ITGB2, MANF, tandem C2 domains, nuclear (TC2N), synaptopodin (SYNPO), UDP-glucose pyrophosphorylase 2 (UGP2), EGF-containing fibulin extracellular matrix protein 1 [EFEMP1], were chosen for further analyses. KEGG pathway and GO enrichment analyses on these 9 mRNAs indicated that the functions were primarily related to natural killer cell mediated cytotoxicity, protein binding, and metalion binding (Fig. 2F, G).
3.3 Verification of progressively elevated mRNA abundance through qRT-PCR
The expression levels of 9 progressively elevated mRNAs were verified with qRT-PCR in liver tissues (n=5 per group). Of these, 8 mRNAs (SHC1, SLAMF8, IL-32, ITGB2, MANF, TC2N, SYNPO and EFEMP1) showed a gradually increasing trend in the four groups (Fig. 3). We further performed qRT-PCR on PBMCs (n=20 per group), and 3 mRNAs (SHC1, SLAMF8 and IL-32) had the same expression trends as those observed in liver tissues (Fig. 3). Our qRT-PCR data showed that EFEMP1 has an increasing trend in liver tissues, but its expression in PBMCs is too low to be detected, so we didn’t show these results in the figure. Recent studies have shown that ITGB2 was elevated in many malignant tumors [13, 14], and Liang Zhang et al. [15] suggested that the expression of ITGA2 was higher in liver cancer tissues than those in para-carcinoma tissues and ITGB2 had a close relationship with ITGA2. Thus, we were curious whether the expression of ITGB2 was elevated in hepatocellular carcinoma. Jingyi He et al. [16] showed that MANF was overexpressed in HCC, but Jun Liu et al. [17] found that MANF mRNA and protein levels were lower in HCC tissues versus adjacent noncancer tissues. So we wonder whether he expression of MANF was higher in liver cancer. To sum up, we selected five candidate mRNAs (SHC1, SLAMF8, IL-32, ITGB2 and MANF) for further verification.
3.4 IHC staining for SHC1, SLAMF8, IL-32, ITGB2 and MANF in normal and pathologically altered livers
The expression activities of SHC1, SLAMF8, IL-32, ITGB2 and MANF in liver tissues were detected by IHC staining (n=5 per group). With an increase in the severity of HBV-related liver diseases, the IHC intensities of SHC1 and SLAMF8 were gradually elevated (Fig. 4). IOD analysis indicated that the hepatic expression of IL-32 was the highest in HCC patients, followed by LF/LC patients, CHB patients and control subjects (Fig. 4). Besides, the expression activities of ITGB2 and MANF were markedly higher in LF/LC tissues than in CHB and healthy tissues. However, ITGB2 and MANF levels were found to be declined in the HCC tissues when compared to those in LF/LC tissues (Fig. 4).
3.5 Preliminary validation of the candidate biomarkers via ELISA
As shown in Table 1, the proportion of male participants, age, aspartate transaminase (AST), ferritin, FIB-4, and APRI levels were markedly increased among the four groups, whereas the platelet count (PLT), white blood cell count (WBC) and albumin (ALB) levels were reduced. Notably, the plasma levels of SHC1 were remarkably elevated in HCC patients compared to LF/LC patients, CHB patients and control subjects (10.7±2.0 vs. 9.7±1.7, 7.5±1.7 and 4.6±1.3 ng/ml, respectively, P<0.001, Table 1). As the disease progressed, the median values of SLAMF8 were obviously increased compared with those measured previously (P<0.001, Table 1), which were 8.8, 6.9, 6.1 and 4.9 in the HCC, LF/LC, CHB and healthy control groups, respectively. We determined the mean IL-32 concentrations of 78.3±17.5 pg/mL for HCC samples, 62.3±16.0 pg/mL for LF/LC samples, 49.1±13.8 pg/mL for CHB samples and 32.3±11.7 pg/mL for control samples, which were significantly different among these groups (P<0.001, Table 1).
3.6 Verifying the predictive mRNA panel in the validation set
Ten variables (gender, age, WBC, PLT, ALB, AST, ferritin, SHC1, SLAMF8, and IL-32) were identified by univariate ordinal regression analysis (all P<0.001, Table 2). These variables were further subjected to multivariate ordinal regression analysis, and significant differences were observed for the following variables: age (OR=1.071, 95%CI=1.036-1.107, P<0.001), PLT (OR=0.988, 95%CI=0.981-0.995, P=0.002), ferritin (OR=1.003, 95%CI=1.001-1.006, P=0.014), SHC1 (OR=2.077, 95%CI=1.648-2.618, P<0.001), SLAMF8 (OR=2.104, 95%CI=1.583-2.796, P<0.001), and IL-32 (OR=1.054, 95%CI=1.025-1.083, P<0.001).
When differentiating between CHB and healthy control, the AUC values of age, PLT, ferritin, SHC1, SLAMF8 and IL-32 were 0.744, 0.699, 0.688, 0.900, 0.744 and 0.821, respectively (Table 3, Fig. 5). Accordingly, the cut-off values of age, PLT, ferritin, SHC1, SLAMF8 and IL-32 were 36 years old, 195×109/L, 90.60 ng/mL, 5.03 ng/mL, 4.94 ng/mL and 48.29 pg/mL, respectively. When discriminating CHB patients from LF/LC patients, age alone yielded an AUC of 0.592 with 58.0% sensitivity and 62.0% specificity at the threshold of 42 years old, PLT alone yielded an AUC of 0.812 with 66.0% sensitivity and 88.0% specificity at the threshold of 140×109/L, ferritin alone yielded an AUC of 0.587 with 54.0% sensitivity and 72.0% specificity at the threshold of 126.30 ng/mL, SHC1 alone yielded an AUC of 0.812 with 82.0% sensitivity and 64.0% specificity at the threshold of 8.11 ng/mL, SLAMF8 alone yielded an AUC of 0.684 with 44.0% sensitivity and 92.0% specificity at the threshold of 7.30 ng/mL, IL-32 alone yielded an AUC of 0.741 with 48.0% sensitivity and 90.0% specificity at the threshold of 62.37 pg/mL (Table 3, Fig. 5). When distinguishing HCC patients from LF/LC patients, SLAMF8 was the most outstanding diagnostic parameter in the validation set (AUC=0.802), which was superior to age, PLT, ferritin, SHC1 and IL-32 (AUC=0.758, 0.625, 0.636, 0.646 and 0.761, respectively).
Furthermore, a logistic regression model was constructed to determine the best diagnostic mRNA panel. The predicted probability of the 3-mRNAs panel in combination with clinical parameters was calculated as follows: APFSSI = -0.1871 + 0.012 × age - 0.0027 × PLT + 0.0005 × ferritin + 0.1361 × SHC1 + 0.1234 × SLAMF8 + 0.0108 × IL-32. At the threshold of 1.624, the sensitivity and specificity of APFSSI for CHB diagnosis were 92.0% and 94.0% with an AUC of 0.966. Using an APFSSI value of 2.470 as the decision threshold, LF/LC patients could be differentiated from CHB patients with 90.0% sensitivity and 84.0% specificity. At a score cut-off of 3.349, the diagnostic model exhibited 84.0% sensitivity and 86.0% specificity for distinguishing HCC patients from LF/LC patients (Table 3, Fig. 5).
3.7 Establishing the mRNA panel in the test set
Parameters estimated from the validation set (n=200) were used to evaluate the diagnostic performance of the mRNA panel in the independent test set (n=400). The variables with gradually increasing or decreasing trends were consistent with those in the validation set (Supplemental Table 2). Moreover, the mRNA panel with the highest AUCs showed a better diagnosis for HCC, LF/LC or CHB than other parameters (all P<0.05, Supplemental Table 3, Fig. S2). In differentiating CHB from healthy controls, SLAMF8 was regarded as an excellent diagnostic parameter (AUC=0.911), but still inferior to APFSSI model (AUC=0.980) (Supplemental Table 3, Fig. S2). The mRNA panel had a great diagnostic potential for distinguishing LF/LC patients from CHB patients, with 94.0% sensitivity and 90.0% specificity. The differentiation power of the APFSSI model for HCC vs. LF/LC (AUC=0.930) was far superior to SHC1 (AUC=0.744), SLAMF8 (AUC=0.791) and IL-32 (AUC=0.852) (Supplemental Table 3, Fig. S2).