Patients
This retrospective study was approved by the institutional review board of Affiliated Cancer Hospital & Institute of Guangzhou Medical University. From January 2013 to October 2018, we reviewed liver MRI, clinical, and pathology data of 675 consecutive cirrhosis patients. The following patients were included: (1) patients with at least one nodule having a diameter smaller than or equal to 3 cm; (2) patients who had undergone dynamic enhancement and diffusion-weighted (DW) imaging; (3) patients in whom pathological confirmation by surgical resection had been performed; and (4) patients who did not undergo any treatment before MRI. Subsequently, 525 patients were excluded due to the following reasons: (1) presence of a nodule with a diameter larger than 3 cm (n = 220); (2) unavailability of dynamic enhancement or DW imaging data (n = 27); (3) lack of pathology data (n = 245); and (4) receipt of treatment prior to MRI (n = 33). Finally, 111 patients with 112 HCCs and 39 patients with 44 benign nodules were included. The patient inclusion flowchart is shown in Fig.1.
Image Acquisition
Sixty-eight patients underwent gadoxetic acid-enhanced MRI (Gd-EOB-MRI) and 82 patients underwent gadopentetate dimeglumine-enhanced (Gd-DTPA) MRI. MR images were obtained using a 3.0-T whole-body MR system (Achieva; Philips Healthcare) with a 16-channel phased-array coil. Scanning sequences included a dual gradient-recalled echo T1-weighted sequence, an axial T2-weighted fat-suppression (FS) turbo spin-echo (TSE) sequence, dynamic contrast-enhanced MRI-Gd-EOB-MRI (unenhanced, arterial [20~35 s], portal [60 s], transitional phase [3 min], and hepatobiliary phase [20 min]) or Gd-DTPA-MRI (unenhanced, arterial [20~35 s], portal [60 s], and equilibrium [3 min]), and DW imaging with b-values of 0 and 800 s/mm2. Apparent diffusion coefficient (ADC) maps were created automatically on a voxel-by-voxel basis from the two b-values. The detailed MRI parameters are summarized in Table 1.
Table 1. MRI Sequences and Parameters
Sequence
|
FS
|
TR/TE (ms)
|
FA
|
ST(mm)
|
FOV(cm)
|
Matrix
|
T1-w dual gradient recalled echo
|
|
|
|
|
|
|
in-phase
|
No
|
10/2.5
|
10°
|
5mm
|
30-38
|
256 × 224
|
opposed-phase
|
No
|
10/3.55
|
10°
|
5mm
|
30-38
|
256 × 224
|
Breath-hold FS T2-w
|
Yes
|
2096/72
|
90°
|
5mm
|
30-38
|
324 × 256
|
DWI
|
Yes
|
1600/70,
|
90°
|
5mm
|
30–35
|
100 × 100
|
T1-w dynamic enhanced
|
Yes
|
3.1/1.5
|
10°
|
2mm
|
32–38
|
228 × 211
|
FS fat suppression, TR repetition time, TE echo time, FA flip angle, ST slice thickness, FOV field of view, T1-w T1 weighted, T2-w T2 weighted.
Qualitative Image Analysis
Two radiologists (observer 1, JSL, with 15 years of experience; and observer 2, BGL, with 10 years of experience) independently analyzed all MR images and reached a consensus. The radiologists were informed that this study attempted to evaluate the contribution of LI-RADS v 2018 in HCC detection but they were blinded to the patients’ clinical data and pathologic diagnosis.
First, LI-RADS categories were assigned based on major imaging features (lesion size, arterial phase hyperenhancement, enhancing “capsule,” and nonperipheral “washout”) and the observations were categorized as LR-3, LR-4, and LR-5 [5, 6]. The growth threshold was eliminated from the assessment, because more than 6 months of follow-up were performed in only 10 patients. The detailed algorithm based on major imaging features is shown in Supplementary Table 1.
Second, the radiologists were requested to upgrade or downgrade the final LI-RADS categories based on the presence of ancillary features. Unlike major features, ancillary features are optional imaging features applied at the radiologist’s discretion. The ancillary features applied in this study are shown in Supplementary Table 2. According to the evaluation criteria of LI-RADS v 2018 [5], the rules for application of ancillary features to adjust LI-RADS categories are as follows: (a) if there are conflicting ancillary features, the category should not be adjusted; (b) ancillary features favoring HCC in particular or malignancy in general are only allowed for upgrade by a maximum of one category up to LR-4; upgrade from LR-4 to LR-5 category is not permitted; and (3) ancillary features favoring benignity may be used to downgrade an observation by a maximum of one category. Finally, LI-RADS categories based on the combination of major and ancillary features were documented for each lesion assessed.
Radiomics Analysis
Image Segmentation and Feature Extraction: Lesion outlining and texture feature extraction on MRI were performed using free open-source software package MaZda 4.6 (http://www.eletel.p.lodz.pl/programy/mazda/). Axial in-phase T1-WI, Fat suppresses (FS) T2-WI, and ADC maps in the “.dicom” format were imported to MaZda 4.6 for feature extraction. Two radiologists (XZ and BGL, with 5 and 10 years of experience in medical image segmentation) manually drew a region of interest (ROI) for each nodule on the image section that depicted the maximum area (Fig.2a~2c). If a nodule was difficult to identify in these sequences, T1W contrast-enhanced images were used for accurate ROI placement. The gray level of each ROI was normalized in the range ofu±3σ(u, gray-level mean; and σ, gray-level standard deviation) to minimize the impact of contrast and brightness variation [19, 20]. Two hundred seventy-nine quantitative texture features resulting from six statistical image descriptors were extracted for each ROI (Fig.2d), including histogram, gray-level co-occurrence matrix, run-length matrix, wavelet, absolute gradient, and autoregressive model [21]. The detailed feature names and numbers are summarized in Supplementary Table 3. Ultimately, for each lesion, a total of 837 texture features based on the three sequences were determined for subsequent selection.
Feature Selection and Radiomics Signature Construction:
To determine the discriminative texture features for differentiating HCCs from benign nodules, feature selection was performed. First, feature selection was performed based on reproducibility and redundancy with reference to previous studies [22-24]. The inter-observer reproducibility for each feature extraction was estimated using interclass correlation coefficients (ICC). Texture features with ICC values ≥ 0.80 were identified as highly reproducible features and were remained for further selection. Second, we performed feature selection from the remaining dataset by using the Mann–Whitney U test, and features with a P-value less than 0.05 were maintained.
Based on the aforementioned maintained features, the final feature selection was performed using the least absolute shrinkage and selection operator (LASSO) logistic regression analysis with 10-fold cross-validation based on minimum criteria [25]. The use of the LASSO analysis may cause overfitting and bias; thus, as mentioned in previous studies, we added backward elimination to reduce the number of remaining final features [22, 26]. A formula was created using a linear combination of the final selected features that were weighted by their respective LASSO coefficients. The formula was then applied to calculate a radiomics score for each liver nodule to reflect the probability of HCC. The performance was assessed by receiver-operator characteristic curve (ROC) analysis.
Construction of the Radiomics Nomogram:
With the combination of LI-RADS and radiomics signature, a radiomics nomogram model was constructed by using the multivariate logistic regression analysis, and a nomogram was plotted based on coefficients weighted by the logistic regression analysis. A calibration curve was drawn to appraise the calibration of the radiomics nomogram, accompanied by the Hosmer–Lemeshow test to assess the goodness-of-fit of the nomogram. The performance was assessed using ROC analysis.
Statistical Analysis
All statistical analyses were performed using R software (version 3.5.3, http://www.rproject.org/) and SPSS 16.0 (SPSS Inc., Chicago, IL, USA) software package, and statistical significance was set at P < 0.05. LASSO logistic regression was performed using R statistical software with the "glmnet" package. The nomogram and calibration plots were created using the "rms" package, and the Hosmer–Lemeshow test was conducted using the "generalhoslem" package. Other statistical analyses were performed using SPSS 16.0; inter-reader variability between the two observers for LI-RADS categories was appraised using kappa statistics. The diagnostic performance for each diagnosis model was assessed using ROC analysis. The Mann–Whitney U test and Pearson chi-square test (or Fisher test) were used for continuous and categorical variables, respectively.