DOI: https://doi.org/10.21203/rs.3.rs-203679/v1
OBJECTIVE.
The purpose of this study is to evaluate the potential value of CT radiomics in predicting the mutation status of β-catenin in patients with hepatic cell cancer (HCC).
MATERIALS AND METHODS.
In this retrospective study, 43 patients with hepatic cell HCC (18 without β-catenin mutation and 15 with β-catenin mutation) were identified in The Cancer Genome Atlas–hepatic liver Cell Carcinoma database (TCGA-LIHC). To create stable models, the data were augmented to a total of 202 labeled samples (131 without β-catenin mutation and 73 with β-catenin mutation) by obtaining up to five different samples per patient. Extraction of large amounts of image features from portal phase contrast-enhanced CT images had been performed on an open-source software package (Pyradiomics, version 2.1.2.). Reproducibility analysis (intraclass correlation, run ICCs in SPSS 18.0) was performed by two radiologists. Classification problem is about β-catenin gene mutation status. Machine Learning based classifications were performed using the Pycaret (version 2.1.2) software. The main performance metric was the AUC value.
RESULTS.
Of 828 extracted texture features, 759 had excellent reproducibility. Using 10 selected features, the Extra Trees Classifier algorithm correctly classified 93.4% of the HCCs in terms of β-catenin mutation status (AUC value, 0.9741); the CatBoost Classifier algorithm correctly classified 91.9% of the HCCs (AUC value, 0.9692); Gradient Boosting Classifier algorithm correctly classified 91.1% ( AUC value, 0.9722). All the three advanced algorithms performed above 90% accuracy.
CONCLUSION.
Machine Learning-based high-dimensional quantitative CT radiomics analysis might be a feasible and potential method for predicting β-catenin mutation status in patients with HCC.
Hepatocellular carcinoma (HCC) is a major type of primary malignant hepatic tumor and the third cause of cancer-related mortality worldwide [1]. The diagnosis and treatment for HCC had been significantly affected by genomic characteristics of the tumors [2][3].
WNT/β-catenin signaling pathway activation has been known as an important signal for hepatic carcinogenesis [4]. Abnormal Wnt–β-catenin signaling due to β-catenin mutation had been found in 30–40% of patients with HCC [5]. The β-catenin mutation had been proved to accelerate bile production in a higher grade of differentiation HCCs [6]. HCC with β-catenin mutation may be a special subtype that shows specific pathologic and clinical features, so β-catenin mutation might associate with interesting radiologic feature in clinical settings [7].
Most of HCCs show the decreased uptake of gadoxetic acid disodium or Gd-EOB-DTPA (Primovist, Schering, Berlin, Germany) comparing with normal liver tissue in the hepatobiliary phase in cancer cells [8]. A small portion of HCC nodules are reported to uptake more EOB [9]. Therefore, the imaging diagnosis of HCCs with β-catenin mutation may be important in daily clinical practice.
The term ‘radiogenomics’ is used herein in the context of the mutual relation or connection between the features of tissue-scale cancer imaging and molecular features of malignancies such as gene expressions.[10] An emerging related quantitative technique is the computed tomography texture analysis (CTTA), which characterizes the heterogeneity of a lesion inside a specific region of interest (ROI), and hence utilizes pixel attributes and image histograms to obtain quantitative texture parameters.[11] This technique demonstrated its utility as an imaging biomarker, as a predictor of patient outcomes and overall survival, and as an estimator of therapy response for multiple tumors.[12] We tried to look for a connection between β-catenin mutation and imaging biomarkers from CTTA-based radiomics.
The existence of a mutual correlation between CTTA-based radiomics and β-catenin mutation has not yet been investigated in the open literature, according to the best that we know. We explore whether such a correlation really exists by investigating the possibility of some relation and connection between CTTA-derived quantitative parameters and mutations of β-catenin in patients affected by hepatocellular carcinoma (HCC).
The diameters of the lesions on axial portal phase contrast enhance images were as follow: (1) HCC with β-catenin mutation: mean ± standard deviation [SD], 4.95 ± 1.38 cm; median, 4.87cm; interquartile range [IQR], 3.88– 5.40cm; (2) HCC without β-catenin mutation: mean ± SD, 5.02 ± 1.45 cm; median, 5.07 cm; IQR, 4.06–6.66 cm. The numbers of voxels in the ROIs, each containing a tumor, were as follow: (1) HCC with β-catenin mutation: mean ± standard deviation [SD], 4574.86 ± 6698.65; median, 1875; interquartile range [IQR], 860– 5578; (2) HCC without β-catenin mutation: mean ± SD, 7093.93 ± 5658.79; median, 5984; IQR, 1761.5–10430.5.
The study flow-chart is displayed in Fig. 1.
In this study, the target variable is the β-catenin mutation status (yes = 1, no = 0). The data are radiomics parameters, which had been selected with high ICC above 0.9. Total 759 columns and 192 samples (ID) had been included in the following analysis. 58 samples had been used as test/hold-out set (train/test = 70/30).
Total 18 models had been trained and evaluated using cross validation. In order to compare all models about their performances, all models in the model library had been trained and scored them using stratified cross validation for metric evaluation.
The top 3 models with a score grid that shows average accuracy, AUC, recall, precision, F1, kappa, and MCC with training dataset as shown in Table 2. Of 828 texture features, 759 had excellent reproducibility (intraclass correlation coefficient, ≥ 0.9). Hence, these features were included in the additional dimension reduction steps.
After classifier-specific feature selection algorithms, the number of selected features was reduced to 10. In order to estimate the predict model function on unseen data, 10 sample records (0.05% faction of total 202 segmentations) had been withheld from the original dataset to be used for predictions.
The plot function takes a trained model object and returns a plot based on the test / hold-out set can be used to analyze the performance across different aspects such as AUC, confusion matrix, decision boundary.
The AUC plot, Precision-Recall Curve and Feature Importance Plot from extra trees classifier, CatBoost classifier, and gradient boosting classifier models had been displayed in Fig. 2 and Fig. 3.
Figure 2 displays receiver operating characteristic (ROC) curves obtained from CTTA of portal phase enhancement CT of HCC using Extra Trees, CatBoost, Gradient Boosting Classifier. For top three models, the top 10 selected features were all wavelet-transformed images parameters. The wavelet derived features had been proved as the most significant features in all the three top models.
The selected features important plots for Extra Trees Classifier, CatBoost Classifier, and Gradient Boosting Classifier models are presented in Fig. 3.
In order to predict the test/hold-out set and reviewing the evaluation metrics, the finalized models function fit the model onto the complete dataset including the test/hold-out sample (30% in this study, 58 samples) and unseen samples.
The two good performance models are Extra Trees Classifier and Cat Boost models with accuracy on test data and unseen data as follow: (1) the accuracy of Extra Trees Classifier on test/hold-out set is 0.9138 compared to 0.9342 achieved on the Cross Validation results, accuracy of predicting on unseen data (10 samples) is 0.9; (2) the accuracy of Cat Boost on test/hold-out set is 0.9138 compared to 0.911 achieved on the Cross Validation results, accuracy of predicting on unseen data (10 samples) is 1.0.
We devote this work to exploring the effectiveness of the machine learning (ML) methodology of high-dimensional CT radiomics in making a prudent or educated guess of the β-catenin mutation status of HCC patients. Our results indicate that CT radiomics using different ML classifiers (the extra trees classifiers, and the CatBoost classifiers) is potentially useful for predicting HCCs irrespective of whether the β-catenin mutation exists or not.
We recall that radiomics is a medical technique that applies algorithms of data characterization to radiographic medical images for extracting a large number of features[23]. CT-based radiomics analysis had been used to predict survival of patients with metastatic colorectal cancer [24]. Radiomics could also be used to predict response of individual HER2-amplified colorectal cancer liver metastases, as well as the biomarkers of molecular subtype prognosis [25].
As radiogenomics could reveal the relationship between imaging features and genomic features [26], radiogenomics could be used to bridge imaging and genomics. Our current study may have important practical and clinical implications.
The β-catenin mutation in HCCs may promote immune escape and might affect responsiveness to therapeutic procedures [27]. The evaluation of genetic mutations of liver cancer could prove impractical if implemented for every patient. Nevertheless, the radiomic features derived from CT texture analysis might provide potential biomarkers for predicting HCCs (whether the β-catenin mutation exists or not) after the validation of such biomarkers in larger datasets. Moreover, we anticipate that new biomarkers and models could be developed through forthcoming research that might involve larger datasets and different feature selection algorithms as well as supporting ML schemes.
In our analysis, the radiomics parameters in each ML-based model were similar. We used cross validation, testing on unseen data methods to optimize the model performance. Total 18 types of classifiers had been selected during the feature selection process. Several experiments with various ML classifiers might be needed to find the best ML scheme, when less data available.
One of the well-known challenges in the field of radiomic is interpretation of the selected features in model development, even if they were validated [28]. Regarding radiomics of HCCs for identifying β-catenin mutation status [29], the selected features might represent some kinds of information that are associated with pathological stage, or differentiation grade, which are correlated with β-catenin mutation status [30].
A few limitations of this study should be addressed. First, as a retrospective study design, this study provided some inferior level of evidence. Second, ML-based classifiers might risk overfitting induced by the small and imbalanced patient population. We strived to reduce this expected overfitting problem through the application of data augmentation techniques to increase the number of the labeled samples, a truly fruitful method for overcoming overfitting in ML-based classification. Third, although 3D segmentation could represent radiomics information more effectively, we just used the largest 2D slice and its adjacent consecutive upper and lower slices for CT radiomics, because most former clinical research on HCCs had been based on a single segmentation or segmentations of a few slices. Fourth, we derived the imaging data from TCGA-LIHC on The Cancer Imaging Archive (TCIA) website, which includes patients from different centers and sources using different image acquisition protocols, just the same as in standard clinical practice. To minimize various kinds of variabilities, all image samples underwent normalization and pixel rescaling procedures as shown in the methods section. The current technique had been proved to reduce both variabilities and bias. Fifth, we used the same dataset for training, validation and testing, an action that could certainly be viewed as a bias, and hence we implemented a 10-fold cross-validation procedure to minimize such a potential bias. It is obvious that independent external datasets are needed to validate the performance of the classifiers in any further exploration. Sixth, we included the portal phase images only in the analysis since they are widely available. Further research is warranted for the unenhanced CT or arterial phase-enhanced CT. Seventh, we evaluated only the β-catenin mutation status because the corresponding patient group possessed sufficient imaging data that satisfied our criteria and attained clinical usefulness with an effective prognostic value in this study. Ninth and finally, all radiogenomic studies suffer from the same common problem, namely the possibility of some discrepancy between the data present on imaging studies and the small sample used for genomic analysis [31].
In conclusion, CT radiomics based on machine learning is shown to be a feasible and potentially successful method for predicting β-catenin mutation status in HCC patients. Due to the advantage of routine acquisition of enhanced CT images, we prudently propose that this radiogenomics approach could be used as a future clinical decision support tool in larger and prospective trials.
All materials and methods were performed in accordance with relevant guidelines and regulations. All experimental protocols were approved by the institutional review board at the JinZhou Medical University (Authorization Number: JMU20210217). All patients in the study were deidentified. The data were publicly available for scientific purposes.
All genomic and clinical data in this study had been obtained from The Cancer Genome Atlas Liver Hepatocellular Carcinoma (TCGA-LIHC) database [13]. Available pretreatment imaging studies were downed in DICOM format from Cancer Imaging Archive website [14]. TCGA- LIHC database included 429 patients with HCC (total number of β-catenin mutation 103, frequency of mutation about 26.8%). However, the imaging data of only 97 patients were available for use.
To create a uniform imaging protocol, the study included only those patients with available preoperative portal phase contrast-enhanced CT (CECT) images that were acquired with maximum tube voltage (140 kV), slice thickness (5 mm or less), no slice overlap. Because the arterial phase and delay phase was not available in the patients’ imaging data, we only include portal phase images.
Patients were excluded from study due to poor-quality CECT images (significant image noise, significant artifacts, or other quality issues), multiple tumors (as a result of information in the database creating uncertainty about which tumor had the β-catenin mutation). Fewer than five patients’ imaging studies would be excluded due to minimize the heterogeneity of the imaging protocol.
Total 43 patients with total 43 HCCs tumors had met the eligibility criteria of the study. All the information of demographics and clinical characteristics of the patients had been displayed in Table 1. The list of included patients, in addition to their corresponding patient codes both in the TCGA-LIHC database and on The Cancer Imaging Archive website, are presented in Appendix.
Data augmentation has been proved and is considered a powerful method for avoiding overfitting when there is a small amount of data. It has been successfully applied in many different machine learning-based classification tasks [15]. Given the small number of patients in our study might lead to potential overfitting, we naturally augmented the labeled data in our study by obtaining samplings from different levels of the tumors (as shown in Fig. 1).
HCCs were sampled by 3–5 different and consecutive slices around the largest diameter center slice, unless the last slice of the tumor was affected by partial volume [15]. The augmentation resulted in 202 labeled segmentation data (131 without the β-catenin mutation and 71 with the β-catenin mutation) from 43 HCCs (28 without the β-catenin mutation and 15 with the β-catenin mutation). We considered using actual data derived from the multiple segmentations or samplings, rather than artificial or synthetic data.
Reference StandardThe tumor segmentations were manually performed using 3D Slicer software (version 4.8.1) [17]. Up to five segmentations were obtained for each lesion with about 2 mm of margin shrinkage from the lesion contour. The initial segmentation was done on the axial image slice representing the largest cross-sectional area of the tumor. The additional segmentations were then performed on the adjacent consecutive upper and lower slices. Shrinkage was performed using the margin shrinkage function of the software that creates the procedure equally in every direction.
A DICOM image format was used in each step of the analysis. Before texture feature extraction, all images were normalized, rescaled, and discretized [18]. To minimize inter scanner effects, all datasets were normalized by centering the pixel image intensity values at the mean value with SD. We set the scaling factor to 1. Pixel spaces in all image slices were resampled and rescaled to the resolution of 1 × 1 mm2 in order to considered into many texture features necessitate the same spatial resolution and require the pixel size to be comparable. Cubic B-spline interpolation was used for rescaling. The gray-level discretization was done in the matrix representation of the gray levels in the segmentation with a bin width value of 0.01.
Texture features were extracted using an open source software package for the extraction of radiomic data from medical images (Pyradiomics, version 2.2.0) [18]. The features were extracted from the original, filtered, and wavelet-transformed images [19]. The Laplacian of Gaussian filter was used for image filtration, with values of 1 mm, 3 mm, and 5 mm denoting fine, medium, and coarse patterns, respectively.
The extracted texture features included first-order features, the gray-level dependence matrix, gray-level co-occurrence matrix, gray-level run-length matrix, gray-level size zone matrix, neighboring gray-tone difference matrix, and wavelet-based texture features. The total number of features extracted per lesion was 828. Detailed descriptions and mathematic formulas for these features have been described elsewhere [18].
To assess the reproducibility of the texture features [20], two radiologists independently segmented 43 randomly selected tumors. Both radiologists were blinded to the β-catenin mutation status. Intraclass correlation coefficient values [21] were calculated for each texture feature with the use of statistical software (run ICCs in SPSS 18.0). Only the features with an intraclass correlation coefficient of 0.9 or greater, which indicated excellent reproducibility, were included in additional dimension reduction steps.
Machine Learning based classifications were performed using the Pycaret software, version 2.1.2 [22]. Total 18 models of classifier had been made with the 10-fold cross-validations. The performance of classifiers had been mainly evaluated and compared basing on AUC value. The values of accuracy, sensitivity, specificity, precision, the F-measure, and the Matthews correlation coefficient (MCC) had all been calculated.
The classification module in Pycaret is a supervised machine learning module, which is used for classifying the elements into a binary group based on various techniques and algorithms. In the current study of classification problem include β-catenin gene mutation detection found (positive vs. negative).
In order to demonstrate the prediction function on unseen data, sample of 10 records (5% total samples) has firstly been with-hold from the original dataset to be used for predictions. The second step is to creates the transformation pipeline to prepare the data for modeling and deployment. The target column indicated status of β-catenin gene mutation. The third step is comparing all models to evaluate performance. The output shows average Accuracy, AUC, Recall, Precision, F1, Kappa, and MCC crossing the 10-folds along with training times. Total over 15 models using cross validation had been trained. Accuracy (highest to lowest) is used to model selection. AUC, feature importance plot and confusion matrix had been used to analyze the performance across different aspects based on the test / hold-out set. The last step is finalizing the model and predicting on unseen data.
Characteristics |
Value |
---|---|
Mean age (year) |
59.74 |
Sex |
|
female |
14 |
male |
29 |
Disease stage |
|
Stage I-II |
29 |
Stage II-IV |
14 |
Histopathologic nuclear grade |
|
grades 1–2 (low) |
27 |
grades 3–4 (high) |
16 |
Beta-Catenin mutation |
|
Absent |
28 |
present |
15 |
Vascular tumor invasion |
|
Present |
10 |
Absent |
33 |
Model |
Accuracy |
AUC |
Recall |
Precision |
F1 score |
Kappa |
MCC |
---|---|---|---|---|---|---|---|
Extra Trees Classifier |
0.9341 |
0.9739 |
0.88 |
0.935 |
0.9022 |
0.8531 |
0.8586 |
CatBoost Classifier |
0.9187 |
0.9692 |
0.84 |
0.935 |
0.8744 |
0.8166 |
0.8285 |
Gradient Boosting Classifier |
0.911 |
0.9722 |
0.9 |
0.8798 |
0.8843 |
0.8124 |
0.819 |
Note: TP = True Positive; TN = True Negative; FP = False Positives; FN = False Negatives |
Accuracy = TP + TN/TP + FP + FN + TN;
AUC = area under the curve;
Recall = TP/TP + FN;
Precision = TP/TP + FP;
Specificity = TN/TN + FP;
F1 score = 2 * Precision * Recall / (Precision + Recall)