Diagnostic Accuracy of Contrast-enhanced Ultrasonography Liver Imaging Reporting and Data System in Hepatocellular Carcinoma: A Systematic Review

Background: Contrast-enhanced Ultrasonography Liver Imaging Reporting and Data System (CEUS LI-RADS) released by American College of Radiology was a widely used reporting system for patients at risk with hepatocellular carcinoma (HCC). In CEUS LI-RADS, the categories range from denitely benign (LR-1), probably begin (LR-2), intermediate probability of malignancy (LR-3), probably HCC (LR-4) to denitely HCC (LR-5), malignancy (LR-M), or denite tumor in vein (LR-TIV). Methods: We searched MEDLINE, Web of Science, Cochrane, Embase, and Chinese databases to obtain eligible studies reporting on the diagnostic performance of CEUS LI-RADS in patients at risk for HCC. Results: Twelve studies were eligible in the analysis, including 5275 patients, 5739 observations, and 4066 HCCs. The pooled sensitivity and specicity were 70% (95% Condence Interval [CI] 65%-74%), 94% (95% CI, 91%-96%) of LR-5 category as predictors of HCC, respectively. The pooled sensitivity and specicity of LR-M category as a predictor of non-HCC malignancy were 83% (95% CI, 71%-91%), 94% (95% CI 88%-97%), respectively. The pooled proportions of HCCs were 1% (95% CI 0%-6%) for LR-2, 20% (95% CI, 9%-34%) for LR-3, 78% (95% CI, 67%-88%) for LR-4, 97% (95% CI, 94%-99%) for LR-5, 40% (95% CI, 23%–58%) for LR-M and 100% (95% CI, 93%–100%) for LR-TIV. Conclusion: CEUS LI-RADS is an important tool for the diagnosis of HCC.


Introduction
Hepatocellular carcinoma (HCC) was accounted for the majority of primary hepatic neoplasm, threated to human health severely, and ranked fourth for cancer death in 2018 [1]. For those who had liver cirrhosis, HCC is the main cause of death. The prognosis of advanced liver cancer is relatively poor with a low 5-year survival rate ( 20%) [2]. If diagnosed early, the prognosis could be greatly improved (5-year survival 50%) [3,4].
Imaging plays an important role in the early diagnosis of HCC. It is well recommended that histological biopsy is not necessary when the typical features of HCC are showed by appropriate imaging examination in patients at high risk [5]. Contrast-enhanced ultrasound (CEUS) has been routinely used to distinguish benign and malignant liver nodules in the clinic and provides useful diagnostic information [6,7]. The American College of Radiology has developed the CEUS Liver Imaging Reporting and Data System (CEUS LI-RADS), which is a diagnostic algorithm aimed to standardize interpreting imaging results of the liver lesions at risk of HCC and can help clinicians decide whether to proceed with the next more aggressive examination [8,9]. The initial version of CEUS LI-RADS was launched in 2016 and updated in 2017. According to the algorithm, liver nodules in patients at high risk for HCC are classi ed into ve major categories (from LR-1 to LR-5), re ecting the probability of HCC. Additionally, the LR-M category refers to probably or de nitely malignant, but not necessarily HCC.
Many studies have shown CEUS LI-RADS was of high sensitivity and speci city in diagnosing HCC [10]. However, some were with relatively small samples because the algorithm has been developed for a short time compared with LI-RADS in CT/MRI. The diagnostic performance of CEUS LI-RADS and the percentage of HCC and overall malignancy in each category have not been thoroughly evaluated. Thus, we carry out a meta-analysis to investigate the diagnostic performance of the LR-5 category for HCC, LR-M category as a predictor of non-HCC malignancy, and pooled proportion of HCC and overall malignancy for each CEUS LI-RADS category.

Methods
This study followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guideline for conduct and reporting. It used analytical methods currently recommended by the Diagnostic Test Accuracy Working Group of the Cochrane Collaboration and the Agency for Healthcare Research and Quality. The protocol was registered in the PROSPERO (CRD 42020175194)

Search Strategy
We systematically searched the MEDLINE, Embase, Web of Science, Cochrane CENTRAL, Scopus databases, and Chinese database from August 2016 to April 2020, with no language restrictions. The literature search strategy was following speci c keywords and Boolean operators. The keywords were including: "CEUS or contrast-enhanced ultrasound," "LI-RADS," "HCC or Hepatocellular carcinoma." The detailed search terms were listed in Appendix 1.

Selection Criteria
Studies were reviewed with respect to eligibility: (a) Patients at high risk of HCC presented with untreated liver nodules; (b) Index test, CEUS; (c) LI-RADS version 2016 (v.2016) or v.2017 were used to evaluate lesions; (d) reference standard based on pathological diagnosis or clinical follow-up (e) original research including prospective or retrospective studies; (f) True positive, false positive, true negative and false negative of the LR-5 category for HCC; (g) the proportions of HCCs and non-HCC malignancies in each LI-RADS category. Studies were excluded if they: (a) liver observations that were treated before the index test (CEUS); (b) insu cient data were required to calculate sensitivity and speci city. (c) the possibility of duplicate data. Initially, two reviewers (H.F.L and R.H.Z) independently screened abstracts and titles. Then the potentially relevant articles were accessed by full-text review according to selection criteria. Any disagreements were resolved through discussion with a third reviewer (W.H).

Data Extraction
Available data extracted included (1) study characteristics (study design, publication year, journal, country where the study was performed, etc.); (2) patient characteristics (number, age and gender); (3) liver observations characteristics (quantities, size range, chronic liver disease); (4) Type of US probe and type of contrast agents; (5) Accuracy data for CEUS ndings: the numbers of true positives, false positives, false negatives, and true negatives; (6) image review method; (7) reference standards. Two reviewers (H.F.L and R.H.Z) independently extracted data using a standardized form. Then disagreements were resolved through discussion by consensus or a third reviewer (W.H).

Assessment Of The Quality Of The Included Studies
The quality of the included studies was assessed according to the Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2) tool which included the risk of bias for four domains (patient selection, index test, reference standard, and ow and timing) and the clinical applicability for the rst three domains of the study characteristics [11]. Two reviewers (H.F.L. and R.H.Z, all had enough experience in radiology) independently used the QUADAS-2 tool to assess the quality. The reviewers did not blindly review these articles. Discrepancies were resolved through discussion with a third reviewer (W.H).

Data Synthesis And Statistical Analysis
To estimate summary sensitivities and speci cities of LR-5 and LR-M and their 95% con dence intervals (95% CIs), a meta-analysis was performed using a random-effects model. To present the variation between studies in sensitivity and speci city, study results were showed in forest plots, and receiver operating characteristic (ROC) plots with a 95% estimate region.
To determine the pooled proportions of HCCs and overall malignancies for each LI-RADS category, a random-effects model was applied, with forest plots and 95% CIs. Study heterogeneity in each LI-RADS category was evaluated by the Higgins I2 statistic, which exceeds 50% regarded as possible substantial heterogeneity.
Heterogeneity was quanti ed with the Q test (P < 0.05, indicating signi cant heterogeneity). We conducted meta-regression analyses to explore potential covariates of heterogeneity based on the study characteristics. Such covariates were as following: (1) study type and design (case-control, cohort study, To assess publication bias, we performed funnel plots and the Egger′s test. Statistical analyses were done using R version 3.3.2 (The R Foundation for Statistical Computing, Vienna, Austria) and STATA software (Version 12.0; Stata Corporation).

Literature Search
Through our electronic search, 406 studies were initially identi ed, and 70 of these were duplicates. Fifty-one studies were relevant after reading titles and abstracts. After scrutinizing the full text, 12 articles [12-16, 10, 17-22] with su cient detail that meet our research design were included (Fig. 1). Details of the excluded studies are showed in Appendix 1, See Fig. 1 for the study selection process.

Characteristics Of The Included Studies
The important characteristics of the nally included studies are listed in Table 1. A total of 5275 patients with 5739 lesions were evaluated in this article ( Table 2). All studies were full-text publications that contain enough data and details to analyze. All included studies except for one [20] had been carried out in a single center. Most of these studies had a retrospective design, only two prospective studies. And in these retrospective studies, 1 study[16] was a casecontrol design. Liver nodules were analyzed according to LI-RADS v.2017 in 9 studies [14,12,16,10,22,17,21,15,19], and v.2016 in the other 3 articles[18, 20,13].
The studies mostly were conducted in Asia, in which China and South Korea account for the majority; 1 in Italy [20], 1 in German [18]. In 3 of the 12 available studies, the publication was English, and the rest was Chinese [13,12,14].  Data are mean or median value, mean ± SD and data in parentheses are range.

Methodological Quality
Our evaluation of study quality with the QUADAS-2 tool is reported in Appendix 2. Overall, these included studies were considered to be at high risk of bias and high concerns about applicability. Only one study presented low bias in all four domains and applicability concerns [15]. The major risk of bias and applicability concerns were related to patient selection, reference standard. 2 articles[16, 22] selectively enrolled study subjects (1[16] was case-control). This was considered high risk because it may affect the prevalence of HCC in the study by the authors.
Regarding the index test, 1 study used CEUS and the CT/MRI [22], which may be a source of bias. The reference standard was a potential risk of bias because these articles used different reference standards to diagnose liver nodes. For example, only using pathology would be appropriate to exclude some patients [23]. Most studies did not report whether or not the assessment of the index test was blinded to the reference standard. On the other hand, the Flow and Timing domain in most articles was not reported, so the risk of bias was unclear.

Meta-regression Analysis
We explored potential heterogeneity using the nation of publication, study design, subject enrollment, LI-RADS version, imaging reviewer, and reference standard as covariates. For LI-RADS category 5, all these factors, except for imaging reviewers, signi cantly in uenced sensitivities or speci cities. And for LI-RADS M, only reference standard had a signi cant impact on the pooled sensitivity (P 0.01). The result was summarized in Table 3 and Table 4.

Publication Bias
The linear regression test of funnel plot asymmetry indicated no signi cant publication bias (P = 0.50).  CEUS LI-RADS, released by the American College of Radiology in 2016 and updated in 2017, has been widely used and recognized by international scienti c societies. However, the diagnostic performance of CEUS LI-RADS has not been widely evaluated and validated.
The ndings showed a signi cant number of HCCs and malignant cases in the higher LI-RADS categories (LR-1 to LR-5). This pooled analysis found that the CEUS LI-RADS category LR-5 had a high speci city (0.94 [95% CI 0.91-0.96] for HCC. And the pooled sensitivity, speci city of LR-M, including ten studies as a predictor of non-HCC malignancy, were 70% and 94%. According to our systematic review, the pooled sensitivity and speci city of LR-5 were calculated, which were 0.70 (95 % CI 0.65-0.74) and 0.94 (95 % CI 0.91-0.96). To our knowledge, CEUS LR-5 emphasizes high speci city close to 100% in diagnosing HCC in patients at risk for HCCs. Therefore, our result demonstrated that CEUS LI-RADS is an excellent system that can offer a good diagnostic value of HCC. Moreover, HCC guidelines also recommend CEUS as a diagnostic tool for HCC in expert centers [5].
Basing on current evidence, LR-M as a predictor of non-HCC malignancy showed a high pooled sensitivity 94% (95% CI, 88%-97%) but a relatively low speci city 83% (95% CI, 71%-91%) from 10 included studies. Notably, we found a high pooled proportion of HCCs (40%) in the LR-M category, which means a low positive predictive value. It's necessary to modify the LR-M category to distinguish HCC from non-HCC malignancies [16].
We conducted a meta-regression analysis to explore the essential sources resulting in high heterogeneity. Finally, we observed considerable heterogeneity that mainly caused by the nation of publication, study design, and reference standard. It is worth noting that a statistically signi cant difference of speci city and sensitivity in the LR-5 category between the articles originating from Asia versus Non-Asian (P 0.01). This may correct with the epidemiology of HCC, such as the etiology of the chronic liver disease and the presence of cirrhosis. Because HBV is endemic and cirrhosis is less common in Asia, unlike western countries' regions. Besides, our result shows that the pooled sensitivity and speci city of LI-RADS category 5 for diagnosing HCC in retrospective studies (n = 10) are higher than the prospective study (n = 2). Because retrospective researches might induce some selection bias, another essential factor impacting the heterogeneity of results was that the use of different referent standards contributed to the heterogeneity (P 0.05). Due to LI-RADS 1-3 corresponds to begin or intermediate malignancy probability, these patients are generally followed up. Thus two studies employed the only pathology as the reference standard, which will increase selection bias. Despite these limitations, our meta-analysis can still demonstrate the usefulness of CEUS LI-RADS.
In the present analysis, the results showed that pooled proportions of HCCs in 12 included studies were 0% for LR-1, 1% for LR-2, 20% for LR-3, 78% for LR-4, and 97% for LR-5. This result met the de nition of LI-RADS, as the category increases, the degree of malignancy increased. According to the CEUS LI-RADS classi cation, LR-3 and LR-4 were de ned as intermediate malignancy probability and the probability of HCC. However, the pooled proportions of HCC in LR-3 and LR-4 were high heterogeneity (P 0.01). This could have been due to the inappropriate exclusion of some LR-3 and LR-4 lesions because of a lack of histology. The rate of HCC in class LR-3 also suggested that LR-3 observations should be considered a biopsy combination of patients' medical history.
Moreover, we found the pooled proportion of malignancies was 97% for LR-M, consistent with the LR-M classi cation of malignancy not speci c for HCC.
However, our meta-analysis had several crucial limitations. One limitation of this study is the poor reporting of primary studies. For example, the limited number of prospective research in this meta-analysis may introduce a bias toward increased diagnostic sensitivity. Moreover, the reference standard we used in this study were either histology or composite clinical reference standard. Some studies only used pathology as a referenced standard, and this may cause an inevitable selection bias because some enrolled patients may be excluded inappropriately. Also, we conducted meta-regression to explore the potential factors. Secondly, this meta-analysis is somewhat restricted by the date of the last search (2020.4). Therefore, it's unknown whether more high-quality researches published in the interim period. Thirdly, the number of included studies is small and may in uence the outcome. Therefore, we did not conduct a subgroup to fully explore the possible sources of heterogeneity according to the characteristics of the article.

Conclusions
In conclusion, our meta-analysis reveals that CEUS LI-RADS is a useful tool for diagnosis HCC, and the speci city of LR-5 (94%) is relatively high for diagnosing hepatocellular carcinoma. From LR-1 to LR-5, the possibility of malignancy increases. LR-M categories are highly suspicious for malignancy, but of a high proportion of HCC (40%). Availability of data and materials The datasets used in the study are available from the corresponding author upon request.

Competing interests
Not applicable.