Non-Invasive Assessment of Liver Steatosis: A Comprehensive Comparison of Biomarkers- and Ultrasound-Based Techniques

Background & Aims: In view of limited reliability of the biopsy in the assessment of liver fat, a noninvasive, trustworthy and more accessible method estimating a degree of steatosis is urgently needed. While controlled attenuation parameter (CAP) is used to quantify hepatic fat, its availability in routine practice is limited. Therefore, the aim of this study was to compare the diagnostic accuracy of biomarker-and ultrasound-based techniques for the diagnosis and grading of hepatic steatosis. Methods: This was a prospective study of 167 adults with and without non-alcoholic fatty liver disease. As measured against CAP, we assessed Hamaguchi’s score and the hepatorenal index (HRI), and the following biochemical measures: the fatty liver index, hepatic steatosis index and lipid accumulation product scores during a single out-patient visit. Area under the receiver operating curve (AUROC) analyses were used to evaluate the diagnostic accuracy of each test and to calculate optimal thresholds for the ultrasound techniques. Results: All non-invasive methods displayed high accuracy in detecting steatosis (mean AUC value ≥ 0.90), with Hamaguchi’s score and the HRI as the most precise. These two tests also had the highest sensitivity and specicity (82.2% and 100%; 86.9% and 94.8%, respectively). We propose new thresholds for Hamaguchi’s score and HRI for hepatic steatosis grading, indicated by optimal sensitivity and specicity. Conclusion: Ultrasound-based techniques are the most accurate for assessing liver steatosis compared to other non-invasive tests. Given the accessibility of ultrasonography, this nding is of practical importance for the assessment of liver steatosis in clinical settings.


Introduction
Non-alcoholic fatty liver disease (NAFLD) has become the most common liver disorder in Western countries, and its global prevalence is estimated at 25.2%. NAFLD may progress from liver steatosis to non-alcoholic steatohepatitis (NASH) and then to liver cirrhosis, which is associated with an increased risk of hepatocellular carcinoma and other cirrhosis-related complications. 1 The main risk factors for NAFLD include components of the metabolic syndrome such as diabetes, obesity, dyslipidaemia and hypertension, and patients with these co-morbidities should be actively screened for NAFLD, even if their liver enzymes are within normal ranges. [2][3][4] Currently, effective screening for NAFLD in clinical practice is hindered by a lack of clear guidelines concerning non-invasive diagnostic tools. Liver biopsy (LB) remains the gold standard for the diagnosis of NASH and assessment of brosis, but not for hepatosteatosis. 1 Furthermore, biopsies have known disadvantages that include a risk of complications due to the invasive nature of the procedure, sampling variability because of the small size of tissue obtained, and the heterogeneous distribution of histological changes in liver parenchyma. 5 Moreover, due to the extent of the condition (e.g., 80 million Americans are affected by NAFLD 6 ) routine LBs to con rm NAFLD may be unwarranted and could even be considered unethical. Undoubtedly, LB remain the only tool available to con rm NASH, even though its prevalence among NAFLD patients is estimated at 1.5%-6.5%. Therefore LBs should only be considered in those patients at high risk for the progressive type of the disease. 7 To nd a new standard for liver steatosis screening, and to replace the use of LBs in most patients at low risk for NASH, non-invasive diagnostic methods (biomarkers and imaging-based techniques) are required.
Controlled attenuation parameter (CAP) (FibroScan system; Echosens, Paris, France) integrated with Fibroscan -a modality estimating liver steatosis and brosis -remains one of the best quantitative tools that has been validated against LBs. CAP calculates the attenuation of an ultrasound beam traversing the liver tissue. It is observer independent and evaluates an area 100 times larger than an LB. CAP is considered to be an accurate tool for the diagnosis and staging of hepatic steatosis, with mean area under the receiver operating characteristic (AUROC) values for the diagnosis of mild, moderate and severe steatosis of 0.9, 0.8 and 0.7, respectively. 4 However, the equipment needed to undertake this assessment is usually unavailable in non-hepatological centres, such as in clinics for general practitioners or peripheral hospitals.
During routine practice, liver steatosis is typically screened using abdominal ultrasound, despite its limitations, which include subjective evaluations, operator dependency and the ability to only recognize fatty liver in ltration that is greater than 15-20%. To enhance objectivity, Hamaguchi et al 8 proposed an alternative, semi-quantitative, ultrasound-based, steatosis-assessment score that has 91.7% sensitivity and 100% speci city. Alternatively, the hepatorenal index (HRI) designed by Webb et al 9 is another quantitative, ultrasound-based, hepatic-steatosis measure that correlates with LBs and has an AUROC of over 0.9 for all steatosis grades. 10 To identify simpler and cost-effective approaches for the diagnosis of NAFLD, several scores based on easily measurable biochemical and clinical parameters such as the fatty liver index (FLI), 11 hepatic steatosis index (HSI) 12 and lipid accumulation product (LAP) 13,14 have also been developed. All of these non-invasive tools are potentially useful screening methods for clinical practice; however, the choice of an optimal screening modality as part of a daily clinical routine is made di cult by a lack of comparative studies that assess their accuracy, as well as there being no clear guidelines for clinicians. Therefore, the aim of this study was to undertake a comparative assessment of the diagnostic accuracy for the detection and quanti cation of hepatic steatosis using both ultrasound-based and biochemical techniques. Speci cally, we investigated two validated ultrasound methods (Hamaguchi's score and the HRI) and three biochemical panels (FLI, HSI and LAP) using CAP as the reference method.

Study population and design
A total of 177 adult consecutive patients were prospectively recruited between March 2018 and February 2020 in a single, out-patient centre located in Szczecin, Poland. The study was designed following the Standards for Reporting of Diagnostic Accuracy (STARD) guidelines (Suppl.

Clinical assessment
Demographic and anthropometric measurements including body mass index (BMI), waist and hip circumferences and questionnaires regarding medical history, and current and past alcohol ingestion were obtained. The presence and severity of steatosis was evaluated using CAP as the reference method. CAP and abdominal ultrasound for obtaining Hamaguchi's score and the HRI were performed during the same appointment by the same trained operator (KKP) after patients had fasted for at least 4 hours.
Fasting blood samples were collected to obtain biochemical data.

CAP measurements
Liver stiffness and CAP measurements were performed using FibroScan®. Measurements were obtained using both M (3.5 MHz) and XL (2.5 MHz) probes, depending on skin-to-liver capsule distance (≤25 mm or > 25 mm), and the probe selection was guided by an integrated tool. Steatosis grades were established using the following cut-off values for low-, intermediate-and high-grade steatosis (S1, S2, S3): 234, 269 and 301 dB/m, respectively. 15 These values were preferably integrated in FibroScan cut-offs for quantifying NAFLD.
Ultrasound examination: Hamaguchi's score and the HRI The hepatic ultrasound was performed using a high-resolution B-mode tomographic ultrasound system (Aixplorer, SuperSonic Imagine, Aix-en-Provence, France) with a convex SC6-1 probe in abdominal mode. Patients were examined in the dorsal position when Hamaguchi's score and the HRI were calculated. Hamaguchi's ultrasound steatosis score is a six-point scale used to assess four variables: hepatorenal echo contrast, liver parenchyma brightness, vessel blurring and attenuation depth. 8 Hepatic steatosis is de ned by a score ≥ 2 and moderate/severe steatosis by score ≥ 4. 16 The HRI is the ratio of the average liver parenchyma to renal cortex brightness in a B-Mode sonogram. Fatty liver disease algorithms Three liver steatosis algorithms: the FLI, HSI and LAP were calculated using clinical, anthropometric and laboratory data obtained at the same appointment as the ultrasound examination.
Technical details of non-invasive techniques applied in the study are presented in Supplementary Material 2.

Statistical analysis
The Kolmogorov-Smirnov test was used to assess the distribution of variables. Qualitative variables are presented as counts and percentages. Continuous variables are shown as mean ± standard deviation (SD) or as medians with interquartile ranges (IQRs). Parametric tests (Student's t-test and ANOVA) were used for the assessment of differences between numerical variables with normal distributions, and nonparametric tests (Mann-Whitney or Kruskal-Wallis tests) were used for variables with non-normal distributions. Pearson's coe cient (ρ) was used to evaluate the association between two continuous variables. To evaluate the effectiveness of different scores in predicting (S > 0) and grading (S1, S2, S3) NAFLD, the receiver operating characteristic (ROC) curves, and sensitivity and speci city was constructed using CAP as a reference. The AUROCs, with 95% CIs, were recorded and used to establish new optimal thresholds for detecting and grading liver steatosis. The Spearman's rank coe cient was calculated to analyse inter-rater reliability between ordinal diagnostic scales and liver steatosis severity, as established by CAP. Multi-variate forward and backward stepwise logistic regression analyses (using CAP as the dependent variable and the scores, and the biochemical and anthropometric parameters as the independent variables) were used to evaluate algorithms that strongly correlated (R) with hepatic steatosis. Two-sided P values < 0.05 were considered signi cant. All statistical analyses were performed using STATA 11 software (StataCorp, College Station, TX, USA).

Characteristics of the study population
Among 177 patients initially screened, 167 were included in the nal data analysis (Suppl. Figure 1).
Patients were divided into two groups according to their CAP results using Karlas et al's 15 cut-off values. Fifty-eight participants without hepatic steatosis (S0) were included as a control group, and 109 patients with CAP-con rmed liver steatosis (≥ S1) constituted the study group. Participant characteristics are presented in Table 1. The majority of patients were female (61.7%), the average age was 53 ± 12 years.

Correlation between CAP and non-invasive indexes
All tests displayed acceptable accuracy in discriminating the presence of steatosis as de ned by CAP, and all were signi cantly correlated with hepatic fat content as measured with CAP (Fig. 1). The diagnostic performance data for the detection of steatosis (CAP ≥ 234 dB/m) are presented in Table 2, with the ROC curves shown in Fig. 2. The sensitivity and speci city of the methods were measured based on the AUROC results and are listed in Table 2. The highest sensitivity was achieved by the LAP and the highest speci city by Hamaguchi's score. As presented in Fig. 3, the highest inter-rater reliability (R) in comparison to CAP were the ultrasound-based grading scores (R = 0.79), with biochemical algorithms having slightly lower values. All participants from the control group (S0) achieved 0 points in Hamaguchi's score and 95% had HRIs < 1.49. Forty-three patients (73%) out of these diagnosed with S3 steatosis had 4 or more points using Hamaguchi's score. When comparing Hamaguchi and the HRI versus CAP for the identi cation of steatosis of any severity (≥ S1), these methods were able to correctly identify 82% and 87% of patients using thresholds of 2 and 1.49 points, respectively. Optimal thresholds Based on the AUROC results, new optimal thresholds for HRI and Hamaguchi's score, according to CAP, were calculated and listed in Table 3. The estimated cut-off values reached high sensitivity and speci city values for steatosis grading. The only value with a sensitivity lower than 70% was the HRI threshold for detecting severe steatosis.

Optimal steatosis prediction model
Based on the multi-variate forward and backward stepwise logistic regression analysis, different imaging, biochemical and anthropometric parameters and scores were evaluated to nd a combination model that strongly correlated with hepatic steatosis ( Table 4). The application of Hamaguchi's score together with the HSI achieved a correlation rate of 0.87. An even better result was found for the diagnostic model that relied on Hamaguchi's score, BMI, GGTP and ferritin levels, with a correlation rate of 0.89.

Main ndings
In this study, we performed a comparative analysis of ultrasound-based and biochemical techniques for the non-invasive assessment of liver steatosis with CAP as a reference modality. We report that all tests attained high accuracy in detecting steatosis in comparison to CAP. Furthermore, we have demonstrated that ultrasound-based techniques (Hamaguchi's score and the HRI) were more accurate than biochemical indexes. Of the biochemical panels, the FLI reached the highest accuracy for detecting NAFLD. Our results are similar to those reported in previous studies that have validated these tests against LBs or magnetic resonance imaging proton density fat fraction (MRI-PDFF). We have also proposed threshold values for ultrasound methods that allow for the diagnosis and grading of liver steatosis.
Recent American Association for the Study of Liver Diseases Practice Guidelines on NAFLD do not recommend LBs in patients with NAFLD unless there is a strong suspicion of advanced brosis. 1 Therefore, a non-invasive measurement of liver steatosis plays a crucial role in the assessment of this pathology. In routine practice, liver steatosis is typically diagnosed with an ultrasound, an easily accessible and inexpensive modality. Nonetheless, this method is subjective and imprecise in follow-ups. Therefore, this study attempted to validate two, simple-to-perform, alternative ultrasound-based methods and three well-known biochemical panels for the evaluation of liver steatosis. Moreover, we investigated whether these tests could also be used for the quanti cation of steatosis.

Ultrasound-based techniques
Hernaez et al 17 published a meta-analysis based on forty-nine studies (4720 participants) investigating the diagnostic accuracy of ultrasonography for the detection of moderate to severe fatty livers in comparison to histology. The overall sensitivity and speci city for the ultrasound methods were 84.8% (95% CI, 79.5-88.9) and 93.6% (95% CI, 87.2-97.0), which was similar to that of other imaging methods (i.e., computed tomography and MRI). Our study showed that Hamaguchi's score and the HRI demonstrated high diagnostic accuracy for the detection of steatosis (AUROC = 0.94). Furthermore, performance in terms of sensitivity, speci city and Spearmen's coe cient (ρS) was good to excellent for the detection of steatosis (CAP ≥ 234 db/m) using optimal cut-off values. In regards to the detection of steatosis, the sensitivity and speci city was 82.2% and 100.0% for Hamaguchi's score and 86.9% and 94.8% for the HRI, and both methods achieved a high grading correlation with CAP (ρS = 0.79). These results are in agreement with previous studies that have validated these methods against LBs.
Hamaguchi et al 8 reported a 91.7% sensitivity and 100% speci city for Hamaguchi's score. Therefore, we conclude that Hamaguchi's score has a good performance for the detection of steatosis. However, the optimal thresholds to quantify all of its degrees has not been previously estimated. Therefore, based on our results, we propose new cut-off values, where a score of 0 points excludes steatosis, 1 or 2 points indicates low-grade steatosis (S1), 3 points suggests intermediate steatosis (S2) and a score of 4 points or greater indicates high-grade (S3) steatosis.
The HRI appeared to be a highly accurate modality to detect low and moderate steatosis, but our results indicate that it demonstrated poor sensitivity when distinguishing between moderate and severe steatosis. Thus far, only a few studies (based on small groups) have validated the HRI, and they have suggested different optimal cut-off values for steatosis grading. Webb et al 9 were the rst to describe a correlation between the HRI and LBs, with an AUROC of over 0.9 for all steatosis grades and 1.49 as the optimal cut-off value. 10 Marshall et al, 18 in a study of 101 patients with biopsy-diagnosed NAFLD, reported an HRI sensitivity and speci city of 100% and 54% respectively, with a cut-off value of 1.27. A similar optimal cut-off value of 1.24 was described by Borges et al 19 on a small sample of 42 participants, with a sensitivity and speci city of 93%. Chauhan et al 20 estimated an optimal threshold of 2.01 to detect steatosis, with a sensitivity of 62.5% and speci city of 95.2%. According to the differences in the above-mentioned studies, it appears that further investigations are needed to establish optimal HRI cut-off values. In our study (based on AUROC results) we calculated new optimal thresholds for the HRI (S ≥ S1 1.41, S ≥ S2 1.56, S ≥ S3 2.015) ( Table 3) and report sensitivity, speci city and Spearmen's coe cient for S ≥ S1, S ≥ S2 and S ≥ S3 equal to 91.6%, 86.2%, ρ = 0.78 and 94.0%, 80.2%, ρ = 0.75 and 57.6%, 90.6% and ρ = 0.52, respectively.

Biomarkers
In our study, the FLI and HSI performed similarly to what was originally described by Bedogni et al 11 and Lee et al. 12 In several studies comparing two of the biochemical scores tested in our study versus liver histology 21 or MRI, 14,22 the FLI and HSI performed equally well or slightly weaker to our results for the detection of liver steatosis. Koehler et al, 23 in their retrospective study on 2652 patients with ultrasounddetected NAFLD, reported an AUROC of 0.81 for the FLI. Good performance of the FLI for the diagnosis of NAFLD has also been con rmed in other studies, 11,16,24 including a report by Motamed et al 25 that found an AUROC of 0.86 (95% CI: 0.85-0.87). Our study con rmed this conclusion as the correlation rate for the FLI and waist circumference was equal to 0.71 and 0.72, respectively. Waist circumference appears to be a simpler and more accessible measure that has a similar performance to the FLI.
There are only a few published studies on the LAP, as originally described by Bedogni et al. 13  Despite a promising sensitivity of 93.1%, the speci city of the LAP was estimated at only 62.1%. The sensitivity and speci city of the FLI and HSI were 77.8% and 86.2%, and 76.7% and 86.2%, respectively.

Strengths and limitations of the study
The strengths of this study include its prospective design, with well-de ned participant characteristics with and without NAFLD and screening using standardised liver assessments to exclude patients with other causes of chronic liver disease, including excessive alcohol consumption. In addition, all participants underwent consecutive CAP and ultrasound assessments by the same certi ed operator, and biomarker evaluations occurred on the same day. To our knowledge, this is the rst prospective study to assess the diagnostic accuracy of these ve non-invasive methods in the general population in comparison to CAP, as well as the rst to establish steatosis grading cut-off values for Hamaguchi's score.
However, we acknowledge the following limitations of this study. First, LBs were not performed, as we found it unethical to perform them on patients with simple steatosis. LB is the reference method for the diagnosis of NAFLD, but is hindered by misdiagnosis and inaccuracies in identifying staging, which is partially due to the small sizes of tissue specimens (1/50.000 to 1/65.000 total volume) and the heterogeneous distribution of histological changes in the liver parenchyma. 5,27 Therefore, steatosis may be unevenly distributed and sampling error remains a major challenge for LBs. Hence, we used a highly accurate, widely-available, non-invasive quantitative modality, CAP, that has been validated against LBs and PDFF-MRI in numerous studies, [28][29][30][31] and against ultrasound. 16 It evaluates an area 100 times larger than LBs and has emerged as a novel biomarker for assessing hepatic steatosis. Second, in our study, we used both probes (M and XL) and probe selection was guided by an integrated tool according to skin-liver capsule distance. The CAP thresholds used were the same for both probes as the literature had suggested there are no difference between measurements. However, in a recently published study by Caussy et al, 32 it was demonstrated that CAP values were signi cantly lower when obtained using the M probe as compared to the XL probe in the same participant, even when the probe was selected according to participant's BMI. Therefore, the authors concluded that different thresholds for the detection of NAFLD should be applied depending on the type of probe used for the CAP measurement. Moreover, along with the current study by Caussy et al, signi cantly different CAP thresholds for the diagnosis and staging of liver steatosis have been proposed over past few years, depending on the reference method used. 15,28−31 These discrepancies could have affected our results, but would have increased the calculated accuracy, as we used the lowest cut-off values as opposed to the newly proposed ones. Undoubtedly, the optimal cut-off values need to be validated using large cohorts for reliable diagnoses and staging of steatosis.
Currently, diagnostic confusion may seriously affect further clinical decisions, as well as the determination of the risk of NAFLD progression and its complications.

Implications for clinical use
In this study, we have demonstrated that all ve non-invasive methods of steatosis detection (Hamaguchi's score, HRI, FLI, HSI and LAP) were able to discriminate between the presence and absence of steatosis with an overall good diagnostic performance. Using a prospective study design, we showed that ultrasound assessment using Hamaguchi's score or the HRI were also good diagnostic tools for the quanti cation of hepatosteatosis degrees. However, further cohort studies are needed to evaluate optimal diagnostic thresholds.
Our ndings suggest that the above-mentioned tests can be useful screening tools for the detection of NAFLD, especially in patients with risk factors. PDFF-MRI and CAP remain relatively expensive and are not easily available. This is in contrast to ultrasound-based and biochemical diagnostic methods, which are possible to maintain in routine clinical practice. Our ndings support the use of ultrasound as the imaging technique of choice for screening for NAFLD in the general population, especially given their low cost, non-invasive nature and the lack of radiation exposure involved. Moreover, simple biochemical algorithms may prove helpful in diagnosing steatohepatosis particularly in the general practice setting, and in selecting patients who require further liver diagnostics.