More than a dozen models have been proposed to detect NAFLD patients, and most of them had a good diagnostic performance in the original study such as the Steato Test [22], FLI [10], FLD index [14], Hepatic steatosis index (HSI) [23], Framingham steatosis index (FSI) [24], homeostasis model assessment of insulin resistance (HOMA-IR) [25], and index of NASH (ION) [26]. However, some indicators in these models are costly and not available in most laboratories in the undeveloped countries, such as α2-macroglobulin, apolipoprotein A-I in the Steato Test, and the insulin test in HOMA-IR and ION. In order to determine a simple, accurate and cost-effective model for NAFLD screening on a large scale, we screened out NAFLD-related models developed by simple indicators. We validated eight NAFLD-related models (FLI, FLD, ZJU, LAP, CAP, WHtR, TyG and VAI) and compared the performance of these models in eastern Chinese community population. Our study showed that all of these models have a moderate discrimination, and FLI and FLD have better performance than other models in this population.
FLI, consisting of TG, GGT, BMI and WC, was developed by Bedogni et al [10] in northern Italy, 2006. It has been validated to a great extent and has been proved to be correlated with insulin resistance, coronary heart disease, and early atherosclerosis [27, 28]. Studies have shown that FLI has a moderate discrimination (AUC, 0.83–0.88) in Taiwan, northern and western Chinese mainland [17, 29, 30]. Our study has testified the feasibility of FLI in eastern Chinese mainland and proved that FLI has good applicability in the community population. FLD based on BMI, TG, ALT to AST ratio and FPG was proposed by Fuyan et al. in eastern China with an AUC of 0.82 [14]. It has been validated by Jinzhou et al. recently with an AUC of 0.87 in western China, but it was no better than FLI (AUC, 0.88) [29]. Our study supported that FLD has the same diagnostic performance as FLI. Besides, the Decision Curve Analysis also suggested that FLI and FLD have higher net benefit than other models. The population in Jinzhou et al.’s study was western Chinese while our population was eastern Chinese, which can cause a great difference.
ZJU also has a great performance in this study but it doesn’t perform as well as FLD in the whole population. ZJU was developed based on BMI, FPG, TG, ALT to AST ratio and gender with an AUC of 0.82 in eastern China, and in the validation cohort using pathological data, the ZJU index had a good accuracy (AUC, 0.896) for the detection of steatosis [15]. Some recent studies have validated ZJU and compared to other models including FLI using large population, with conflicting results [29, 31]. The underlying reason may be the difference of NAFLD prevalence and characteristics of population.
LAP, CAP, WHtR, TyG and VAI in our study have AUCs of over 0.75 in all population. WC as a good surrogate parameter of visceral fat [32, 33], is the common component of FLI, LAP, WHtR and VAI. Visceral adiposity has a significant association with increased free fatty acids, which can be transported to liver and expose the liver to fat accumulation, liver insulin resistance and inflammation [34]. CAP was developed based on 141 histological diagnosed NAFLD patients and has higher accuracy than FibroScan [16]. The sample size in the primary study may be too small to develop a model applying to general population. To our knowledge, CAP has not been external validated. TyG has a strong correlation with the degree of NAFLD, but the AUC of TyG to detect NAFLD is not in parallel with the high risk correlation. TyG has been used to reflect insulin resistance, which is very important in the development of NAFLD [35]. Apart from insulin resistance, transaminases and anthropometric indicators also play vital roles in the prediction of NAFLD. It may be the reason why the diagnostic performance of TyG is not in line with the relationship to NAFLD, and is not as well as FLI and FLD.
Cut-off value is an important concern when applying models to specific population. We also determined cut-off values of these models in this population. The cut-off values of FLI and FLD differ in gender and are inconsistent with previous studies. The optimal cut-off points of the FLI in the present study were 20.6, 25.3 and 8.4 in the total, male and female population respectively. Western countries mostly identified FLI < 30 as non-NAFLD and >60 as NAFLD without gender difference [10]. Li C et al. determined the optimal cut-off point of FLI as 20 in the north Chinese population, which was similar to our findings [30]. But they didn’t detect the gender difference of the cut-off values. A validation study of FLI in Taiwan also showed lower cut-off values for NAFLD than Western populations [17]. Body size, body composition and fat distribution have difference among different races and ethnic groups due to the environment, nutrition factors and culture [36, 37], which leads to the diversity of anthropometric and serological measurements. Women had higher percent of fat mass, extremity fat, and lower lean mass compared to men at the same level of age and BMI [38]. This may explain the lower cut-off values in female subjects. The optimal cut-off values of FLD in different population are more stable (28.7, 29.0 and 26.1 for total, male and female respectively) than FLI, while the cut-off value for female subjects is also lower than male. So it is essential to apply corresponding cut-off values among different population.
Our results showed that FLI and FLD can be used to screen for NAFLD in eastern Chinese community population. The expenditure of these models in eastern China has not been studied. Jinzhou et al. compared the expenditure of several NAFLD-related models in western China [29]. FLI costs 20 Yuan per capita, which is lower than FLD and ZJU. So FLI may have advantages in expenditure and accessibility compared to other models. The cost-effectiveness of these models should be further studied.
The strength of this study includes large scale population, comprehensive analysis of the variables and head to head comparison of included models. There are also some limitations in our study. Ultrasonography as a diagnostic method for NAFLD has limited sensitivity [39]. But ultrasound is a preferable method for large scale screening of asymptomatic individuals like community population. Another limitation is we didn’t adjust for other factors such as physical activity, diet and smoke, which may have correlation with the risk of metabolic symptoms. Additionally, our study was retrospective, further prospective studies are needed to evaluate the applied values of these models.