Machine Learning Soft Voting Algorithm for Prediction and Detection of Nonalcoholic Fatty Liver Disease

2. Materials And Method

2.1. Dataset Used in Research

Data were obtained from 10,508 participants who attended the 2010 annual health examination at the First Affiliated Hospital of Zhejiang University School of Medicine, China⁹. Informed consent was obtained from all subjects involved in the study. The study was approved by the Ethics Committee of the Guilin University of Technology, and was in compliance with the Helsinki Declaration. All methods were performed in accordance with the approved guidelines. The variables consisted mainly of 4 basic characteristics of the subjects and 15 biochemical indicators of the subjects' blood. The four basic characteristics of the subjects were age, gender, height, and weight. The biochemical variables included liver enzymes, lipids, uric acid, and glucose. The diagnosis of NAFLD was based on criteria from the Chinese Liver Disease Association¹³. Ultrasound examined by trained sonographers. When the variable Ultrasound is equal to 1 indicates that the subject NAFLD is present, and when the variable Ultrasound is equal to 0 indicates that the subject NAFLD is absent. The detailed variables and descriptions are shown in Table 1.

Table 1

The variables and descriptions of the data.
Num.	Variable	Description	Num.	Variable	Description
1	Age	Age	11	IB	Indirect bilirubin
2	Gender	Gender	12	TC	Total cholesterol
3	Height	Height	13	TG	Triglycerides
4	Weight	Weight	14	HDL	High-density lipoprotein cholesterol
5	ALT	Alanine aminotransferase	15	LDL	Low-density lipoprotein cholesterol
6	AST	Glutamic oxaloacetic transaminase	16	Bun	Blood urea nitrogen
7	ALP	Alkaline phosphatase	17	Cr	Creatinine
8	γ-GT	γ-Glutamyl transferase	18	Glu	Fasting plasma glucose
9	TB	Total bilirubin	19	Uric	Serum uric acid
10	DB	Direct bilirubin	20	Ultrasound	Ultrasound

2.2. Feature Selection

Predictive Permutation Feature Selection (PPFS)¹⁴ is a novel feature selection algorithm based on the concept of Markov Blanket (MB). MB can be described by the following Fig. 1. It contains all the information related to the target node, and the non-MB nodes can be discarded safely to achieve the purpose of feature selection. The G node with green is a target node, the nodes with a pink form an MB of the G node, and the G node is independent of any node outside the rectangle¹⁵.

The PPFS algorithm selects a subset of features based on their performance both individually and as a group, and it can automatically decide how many features to take and tries to find the optimal combination of features¹⁴. The PPFS algorithm is implemented by using the PPIMBC function in the PyImeptus package for python.

2.3. Proposed Voting Algorithm

10 machine learning algorithms are used, namely, Logistic Regression (LR)¹⁶, Random Forest (RF)¹⁷, Support Vector Machine (SVM)¹⁸, Decision Tree (DT)¹⁹, LigntGBM²⁰, CatBoost²¹, Neural Network (NN)²², K-Nearest Neighbor (KNN)²³, Bayesian Network (BN)²⁴, Genetic Algorithm (GA)²⁵. The four most accurate machine learning algorithms (GA, LR, NN, RF) were used to build a Voting algorithm²⁶ by Ensemble Learning soft voting algorithm.

The flowchart of proposing Voting algorithm is shown in Fig. 2. The whole framework consists of three phases, data preparation, model construction, and model prediction. In the data preparation phase, missing data are filled, correlation analysis and feature selection are performed. In the model construction phase, the four most accurate machine learning algorithms (GA, LR, NN, RF) were used to build a Voting model by Ensemble Learning soft voting algorithm. In the model prediction phase, the proposed Voting algorithm is applied to predict whether a new patient will progress to NAFLD.

GA is a randomized searching optimization algorithm, which stimulates the crossover, variation and selection phenomena that occur in natural selection and genetics processes. Beginning with a random initial population, a population of individuals better suited to the environment is generated by random selection, crossover and mutation operations²⁷.

Ensemble learning is a novel machine learning technique that is broadly used in classification and regression problem. Ensemble learning is the use of multiple identical or different machine learning algorithms to solve a problem through some combination. Voting-based ensemble learning builds multiple models and applies basic statistical methods to combine the predictions of the models. Voting algorithms consist of hard voting and soft voting, and soft voting uses the class probability output by each algorithm for class selection, and the predicted outcome is the class with the largest sum of probabilities among all voting results²⁸.

The packages and functions used by the 11 algorithms are shown in Table 2.

Table 2

The packages and functions used by 11 algorithms.
Algorithm	Package	Function
LR	sklearn	LogisticRegression
RF	sklearn	RandomForestClassifier
SVM	sklearn	SVC
DT	sklearn	DecisionTreeClassifier
LigntGBM	lightgbm	LGBMClassifier
CatBoost	catboost	CatBoostClassifier
NN	sklearn	MLPClassifier
KNN	sklearn	KNeighborsClassifier
BN	pyAgrum	BNClassifier
GA	tpot	TPOTClassifier
Voting	sklearn	VotingClassifier

2.4. Evaluation Indicators

Based on the method used in a previous study²⁹, we calculated the accuracy, precision, recall, ${F}_{1}$ score, and AUC to evaluate the performance of the different algorithms.

$$\left\{\begin{array}{c} \text{T}\text{P}\text{ }\text{=}\text{ }\text{T}\text{r}\text{u}\text{e}\text{ }\text{P}\text{o}\text{s}\text{i}\text{t}\text{i}\text{v}\text{e}\text{ }\\ \text{F}\text{P}\text{ }\text{=}\text{ }\text{F}\text{a}\text{l}\text{s}\text{e}\text{ }\text{p}\text{o}\text{s}\text{i}\text{t}\text{i}\text{v}\text{e}\text{ }\\ \text{F}\text{N}\text{ }\text{=}\text{ }\text{F}\text{a}\text{l}\text{s}\text{e}\text{ }\text{n}\text{e}\text{g}\text{a}\text{t}\text{i}\text{v}\text{e}\text{ }\\ \text{T}\text{N}\text{ }\text{=}\text{ }\text{T}\text{r}\text{u}\text{e}\text{ }\text{n}\text{e}\text{g}\text{a}\text{t}\text{i}\text{v}\text{e}\text{ }\end{array}\right\}$$

Accuracy represents the number of correctly classified test instances as a percentage of the total number of test instances and is calculated as³⁰

$$Accuracy =\frac{TP+TN}{TP+FP+FN+TN}$$

Recall represents the ratio of the number of correctly classified positive cases to the actual number of positive cases and is calculated as³¹

$$\text{Recall}\text{ }=\frac{TP}{TP+FN}$$

Precision represents the ratio of the number of correctly classified positive instances to the number of instances classified as positive and is calculated as³²

$$\text{Precision }=\frac{TP}{TP+FP}$$

The ${F}_{1}$ score is based on the harmonic mean of Recall and Precision, which evaluates Recall and Precision together and is calculated as³³

$${F}_{1}\text{ }=\frac{2*\text{ Recall }*\text{ Precision }}{\text{ Recall }+\text{ Precision }}$$

The true positive rate (TPR) indicates the percentage of all samples that are positive that are correctly identified as positive. The false positive rate (FPR) indicates the rate at which all actually negative samples are incorrectly identified as positive. TPR and FPR are calculated as³⁴

$$\text{TPR }=\frac{TP}{TP+FN}$$

$$\text{FPR }\text{=}\frac{\text{FP}}{\text{FP+TN}}$$

The receiver operating characteristic (ROC)³⁴ curve is based on FPR as the X-axis and TPR as the Y-axis. The area under the curve (AUC)³⁵ is the area under the ROC curve and gives the average performance value of the classifier.

3. Results

3.1. Missing data filling

First, the missingno package in python is used to visualize the distribution of missing values in the data. The visualization result is shown in Fig. 3. The two values on the left axis are the beginning and end of the sample size (from 1 to 10508 data). The number 3 on the right indicates that there are 3 columns of data without missing values, and the number 20 on the bottom right indicates that there are 20 columns of data. If there are more white lines, it means there are more missing values. We can see that the three variables Age, Gender, and Ultrasound do not have white lines, which means that there are no missing values for these three variables.

We could also able to calculate the missing rate of the data, and the result is shown in Fig. 4. It can be seen that the missing rate of each variable is not high. Height and Weight, the two variables with the highest missing rates, are neither more than 0.5%. Age, Gender, and Ultrasound have no missing value and the missing rate is 0. The KNN algorithm in python's fancyimpute package is used to fill the missing values. Constructs the body mass index (BMI), which is calculated by dividing weight by height squared, and is used as a standard for the diagnosis of overweight and obesity.

3.2. Correlation analysis

The data are divided into two categories by the Ultrasound variable, where an Ultrasound of 1 means NAFLD is present and an Ultrasound of 0 means NAFLD is absent. The data characteristics of the two types are viewed by the describe function in python. The result is shown in Table 3.

Table 3

The data characteristics of the two types.
Variable	NAFLD present (n = 2522)	NAFLD absent (n = 7986)	Kolmogorov-Smirnov Z value	Kolmogorov-Smirnov P value
Age (year)	50.86 (12.75)	47.00 (14.96)	5.768	< 0.001
Gender (male/female)	1907/615	4971/3015	5.853	< 0.001
BMI (kg/m2)	26.02 (2.74)	22.48 (2.72)	21.99	< 0.001
ALT (U/L)	23.00(16.00–34.00)	13(10.00–19.00)	17.451	< 0.001
AST (U/L)	23.00(19.00–30.00)	20.00(16.00–24.00)	11.35	< 0.001
ALP (U/L)	83.00(71.00–99.00)	77.00(64.25-91.00)	5.837	< 0.001
γ-GT (U/L)	31.00(22.00–47.00)	17.00(13.00–26.00)	18.006	< 0.001
TB (µmol/L)	12.90(10.20–16.40)	12.20(9.60–16.10)	2.93	< 0.001
DB (µmol/L)	4.10(3.50–5.10)	3.90(3.20–4.80)	4.935	< 0.001
IB (µmol/L)	8.80(6.70–11.50)	8.60(6.30–11.30)	2.026	0.001
TC (mmol/L)	5.08(4.51–5.72)	4.72(4.17–5.30)	7.579	< 0.001
TG (mmol/L)	1.63(1.18–2.23)	0.96(0.71–1.36)	18.113	< 0.001
HDL (mmol/L)	1.34(1.18–1.53)	1.53(1.32–1.78)	11.252	< 0.001
LDL (mmol/L)	2.85(2.35–3.35)	2.60(2.14–3.08)	6.456	< 0.001
Bun (mmol/L)	4.98(4.24–5.85)	4.93(4.18–5.82)	1.33	0.058
Cr (mmol/L)	68.00(59.00–77.00)	66.00(56.00–75.00)	3.314	< 0.001
Glu (mmol/L)	5.11(4.75–5.65)	4.88(4.57–5.24)	8.037	< 0.001
Uric (µmol/L)	367.32(80.48)	312.36(53.31)	12.196	< 0.001

From Table 3, we can see that there are 2522 samples with NAFLD present, accounting for a quarter of the total, and the data is not unbalanced. Only one number in the parentheses represents the inside number as the standard deviation and the outside number as the mean. For example, when NAFLD was present, the mean value of the BMI variable was 26.02 and the standard deviation was 2.74. The numbers inside the parentheses represent the lower and upper quartiles, and the numbers outside the parentheses represent the median when there are two numbers in the parentheses. For instance, when NAFLD was absent, the median value of the ALT variable was 13, the lower quartile was 10, and the upper quartile was 19. When NAFLD was present, we could find more males than females, and the medians of all other variables were large, except for the HDL variable.

Performing the Kolmogorov-Smirnov Z test on the data, we found a Z value of 1.33 and a P value of 0.058 > 0.05 for the Bun variable, indicating no significant difference between the presence and absence of NAFLD for the Bun variable. P values for the remaining variables were less than 0.05, indicating a significant difference between the presence and absence of NAFLD for the remaining variables.

By using the heatmap function in python's seaborn package, the next correlation analysis was conducted on 19 variables. The heatmap of 19 variables is shown in Fig. 5.

We chose the Pearson correlation coefficient, which ranges from − 1 to 1. The larger the absolute value of the correlation coefficient, the stronger the correlation. The darker the color of the heatmap means the stronger the correlation. From Fig. 5, we can see that the correlations between the three variables, IB, DB, and TB, are compelling, the correlation between IB and TB is 0.98, the correlation between DB and TB is 0.87 and the correlation between IB and DB is 0.78. The relationship between LDL and TC was highly correlated, with a correlation of 0.89. The correlation between ALT and AST was also very high, reaching 0.81.

3.3. Feature Selection using PPFS

Variable selection is applied to 18 independent variables (Ultrasound as a dependent variable) by using the PPFS algorithm. The PPFS algorithm can automatically decide how many features to take and it attempts to find the best feature combination. Finally, 11 variables were retained and the feature importance is shown in Fig. 6.

By using the pairplot function in python's seaborn package, the pairplot of the first five most important variables is obtained as shown in Fig. 7. Both from the distribution plot on the diagonal and the scatter plot after classification, it can be seen that for NAFLD, the distributions of the five variables ALT, BMI, TG, γGT, and LDL are more different. In other words, these attributes are available to help us to identify NAFLD.

3.4. Machine Learning

The data are normalized by the StandardScaler function in the python package sklearn's preprocessing. Then, the train_test_split function in the sklearn package randomly divides 70% of the data into the training set and the remaining 30% into the testing set.

Ten basic machine learning models were used to classify the data. Conducting the cross_val_score and GridSearchCV functions in the sklearn package for parameter tuning to determine the optimal parameters that achieved the highest score. GA, LR, RF, and NN have the highest accuracy among the 10 basic algorithms, so these four algorithms are integrated by the ensemble learning Voting method. The Voting algorithm uses a soft voting method. The parameters of GA, LR, RF, NN algorithms are shown in Table 4.

Table 4

The parameters of GA, LR, RF, NN algorithms.
Algorithm	Items	Parameters
GA	generations	5
	population_size	50
	verbosity	2
LR	penalty	l1
	solver	liblinear
	C	0.9
	max_iter	200
RF	n_estimators	110
	min_samples_split	100
	max_depth	15
	min_samples_leaf	100
NN	max_iter	300
	learning_rate	constant
	learning_rate_init	0.001
	alpha	0.001
	activation	relu
	solver	adam
	batch_size	auto

By using the plot_confusion_matrix function in the sklearn package, the confusion matrices of 11 algorithms were obtained. The confusion matrices of 11 machine learning algorithms are shown in Fig. 8.

The evaluation metrics of the algorithm were calculated using confusion matrices. The results of the evaluation metrics for 11 machine learning algorithms are shown in Table 5.

Table 5

The results of evaluation metrics for 11 machine learning algorithms.
Algorithm	Accuracy	Recall	Precision	F₁	AUC
LR	0.839741	0.517516	0.733634	0.606909	0.884480
RF	0.833270	0.500000	0.716895	0.589118	0.884459
SVM	0.811953	0.713376	0.587927	0.644604	0.866256
DT	0.778074	0.552548	0.534669	0.543461	0.713757
LigntGBM	0.826037	0.585987	0.651327	0.616932	0.874366
CatBoost	0.791397	0.824841	0.541841	0.654040	0.890978
NN	0.843167	0.571656	0.715139	0.635398	0.888196
KNN	0.811572	0.527070	0.625709	0.572169	0.822654
BN	0.778835	0.804140	0.524403	0.634821	0.867345
GA	0.840502	0.511146	0.741339	0.605090	0.891156
Voting	0.846212	0.573248	0.725806	0.640569	0.894010

We found that there are different performances for different algorithms. Among the 11 algorithms, the proposed Voting algorithm achieves the best accuracy value (0.846212) and the best AUC (0.894010). GA achieves the best precision (0.7413). CatBoost achieves the best recall (0.680) and the best ${F}_{1}$ score (0.655). The AUC is the most important evaluation metric, which will be further explained in the discussion section. The results showed that of the 11 machine learning algorithms, the proposed Voting algorithm demonstrated the best overall performance. It achieved accuracy, recall, precision, ${F}_{1}$ score, and AUC of up to 0.846212, 0.573248, 0.725806, 0.640569, 0.894010, respectively.

From the ROC curves in Fig. 9, we can see that the DT algorithm has the worst performance, and the second worst performance is the KNN algorithm. When the ROC curves of classifiers are close to each other, the ROC curve is not a clear indication of which classifier is better. AUC is the area under the ROC curve, and AUC is used as an evaluation criterion to indicate which classifier is better. The larger the AUC, the better the classifier. From the value of AUC in the lower right corner of Fig. 9, we can see that the highest AUC of a single classification algorithm is GA, with an AUC of 0.8912, and the highest AUC of the proposed Voting algorithm is 0.8940.

A randomly selected sample from the test data is input to the Voting model and the interpretation information of the Voting model is obtained. The LimeTabularExplainer function of python's Lime package is used to get information about the interpretation of the model. The interpretation information of Voting model is shown in Fig. 10. The positive predictive probability of this sample was 0.84 > 0.5, so it could be classified as NAFLD present. The top four features that contributed the most to the classification of this sample as positive were BMI, ALT, TG, and γGT, and the feature that contributed the most to the classification of this sample as negative was LDL.

3.5. Comparison with Existing Studies

The proposed study was compared with relevant studies to demonstrate its reliability in the screening diagnosis of NAFLD, and Table 6 shows this comparison. According to the comparison results, the proposed Voting algorithm demonstrated the best performance.

Table 6

Comparison with existing studies.
Year	Reference	Result
2018	9	The highest Accuracy is 0.8341 for the LR algorithm
2019	36	The AUC of the LR algorithm is 0.73
2020	37	The highest AUC is 0.824809 for the RF algorithm
2021	38	The Accuracy of the SVM algorithm is 0.71
2021	39	The Accuracy of the NN algorithm is 0.77
2021	39	The AUC of the NN algorithm is 0.82
2022	40	The highest Accuracy is 0.79 for the RF algorithm
2022	40	The highest AUC is 0.84 for the ElasticNet algorithm
Proposed Voting algorithm		The Accuracy is 0.846212 and the AUC is 0.8940

References

Chalasani N, Younossi Z, Lavine J E, et al. The diagnosis and management of non‐alcoholic fatty liver disease: Practice Guideline by the American Association for the Study of Liver Diseases, American College of Gastroenterology, and the American Gastroenterological Association[J]. Hepatology, 2012, 55(6): 2005-2023.
Williams C D, Stengel J, Asike M I, et al. Prevalence of nonalcoholic fatty liver disease and nonalcoholic steatohepatitis among a largely middle-aged population utilizing ultrasound and liver biopsy: a prospective study[J]. Gastroenterology, 2011, 140(1): 124-131.
Sanyal A J, Brunt E M, Kleiner D E, et al. Endpoints and clinical trial design for nonalcoholic steatohepatitis[J]. 2011, 54, 344-353.
Estes C, Anstee Q M, Arias-Loste M T, et al. Modeling nafld disease burden in china, france, germany, italy, japan, spain, united kingdom, and united states for the period 2016–2030[J]. Journal of hepatology, 2018, 69(4): 896-904.
Bedogni G, Bellentani S, Miglioli L, et al. The Fatty Liver Index: a simple and accurate predictor of hepatic steatosis in the general population[J]. BMC gastroenterology, 2006, 6(1): 1-7.
Wang J, Xu C, Xun Y, et al. ZJU index: a novel model for predicting nonalcoholic fatty liver disease in a Chinese population[J]. Scientific reports, 2015, 5(1): 1-10.
Lee J H, Kim D, Kim H J, et al. Hepatic steatosis index: a simple screening tool reflecting nonalcoholic fatty liver disease[J]. Digestive and Liver Disease, 2010, 42(7): 503-508.
Wieckowska A, Feldstein A E. Diagnosis of nonalcoholic fatty liver disease: invasive versus noninvasive[C]//Seminars in liver disease. © Thieme Medical Publishers, 2008, 28(04): 386-395.
Ma H, Xu C, Shen Z, et al. Application of machine learning techniques for clinical predictive modeling: a cross-sectional study on nonalcoholic fatty liver disease in China[J]. BioMed research international, 2018, 2018.
Yoo T K, Kim S K, Kim D W, et al. Osteoporosis risk prediction for bone mineral density assessment of postmenopausal women using machine learning[J]. Yonsei medical journal, 2013, 54(6): 1321-1330.
Choi S B, Kim W J, Yoo T K, et al. Screening for prediabetes using machine learning models[J]. Computational and mathematical methods in medicine, 2014, 2014.
Lee C L, Liu W J, Tsai S F. Development and Validation of an Insulin Resistance Model for a Population with Chronic Kidney Disease Using a Machine Learning Approach[J]. Nutrients, 2022, 14(14): 2832.
Fan J G, Jia J D, Li Y M, et al. Guidelines for the diagnosis and management of nonalcoholic fatty liver disease: update 2010:(published in Chinese on Chinese Journal of Hepatology 2010, 18: 163-166)[J]. Journal of digestive diseases, 2011, 12(1): 38-44.
Hassan A, Paik J H, Khare S, et al. PPFS: Predictive Permutation Feature Selection[J]. arXiv preprint arXiv:2110.10713, 2021.
Wang Y, Gao X, Ru X, et al. Identification of gene signatures for COAD using feature selection and Bayesian network approaches[J]. Scientific Reports, 2022, 12(1): 1-13.
Sumner M, Frank E, Hall M. Speeding up logistic model tree induction[C]//European conference on principles of data mining and knowledge discovery. Springer, Berlin, Heidelberg, 2005: 675-683.
Breiman L. Random forests[J]. Machine learning, 2001, 45(1): 5-32.
Mining W I D. Data mining: Concepts and techniques[J]. Morgan Kaufinann, 2006, 10: 559-569.
Jiang L, Li C, Cai Z. Learning decision tree for ranking[J]. Knowledge and Information Systems, 2009, 20(1): 123-135.
Ke G, Meng Q, Finley T, et al. Lightgbm: A highly efficient gradient boosting decision tree[J]. Advances in neural information processing systems, 2017, 30.
Veronika Dorogush A, Ershov V, Gulin A. CatBoost: gradient boosting with categorical features support[J]. arXiv e-prints, 2018: arXiv: 1810.11363.
Glorot X, Bengio Y. Understanding the difficulty of training deep feedforward neural networks[C]//Proceedings of the thirteenth international conference on artificial intelligence and statistics. JMLR Workshop and Conference Proceedings, 2010: 249-256.
Mining W I D. Data mining: Concepts and techniques[J]. Morgan Kaufinann, 2006, 10: 559-569.
Jiang L, Li C, Wang S. Cost-sensitive Bayesian network classifiers[J]. Pattern Recognition Letters, 2014, 45: 211-216.
Le T T, Fu W, Moore J H. Scaling tree-based automated machine learning to biomedical big data with a feature set selector[J]. Bioinformatics, 2020, 36(1): 250-256.
Yang N C, Ismail H. Voting-based ensemble learning algorithm for fault detection in photovoltaic systems under different weather conditions[J]. Mathematics, 2022, 10(2): 285.
Yan B, Ye X, Wang J, et al. An Algorithm Framework for Drug-Induced Liver Injury Prediction Based on Genetic Algorithm and Ensemble Learning[J]. Molecules, 2022, 27(10): 3112.
Husain A, Khan M H. Early diabetes prediction using voting based ensemble learning[C]//International conference on advances in computing and data sciences. Springer, Singapore, 2018: 95-103.
Mining W I D. Data mining: Concepts and techniques[J]. Morgan Kaufinann, 2006, 10: 559-569.
Koller D, Friedman N. Probabilistic Graphical Models: Principles and Techniques, ser. Adaptive computation and machine learning[J]. MIT Press, 2009, 11: 16-19.
Cover T M. Elements of information theory[M]. John Wiley & Sons, 1999.
Friedman N, Geiger D, Goldszmidt M. Bayesian network classifiers[J]. Machine learning, 1997, 29(2): 131-163.
Systique H. Machine learning based network anomaly detection[J]. Int. J. Recent Technol. Eng, 2019, 8: 542-548.
Fawcett T. An introduction to ROC analysis[J]. Pattern recognition letters, 2006, 27(8): 861-874.
Hand D J, Till R J. A simple generalisation of the area under the ROC curve for multiple class classification problems[J]. Machine learning, 2001, 45(2): 171-186.
Canbay A, Kälsch J, Neumann U, et al. Non-invasive assessment of NAFLD as systemic disease—a machine learning perspective[J]. PloS one, 2019, 14(3): e0214436.
Bangash A H. Leveraging AutoML to provide NAFLD screening diagnosis: Proposed machine learning models[J]. medRxiv, 2020.
Panigrahi S, Deo R, Liechty E A. A New Machine Learning-Based Complementary Approach for Screening of NAFLD (Hepatic Steatosis)[C]//2021 43rd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC). IEEE, 2021: 2343-2346.
Sorino P, Campanella A, Bonfiglio C, et al. Development and validation of a neural network for NAFLD diagnosis[J]. Scientific Reports, 2021, 11(1): 1-13.
Noureddin M, Ntanios F, Malhotra D, et al. Predicting NAFLD prevalence in the United States using National Health and Nutrition Examination Survey 2017–2018 transient elastography data and application of machine learning[J]. Hepatology Communications, 2022.