Recently established machine learning has been widely applied in the field of clinical medicine such as outcome prediction, diagnosis, and image interpretation [14–18]. In the present study, we applied the machine learning models to predict IVIG resistance of the initial KD treatment in the Yamanashi cohort study in which clinical data of the 996 cases were available. Considering an imbalanced dataset of IVIG resistance, we applied SMOTE [28, 29], and confirmed a good discriminating ability to predict IVIG resistance. To apply the accurate prediction ability of machine learning model to clinical practice, we established a new scoring system (Yamanashi score) based on the findings in the SHAP plot [30–32] of the random forest model. Considering correlations among features, we selected the following five futures among the top six features with high SHAP values: days of illness at initial therapy as well as serum levels of CRP, sodium, total bilirubin, and total cholesterol. Surprisingly, this simple scoring system using the top five features of the random forest model predicted IVIG resistance as accurately as the random forest model itself. Among the five features of Yamanashi score, four features were also included in three major scoring systems [11–13] as follows; serum CRP level was included in all three scoring systems (Gunma [11], Kurume [12], and Osaka [13]), days of illness at initial therapy was included in two scoring systems (Gunma [11] and Kurume [12]), serum sodium level was in the Gunma score [11], and serum total bilirubin level was in the Osaka score [13]. In contrast, serum total cholesterol level was not included in the three previously established scoring systems. Using the 450 cases of the Yamanashi cohort study, we confirmed that Yamanashi score was as reliable as the Gunma score and more reliable than the Kurume score and the Osaka score.
Among five features in the Yamanashi score, the serum level of total cholesterol had distinctive characteristics as it was not included in all of the three commonly used scoring systems [11–13]. In the SHAP dependence plot of the present study, serum total cholesterol level lower than approximately 130 mg/dL was associated with higher risk of IVIG resistance. Our machine learning finding seems to be consistent with a previous finding showing that levels of serum total cholesterol decreased in the acute phase of KD patients due to abnormal lipid metabolism [35]. In particular, recent report by Shao et al [36] revealed that serum total cholesterol level before the initial IVIG treatment was significantly lower in the cases of IVIG resistance in a single-center prospective cohort study. Although the underlying mechanism for association between dyslipidemia and the severity of systemic inflammation in KD remains unclear, a recent study by Zhang et al [37] revealed that dyslipidemia during acute phase of KD was associated with aberrant levels of adipokines including adiponectin, omentin-1, and chemerin. In the above study by Shao et al [36], alterations in the other lipid proteins were also associated with IVIG resistance: a higher level of triglyceride and lower levels of high-density lipoprotein cholesterol, low-density lipoprotein cholesterol, and apolipoprotein A. Thus, although the lipid profile was not fully evaluated in the present study, dyslipidemia due to systemic inflammation in the acute phase of KD patients may be a rational explanation for the usefulness of serum total cholesterol level as one of predictors for IVIG resistance in the Yamanashi score.
This study has several limitations. First, prediction values in each machine learning model were almost similar to those in the logistic regression model and Gunma score. To further improve prediction values in the random forest model, we also used 50% of the samples as the training set. However, only partial improvement was observed in the half-split train-test in our cohort (Supplemental Fig. 3, Supplemental Table 5). Second, since the majority of the subjects in the present study were of Japanese ethnicities, further validation is required before the present scoring system can be applied to other ethnicities and different populations. Third, although the patients were treated with a standardized protocol, the study was based on retrospective data collection from a number of hospitals. Forth, several known predictive factors such as neutrophil-to-lymphocyte and platelet-lymphocyte ratios [38] were not evaluated. Recently, utilities of coagulation profile [39] and genetic variants of the interleukin gene [40] have been also reported. Thus, machine learning using these factors as additional variables might improve the accuracy. Feature engineering of clinical variables is another possibility to further improve the accuracy [41]. Fifth, insufficient reduction in the serum CRP level was additionally included in the definition of IVIG resistance in the present study, while only persistent fever was evaluated in many studies [42, 43].
In conclusion, we implemented the machine learning algorithm to predict IVIG resistance in KD patients and confirmed its potential. Moreover, using the top five features of the random forest model, we designed a simple scoring system to predict IVIG resistance. Of note, in spite of its simplicity, the scoring system predicted IVIG resistance as accurately as the machine learning approach. Moreover, it should be noted that the widely used Gunma score is just as reliable as the machine learning models at this stage.