A Simple Scoring Model Based on Machine Learning Predicts Intravenous Immunoglobulin Resistance in Kawasaki Disease

doi:10.21203/rs.3.rs-1215051/v2

Download PDF

Research Article

A Simple Scoring Model Based on Machine Learning Predicts Intravenous Immunoglobulin Resistance in Kawasaki Disease

https://doi.org/10.21203/rs.3.rs-1215051/v2

This work is licensed under a CC BY 4.0 License

Version 2

posted

You are reading this latest preprint version

In Kawasaki disease (KD), accurate prediction of intravenous immunoglobulin (IVIG) resistance is crucial to reduce a risk for developing coronary artery lesions. To establish a simple and accurate scoring model predicting IVIG resistance, we conducted a retrospective cohort study of 996 KD patients that were diagnosed at 11 facilities for 10 years, in which 108 cases (23.5%) were resistant to initial IVIG treatment. We performed machine learning with random forest model using 30 clinical variables at diagnosis in 796 and 200 cases for training and test datasets, respectively. Random forest model accurately predicted IVIG resistance (AUC: 0.75, sensitivity: 0.54, specificity: 0.80). Next, using top five influential features (days of illness at initial therapy, serum levels of C-reactive protein, sodium, total bilirubin, and total cholesterol) in the random forest model, we designed a simple scoring system. In spite of its simplicity, the scoring system predicted IVIG resistance (AUC: 0.73, sensitivity: 0.55, specificity: 0.83) as accurately as the random forest model itself. Moreover, accuracy of our scoring system with five clinical features was almost identical to that of Gunma score with seven clinical features (AUC: 0.73, sensitivity: 0.53, specificity: 0.83), a well-known logistic regression scoring model, and superior to that of two widely used scores (Kurume score: 0.67, 0.46 and 0.76, respectively, and Osaka score: 0.69, 0.33 and 0.84, respectively).

Conclusions: Our simple scoring system based on the findings in machine learning, as well as machine learning itself, seems to be useful to accurately predict IVIG resistance in KD patients.

Kawasaki disease

machine learning

random forest

Intravenous Immunoglobulin resistance

Shapley additive explanation

・Approximately 20% of KD patients are refractory to IVIG therapy and at a high risk of developing coronary artery lesions.

・To reduce development of coronary artery lesions in KD patients, accurate prediction of IVIG resistance is a key issue.

What is New

・The scoring system using five clinical features (days of illness at initial therapy, serum levels of CRP, sodium, total bilirubin, and total cholesterol at diagnosis) identified by the random forest model of machine learning efficiently predicted IVIG resistance.

Kawasaki disease (KD) is an acute febrile illness in infants and children. Of clinical importance, it is characterized by systemic vasculitis and affects the small arteries, especially the coronary arteries [1, 2]. To avoid the development of coronary artery lesions (CAL), high-dose (2g/kg) intravenous immunoglobulin (IVIG) therapy has been established as a standard initial treatment for KD patients in the acute phase [2, 3]. However, approximately 20% of KD patients are resistant to the initial IVIG treatment [3], and IVIG resistance is a typical risk factor for developing CAL [1, 4–7]. Under these circumstances, several recent studies showed a possible clinical benefit of intensive initial IVIG therapy combined with other anti-inflammatory agents for the high-risk KD patients [6, 8–10]. For effective pre-treatment risk stratification, it is crucial to establish a scoring system to accurately predict IVIG resistance at the timing of clinical diagnosis of KD. Currently, there are several widely used Japanese scoring models for predicting IVIG resistance: Gunma score proposed by Kobayashi et al [11], Kurume score proposed by Egami et al [12], and Osaka score proposed by Sano et al [13]. These scoring systems were developed by the logistic regression analysis of clinical profiles and laboratory findings before initial treatment, which were selected based on statistical assumptions.

To establish a more reliable and simple scoring system for the prediction of IVIG resistance in KD patients, an alternative approach using large data repositories is required. Recently the developed machine learning approach has shown great potential for assisting the clinical diagnosis and predicting outcomes [14–19]. Two recent studies applied machine learning to predict IVIG resistance in KD patients, and confirmed its usefulness [14, 15]. However, there were several limitations in both studies including a limited number of KD patients (n = 98) in a single institute in one study [14], and a relatively large number of KD patients (n = 497) with two different IVIG protocols in the other [15]. Moreover, in clinical practice, even if machine learning has a high degree of accuracy, a simple scoring system is more convenient for risk-stratified treatment.

In the present study, we applied machine learning to predict IVIG resistance in 996 KD cases treated with single IVIG protocol in multiple institutes. Subsequently, using the five most important features associated with IVIG resistance in the machine learning, we developed a new scoring system and confirmed its utility by comparison with the three representative scoring systems.

Study participants

The study is a retrospective review of multicenter registration database of 1134 consecutively diagnosed KD patients who were diagnosed between June 2010 and December 2020 in 12 inpatient facilities for the care of pediatric patients, as listed in Supplemental Table 1. Diagnosis of KD was retrospectively confirmed based on criteria defined in the fifth edition of the Japanese Kawasaki Disease Diagnostic Guidelines [20]. In brief, a diagnosis was made when the patients had at least five of the six major symptoms (fever, conjunctival congestion, oral mucosa alteration, cervical lymphadenopathy, swelling of extremities, and polymorphous rash), or when the patients had four major symptoms with the development of CAL. Development of CAL was defined by quantifying the internal coronary artery dimension as per the Japanese Ministry of Health Criteria (a maximum absolute internal diameter > 3 mm in children < 5 years of age, or > 4 mm in children 5 years and older, or segment 1.5 times greater than an adjacent segment, or the presence of luminal irregularity) and whenever body surface area-adjusted Z score of any coronary artery was ≥ + 2.5 (including left main, left anterior descending, left circumflex arteries and right coronary arteries) [2]. The facilities included all of the 11 pediatric inpatient facilities in Yamanashi Prefecture and 1 facility in Nagano Prefecture in Japan. The registration database was constructed with anonymized clinical records of all the diagnosed KD cases in each hospital that were collected at the end of every year. The study was performed under the approval by the Research Ethics Committee of University of Yamanashi Hospital (Approval Number 1698).

Treatment of Kawasaki disease

All of the patients were treated identically with a first-line regimen of 2 g/kg/dose of IVIG in combination with 30 mg/kg of oral aspirin immediately after the diagnosis of KD was made based on the above criteria. Standardized treatment workflow was confirmed in the meeting by each facility every year. IVIG therapy was completed within 24 hours after diagnosis of KD in all of the patients. The response to the initial treatment was evaluated 48 hours after initiation of IVIG administration and was considered as ‘IVIG resistance’ when the body temperature was over 37.5°C and the serum levels of C-reactive protein (CRP) was higher than half of the peak value. Initially, IVIG-resistant patients were treated with second-line therapy comprising an additional 2 g/kg/dose of IVIG or 5 mg/kg of intravenous infliximab [21]. In addition, when the patients were considered to be resistant to the second-line therapy, plasma exchange was carried out after the patient was transferred to University of Yamanashi Hospital [22].

Machine learning

The predictors for IVIG resistance were chosen from routinely available data including 6 demographic variables, 22 laboratory data, and 2 echocardiographic parameters at diagnosis as listed in Table 1. For any missing laboratory data and echocardiography parameter values, the median value was complementary used in the machine learning. We used the random forest model, which is a tree-based, nonparametric method requiring no assumption about data distribution [23]. We also performed logistic regression analysis, support vector machine analysis (SVM) [24], eXtreme Gradient Boosting (XGBoost) [25], and a neural network [26]. We performed each machine learning in the training set (approximately 80% of the random sample) using scikit-learn in Python software (version 3.8.3), and the optimal parameters (number of trees and the maximum depth of the tree) were determined according to the best area under the ROC curve (AUC) in the validation set (approximately 20% of the random sample) as in the previous studies [14, 15] by using k-fold cross validation (k = 10) (Supplemental Fig. 1) [27]. Considering an imbalanced dataset of the IVIG response, we used synthetic minority over-sampling technique (SMOTE), which is a technique of over-sampling the minority class [28, 29].

Development of the scoring system

For development of the simple scoring system to predict IVIG resistance, we selected the features that influenced the prediction model in the random forest algorithm. We used Shapley additive explanation (SHAP), which is a unified approach for explaining the outcome of machine learning model [30–32]. SHAP values evaluate the importance of the output, and a higher SHAP value indicates that a feature has a larger impact and is more important on the model [15, 33]. To determine the cutoff level of each variable, we used the SHAP dependence plot, which evaluates significance of each feature in the output of the random forest model [18]. Based on the SHAP value, we constructed a new predictive scoring model (Yamanashi score). To validate the accuracy of the new score system, we applied the score system in the above Yamanashi study cohort and compared it with three previously established score systems.

Statistical analysis

Statistical analyses were performed using EZR software (version 1.41) [34] and Python software (version 3.8.3). Spearman's correlation coefficient was used to analyze the correlation of variables. Creation and comparison of the Receiver Operating Characteristic (ROC) curves were performed by using EZR software.

Prediction of IVIG resistance by machine learning

From June 2010 to December 2020, 1134 consecutive KD cases were enrolled in the Yamanashi study cohort. In the present study, 138 cases were excluded for further analyses due to a diagnosis of incomplete KD (n = 129), severe lack of laboratory data (n = 1), or delayed IVIG treatment after 10 days of onset (n = 8) (Fig. 1). In the remaining 996 cases, 225 cases (22.6%) were resistant to first course of IVIG treatment. In the demographics of 12 facilities (Supplemental Table 2), variations in day of illness at initial therapy (median: 5.2 days; range: 4.9–5.8) and IVIG resistance (22%, 14–34%) were largely acceptable. Prediction values of IVIG resistance in three representative scores are summarized in Supplemental Table 3. We operated each machine learning using 6 demographic variables, 22 laboratory data, and 2 echocardiographic parameters at diagnosis listed in Table 1. The data of 996 cases were divided at random into 796 cases of the training dataset (approximately 80%) and 200 cases of the test dataset (approximately 20%). Considering a relatively low frequency of IVIG resistance as an imbalanced dataset of machine learning, we applied SMOTE [28, 29]. Prediction values and ROC curves for IVIG resistance in each model are summarized in Table 2 and Fig. 2a, respectively. The highest area under the ROC curve (AUC) was observed in the random forest model (0.75) (Fig. 2a). In the random forest model, global accuracy, sensitivity, specificity, positive prediction value, negative prediction value, positive likelihood ratio, and negative likelihood ratio scores were 0.74 [95% confidence interval (CI): 0.67–0.80], 0.54 (0.39–0.69), 0.80 (0.73–0.86), 0.45 (0.31–0.59), 0.85 (0.79–0.91), 2.70 (1.79–4.07), and 0.57 (0.41–0.79), respectively. These observations demonstrated that machine learning models achieved good discriminating abilities to predict IVIG resistance in KD patients.

Development of scoring system to predict IVIG resistance

We next evaluated the top 20 features among 30 items tested in the random forest model using SHAP (Fig. 2b). In SHAP summary plot, the higher the SHAP value of a feature, the higher the probability of IVIG resistance. In each SHAP value of a feature, each dot represents the feature attribution value of each patient, and red and blue dots represents higher and lower feature value, respectively. The highest SHAP value feature was days of illness at initial therapy (start day) (SHAP value [average of absolute value]; 0.069). Additionally, serum levels of CRP (0.051), sodium (0.047), chloride (0.033), total bilirubin (0.026), and total cholesterol (0.025) were the other top six features. In the top six features, we evaluated Spearman's correlation coefficient for each parameter (Fig. 2c). Significant correlation (R² > 0.2) was exclusively observed between serum chloride level and serum sodium level (R² = 0.46). Thus, to create a new score system to predict IVIG resistance, we selected the following five features; start day, CRP, sodium, total bilirubin, and total cholesterol. Then, to determine the cutoff level for each variable, we evaluated the SHAP dependence plot of the five features (Fig. 3). In the case of days of illness at initial therapy (start day), days 4 or earlier were high risk values. In the same manner, the SHAP dependence plots revealed a cutoff level of high risk for each laboratory data as follows; CRP ≥ 8 mg/dL, sodium ≤ 132 mmol/L, total bilirubin ≥ 0.8 mg/dL, and total cholesterol ≤ 130 mg/dL. Based on the results in the SHAP analysis, we constructed a simple scoring model (Yamanashi score, Supplemental Fig. 2). Among five variables, 2 points were scored for the top two variables (start day and CRP), while 1 point was given for the other three variables (sodium, total bilirubin, and total cholesterol). Thus, the maximum total score was 7 points.

Validation of scoring systems to predict IVIG resistance

We validated the accuracy of the Yamanashi score in the prediction of IVIG resistance by comparing it with three representative scoring systems in the cohort of Yamanashi study. Among the 996 KD cases, 546 cases were excluded for the validation due to lack of even one of variables for the 4 scoring systems, and thus the remaining 450 cases were available for further analyses. Among the 450 cases, 108 cases (23.5%) were resistant to initial IVIG treatment. In the ROC curve of the Yamanashi score, AUC was 0.73 (95%CI: 0.68–0.79). With a cutoff of 4 points for the total score, sensitivity and specificity were 0.55 (0.45–0.64) and 0.83 (0.78–0.86), respectively. Of note, although the subjects for two analyses were partly different, the accuracy (AUC, 0.73; sensitivity, 0.55; specificity, 0.83) of the Yamanashi score using the top five features in the random forest model was almost identical to that (AUC, 0.75; sensitivity, 0.54; specificity, 0.80) of the random forest model using 30 clinical variables. Next, we compared the prediction accuracy of the Yamanashi score with three previous scoring systems (Fig. 4, Supplemental Table 4). Interestingly, although two of the five variables of the Yamanashi score were different from the seven variables of the Gunma score [11], ROC curve and AUC of the Yamanashi score were almost identical to those of the Gunma score (Fig. 4a, AUC: 0.73). When the Gunma score was applied with a cutoff of 5 points for total score, sensitivity and specificity were 0.53 (95%CI: 0.43–0.63) and 0.83 (0.78–0.87), respectively. In the 450 cases of the Yamanashi cohort study, the total Yamanashi score was strongly correlated with that of the Gunma score (R² = 0.52). In contrast to the Gunma score, although the correlation coefficient (R²) with the Yamanashi score was 0.49, ROC curve and AUC of the Kurume score [12] (Fig. 4b, AUC: 0.67) were significantly inferior (p = 0.006) to those of the Yamanashi score. Similar to the Kurume score, although the correlation coefficient (R²) with the Yamanashi score was 0.51, ROC curve and AUC of the Osaka score [13] (Fig. 4c, AUC: 0.69) were marginally inferior (p = 0.048) to those of the Yamanashi score. Moreover, most of the prediction values in Yamanashi score were almost similar to those in the Gunma score (Supplemental Table 4), while the majority of the prediction values in the Yamanashi score were significantly better than those in the Kurume and Osaka scores. These observations revealed that the simple scoring system using top five features in the machine learning model predicted IVIG resistance as accurately as the machine learning model itself as well as the widely used Gunma score, at least in the Yamanashi cohort study.

Recently established machine learning has been widely applied in the field of clinical medicine such as outcome prediction, diagnosis, and image interpretation [14–18]. In the present study, we applied the machine learning models to predict IVIG resistance of the initial KD treatment in the Yamanashi cohort study in which clinical data of the 996 cases were available. Considering an imbalanced dataset of IVIG resistance, we applied SMOTE [28, 29], and confirmed a good discriminating ability to predict IVIG resistance. To apply the accurate prediction ability of machine learning model to clinical practice, we established a new scoring system (Yamanashi score) based on the findings in the SHAP plot [30–32] of the random forest model. Considering correlations among features, we selected the following five futures among the top six features with high SHAP values: days of illness at initial therapy as well as serum levels of CRP, sodium, total bilirubin, and total cholesterol. Surprisingly, this simple scoring system using the top five features of the random forest model predicted IVIG resistance as accurately as the random forest model itself. Among the five features of Yamanashi score, four features were also included in three major scoring systems [11–13] as follows; serum CRP level was included in all three scoring systems (Gunma [11], Kurume [12], and Osaka [13]), days of illness at initial therapy was included in two scoring systems (Gunma [11] and Kurume [12]), serum sodium level was in the Gunma score [11], and serum total bilirubin level was in the Osaka score [13]. In contrast, serum total cholesterol level was not included in the three previously established scoring systems. Using the 450 cases of the Yamanashi cohort study, we confirmed that Yamanashi score was as reliable as the Gunma score and more reliable than the Kurume score and the Osaka score.

Among five features in the Yamanashi score, the serum level of total cholesterol had distinctive characteristics as it was not included in all of the three commonly used scoring systems [11–13]. In the SHAP dependence plot of the present study, serum total cholesterol level lower than approximately 130 mg/dL was associated with higher risk of IVIG resistance. Our machine learning finding seems to be consistent with a previous finding showing that levels of serum total cholesterol decreased in the acute phase of KD patients due to abnormal lipid metabolism [35]. In particular, recent report by Shao et al [36] revealed that serum total cholesterol level before the initial IVIG treatment was significantly lower in the cases of IVIG resistance in a single-center prospective cohort study. Although the underlying mechanism for association between dyslipidemia and the severity of systemic inflammation in KD remains unclear, a recent study by Zhang et al [37] revealed that dyslipidemia during acute phase of KD was associated with aberrant levels of adipokines including adiponectin, omentin-1, and chemerin. In the above study by Shao et al [36], alterations in the other lipid proteins were also associated with IVIG resistance: a higher level of triglyceride and lower levels of high-density lipoprotein cholesterol, low-density lipoprotein cholesterol, and apolipoprotein A. Thus, although the lipid profile was not fully evaluated in the present study, dyslipidemia due to systemic inflammation in the acute phase of KD patients may be a rational explanation for the usefulness of serum total cholesterol level as one of predictors for IVIG resistance in the Yamanashi score.

This study has several limitations. First, prediction values in each machine learning model were almost similar to those in the logistic regression model and Gunma score. To further improve prediction values in the random forest model, we also used 50% of the samples as the training set. However, only partial improvement was observed in the half-split train-test in our cohort (Supplemental Fig. 3, Supplemental Table 5). Second, since the majority of the subjects in the present study were of Japanese ethnicities, further validation is required before the present scoring system can be applied to other ethnicities and different populations. Third, although the patients were treated with a standardized protocol, the study was based on retrospective data collection from a number of hospitals. Forth, several known predictive factors such as neutrophil-to-lymphocyte and platelet-lymphocyte ratios [38] were not evaluated. Recently, utilities of coagulation profile [39] and genetic variants of the interleukin gene [40] have been also reported. Thus, machine learning using these factors as additional variables might improve the accuracy. Feature engineering of clinical variables is another possibility to further improve the accuracy [41]. Fifth, insufficient reduction in the serum CRP level was additionally included in the definition of IVIG resistance in the present study, while only persistent fever was evaluated in many studies [42, 43].

In conclusion, we implemented the machine learning algorithm to predict IVIG resistance in KD patients and confirmed its potential. Moreover, using the top five features of the random forest model, we designed a simple scoring system to predict IVIG resistance. Of note, in spite of its simplicity, the scoring system predicted IVIG resistance as accurately as the machine learning approach. Moreover, it should be noted that the widely used Gunma score is just as reliable as the machine learning models at this stage.

AUC: area under the ROC curve

CAL: coronary artery lesions

CRP: C-reactive protein

IVIG: intravenous immunoglobulin

KD: Kawasaki disease

ROC: Receiver Operating Characteristic

SHAP: Shapley additive explanation

SMOTE: synthetic minority over-sampling technique

Authors’ contributions:

All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by Drs Masashi Yoshizawa, Yosuke Kono, Yohei Hasebe, Keiichi Koizumi and Minako Hoshiai. Drs Takako Toda, Atsushi Watanabe, Nobuyuki Katsumata and Prof Eiryo Kawakami coordinated and supervised data collection. The first draft of the manuscript was written by Dr Yuto Sunaga and Prof Takeshi Inukai and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Funding: N/A

(The authors declare that no funds, grants, or other support were received during the preparation of this manuscript.)

Conflicts of interest/Competing interests: N/A

(The authors have no relevant financial or non-financial interests to disclose.)

Availability of data and material: N/A

Code availability: N/A

Authors' contributions:

Ethics approval:

This study was performed in line with the principles of the Declaration of Helsinki. Approval was granted by the Research Ethics Committee of University of Yamanashi Hospital (Approval Number 1698).

Consent to participate: Informed consent was obtained from the individual patients or their parents.

Consent for publication: Informed consent was obtained from the individual patients or their parents.

Acknowledgments

All authors express our sincere gratitude to all the members in Yamanashi Kawasaki Disease Research Group, who supported acquisition of data. In addition to those listed as authors, the following investigators participated and cooperated in the acquisition of data for this study: Tomohiro Saito (Yamanashi Prefectural Central Hospital), Sho Hokibara (Kofu Municipal Hospital), Koji Kobayashi (Yamanashi Kosei Hospital), Tomoaki Sano, Toshie Nishijima (Yamanashi Red Cross Hospital), Hiroki Sato, Hiroaki Kanai (Suwa Central Hospital), Miwa Goto (National Hospital Organization Kofu National Hospital), Makoto Tsuruta (Kofu-Kyoritsu Hospital), Satoru Kojika, Makoto Nakamura (Fujiyoshida Municipal Hospital), Sonoko Mizorogi, Kinuko Saito (Nirasaki City Hospital), Masanori Ohta, Kazuya Takahashi (Tsuru Municipal General Hospital), Kazumasa Sato, Mie Mochizuki (Kyonan Medical Center Fujikawa Hospital).

Kawasaki T, Kosaki F, Okawa S, Shigematsu I, Yanagawa H (1974) A new infantile acute febrile mucocutaneous lymph node syndrome (MLNS) prevailing in Japan. Pediatrics 54(3):271–276
McCrindle BW, Rowley AH, Newburger JW, Burns JC, Bolger AF, Gewitz M, Baker AL, Jackson MA, Takahashi M, Shah PB et al (2017) Diagnosis, Treatment, and Long-Term Management of Kawasaki Disease: A Scientific Statement for Health Professionals From the American Heart Association. Circulation 135(17):e927–999. DOI: 10.1161/CIR.0000000000000484
Newburger JW, Takahashi M, Beiser AS, Burns JC, Bastian J, Chung KJ, Colan SD, Duffy CE, Fulton DR, Glode MP et al (1991) A single intravenous infusion of gamma globulin as compared with four infusions in the treatment of acute Kawasaki syndrome. New Engl J Med 324:1633–1639. DOI: 10.1056/NEJM199106063242305
Tremoulet AH, Best BM, Song S, Wang S, Corinaldesi E, Eichenfield JR, Martin DD, Newburger JW, Burns JC (2008) Resistance to intravenous immunoglobulin in children with Kawasaki disease. J Pediatr 153:117–121. DOI: 10.1016/j.jpeds.2007.12.021
Muta H, Ishii M, Furui J, Nakamura Y, Matsuishi T (2006) Risk factors associated with the need for additional intravenous gamma-globulin therapy for Kawasaki disease. Acta Paediatr 95:189–193. DOI: 10.1080/08035250500327328
Kobayashi T, Saji T, Otani T, Nakamura T, Arakawa H, Kato T, Hara T, Hamaoka K, Ogawa S, Miura M et al (2012) Efficacy of immunoglobulin plus prednisolone for prevention of coronary artery abnormalities in severe Kawasaki disease (RAISE study): a randomised, open-label, blinded-endpoints trial. Lancet 379:1613–1620. DOI: 10.1016/S0140-6736(11)61930-2
Burns JC, Capparelli EV, Brown JA, Newburger JW, Glode MP (1998) Intravenous gamma-globulin treatment and retreatment in Kawasaki disease: US/Canadian Kawasaki Syndrome Study Group. Pediatr Infect Dis J 17:1144–1148. DOI: 10.1097/00006454-199812000-00009
Ogata S, Ogihara Y, Honda T, Kon S, Akiyama K, Ishii M (2012) Corticosteroid pulse combination therapy for refractory Kawasaki disease: A randomized trial. Pediatrics 129:e17–23. DOI: 10.1542/peds.2011-0148
Tremoulet AH, Jain S, Jaggi P, Jimenez-Fernandez S, Pancheri JM, Sun X, Kanegaye JT, Kovalchin JP, Printz BF, Ramilo O, Burns JC (2014) Infliximab for intensification of primary therapy for Kawasaki disease: a phase 3 randomised, double-blind, placebo-controlled trial. Lancet 383:1731–1738. DOI: 10.1016/S0140-6736(13)62298-9
Burns JC, KoneÂ-Paut I, Kuijpers T, Shimizu C, Tremoulet A, Arditi M (2017) Found in Translation: International Initiatives Pursuing Interleukin-1 Blockade for Treatment of Acute Kawasaki Disease. Arthritis Rheumatol 69:268–276. DOI: 10.1002/art.39975
Kobayashi T, Inoue Y, Takeuchi K, Okada Y, Tamura K, Tomomasa T, Kobayashi T, Morikawa A (2006) Prediction of intravenous immunoglobulin unresponsiveness in patients with Kawasaki disease. Circulation 113(22):2606–2612. DOI: 10.1161/CIRCULATIONAHA.105.592865
Egami K, Muta H, Ishii M, Suda K, Sugahara Y, Iemura M, Matsuishi T (2006) Prediction of resistance to intravenous immunoglobulin treatment in patients with Kawasaki disease. J Pediatr 149(2):237–240. DOI: 10.1016/j.jpeds.2006.03.050
Sano T, Kurotobi S, Matsuzaki K, Yamamoto T, Maki I, Miki K, Kogaki S, Hara J (2007) Prediction of non-responsiveness to standard high-dose gamma-globulin therapy in patients with acute Kawasaki disease before starting initial treatment. Eur J Pediatr 166(2):131–137. DOI: 10.1007/s00431-006-0223-z
Kuniyoshi Y, Tokutake H, Takahashi N, Kamura A, Yasuda S, Tashiro M (2020) Comparison of Machine Learning Models for Prediction of Initial Intravenous Immunoglobulin Resistance in Children With Kawasaki Disease. Front Pediatr 8:570834. DOI: 10.3389/fped.2020.570834
Wang T, Liu G, Lin H (2020) A machine learning approach to predict intravenous immunoglobulin resistance in Kawasaki disease patients: A study based on a Southeast China population. PLoS ONE 15(8):e0237321. DOI: 10.1371/journal.pone.0237321
Takeuchi M, Inuzuka R, Hayashi T, Shindo T, Hirata Y, Shimizu N, Inatomi J, Yokoyama Y, Namai Y, Oda Y et al (2017) Novel Risk Assessment Tool for Immunoglobulin Resistance in Kawasaki Disease: Application Using a Random Forest Classifier. Pediatr Infect Dis J 36(9):821–826. DOI: 10.1097/INF.0000000000001621
Kawakami E, Tabata J, Yanaihara N, Ishikawa T, Koseki K, Iida Y, Saito M, Komazaki H, Shapiro JS, Goto C et al (2019) Application of Artificial Intelligence for Preoperative Diagnostic and Prognostic Prediction in Epithelial Ovarian Cancer Based on Blood Biomarkers. Clin Cancer Res 25(10):3006–3015. DOI: 10.1158/1078-0432.CCR-18-3378
Tseng PY, Chen YT, Wang CH, Chiu KM, Peng YS, Hsu SP, Chen KL, Yang CY, Lee OK (2020) Prediction of the development of acute kidney injury following cardiac surgery by machine learning. Crit Care 24(1):478. DOI: 10.1186/s13054-020-03179-9
Rajkomar A, Dean J, Kohane I (2019) Machine Learning in Medicine. N Engl J Med 380(14):1347–1358. DOI: 10.1056/NEJMra1814259
Ayusawa M, Sonobe T, Uemura S, Ogawa S, Nakamura Y, Kiyosawa N, Ishii M, Harada K et al (2005) Revision of diagnostic guidelines for Kawasaki disease (the 5th revised edition). Pediatr Int 47:232–234. DOI: 10.1111/j.1442-200x.2005.02033.x
Koizumi K, Hoshiai M, Katsumata N, Toda T, Kise H, Hasebe Y, Kono Y, Sunaga Y, Yoshizawa M, Watanabe A et al (2018) Infliximab regulates monocytes and regulatory T cells in Kawasaki disease. Pediatr Int 60(9):796–802. DOI: 10.1111/ped.13555
Koizumi K, Hoshiai M, Moriguchi T, Katsumata N, Toda T, Kise H, Hasebe Y, Kono Y, Sunaga Y, Yoshizawa M et al (2019) Plasma Exchange Downregulates Activated Monocytes and Restores Regulatory T Cells in Kawasaki Disease. Ther Apher Dial 23(1):92–98. DOI: 10.1111/1744-9987.12754
Breiman L (2001) Random Forests. Mach Learn 45:5–32
Noble WS (2006) What is a support vector machine? Nat Biotechnol 24(12):1565–1567. DOI: 10.1038/nbt1206-1565
Chen T (2016) XGBoost: A Scalable Tree Boosting System. 785–794. 10.1145/2939672.2939785
Kriegeskorte N, Golan T (2019) Neural network models and deep learning. Curr Biol 29(7):R231–R236. DOI: 10.1016/j.cub.2019.02.034
Kohavi R (1995) A study of cross-validation and bootstrap for accuracy estimation and model selection. IJCAI 14(2):1137–1145
Chawla NV, Bowyer KW, Hall LO (2002) SMOTE: Synthetic Minority Over-sampling Technique. J Artif Intell Res 16:321–357. DOI: 10.1613/jair.953
Fernández A, Garcia S, Herrera F, Chawla NV (2018) SMOTE for Learning from Imbalanced Data: Progress and Challenges, Marking the 15-year Anniversary. J Artif Intell Res 61:863–905. DOI: 10.1613/jair.1.11192
Shapley LS (1953) A Value for n-Person Games. In: Kuhn HW and Tucker AW (eds) Contributions to the Theory of Games II, Princeton University Press, Princeton 28:307–317
Rodríguez-Pérez R, Bajorath J (2020) Interpretation of machine learning models using shapley values: application to compound potency and multitarget activity predictions. J Comput Aided Mol Des 34:1013–1026. DOI: 10.1007/s10822-020-00314-0
Lundberg SM, Lee S (2017) A unified approach to interpreting model predictions. https://papers.nips.cc/paper/2017/hash/8a20a8621978632d76c43dfd28b67767-Abstract.html
Scott ML, Gabriel GF, Su-In (2018) L Consistent Individualized Feature Attribution for Tree Ensembles. https://arxiv.org/abs/1802.03888
Kanda Y (2013) Investigation of the freely available easy-to-use software ‘EZR’ for medical statistics. Bone Marrow Transplant 48:452–458. DOI: 10.1038/bmt.2012.244
Salo E, Pesonen E, Viikari (1991) Serum cholesterol levels during and after Kawasaki disease. J Pediatr 119(4): 557 – 61. DOI: 10.1016/s0022-3476(05)82404-7
Shao S, Zhou K, Liu X, Liu L, Wu M, Deng Y, Duan H, Li Y, Hua Y, Wang C (2021) Predictive value of serum lipid for intravenous immunoglobulin resistance and coronary artery lesion in Kawasaki disease. J Clin Endocrinol Metab 10:dgab230. DOI: 10.1210/clinem/dgab230
Zhang XY, Yang TT, Hu XF, Wen Y, Fang F, Lu HL (2018) Circulating adipokines are associated with Kawasaki disease. Pediatr Rheumatol Online J 16(1):33. DOI: 10.1186/s12969-018-0243-z
Kanai T, Takeshita S, Kawamura Y, Kinoshita K, Nakatani K, Iwashima S, Takizawa Y, Hirono K, Mori K, Yoshida Y et al (2020) The combination of the neutrophil–to–lymphocyte and platelet–to–lymphocyte ratios as a novel predictor of intravenous immunoglobulin resistance in patients with Kawasaki disease: a multicenter study. Heart Vessels 35(10):1463–1472. DOI: 10.1007/s00380-020-01622-z
Shao S, Yang L, Liu X, Liu L, Wu M, Deng Y, Duan H, Li Y, Hua Y, Luo L et al (2021) Predictive value of coagulation profiles for both initial and repeated immunoglobulin resistance in Kawasaki disease: A prospective cohort study. Pediatr Allergy Immunol 32(6):1349–1359. DOI: 10.1111/pai.13495
Amano Y, Akazawa Y, Yasuda J, Yoshino K, Kojima H, Kobayashi N, Matsuzaki S, Nagasaki M, Kawai Y, Minegishi N et al (2019) A low-frequency IL4R locus variant in Japanese patients with intravenous immunoglobulin therapy-unresponsive Kawasaki disease. Pediatr Rheumatol 17(1). DOI: 10.1186/s12969-019-0337-2
Zheng A, Casari A (2018) Feature Engineering for Machine Learning. Principles and Techniques for Data Scientists
Hamada H, Suzuki H, Onouchi Y, Ebata R, Terai M, Fuse S, Okajima Y, Kurotobi S, Hirai K, Soga T et al (2019) Efficacy of primary treatment with immunoglobulin plus ciclosporin for prevention of coronary artery abnormalities in patients with Kawasaki disease predicted to be at increased risk of non-response to intravenous immunoglobulin (KAICA): a randomised controlled, open-label, blinded-endpoints, phase 3 trial. Lancet 393:1128–1137. DOI: 10.1016/S0140-6736(18)32003-8
Miyata K, Miura M, Kaneko T, Morikawa Y, Sakakibara H, Matsushima T, Misawa M, Takahashi T, Nakazawa M, Tsuchihashi T et al (2021) Risk Factors of Coronary Artery Abnormalities and Resistance to Intravenous Immunoglobulin Plus Corticosteroid Therapy in Severe Kawasaki Disease: An Analysis of Post RAISE. Circulation: Cardiovascular Quality and Outcomes 14:e007191. DOI: 10.1161/CIRCOUTCOMES.120.007191

Table 1, 2 is available in the Supplementary Files section.

SupplementalFigure1.pdf
Flowchart of k-fold cross validation The data of 996 cases were divided at random into training dataset (approximately 80%) and test dataset (approximately 20%). The generalization performance of training dataset was evaluated by stratified k-fold cross validation (k=10).
SupplementalFigure2.pdf
Variables for each score Yamanashi score consisted of five variables, while the Gunma, Kurume, and Osaka scores consisted of seven, five, and three variables, respectively. Total cholesterol level was not included in Gunma, Kurume, or Osaka score.
SupplementalFigure3.pdf
ROC curves for IVIG resistance in the random forest model using the half-split train-test The horizontal axis indicates false positive rate (1-specificity), and the vertical axis indicates true positive rate (sensitivity). AUC is indicated at the bottom.
SupplementalTable1.pdf
List of facilities
SupplementalTable2.pdf
Demographics in each facility
SupplementalTable3.pdf
Prediction values of three representative scores in Yamanashi cohort
SupplementalTable4.pdf
Prediction values in each score
SupplementalTable5.pdf
Prediction values in the random forest model using the half-split train-test
Table1.pdf
Table2.pdf

Download PDF

Version 2

posted

You are reading this latest preprint version

A Simple Scoring Model Based on Machine Learning Predicts Intravenous Immunoglobulin Resistance in Kawasaki Disease

Status:

Version 2

Abstract

Figures

What Is Known

Introduction

Materials And Methods

Study participants

Treatment of Kawasaki disease

Machine learning

Development of the scoring system

Statistical analysis

Results

Prediction of IVIG resistance by machine learning

Development of scoring system to predict IVIG resistance

Validation of scoring systems to predict IVIG resistance

Discussion

Abbreviations

Declarations

References

Tables

Supplementary Files

Status:

Version 2