Study population
Data analysed in this study were retrieved from Research Resource for Complex Physiologic Signals (PhysioNet). The data is a restricted-access resource, which can be freely downloaded after passing the ethical examinations and signsing the data use agreement for the project according to the website’s protocols[17]. The study dataset was a retrospective cohort of 2,008 patients with HF consecutively admitted to Zigong Fourth People’s Hospital in Sichuan Province from December 2016 to June 2019. The study dataset was available from the following link: https://physionet.org/content/heart-failure-zigong/1.2[6].
Inclusion and Exclusion Criteria
In our study, HF was diagnosed based on 2016 European Society of Cardiology (ESC) criteria[18]. The target patients who had a diagnosis of HF on hospital admission were identified with International Classification of Diseases (ICD)-9 codes and selected from inpatient electronic health record system. Details on ICD-9 codes for the diagnosis of HF are provided in the original publication[6]. The participants with any missing data were excluded from the study and a total number of 1,532 patients were included in the final statistical analysis.
Data collection
Data collected for the study included three broad categories: demographic data, baseline characteristics and laboratory findings. Subject demographics were collected from the first sheet of the medical records and included age, sex and Body Mass Index (BMI). Baseline characteristics were measured on the day of hospital admission and included admission way (emergency or non-emergency), body temperature (T), pulse rate (PR), respiratory rate (RR), systolic blood pressure (SBP), diastolic blood pressure (DBP), Charlson Comorbidity Index Score (CCI), type of HF (left, right or both), NYHA (New York Heart Association) cardiac function classification, Killip grade, Glasgow Coma Scale (GCS), fraction of inspired oxygenation (FiO2). Laboratory findings were obtained from day one of hospital admission, including creatinine (CREA), uric acid (UA), glomerular filtration rate (GFR), cystatin-C (cys-C), white blood cell count (WBC), coefficient of variation of red blood cell distribution width (RDW-CV), standard deviation of red blood cell distribution width (RDW-SD), lymphocyte count (LYM), mean corpuscular hemoglobin (MCH), mean corpuscular hemoglobin concentration (MCHC), mean platelet volume (MPV), basophil count (BASO), eosinophil count (EON), hemoglobin (HGB), platelet (PLT), platelet distribution width (PDW), platelet hematocrit (PCT), neutrophil count (NEUT), D-Dimer (D-Di), high sensitivity troponin T (hs-TnT), brain natriuretic peptide (BNP), albumin (ALB), total cholesterol (TC), low density lipoprotein cholesterol (LDL-C), triglyceride (TG) and high density lipoprotein cholesterol (HDL-C). The details are shown in the original reference[6]. The primary outcome in this study was readmission within 90-day and readmission was measured from index hospital admission.
Ethical permission and informed consent
The planning, conduct, and reporting of the original study was in accordance with the Declaration of Helsinki, as revised in 2013. Ethical approval was obtained from the ethics committee of Zigong Fourth People’s Hospital (Approval Number: 2020-010) and the informed consent was exempted under the approval of the ethics committee of Zigong Fourth People’s Hospital in the original study. Informed consent in present study was not required as this is a study using secondary data and the data was analysed anonymously [19, 20].
Data Analysis
All data analyses were performed using R version 4.2.1 (https://www.r-project.org, The R Foundation). The continuous variables with normal distribution were expressed as the mean ± standard deviation, whereas continuous variables with a skewed distribution were reported as the median (interquartile range). Categorical variables were expressed as frequency (percentage).
The participants were randomly divided into training set (N = 1068) and test set (N = 464) with the ratio of 7:3, which were used to establish readmission prediction models and test the accuracy (ACC) of the models, respectively. Baseline characteristics between training and test sets were compared using independent t test, Mann-Whitney U test, or Chi-square test, respectively. The variable selection was performed using the LASSO (Least absolute shrinkage and selection operator) regression in training group. 10-fold cross-validation was used to compute the optimal lambda shrinkage coefficient that minimized cross-validated error and the largest value of lambda within one standard error (lambada 1se) of this optimal value. Using the training dataset, we trained and developed two prediction models including XGBoost model and logistic regression model (generalize linear model) as classifiers for outcome prediction. The trained logistic regression is detailed in https://shengsong.shinyapps.io/ readmission_at _3_ months_in_HF_patient. To tune the hyperparameters of the XGBoost classifier and evaluate its performance, we obtained 10-fold cross-validation performance for each iteration and selected the iteration value that generated the best performance[21]. With XGBoost, we extracted the results of gain, cover, frequency and importance from the XGBoost output to evaluate the importance of the features. In this paper, the SHapley Additive exPlanations (SHAP), a model-agnostic explanation technique derived from cooperative game theory, was also used to quantify the importance of clinical features and their relationship to readmission for the XGBoost model [22].
The XGBoost model and logistic regression model were tested using separate test set and the prediction performance of the trained models in test set was estimated using area under curve (AUC), kappa score, ACC, balanced ACC, ACC > no information rate (NIR) metric of McNamar’s Chi-square test, sensitivity, specificity, positive predictive value, negative predictive value, precision, recall, F1 score, detection rate and detection prevalence. Generally speaking, kappa could be classified as poor (< 0.0), slight (0.00–0.20), fair (0.21–0.40), moderate (0.41–0.60), substantial (0.61–0.80), or almost perfect (0.80–1.00) and AUC values were clarified as AUC = 0.5: no discrimination; 0.5 < AUC < 0.7: poor discrimination; 0.7 ≤ AUC < 0.8: acceptable discrimination; 0.8 ≤ AUC < 0.9: excellent discrimination; AUC ≥ 0.9: outstanding discrimination. For additional test, we also compared our XGBoost approach against the logistic regression model in test set. The predictive performance of the two models was compared using several metrics described above. The comparison of AUCs between the two models was performed with the DeLong test.