Machine learning to predict post-operative acute kidney injury stage 3 after heart transplantation

Background Acute kidney injury (AKI) stage 3, one of the most severe complications in patients with heart transplantation (HT), is associated with substantial morbidity and mortality. We aimed to develop a machine learning (ML) model to predict post-transplant AKI stage 3 based on preoperative and perioperative features. Methods Data from 107 consecutive HT recipients in the provincial center between 2018 and 2020 were included for analysis. Logistic regression with L2 regularization was used for the ML model building. The predictive performance of the ML model was assessed using the area under the curve (AUC) in tenfold stratified cross-validation and was compared with that of the Cleveland-clinical model. Results Post-transplant AKI occurred in 76 (71.0%) patients including 15 (14.0%) stage 1, 18 (16.8%) stage 2, and 43 (40.2%) stage 3 cases. The top six features selected for the ML model to predicate AKI stage 3 were serum cystatin C, estimated glomerular filtration rate (eGFR), right atrial long-axis dimension, left atrial anteroposterior dimension, serum creatinine (SCr) and FVII. The predictive performance of the ML model (AUC: 0.821; 95% confidence interval [CI]: 0.740–0.901) was significantly higher compared with that of the Cleveland-clinical model (AUC: 0.654; 95% [CI]: 0.545–0.763, p < 0.05). Conclusions The ML model, which achieved an effective predictive performance for post-transplant AKI stage 3, may be helpful for timely intervention to improve the patient’s prognosis. Supplementary Information The online version contains supplementary material available at 10.1186/s12872-022-02721-7.


Background
Heart transplantation (HT) remains as a life-sustaining treatment choice for numerous end-stage heart disease patients [1]. Despite the advancement of various immunosuppressive therapies and treatment programs, the incidence rates of acute kidney injury (AKI) as well as severe AKI requiring renal replacement therapy (RRT) in patients with HT remain high in recent years [2]. AKI most commonly occurs in the first week after HT, with the incidence of 22-76%, and is associated with high rates of morbidity and mortality [3][4][5][6].
Open Access † Tingyu Li and Yuelong Yang contributed equally to this study, Co-first authors. † Min Wu and Hui Liu contributed equally to this study, Co-corresponding authors.
*Correspondence: wumin0011@gdph.org.cn 1 Guangdong Cardiovascular Institute, Guangdong Provincial Key Laboratory of South China Structural Heart Disease, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, 106 Zhongshan 2nd Road, Guangzhou 510080, Guangdong, China Full list of author information is available at the end of the article Early AKI detection after HT is meaningful for interventions that prevent future kidney damage and preserve the kidney function, because AKI is associated with more than 60% mortality rate among hospitalized postsurgical patients who received intensive care [7]. Moreover, AKI, especially stage 3, is correlated with subsequent progressive chronic kidney disease (CKD) along with decreased survival rates of HT recipients. Various features associated with AKI stage 3, such as drugs, immunosuppression therapies, hemodynamics, and some anesthesia-and surgery-related factors have been identified by traditional models in previous studies [6,8]. However, their predictive performance is relatively limited due to the limited amount of patient's information extracted and some unsatisfying features with conflicting effects. For these reasons, it is indispensable to develop a novel and efficient model to predict AKI stage 3.
As a powerful tool for intelligent data analysis, machine learning (ML) can be utilized to model medical data. Computational algorithms are constructed to develop a model to correlate a spectrum of features of the given datasets with the outcome. ML has been commonly used in medical data analysis for diagnosis and prognosis of a variety of tumors, such as breast [9] and prostate cancers [10]. Furthermore, there is clear evidence that ML can be used for analysis in other medical fields as well. For instance, a recent study on predicting 5-year all-cause mortality in patients with suspected coronary artery disease showed that ML had superior predictive performance compared with traditional clinical or coronary computed tomography angiography metrics alone [11]. We hypothesize that ML adds incremental value to the prediction of adverse events. Therefore, the objective of this study was to evaluate the feasibility and accuracy of ML to predict AKI stage 3 in HT patients and then to compare the performance to that of existing clinical metrics.

Data collection
Data of all HT patients in Guangdong Provincial People's Hospital were collected and analyzed from January 2018 through September 2020. All patients had undergone primary orthotopic deceased-donor HT due to various causes. Exclusion criteria were recipient age < 18 years at the time of operation, retransplantation, or RRT prior to HT (Fig. 1). We obtained patient data from the hospital database or electronic records. This retrospective study was approved by the Institutional Review Board of the Guangdong Provincial People's Hospital and was conducted in accordance with the Declaration of Helsinki. The need for informed consent was waived given the retrospective nature of the study.

Study features
We reviewed the patients's medical records retrospectively and collected clinical data including: demographic features, pretransplant renal function features, liver function features, the use of invasive hemodynamic support therapies, echocardiography features, donor characteristics, aortic clamp time, cardiopulmonary bypass time, blood transfusion. All features were divided into three subsets: preoperative, perioperative, and donor characteristics.

Study outcomes
The study outcome was post-transplant AKI defined based on the Kidney Disease: Improving Global Outcomes (KDIGO) criteria [12]: an increase in serum creatinine (SCr) by ≥ 0.3 mg/dl (≥ 26.5 umol/L) within 48 h or to > 1.5 times baseline within the first 7 postoperative days. AKI was classified into 3 stages depending on the level of SCr: stage 1, SCr increase by ≥ 0.3 mg/ dl (≥ 26.5 umol/L) within 48 h or 1.5-1.9-fold increase from the baseline; stage 2, 2-2.9-fold increase from the baseline; stage 3, ≥ threefold increase from the baseline, increase in SCr by ≥ 4.0 mg/dl (≥ 354 umol/L) or the start of RRT. The baseline SCr was referred to the last SCr value before HT. Next, we calculated the estimated glomerular filtration rate (eGFR) by using the Chronic Kidney Disease-Epidemiology Collaboration Group equation [13].

Machine learning
Feature selection, model building, and model evaluation were all part of the ML system (Fig. 2). It was mostly implemented in WEKA 3.8.
To evaluate the worth of a feature, Pearson's correlation between the feature and the class was calculated. The correlations between the features included were ranked descendingly (Fig. 3), and we selected those most closely related to postoperative AKI stage 3. The details of feature selection are shown in the additional file: (see Additional file 1: Table S1).
We selected different classifiers (e.g., logistic regression with L2 regularization, logistic regression, random forest, naïve Bayes, and support vector machine) to build classifiers based on the features highly correlated with AKI stage 3. Detailed information about the model selection is available in the additional file: (see Additional file 2: Table S1). After the model selection procedure, the logistic regression with L2 regularization was the selected model. L2 regularization is a regularized method that shrinks the regression coefficients towards 0 by placing a penalty on the summation of the estimated coefficients. Although the regularization method may lead to biased regression estimates, it results in a more stable model that produces excellent predictive performance in particular when applied to external datasets (further details about the algorithm are available in the Additional file 3) [14].
A tenfold cross-validation was used to assess the performance of the ML model. The dataset was randomly divided into ten folds with approximately the same number of patients in each fold. Nine folds served as the training set, while the remaining fold served as the validation set. In all, each fold was used nine times as a training set and once as a validation set. Thus, the outcome of each patient was predicted once.

Statistical analysis
Continuous features with normal distribution based upon the Durbin-Watson test were presented as mean ± standard deviation; data with skewed distributions were presented as median and interquartile range (IQR); and categorical features were presented as frequency (percentage). The receiver operator characteristic curves were used to evaluate the performance of the ML model and of the reported Cleveland-clinical model to predict post-transplant AKI stage 3.
In Cleveland-clinical model, preoperative serum creatinine level, serum albumin level, insulin-requiring diabetes, and cardiopulmonary bypass time have been reported based on large samples study as independent predictors of postoperative AKI [15]. And the differences between areas under the curves (AUCs) were compared based on Delong et al. [16]. The accuracy, sensitivity, and specificity of the model based on the optimum cutoffs were computed. All statistical analyses were performed with SPSS version 22.0 software (SPSS, Chicago, Illinois, USA) and R statistical software (R Foundation, Vienna, Austria) by using RStudio Server version 1.3. The presented statistical significance levels were all two-sided and p < 0.05 was considered significant.

Discussion
In the present study, the results suggested that the ML model could be an effective tool for risk stratification and prediction of post-transplant AKI stage 3 for individual patients. The performance of the ML model was superior to that of the reported clinical model confirmed by large samples at the Cleveland Clinic Foundation. As far as we know, this study is the first to evaluate the predictive capability of ML methods for the assessment of severe postoperative AKI in patients undergoing HT. Early identification and prevention of AKI in patients undergoing HT may play an important role in selecting treatment regimens and thus improving prognosis, given the high short-and long-term mortality risks associated with AKI after HT. If acute renal failure happens, the short-term mortality increases 3.5-fold and 1-year mortality 2.3-fold [1]. However, the ability to accurately identify high-risk patients who may develop AKI is a major challenge in clinical practice. Although traditional risk factors for the prediction of post-transplant AKI have been identified, they are population-based tools [7,17], which are less effective for individual risk evaluations. Furthermore, the traditional features to predict post-transplant AKI from existing models have relatively limited predictive performance [18], highlighting the need for a more precise model for personalized treatment decisions.
Analyzing and integrating numerous risk features in each individual patient can be a challenging task for the clinician. The increasing number of clinical features affecting risk stratification from various medical checks amplifies the intricacy of assessment and makes it more difficult for clinicians to make a correct decision involving risk stratification in each patient. Moreover, the unanticipated aspects of possible interactions between a few weaker risk features in an individual patient are frequently underestimated [11]. Machine learning, both supervised and unsupervised, can overcome these challenges by deep integration of the experimental and clinical datasets to build powerful risk models and reclassify patient groups [19].
Our results demonstrated that by the integration of clinical information, experimental datasets, and ultrasonography-derived metrics, the ML model (AUC: 0.821) showed superior risk prediction for AKI stage 3 compared with Cleveland-clinical model (AUC: 0.654). The features had been identified as predictors of AKI by logistic regression analysis in previous studies [15]. In our study, the ML model provided an excellent value in prognostic performance while considering 53 features and potential feature-feature interactions in patients. This characteristic permits a deep exploration of all available data for non-linear patterns that could predict the risk stratification of a particular individual [14].
As reported in previous studies, the occurrence of AKI is the consequence of multifactorial interactions that cannot be interpreted with a single etiologic factor [18,20,21]. In the light of our findings, CysC, eGFR, RA-l, LA-ap, SCr and FVII were all predictive factors included in the ML model for predicting the development of AKI stage 3. In particular, CysC, a biomarker for the quantification of kidney function loss, was the most related predictive factor in patients with AKI stage 3, and it may have the ability to detect AKI one to two days before the rise of SCr with higher accuracy and precision [22]. Furthermore, except for acute renal failure, no other factors were found to alter CysC levels, enhancing its effectiveness as an endogenous marker for predicting AKI. Our findings confirm the predictive value of eGFR ranked lower than CysC. One explanation for this may be that CysC reflects GFR changes more sensitively compared to SCr, and eGFR, which used widely in clinical practice instead of GFR, is calculated with SCr in this study [22].
Cardiac features can reflect the confluence of heart-kidney interactions through hemodynamic dimensions. The difference between arterial perfusion pressure and venous outflow pressures must be adequately large to keep sufficient renal blood flow and glomerular filtration. In the setting of this concept, the inability of impaired left ventricular function makes low forward flow with reduced left ventricular ejection fraction (LVEF), and consequently leading to prerenal hypoperfusion. Interestingly, we found that LVEF had no significant effect on the development of AKI stage 3. This is supported by previous studies, such as Jin et al. [23] demonstrated that LVEF was not independently or significantly associated with the development of AKI after cardiac operations. This was illustrated by a relative preservation of eGFR derived from efferent arteriolar constriction following on from the renin-angiotensin system to accommodate the decreased LVEF. In patients with markedly reduced renal blood flow exceeding renal autoregulatory capacity, the compensatory increase in eGFR was lost and could evolve into AKI. Alternatively, the elevated central venous pressures, as a result of changes in right heart structure such as an augmented diameter of RA-l, can bring about an increased renal resistance; the kidneys may subsequently become more susceptible to the occurrence of AKI. This mechanism has been presented in clinical researches in patients with cardiac dysfunction using invasive hemodynamic measurements [24,25].
The relationship between coagulation factors and the incidence of AKI should be further verified by large samples. FVII was turned out to be a predictor of posttransplant AKI in this study, although consistent with other prior work that higher numbers of transfusions, particularly higher blood and cryoprecipitate transfusion, were associated with the incidence of AKI [26]. However, In Jocher et al., there were no differences in the intra-op pRBC, FFP, platelets, or coagulation factors between the No-AKI and AKI groups, suggesting that transfusion was not a risk factor of AKI [27]. The decision to transfuse is influenced by unmeasured factors, such as severity of intraoperative bleeding and pre-existing comorbidities.
There was a high incidence of AKI (71%) in this study, which met the upper end of the incidence range of 22-76% reported in prior studies [3][4][5][6]. We speculated that there may be the following reasons. Our cohort had a long CPB duration that was associated with a higher incidence of post-operative AKI [26][27][28]. Several mechanisms may play crucial roles, including renal hemodynamic changes (hemodilution, hypothermia, and non-pulsatile flow), hemolysis caused by turbulent flow and occlusive roller pumps leading to generation of reactive oxygen species [21]. In addition, most of our patients were admitted to hospital for acute heart failure, especially the incidence of right heart failure was relatively high. And RV function is a central determinant of Cardiorenal Syndrome hemodynamics [20]. Patients after HT underwent intrinsic oxidative stress as well as systematic and intrarenal inflammation, which is related to AKI [29,30]. This could be explained by renal tubular epithelial cells are extremely susceptible to oxidative stress, particularly during ischemia-reperfusion phase.

Study limitations
This study has several limitations. First, our research was a single center study with a relatively limited sample size. Although the ridge logistic regression with the L2 regularization could cope with over-fitting problems that may occur owing to small sample size, a multicenter study will be better to confirm our findings. Second, although we appraised 53 diverse features with the ML algorithm, we did not consider additional features, such as cardiac magnetic resonance due to its retrospective nature, that may contribute to better risk prediction. Third, we did not conduct external validation to verify the robustness of our results using an independent dataset from other centers; this is our future research direction.

Conclusions
In summary, the ML model based on preoperative and perioperative features can serve as an effective tool for the prediction of post-transplant AKI stage 3. Through the model, the risk of an individual patient with potential AKI stage 3 after HT could be identified accurately, enabling a timely intervention.