Due to the current increase in the global incidence of CKD, the classification of patients at risk becomes a relevant tool for doctors to achieve a disease early diagnosis. In addition to that, XAI could imply an improvement to those prediction models by meeting the healthcare professionals’ demands about understanding the decisions made by the models. Having more explainable CKD prediction models, doctors could make more data-driven decisions and focus on controlling those underlying features or indicators to slow the progressive damage to the kidney.
This paper describes a CKD prediction model developed to tackle the early diagnosis not only seeking high accuracy but also analyzing the explainability of its results. Thus, this research contributes to enlarging the works dedicated to AI for CKD diagnosis from a novel perspective to the best of our knowledge, that focuses on the model’s explainability. By using post-hoc explainability techniques, this work aims to “open” the black-box paradigm of the ensemble trees classifiers when predicting CKD.
The development of the explainable CKD prediction model is based on a data management pipeline that allows inferring automatically different parameters like the appropriate ensemble tree algorithm, the relevant features selected, the feature selection method, and data imputation techniques to obtain the best classification performance of the prediction model. Moreover, the pipeline allows evaluating the model’s performance over new unseen data (30% of the original dataset), which could emulate a deployment in a real clinical environment, however, the model’s performance might differ since actual medical records are not usually as curated as the dataset employed.
Considering our classification results, this work obtains a fairly good performance by achieving the state-of-art of CKD prediction models found in the literature, especially when comparing the number of features selected. Therefore, the SCI-XAI pipeline´s feature selection step has proven to be valuable by reducing substantially the original number of features, leaving 3 out of 24 when using the XGBoost classifier, being the best CKD prediction model in the literature in terms of minimum features considered. Furthermore, 3 out of 4 considered ensemble learning algorithms obtain their best classification results with only 33% of the original features showing the capability of the pipeline to detect relevant features when building the prediction model.
To the best of our knowledge, this paper is the first in the literature to address an explainability analysis of a CKD prediction model selected through an accuracy-explainability trade-off perspective. Thus, albeit not obtaining the best classification performance, XGBoost is selected as the most balanced model, showing an example of the tension between accuracy and explainability dealt by prediction models aimed at being used in specific domains where understanding the results is crucial (e.g., healthcare).
Regarding the analysis of the features’ importance in the prediction model, the hemo (hemoglobin) feature is denoted as the most relevant in all post-hoc analysis techniques considered, followed by the sg (specific gravity) and then htn (hypertension). It is worth highlighting the utility of the PDP plots to identify thresholds on which a certain feature modifies the probability prediction. For instance, this work establishes thresholds in 12.3 gms and 1.015 for hemo and sg respectively where the probability starts to decrease, implying that doctors could set up a treatment for the patient to be above these values and reduce the probability of CKD disease. Moreover, the local explainability results exemplify how XAI could contribute to the promotion of personalized medicine by showing the relevance of the different features for an individual prediction case.
With the results described in this work, the added value of explainability to a clinical prediction model is exhibited. Besides, the feature selection approach is valuable not only for improving the explainability of clinical prediction models but also for reducing the cost of the diagnosis having fewer clinical indicators to extract. Thus, since this explainable CKD prediction model implies the processing of 3 features (hemo, sg, and htn), the cost associated to extract them, by following the price list defined by Salekin et al [42], is 1.65 USD for hemo in a hemoglobin test, and no cost for specific gravity (sg) and hypertension (htn). Therefore, the cost associated with an early diagnosis of CKD by using this explainable prediction model would be around 1.6 USD, which would have an important impact on developing countries where medical access is more difficult [43].
Our research presents some limitations. First, the present study employs a widely used CKD dataset from a UCI-ML repository that although allows benchmarking with other related research works, impedes performing an objective experiment. Since the number of patients is relatively small, a K-fold cross-validation approach has been adopted to foster generalization ability. However, to conclusively validate the results, more CKD data would be needed from a different clinical setting from the original which is planned as future works.