Development and external validation of an online clinical prediction model for augmented renal clearance in adult mixed critically ill patients: the ARC predictor

Background Augmented renal clearance (ARC) might lead to subtherapeutic plasma levels of drugs with predominant renal clearance. Early identification of ARC remains challenging for the intensive care unit (ICU) physician. We developed and validated the ARC predictor, a clinical prediction model for ARC on the next day during ICU stay, and made it available via an online calculator. Its predictive performance was compared with that of two existing models for ARC. Methods A large multicenter database including medical, surgical and cardiac ICU patients (n = 33258 ICU days) from three Belgian tertiary care academic hospitals was used for the development of the prediction model. Development was based on clinical information available during ICU stay. We assessed performance by measuring discrimination, calibration and net benefit. The final model was externally validated (n = 10259 ICU days) in a single-center population. ARC was found on 19.6% of all ICU days in the development cohort. Six clinical variables were retained in the ARC predictor: day from ICU admission, age, sex, serum creatinine, trauma and cardiac surgery. External validation confirmed good performance with an under the of 0.88 (95% CI 0.87 – 0.88), and a sensitivity and specificity of 84.1 (95% CI 82.5 – 85.7) and 76.3 (95% CI 75.4 – 77.2) at the default threshold probability of 0.2, respectively. using

routinely collected clinical information that is readily available at bedside. The ARC predictor is available at www.arcpredictor.com.

Introduction
Augmented renal clearance (ARC), referring to enhanced renal elimination, has been identified during the last decade to be of significant clinical importance in patients admitted at the intensive care unit (ICU). Reported incidences for ARC in the ICU vary between 16-100% depending on the subset of patients and the definition employed for ARC (1)(2)(3)(4). ARC has the potential to result in subtherapeutic plasma levels and consequent therapeutic failure as the kidney is the primary excretory pathway for many (hydrophilic) drugs (5)(6)(7)(8)(9)(10)(11)(12). Most extensive ARC research has been conducted for antimicrobial drugs, although anti-epileptics and anticoagulants are also susceptible to therapeutic failure due to ARC as all these drugs cannot be titrated directly to their clinical effect. To avoid therapeutic failure, increasing the dose and performing therapeutic drug monitoring might be warranted when prescribing drugs with predominant renal excretion (1,2,4,10,(13)(14)(15).
Currently, a 24-hour urine collection to calculate creatinine clearance (CrCl24h) is considered the best available standard to measure renal function in daily ICU practice. Formulae estimating kidney function have been shown to be inaccurate in ICU patients; this is especially the case in patients with ARC (16)(17)(18)(19)(20). Nevertheless, estimating formulae are still often used in ICU routine to monitor kidney function due to the complex and time-consuming process of a CrCl24h measurement (20).
Besides, CrCl24h is only available on the next day, which makes it difficult to anticipate ARC. Therefore, efforts have been made to identify risk factors in order to allow identification of patients at high risk for ARC (1,2,4).
Identification of a patient at high risk for ARC can prompt the clinician to empirically increase the antibiotic dose and to order a CrCl24h for the next day to confirm ARC.
In spite of numerous publications reporting risk factors for ARC (1-4, 10, 14, 20, 21), prediction models for ARC have only been developed by two groups. These models were, however, developed in small and selected cohorts of ICU patients, and have not been externally validated or assessed for net benefit (16,22). As a result, there is still a need for a robust, clinically relevant and externally validated prediction model for ARC that is applicable in a heterogeneous population of critically ill patients.
The aim of this study was the development and external validation of the ARC predictor, a clinical prediction model for ARC on the next day in a large and heterogeneous population of critically ill patients. We wanted the ARC predictor to be applicable in a broad ICU setting and easy to be implemented in clinical practice.
The performance of the ARC predictor was compared to the existing models for ARC, the "ARC score" (22) and the "ARCTIC scoring system" (16).

Development cohort
The M@tric database is a large high-quality multicenter database, containing data from all adult patients admitted to three academic hospitals in Belgium (University Hospitals Antwerp, Ghent, and Leuven) (23

ARC definition and variable selection
CrCl24h was calculated using 24h timed urinary volume (UV, mL) collected over one complete ICU day (7AM-7AM, 1440 min), mean urinary creatinine concentration (UCr, mg/dL), mean serum creatinine concentration (SCr, mg/dL) over this ICU day and corrected for an average body surface area (CrCl24h = (UCr UV) / (SCr 1440) (1.73 / (0.007184 height (cm) 0.725 weight (kg) 0.425 )). ICU days on which the necessary data to calculate a CrCrl24h on the next day were not available, were excluded from the final development cohort.
ARC was defined as a CrCl24h ≥130 ml/min/1.73m², in accordance with the consensus for ARC in current literature (1,3). The predictors used in the development of the ARC predictor were selected based on current literature (1-4), expert consensus and data availability. The predictors selected were day from ICU admission, age, sex, SCr, urinary output, vasopressor use, mechanically assisted ventilation, comorbidities, trauma, neurotrauma, surgery, cardiac surgery and sepsis (detailed description in Additional file 1: Table S1).

Model development
The development cohort was divided at random, at ICU day level, in a training (80%) and an internal validation set (20%). Development of the ARC predictor was performed by applying a generalized estimating equation (GEE) logistic regression analysis with ARC on the next day as outcome (with ICU stay as clustering variable), with backward feature selection on the training set (Additional file 2: detailed model development) (24). At each step, decision curve analysis (DCA) was performed to evaluate the model net benefit in the internal validation set. Net benefit is the number of true positives identified by a prediction model corrected for false positives. Net benefit should be larger than for the alternative strategies (i.e. 'all ARC' meaning "assume all days show ARC", or 'none ARC' meaning "assume none of the days show ARC") over a range of threshold probabilities that would be used in clinical practice. A threshold probability is the predicted probability above which a patient would be classified as showing ARC on the next day (25)(26)(27). Performance of the model was subsequently assessed in the internal validation set.
The receiver-operating characteristics (ROC) curves and area under the ROC curve (AUROC) was used for discrimination. Calibration was assessed using calibration plots (25)Furthermore, sensitivity, specificity, negative predictive value, positive predictive value, negative likelihood ratio and positive likelihood ratio were calculated. The Youden index (28) was estimated to determine the threshold probability for which sensitivity and specificity are maximized. If DCA showed net benefit at this threshold probability, this was used as default threshold probability for further assessment of performance. For all performance parameters bootstrap 95% confidence intervals were calculated.

Validation cohort
For external validation, a single-center retrospective study was performed. All adult patients admitted to the ICUs of the University Hospitals Leuven, Belgium, between January 2016 and December 2016 were screened for eligibility. The same inclusion and exclusion criteria as described above for the development cohort were applied.
The data needed to calculate CrCl24h, and the predictors retained in the ARC predictor were retrieved from the clinical patient data management system database (Metavision®; IMD-Soft®, Needham, MA, USA) from the University Hospitals Leuven, and were pseudonymized.
External validation and comparison with ARC score and ARCTIC score Performance was assessed as described above for the internal validation set, at the same threshold probability.
To compare the ARC predictor with the ARC score, a subset from the validation cohort was selected for which the sequential organ failure assessment (SOFA) score was available, as this is needed to calculate the ARC score. As suggested by Akers et al. (29) and Barletta et al. (16), who evaluated the diagnostic accuracy of the ARC score, a cutoff of 7 or higher was considered as a positive prediction for ARC.
For comparison with the ARCTIC score, a subset from the validation cohort with trauma related diagnosis on admission was selected, as this score was developed in trauma patients. As suggested by the authors of the ARCTIC score, a cutoff of 6 or higher was considered as a positive prediction for ARC (16).

Ethical approval
M@tric data-collection has ethical committee (EC) approval from the University The need for a written informed consent was waived.

Statistical analysis
All statistical analyses were performed in R software (R version 3.5; The R Foundation for Statistical Computing, Vienna, Austria). Two-sided significance level was set to 0.05. Continuous data were presented as median and interquartile range and categorical data were presented as count and percentage. Sample size was not deemed an issue as we anticipated a very large number of inclusions and a relatively high number of events. Therefore, we performed all statistical analyses as complete-case analyses.

Model building and internal validation
For the development and internal validation of the ARC predictor we included data from 4267 ICU stays, representing 33258 patient days (Additional file 3: Figure S1). The ARC predictor presented a clear benefit over the 'all ARC' and 'none ARC' strategies over a broad range of threshold probabilities (0.01-0.71) (Additional file 5: Figure S2 (D)). Maximized sensitivity and specificity was found at a threshold probability of 0.2, corresponding to the pretest probability of ARC in the development cohort. Hence, this was used as default for further assessment. At the default threshold probability of 0.2, the ARC predictor performed very similar to the model before feature selection (Additional file 6: Table S3). Fig. 1 (A) shows that the ARC predictor is well calibrated over a broad range of probabilities, in the internal validation set. The intercept is 0.12 and the calibration slope is 0.95.

External validation
The validation cohort included 1713 ICU stays, representing 10259 days. ARC was found on 19.4% of the patient days, which is similar to that found in the development cohort. The predictors retained in the ARC predictor were found in similar proportion, comparable to the development cohort, except for higher incidences of trauma-and cardiac surgery-related diagnoses on admission in the validation cohort (Additional file 7: Table S4).
As depicted in Fig. 3 (B), DCA confirmed net benefit of the ARC predictor in the validation cohort over a broad range of threshold probabilities (0.01-0.71). Similar clinical performance as in the internal validation set was obtained at the default threshold probability of 0.2, as shown in Table 2. Fig. 1 (B) shows that the ARC predictor is well calibrated over the whole range of probabilities in the validation cohort. The intercept is 0.07 and the calibration slope is 1.06.

Comparison with ARC score and ARCTIC score
The ARC predictor clearly outperforms the ARC score, as shown in Fig. 2 (A/B). The ARC score (22) developed by Udy et al. showed a higher specificity at the cost of a very low sensitivity and a lower AUROC as reported in Table 2.
The ARC predictor performs similar to the ARCTIC score, as shown in Fig. 2 (C/D).
Although showing a higher specificity, the ARCTIC score (16) developed by Barletta et al. showed a lower AUROC and a lower sensitivity as reported in Table 2.

Online ARC calculator
The online ARC calculator is available at www.arcpredictor.com.

Discussion
The ARC predictor is a GEE logistic regression model that predicts ARC on the next day in a heterogeneous population of critically ill patients admitted to the ICU.
Hence, the ARC predictor allows clinicians to anticipate ARC, and increase dosing or order therapeutic drug monitoring when prescribing drugs with predominant renal clearance (e.g. many antimicrobials). This prediction model uses six universally available clinical predictors: day from ICU admission, age, SCr on previous day, sex, trauma and cardiac surgery. The ARC predictor was built on a large multicenter prospective database, and was internally validated in a previously unseen subset of this development cohort. It was externally validated in a large separate singlecenter retrospective cohort. Upon external validation, the ARC predictor had a good discrimination, was well calibrated, and demonstrated superior net benefit across a broad range of probabilities. The ARC predictor is easily accessible via an online calculator and as such it is readily applicable in daily bedside clinical practice in the ICU.
Extensive research has been published on prevalence and risk factors for ARC over the last decade (1-4, 10, 14, 20, 21). However, a reliable predictor for ARC in daily ICU routine has not been introduced so far. Our results confirm that ARC is prevalent in a heterogeneous ICU population. The majority of ARC episodes lasts at least two days. ARC was found on 19.6% and 19.4% of the days in the development and validation cohorts, respectively, which is on the lower end of incidences reported in previous literature (16 -100%) (1)(2)(3)(4). This difference is most likely explained by the different case-mix used in the present study, which included all critically ill patients, compared with other studies that focused on subsets of ICU patients with already an elevated risk for ARC, and in which patients with decreased CrCl (elevated SCr values) are often excluded (1,2,4). Potentially, incidence of ARC has been underestimated in our cohort due to exclusion of short ICU stays (less than two complete days), inherent to the study design. In addition, the present study investigated ARC for each ICU day separately, whereas most studies report ARC incidence on a patient level (1)(2)(3)(4). However, ARC is a dynamic state that can (dis)appear during ICU stay (1,21). Therefore, a GEE model was used in this study, which allows prediction at ICU day level, taking into account clustering per ICU stay.
When using the ARC predictor, it might be challenging for the clinician to choose a threshold probability to classify patients as showing ARC on the next day or not. The rationale for this choice is given in Fig. 3, and an illustration is provided in Additional file 8. In short, the ARC predictor shows net benefit above the alternative strategies over a broad range of threshold probabilities (0.01-0.71), confirming its clinical usefulness when used within this range. Default threshold probability is set at 0.2, as the ARC predictor showed good performance at this threshold, and it is adequate for many clinical decisions concerning antimicrobial dosing on the ICU.
The threshold probability reflects the trade-of between harm associated with a false positive and benefit associated with a true positive. Hence, the default threshold probability of 0.2 reflects that benefit is 4 times greater ((1-0.2)/0.2) than harm. In the online calculator, the user is able to change the threshold probability according to the clinical context. The user should never select a threshold probability above 0.71.
We have shown that the ARC predictor performs better or similar compared with the ARC score or the ARCTIC score, respectively. The ARC score was developed in a sample of 71 septic or multi-trauma patients (22). The ARCTIC score was developed in 133 trauma patients (16). Both studies excluded patients based on SCr, and neither have evaluated clinical utility or calibration, nor have they performed an external validation. Hence, the ARC predictor outperforms both scores in terms of clinical performance and generalizability.
This study and the ARC predictor have several strengths. First, the ARC predictor was developed in a very large and heterogeneous multicenter adult ICU population, making it widely applicable. All types of ICU related diagnoses were included; presence of RRT and unavailability of CrCl24h on the next day were the only exclusion criteria. Second, the ARC predictor contains six variables that are readily available to the clinician at any time during ICU stay. The online calculator makes bedside implementation of the ARC predictor easy. Third, the ARC predictor reports prediction for ARC on the next day which allows the clinician to anticipate (dis)appearance of ARC during ICU stay. Fourth, net benefit was accounted for by performing DCA. Fifth, the ARC predictor was compared with the two existing scoring systems for ARC, outperforming them, in terms of clinical performance.
Finally, in an external validation cohort, good performance of the ARC predictor was confirmed, increasing its generalizability.
This study and the ARC predictor also have several limitations. First, the ARC predictor was not prospectively validated. Second, there is always an inherent risk when building a model on a split sample dataset. As the development cohort was split at ICU day level, some patient's data were included both in the training and the internal validation set. However, external validation was performed in an independent validation cohort. Still, the danger of overfitting has been assessed and is probably low. Nevertheless, the ARC predictor should be used with care in nonacademic hospitals and outside Belgium, since it has not been learned and validated outside this setting. In addition, burn patients were not included in the development cohort. Third, there might be selection bias due to the selection of ICU days with a CrCl24h on the next day, hence excluding short ICU stays with less than two complete days. Development and validation cohort were derived from clinical databases, and short-stay patients, or patients in whom the treating physicians judged that it was not necessary to calculate a measured CrCl24h, might still show ARC. Fourth, even while M@tric is a high resolution dataset where many ICU parameters are registered, it cannot be excluded that other, non-registered important predictors for elevated ARC have been missed. For instance, the SOFA score was not included in the model development because the SOFA score was not available or incomplete on many days. In a subset of 16694 days from M@tric, the SOFA score did not contribute significantly to the prediction of ARC. Finally, as with every prediction model a predicted probability should not be used as a surrogate for presence or absence of the diagnosis, i.e. ARC. Moreover, in real life, ARC is not binary, hence the ARC predictor might help to predict ARC, but measuring the actual CrCl24h remains important to confirm and monitor ARC.

Conclusion
The ARC predictor is a clinical prediction model for ARC on the next day that is applicable in a heterogeneous population of ICU patients. This is a user-friendly model made publicly available via an online calculator (www.arcpredictor.com) allowing bedside implementation in every ICU. Moreover, good clinical performance and net benefit was confirmed in an external validation cohort.

Ethics approval and consent to participate
Approval for the present study was obtained from the ethical committee of the University Hospitals Leuven (S61364) for the use of the M@tric dataset, as well as the retrospective Leuven dataset. The need for a written informed consent was waived.

Consent for publication
Not applicable.

Availability of data and materials
The datasets used and analyzed during the current study are available from the M@tric research group (www.matric.be) on reasonable request.

Competing interests
The authors declare that they have no competing interests.  confidence intervals (2000 replicates); b Subset from the validation cohort with sequential organ failure assessment score available; c Subset from the validation cohort with trauma related diagnosis on admission N = number of ICU days; AUROC = area under the receiver-operating characteristics curve Figures Figure 1 Calibration curves for the ARC predictor in the internal validation set (A) and the external va Receiver operating characteristics analysis: comparison of the ARC predictor to the ARC scor Decision curves for the ARC predictor in the internal validation set (A) and the external valid