We used ML algorithms to predict the requirement for revascularization in patients with CAD using only basic clinical information at the time of admission. The AUC for the predictive value was 0.83 in the training set and 0.79 in the external validation set. These encouraging results suggest that our ML algorithm can help develop treatment strategies for individual patients with CAD.
We found that (1) ML was 75% accurate in predicting strategies with an AUC of 0.83 in the training set; in the external data validation, ML reached 76% accuracy with an AUC of 0.79; (2) ML had 88% precision for predicting treatment strategies, especially for medication-treated patients in the training set; and (3) ML reached 86% precision and 81% recall for predicting medication-treated patients in the external data validation.
In the training set, ML's predictive accuracy for patients treated with medication was higher at 0.88; however, the recall was lower at 0.66. For the patients with revascularization treatment, accuracy was 0.66, the recall was 0.88, and overall accuracy was 0.83. When validated with external data, ML's prediction model performed well for patients treated with medication, with accuracy set recalls of 0.86 and 0.81, respectively; however, the prediction for revascularization therapy was poor and performed less well than the prediction for medication therapy. The prediction of overall patient outcome, with a ROC value of 0.83 for ML, was better than the GRACE score of 0.68. We also found that, with proper calibration, the prediction of outcome events can be enhanced. Implementation of ML models in clinical settings can automate selecting candidates who might benefit most from additional diagnostic testing while avoiding the need for time-consuming and unnecessary routine clinical steps.
Correctly identifying patients at high risk who may benefit from appropriate treatment will improve patient clinical outcomes. The GRACE risk score is a validated predictor of adverse outcomes in CAD patients, and recent studies showed that the GRACE score could assess the severity of coronary artery stenosis in patients with CAD[10, 11]. Current guidelines recommend the GRACE risk score to perform risk stratification in CAD, especially for patients with acute coronary syndrome[14]. Even though the GRACE score is easy to implement, the score in isolation was associated with significant over- and under-treatment, suggesting the need for more accurate assessments using a wider range of clinical variables[3, 4].
Integrating a patient's various clinical information for risk scoring is a challenge for cardiovascular physicians. The complexity of assessment is increasing as additional clinical variables need to be considered. In general, it is challenging for cardiovascular physicians to predict risk in individual patients. In the present study, we showed that our ML overcame these challenges, providing deep integration of comprehensive clinical data.
There are some differences between our study and previous studies. Most of the latter were designed to predict clinical outcomes after coronary artery revascularization; most relied on data from non-invasive (coronary computed tomography angiography) or invasive (coronary angiography, CAG) coronary angiography, and assistive technologies such as cardiac magnetic resonance, intravascular ultrasound, or fractional flow reserve[12, 13]. In the present study, by contrast, we used an ML to predict whether patients with CAD could be treated with immediate revascularization based only on clinical data, history, and laboratory findings in the emergency department.
We used an ML approach, an artificial intelligence that differs from traditional prognostic methods in that it makes no a priori assumptions regarding the cause of disease. This characteristic permits agnostic explorations of available data that may predict the risk to individuals (i.e., precise risk stratification). This approach diverges from the ‘hypothesis-driven approach in standard prognostic risk assessment[15, 16].
We found that the precision value for class 0 (medication treatment) and the recall value for class 1 (revascularization treatment) of these two subsets were both high, especially in the training set. The recall was also high for class 0 (medication treatment) in the validation set. The high precision value of class 0 suggests that the actual class 0 instances account for a high proportion of all predicted class 0 instances, further suggesting that it is rare for the model to misjudge class 1 as class 0. The high recall value of class 1 suggests that the instances correctly identified as class 1 have a high percentage of all instances of class 1, further suggesting that the model has a high recognition accuracy for class 1. This finding was the same for the high recall value of class 0.
The neutrophil-to-lymphocyte ratio showed the highest predictive weight for the outcome. The ML avoided ignoring important but unexpected predictor variables or interactions by not making the necessary prior assumptions between cause and outcome and allowed us to identify clinically essential risks in patients with multiple marginal risk factors. Machines can quickly and seamlessly integrate new data to continuously update and optimize their algorithms, thereby continuously improving their predictive performance over time.
In general, our ML approach provided incremental gains in prognostic performance while managing 40 variables and numerous patient-specific variable-variable interactions. This process permits individualized risk assessment and circumvents several of the limitations inherent in the standard statistical approach.
Our findings have considerable clinical importance. ML may help generate more accurate cardiovascular risk stratification for individual patients.
Classical statistical methods hand-pick features based entirely on medical domain knowledge. Statistical methods are then used to calculate the importance of each feature and construct prediction models. ML methods start from the data and do not refer to traditional risk factors or weighted factors. Furthermore, they do they pay attention to the interpretability of the model. It remains a challenge to fuse medical domain knowledge and ML methods to build highly interpretable predictive models.
ML uses extraction methods and feature representation to extract features from enormous data sets to build models without reference to known weights and risk factors. Therefore, the models are less interpretable than traditional disease prediction methods. Furthermore, ML identifies risk factors different from those generated by traditional methods, allowing for more in-depth prospective studies to determine etiology and interactions. These advantages may eventually lead to new therapeutic targets[15, 17].