3.1 Baseline analysis of clinical data of ACS patients in the training set and validation set
In our study, a total of 1,359 patients were included. Within the training dataset, the 90-day overall mortality rate for ACS patients was 17.31% (n=228), with most patients being white (63.24%), a relatively higher proportion being male (66.86%), and a significant number being overweight (with a BMI > 23.9) (88.35%). A higher prevalence of chronic congestive heart failure (57.69%) was noted, and a considerable proportion of patients developed varying degrees of AKI during their ICU stay (84.05%). The clinical characteristics of the entire study population are presented in Table 1.
Data are expressed as median (IQR), or n (%). Analysis of variance (or the Kruskal-Wallis test) and Chi-square (or Fisher’s exact) tests were used for comparisons among groups. Statistical significance (P<0.05).
APS III, Acute Physiology Score III; CCI, Charlson Comorbidity Index; SAPS II, Simplified Acute Physiology Score II; SOFA, Sequential Organ Failure Assessment; BMI, Body Mass Index; AG, Anion Gap; Scr, Serum Creatinine; BUN, Blood Urea Nitrogen; MBP, Mean Blood Pressure; AKI, Acute Kidney Injury; RR, Respiratory Rate; HR, Heart Rate. MT, malignant tumor; INR, International Normalized Ratio.
3.2 Feature selection and model development
We selected the 32 clinical features as independent variables for the study and utilized the LASSO regression method for the analysis. By employing 10-fold cross-validation, we ascertained the optimal value of λ (lambda. min) and identified 24 variables with non-zero coefficients. These variables include age, SOFA, CCI, APS III, SAPS II, Scr, BUN, AG, K+, Ca2+, PLT, Hb, T, RR, MBP, INR, PT, race, congestive heart failure, chronic lung disease, MT, cerebrovascular disease, BMI, and gender. Variables with non-zero coefficients from the LASSO regression results are shown in Table 2. Figures 2 and 3 respectively depict the variable selection path and the cross-validation plot.
This table shows the 24 variables with non-zero coefficients in the lasso regression.
APS III, Acute Physiology Score III; CCI, Charlson Comorbidity Index; SAPS II, Simplified Acute Physiology Score II; SOFA, Sequential Organ Failure Assessment; AG, Anion Gap; Scr, Serum Creatinine; BUN, Blood Urea Nitrogen; MBP, Mean Blood Pressure; AKI, Acute Kidney Injury; RR, Respiratory Rate; HR, Heart Rate. MT, malignant tumor, INR, International Normalized Ratio.
The trend lines of coefficients illustrate the association between 32 characteristics and mortality within 90 days. Coefficient trend lines describe the relationship between 32 features and 90-day mortality.
Selection of the parameter (lambda) for deviance in LASSO regression is determined using both the minimum criterion and the 1 standard error rule.
Table 3 summarizes the results of the logistic regression analysis conducted on the training dataset. Further selection through bidirectional stepwise logistic regression, using the minimum AIC (Akaike Information Criterion) as the standard, included the following 10 variables: age ( [OR: 1.053, 95% Confidence Interval [CI]: 1.03–1.077), P<0.001; SOFA (OR: 1.091, 95% CI: 1.01–1.178), P=0.026; CCI (OR: 1.081, 95% CI: 0.972–1.198), P=0.146; APS III (OR: 1.023, 95% CI: 1.008–1.037), P=0.002; Scr (OR: 0.768, 95% CI: 0.577–0.985), P=0.052; BUN (OR: 1.02, 95% CI: 1.008–1.034), P=0.001; AG (OR: 1.086, 95% CI: 1.026–1.149), P=0.004; RR (OR: 1.064, 95% CI: 1.028–1.101), P<0.001; INR (OR: 1.662, 95% CI: 1.205–2.316), P=0.002. Race (Black/African American) (OR: 0.808, 95% CI: 0.112–7.867), P=0.841; Race (white) (OR: 2.483, 95% CI: 0.495–19.739), P=0.319; Race (other) (OR: 3.061, 95% CI: 0.591–24.842), P=0.227. Among these predictors, race is a categorical variable, while the rest are continuous variables.
These variables were then included in a bidirectional stepwise regression to choose a Logistic model with the lowest AIC value.
APS III, Acute Physiology Score III; APS III, Acute Physiology Score III; CCI, Charlson Comorbidity Index; Scr, Serum Creatinine; BUN, Blood Urea Nitrogen; AG, Anion Gap.
We ultimately selected 7 significant variables with a P-value of less than 0.05 to construct the predictive model. These are: "Age", "SOFA", "APS III", "Urea Nitrogen", "AG", "RR", and "International Normalized Ratio". Based on this model, we constructed a nomogram to predict the 90-day mortality rate for patients with ACS, as shown in Figure 4. A line is drawn upward from the point axis to connect each predictor in the predictor line plot to a specific point. The “Total Points” axis is used to display the sum of the points for each variable. The plotted “Total Points” axis is then directly connected to the probability axis via a vertical line to determine the probability of 90-day post-discharge outcomes for ACS patients.
As depicted in Figure 5, our predictive model scored an AUC (Area Under the Receiver Operating Characteristic Curve) of 0.842 on the training set (with a 95% confidence interval of 0.809-0.875) and an AUC of 0.855 on the validation set (with a 95% confidence interval of 0.815-0.894). Concurrently, we evaluated our model against traditional scoring systems such as APSIII and SOFA. The AUC values for these systems on the training set were 0.779 and 0.678 respectively (Figure 6), and on the validation set, they were 0.801 and 0.692 respectively (Figure 7). These results indicate that our model may outperform these traditional scoring systems in terms of predictive performance. Utilizing the Youden Index, we identified 0.16 as the optimal cut-off point in the training set, corresponding to a sensitivity of 80.4% and a specificity of 75.1%. For the validation set, the cut-off was set at 0.109, with a sensitivity of 92% and specificity of 67%, suggesting good generalizability of the model to both the training and validation sets. Regarding calibration, the curves showed a reasonable fit for both the training set (Figure 8) and the validation set (Figure 9), with Hosmer-Lemeshow test P-values of 0.1626 and 0.4008 respectively, indicating no statistical significance and good agreement between the predicted and observed values. Furthermore, Brier scores were 0.107 for the training set and 0.103 for the validation set, demonstrating the accuracy of the model. DCA curves have been plotted for both sets, where our model is represented by the red line. It demonstrates a higher net benefit across the full range of risk thresholds compared to established scoring systems such as SOFA and APS (Figure 10 and Figure 11).