Serum albumin and survival in GBM. Serum albumin, recognized as a prognostic marker in many acute and chronic illnesses, is associated with less favorable survival outcomes in multiple solid organ cancers. The study by Borg et al. [6] explored the predictive value of serum albumin for survival in the context of central nervous system tumors. Their study assessed the post-operative survival in 549 GBM patients. The patients were stratified into three distinct categories based on their pre-surgical serum albumin (ALB) concentrations: hypoalbuminemia (ALB < 30g/L), lower range of normal (ALB 30-40g/L), and upper range of normal (ALB > 40g/L). The study found a significant reduction in post-surgical survival rates in hypoalbuminemia patients (ALB < 30 g/L) compared to patients with normal albumin levels. Additionally, the survival duration for patients with lower normal serum albumin levels was substantially shorter than for those with upper-normal range albumin levels. The study hence concluded that the pre-surgical serum albumin level is a powerful predictor of survival in GBM patients.
Leveraging the Sheba data lake using the MDClone platform, we identified 452 patients that fulfilled the inclusion criteria set by the study. The demographic characteristics of the synthetic population closely aligned with the real-world patient population, with a mean serum albumin of 34.7 ± 5.2g/L in the original dataset and 35.6 ± 5.5 in the primary synthetic dataset (Table 1). The primary synthetic data points exhibited a normal Gaussian distribution consistent with the real dataset, as depicted in Fig. 1 (real data n = 549, P = 0.06; synthetic data n = 452, P = 0.10, Kolmogorov-Smirnov statistic D = 0.06). Furthermore, both the original and synthetic study cohorts demonstrated equivalent distribution among the three serum albumin categories (X2 P = 0.99).
Table 1
Patient characteristics in the original (Borg et al.) and synthetic datasets.
Characteristics | Real data (N = 1250) | Synthetic data 1 (N = 1320) | Synthetic data 2 (N = 1320) | Synthetic data 3 (N = 1320) | Synthetic data 4 (N = 1320) | Synthetic data 5 (N = 1320) | P values* |
Age | 60.1 ± 5.0 | 64.4 ± 11.3 | 64.4 ± 11.3 | 64.7 ± 11.1 | 64.4 ± 11.6 | 64.2 ± 11.6 | 0.98 |
Male:Female Ratio | 1.58:1 | 1.79:1 | 1.75:1 | 1.81:1 | 1.78:1 | 1.69:1 | 0.98 |
Surgery Biopsy Debulking Not ascertained | 48.6% 47.3% 4.1% | 36.1% 63.9% 0% | 35.1% 64.9% 0% | 36.9% 63.1% 0% | 37.4% 62.6% 0% | 37.8% 62.2% 0% | 0.45 |
Adjuvant therapy None Radiotherapy Chemothrapy Chemoradiotherapy | 36.6% 48.0% 0% 15.3% | 25.4% 2.7% 30.8% 41.2% | 28.1% 3.3% 28.5% 40.1% | 28.5% 2.8% 29.0% 39.7% | 27.7% 3.0% 29.7% 39.6% | 28.0% 3.0% 29.0% 40.0% | 1.00 |
Number of Patients (Albumin < 30g/L) | 15% | 12% | 12% | 14% | 13% | 13% | 0.91 |
Number of Patients (Albumin 30-40g/L) | 68% | 72% | 70% | 69% | 70% | 70% | 0.95 |
Number of Patients (Albumin > 40g/L) | 18% | 16% | 18% | 17% | 17% | 17% | 1.0 |
Median Albumin (g/L) | 35 ± 5.2 | 36 ± 5.4 | 35 ± 5.5 | 36 ± 5.4 | 36 ± 5.5 | 36 ± 5.5 | 0.93 |
Mean Albumin (g/L) | 34.7 ± 5.2 | 35.6 ± 5.4 | 35.4 ± 5.5 | 35.3 ± 5.4 | 35.4 ± 5.5 | 35.4 ± 5.5 | 0.93 |
Survival (months) (Median[range]) | NA | 13.0 [0.1–59.2] | 12.8 [0.1–58.0] | 13.0 [0.1–59.8] | 13.0 [0.1–59.0] | 12.4 [0.1–56.4] | 0.90 |
*P-values indicate the probability that the value of the specific variable is not significantly different between synthetic datasets.
With regard to survival, our analysis of the primary synthetic data demonstrated a substantial correlation between a GBM patient's serum albumin levels and post-surgical survival (P < 0.0001) (Fig. 2).
Patients in the hypoalbuminemia group exhibited a significantly shorter post-operative survival period, with a median duration of 7.0 months (HR 2.27 [95% CI 1.55–3.31], P < 0.01). Moreover, the patients with hypoalbuminemia experienced increased peri-operative (≤ 1 year post-operatively) mortality (61.1% vs. 26.1%; OR 4.18 [95% CI 2.20–7.99], P < 0.01). In contrast, the groups with lower and upper normal albumin levels demonstrated significantly longer survival times of 12.9 and 16.2 months, respectively (Fig. 2). It must be noted that the median survival times reported in the original article vary slightly from the results derived from synthetic data. This variance can be attributed to recent advances in evidence-based surgical and oncological practices implemented in multidisciplinary healthcare settings, which have significantly enhanced short-term survival outcomes for neurosurgical patients. Notably, the prevailing treatment strategy for GBM patients currently involves the combined use of adjuvant chemotherapy and radiotherapy. This approach is exemplified in the synthetic data, in which 41.3% of the patients have undergone a multimodal treatment regimen, whereas only 15.3% received multimodal treatment in the original study cohort (X2 P < 0.01). Contemporary research indicates that combined chemoradiotherapy results in a median survival duration of 14.6 months (range: 13.2–16.8 months), [8] which aligns with the median survival of 15.4 months [95% CI 3.1–23.8] observed in the primary synthetic dataset for patients with the same multimodal treatment. Therefore, the median survival times observed from the synthetic data cohort are consistent with the current literature, when taking into consideration treatment discrepancies across cohorts.
Across all five datasets, overall survival times and demographic characteristics demonstrated noteworthy consistency, with very little inter-variability (Table 1). Each of the five synthetic datasets yielded a range of 452–468 viable patients, all exhibiting a median albumin level between 35-36g/L (P = 0.93). The characteristics variables across the five synthetic datasets showed no significant difference. Furthermore, median overall survival (OS) times were uniform for all five datasets (Table 1, P = 0.90).
Systemic inflammation scores correlate with survival prognosis in patients with newly diagnosed brain metastases. The Starzer et al. [7] study focused on using inflammatory markers as prognostic factors for survival in patients with brain metastasis. A group of 1250 patients was selected from the electronic medical record of the Medical University of Vienna between 1990 and 2019. Data collection included the source of brain metastases, brain metastases correlation with the primary tumor, and the presence of extra-cranial disease. Other factors included treatment methods such as radiotherapy, surgery, and chemotherapy. The authors collected data on systemic inflammatory markers, which included the neutrophil-to-lymphocyte ratio (NLR), platelet-to-lymphocyte ratio (PLR), monocyte-to-lymphocyte ratio (MLR), and the C-reactive protein-to-albumin ratio. These markers were measured at the time of diagnosis of the brain metastasis (± 14 days).
Results showed a notable correlation between the progression of extra-cranial disease and elevated systemic inflammatory markers, particularly the platelet-to-lymphocyte ratio (PLR) and C-reactive protein-to-albumin ratio (CRP/Alb), which were found to be significantly higher in patients with advancing extra-cranial disease. On the other hand, lower serum inflammatory markers were associated with longer overall survival.
The synthetic population contained 1320 patients compared to 1250 patients in the original study. Similar categorical features were present in the synthetic dataset such as cancer entity and first-line treatment to the brain metastasis (Table 2). Systematic inflammatory markers were also measured from the synthetic dataset, showing ranges comparable to the real dataset (Table 3).
Table 2
Patient characteristics in the original (Starzer et al.) and synthetic datasets.
Characteristics | Real data (N = 1250) | Synthetic data 1 (N = 1320) | Synthetic data 2 (N = 1320) | Synthetic data 3 (N = 1320) | Synthetic data 4 (N = 1320) | Synthetic data 5 (N = 1320) | P values* |
Age at diagnosis of BM (Median[range]) | 62 [23–91] | 67.2 [25.6–95.4] | 67.01 [25.6–95.5] | 66.8 [26.0-95.8] | 66.6 [25.4–95.4] | 67.3 [26.7–95.7] | 1.0 |
Sex (Male/Female) | 662 / 588 | 724 / 596 | 668 / 596 | 670/594 | 668/596 | 669/596 | 0.89 |
Cancer entity Lung cancer Breast cancer Melanoma Renal cell carcinoma Colorectal cancer Others | 994 86 106 7 11 36 | 499 94 146 28 60 493 | 499 94 147 28 60 494 | 499 94 147 28 60 494 | 498 94 147 28 60 495 | 498 94 147 28 60 495 | 0.99 |
First-line treatment to BM Surgery Radiotherapy | 87 973 | 83 676 | 83 676 | 82 676 | 83 675 | 83 675 | 1.0 |
Adjuvant chemotherapy Yes No | 346 904 | 914 406 | 914 406 | 913 407 | 915 405 | 914 406 | 1.0 |
*P-values indicate the probability that the value of the specific variable is not significantly different between synthetic datasets.
Table 3
Systematic inflammatory markers in the original (Starzer et al.) and synthetic datasets.
| Real data | Synthetic data 1 | Synthetic data 2 | Synthetic data 3 | Synthetic data 4 | Synthetic data 5 | P values* |
Median | Median | Median | Median | Median | Median |
Survival (months) | N/A | 4.2 [1.00-72.46] | 4.21 [1.01–76.46] | 4.20 [1.02–73.43] | 4.20 [0.80-72.78] | 4.15 [0.87–73.33] | 1.0 |
Leucocytes (g/L) | 8.69 [0.89–48.8] | 8.61 [1.16–53.63] | 8.59 [1.17–53.41] | 8.59 [1.21–53.62] | 8.61 [1.18–53.47] | 8.61 [1.17–53.35] | 1.0 |
Neutrophils (g/L) | 6.15 [0.24-66.0] | 6.44 [0.64–44.26] | 6.47 [0.63–45.22] | 6.59 [0.63–45.07] | 6.51 [0.6-44.36] | 6.47 [0.62–43.39] | 1.0 |
Lymphocytes (g/L) | 1.30 [0.1–26.5] | 1.05 [0.14–11.35] | 1.04 [0.14–11.12] | 1.06 [0.14–11.25] | 1.06 [0.14–11.43] | 1.06 [0.14–10.9] | 1.0 |
Platelets (g/L) | 272.0 [2.0-894.0] | 242.0 [13–769] | 239.0 [13–769] | 239.0 [13–756] | 238.0 [13–769] | 239.0 [1.17–53.35] | 1.0 |
Monocytes (g/L) | 0.70 [0.01–2.42] | 0.58 [0.02–2.09] | 0.59 [0.02–2.11] | 0.57 [0.02–2.09] | 0.58 [0.02–2.1] | 0.58 [0.02–2.11] | 1.0 |
CRP (mg/L) | 1.22 [0-43.7] | 2.23 [0.2-39.06] | 2.16 [0-39.49] | 2.09 [0-39.52] | 2.22 [0-39.55] | 2.12 [0-39.53] | 1.0 |
PLR | 207.57 [2.0-894.0] | 228.80 [7.43-2471.43] | 230.09 [4.74-3657.14] | 228.44 [4.59-2506.25] | 227.83 [3.30-2162.50] | 226.63 [5.44-3077.78] | 0.99 |
LLR | 6.39 [0.12–997.5] | 7.91 [1.0- 133.06] | 7.86 [1.09-131.57] | 7.79 [1.16–158.60] | 7.87 [1.47–87.92] | 7.79 [1.14–111.0] | 1.0 |
MLR | 0.52 [0.01–3.06] | 0.53 [1.02–5.33] | 0.53 [0.02–5.22] | 0.53 [ 0.02–4.69] | 0.53 [0.02–5.43] | 0.52 [0.02–3.89] | 1.0 |
NLR | 4.76 [0.07-98.0] | 6.01 [0.30-143.35] | 6.09 [0.37-131.86[ | 6.03 [0.40-139.37] | 6.01 [0.39–86.96] | 6.05 [0.44–104.0] | 1.0 |
CRP/Alb | 2.41 [0.02-121.64] | 6.26 [0.04-205.56] | 6.03 [0.04-190.26] | 5.94 [0.04-187.61] | 6.08 [0.04-168.33] | 5.99 [0.04-197.51] | 1.0 |
*P-values indicate the probability that the value of the specific variable is not significantly different between synthetic datasets.
In the primary synthetic dataset, lower NLR (cut-off: <5.0 g/L) was associated with longer overall survival (OS), 5.1 months compared to 3.7 months in patients with a higher NLR (P < 0.01; mean of differences 2.2 [95% CI 1.0-3.4]). Furthermore, patients with lower LLR (cutoff: <5.7 g/L; 5.1 vs 3.8 months with P < 0.01; mean of differences 2.5 [95% CI 1.1–3.8]), lower PLR (cutoff: <3.5 g/L; 4.8 vs 3.5 months with P < 0.04; mean of differences 1.3 [95% CI 0.07–2.7]), lower MLR (cutoff: <0.5 g/L; 4.7 vs 3.7 months with P < 0.01; mean of differences 1.6 [95% CI 0.6–2.7]), and lower CLR/Alb (cutoff: <10.0 g/L; 5.2 months vs 3.4 months with P < 0.01; mean of differences 3.5 [95% CI 2.5–4.5]) presented with a more favorable survival prognosis (Fig. 3). The original article similarly depicted a comparable pattern, albeit with subtle variations noted among the monthly survival rates.
An advantage of synthetic data, in this case, is the ability to choose within different time frames and generate data in a timely fashion. All systemic inflammatory markers were taken within 14 days of brain metastasis diagnosis. However, if it was indicated otherwise, it would have been possible to rapidly generate another set of synthetic data within another time-frame.
Negligible variability can be observed between the five datasets generated for this study. All five synthetic datasets exhibit nearly identical median overall survival times, attesting to the reliability of the MDClone software in generating various cohorts from a singular query (Table 3, P = 0.99). Furthermore, the different inflammatory markers were consistent across all five datasets showing no significant difference (Table 3).