Cohort characteristics and DNA extraction data analysis
A flow diagram summarizing the study design is shown in Figure 1. In the discovery phase, 181 tissues samples were used to find variant candidate markers. And 196 participants were recruited to establish the diagnostic model, of whom 83 BC tumor cases with confirmed pathology and 67 healthy controls were enrolled in analysis. Detailed demographics and clinical characteristics of the included participants were shown in Table S1. The ucfDNA concentration was 6.47 ng/ml in controls and 199.28 ng/ml in cases (Figure S1A). Additionally, a higher concentration of uexDNA was also observed in cases (53.02 ng/ml) than in controls (20.78 ng/ml) (Figure S1B). In the validation phase, genomic sequencing and CNV data of 281 BC tumor specimens and 393 normal tissue samples from BLCA_TCGA portal and 11 UTUC patients (Table S7) from PKUH were enrolled in the analysis. Nineteen patients without and 12 patients with residual disease after surgery were used to evaluate the potential MRD detection ability of the novel diagnostic model (Table S8). No patients had any other known active cancer diagnoses at the time of surgery.
Analysis and clinical utility of the mutation profile in UC
The genetic profiles of 130 tissues from BLCA_TCGA and 51 tissues from 2ndHATMU were analyzed to find genes that could cover the maximum number of patients with minimum variants (Table S2). In urine, these genes could detect at least one mutation in 63.9%, two mutations in 39.6%, and more than two mutations in 8.4% of 83 tumor cases, but no mutation was found in 67 healthy controls (Figure 2A). Across the 83 tumor cases, the two most commonly mutated regions were the TERT promoter and TP53 (Figure 2A). TERT promoter mutations were present in 47% (39/83) of cases, and we also identified TP53 in 18% (15/83) of the cases in our cohort (Figure 2A). Moreover, other genes with a high frequency in our cohort were included, such as ERBB2, ERCC2, and FGFR3 (Figure 2A, Table S2). Based on mutation analysis of the DNA from primary bladder tumors, a 7:3 training and test cohort was calculated, with the best AUC of 0.819 (63.9% sensitivity at 100% specificity) in the modeling cohort (Figure 2B).
CNV_score predicts malignancy from nontumor controls
Large CNVs were analyzed by using shallow whole-genome sequencing (WGS) data of uexDNA. A genome-wide schematic overview of CNVs in the discovery phase is shown in Figure 3A. Chromosomal loss and gain were frequently identified in tumor cases but not in healthy controls (Figure 3A). For each subsequent sample, we standardized and calculated the CNV_score according to the method. The CNV_score was significantly higher in patients than in healthy controls in our cohort (P<0.001) (data not shown). Then, we tested the CNV_score performance in different segment cutoffs in the modeling cohort with 7:3 training and test sets, and it could discriminate tumor cases from healthy controls with high accuracy, obtaining the best AUC of 0.934 (sensitivity 86.75%, specificity 97.01%) in the modeling cohort (Figure 3B, Table S3). We also compared the CNV_score model with all autosomes, and the UroVysion model used chromosomes from the UroVysion FISH assay. The performance of the UroVysion model reached an AUC of 0.864 (sensitivity 79.52%, specificity 91.04%) in our cohort, which was worse than that of all autosomal models (p=0.029) (Table S4).
Dual demonstration model construction
Herein, we constructed a two-dimensional ensemble stacked ML approach employing two different base models on two optimized features to provide an ultrasensitive and cost-effective model for detecting UC. In the training phase, 3 ML methods (random forest (RF), support vector machines (SVM), and logistic regression without regularization (LR)) were considered. A 7:3 training set:test set split of the data and a grid search of hyperparameters with 10-fold cross-validation were also considered. The models were robust, showing stable accuracy in different models (Figure 4A, S2; Table S5). The overall sensitivity of the SVM model was 92.78%, with a specificity of 96.00% in the training set (Figure 4A); in the test set, the sensitivity was 85.71%, with a specificity of 100.00% (Figure 4B). As a measure of relative importance, the proportional contributions to the algorithm score variance were calculated. The large CNV contributed 60%, and the mutation feature contributed 40%.
Comparison between ML models and urine cytology in the training cohort
Urine cytology is a routine method used in the detection of UC. To compare the performance among ML models and urine cytology in UC, 42 NMIBC/MIBC patients in the modeling cohort were included for further analysis. The landscape of comparison is shown in Figure 5A. The SVM model assay obtained 2-fold more positive results than cytology (Figure 5B). Moreover, SVM could detect 82.6% (19/23) MIBC patients, better than the 69.6% (16/23) detected by urine cytology. Furthermore, in NMIBC patients, the sensitivity of the utLIFE-UC model (94.7%, 18/19) was 3-fold higher than that of cytology (31.6%, 6/19) (Figure 5B). The sensitivity of the other two models was considerable to that pf SVM, while the data are not shown here. Collectively, compared with urine cytology, our ML models exhibited markedly improved sensitivity, including in patients with NMIBC.
Validation cohorts proved the diagnostic value of ML models
Although the ML models were constructed from Chinese BC and Chinese healthy controls, we evaluated these models in a BLCA_TCGA validation cohort and UTUC cohort, as shown in Figure 1. The characteristics of the UTUC validation cohort are shown in Table S7. Healthy with matched sex and age-with UTUC patients in the modeling cohorts were used as controls in this cohort. The SVM model was selected for further analysis because it showed superior/comparable accuracy and specificity at the same sensitivity to the other 2 models (Table S5). The utLIFE-UC (SVM) model showed high accuracy in distinguishing BCs from controls (AUC 0.942, sensitivity 94.31%, specificity 98.73%) (Figure 6A). Confusion matrices of the utLIFE-UC model for BLCA_TCGA validation cohort are shown in Table S6. The utLIFE-UC model classified UTUCs at a sensitivity of 90.91% and specificity of 90.91% (Figure 6B). The risk probabilities of the BC and UTUC groups derived from the model were distinctively higher than those of the control groups (Figure 6C, 6D). These results indicated that the utLIFE-UC model possessed high accuracy and strong clinical utility in the detection of UC.
Application of utLIFE-UC to detect residual disease
In the MRD training cohort, the clinical characteristics of 16 patients who underwent radical resection or transurethral resection of bladder tumor (TURBT) are shown in Table S8. Eight patients achieved pCR, and another 8 patients still had residual disease detected in the surgical sample (including PR or SD, defined as non-pCR). The utLIFE-UC score was not significantly different between the 2 groups of patients in the baseline samples, while this score was significantly decreased in the pCR group compared to the non-pCR group (p < 0.05) (Figure 7A, S3). We constructed an MRD model in these 16 patients, with a sensitivity of 100%, specificity of 87.5% and negative predictive value (NPV) of 100%. A total of 87.5% (7/8) of patients with pCR were MRD negative, and 100% (8/8) of non-pCR patients were MRD positive (Figure 7B). In pCR patients, we observed an MRD negative rate of 75.0% (6/8) at the second time point (during treatment) and 87.5% (7/8) patients showed MRD negative at the day before surgery, while in non-pCR patients, the MRD positive rate were 75.0% (6/8) at the second time and 100% (8/8) showed MRD positive at the day before surgery (Figure 7C), indicating that the MRD score may represent the therapeutic effects in real time. Fifteen patients were used as an independent MRD validation cohort (Table S8), with a sensitivity of 100%, specificity of 80% and NPV of 100%, and the MRD probabilities of the non-pCR group were significantly higher than those of the pCR group.
Twenty of 31 patients in the MRD training and validation cohorts underwent urine cytology or FISH assays before surgery. The landscape of the diagnostic status is shown in Figure 7D. The utLIFE-UC model assay was approximately 3-fold more sensitive than cytology and 2-fold more sensitive than FISH (Figure 7E). These results further corroborated utLIFE-UC MRD detection as the sole predictor of pathologic response, which established a clear separation between patients with pCR and those with no pCR.