The systematic review of the literature provided 775 records. After screening and eligibility assessment, 24 studies met the inclusion criteria. Namely, one trial contained both prospective and retrospective cohorts: this was analyzed as two separate datasets. The pooled analysis was finally carried out on 25 cohorts for a total of 1966 patients (Fig. 1). The main features of selected studies are summarized in Tables 1 and Additional Table 1.
Furthermore, as a means of investigating heterogeneous results while answering specific clinical questions, we split participant data into subgroups according to tumor burden, sample size, diagnostic technique, sampling time, biological subtype, and hotspot mutation (Table 2).
Tumor burden. Extracting data from cohorts singly evaluating different disease stages, 4 and 23 cohorts were finally assigned to early and advanced subgroups for a total of 55 and 1836 patients, respectively (Additional Table 3), , –, , –, –. Regarding the advanced setting, we observed an AUC of 0.92, which showed an excellent discrimination ability between mutated and wild-type patients (Additional Fig. 6 and Table 2). Furthermore, even if not evaluated in terms of diagnostic accuracy due to missing data, we investigated both the disease distribution and the number of metastatic lesions from 9 and 8 cohorts, respectively, , , , , , , –. Most of the examined population had a visceral involvement and at least two metastatic lesions (Additional Table 6). Likewise, we found comparable pooled diagnostic values for the early subgroup, even if arising from a very limited sample size (Additional Fig. 6 and Table 2). We observed lower absolute sensitivity rates in the earlier stages25, however, showing similar pooled diagnostic values compared to the advanced setting (Table 2).
Sample size. According to the median number of included patients (45 individuals), 12 and 13 studies were collected in the low- and high-size subgroups, showing the highest ctDNA performance in low-size studies according to the diagnostic values (Additional Figs. 7 and 8). Noteworthy, smaller studies added compelling insights in terms of pooled specificity and DOR compared to the heterogeneity of larger samples (0.96 and 40.42 versus 0.85 and 27.11, respectively) (Additional Fig. 9).
Diagnostic technique. The most used techniques were ddPCR/BEAMing (12 cohorts, 1485 patients), followed by NGS (9 cohorts, 307 patients) and PCR (5 cohorts, 174 patients) (Additional Table 5). The ctDNA PIK3CA MAF was reported as median and/or media of all mutated cases or calculated extracting data from supplementary (7/25 studies) (Additional Table 9) , , –, . Namely, NGS seemed to outperform ddPCR/BEAMing and PCR in terms of sensitivity (0.83 versus 0.74 and 0.51, respectively) (Additional Fig. 10 and Table 2). The ddPCR/BEAMing subgroup reported a lower pooled specificity (0.84) than NGS (0.98) and PCR (0.96). Furthermore, NGS outclassed PCR-based assays in terms of detection sensitivity, specificity, and AUC (0.98), not eventually leading to heterogeneity for specificity (Additional Fig. 10a) while showing compelling PLR, NLR, and DOR rates that favored NGS over PCR-based methodologies (Table 2).
Sampling time. Among 20 studies, tissue biopsies were mainly performed on the primary site, with 4 studies carrying out tissue biopsy on metastatic lesions (Additional Table 6). According to data available for 13 cohorts, the time between tissue and plasma sampling was variable, ranging from 0 days to over 15 years , , , , , , , , , , ,  (Additional Table 10). Patients were assigned into low- and high-time subgroups, respectively (≤ and > 18 days), according to the median time between tissue and plasma collection. The best ctDNA performance in terms of sensitivity, specificity, and AUC (0.85, 0.99, and 0.94, respectively) was observed in the low-time subgroup, showing compelling findings for PLR, NLR, and DOR rates (16.24, 0.21, and 101.50, respectively) with acceptable heterogeneity (Additional Fig. 11 and Table 2).
Biological subtype. The H+/HER2- and HER2 + subgroups were included in 5 and 10 studies (Additional Table 11) , , , , , , , ,  with very few data being available on triple-negative BCs, , . We found a comparable ctDNA performance for AUC (0.87 and 0.86, respectively) and other diagnostic rates, however observing higher ctDNA sensitivity favoring the H+/HER2- over the HER2 + subgroup (0.73 versus 0.57, respectively) (Additional Fig. 11 and Table 2).
Hotspot mutation. Considering the most involved PIK3CA mutations within exons 9 and 20, 12 and 10 studies were pooled for the H1047X and E542/545X subgroups (520 and 421 patients, respectively) (Additional Table 4), , , –. Specifically, ctDNA assays revealed a slightly more accurate trend in detecting H1047X than E542/545X in terms of sensitivity, specificity, and AUC (0.74, 0.98, and 0.93 versus 0.70, 0.95, and 0.88, respectively) (Additional Fig. 12a-b and Table 2).