The Screening Value of Mammography for Breast Cancer: An Overview of 22 Systematic Reviews with Evidence Mapping

Background: Several meta-analyses have evaluated the screening value of mammography for breast cancer, but the overall results have remained mixed or inconclusive. Methods: Comprehensive literature search was conducted for SRs (systematic reviews) in Chinese Biomedical Databases (CBM), Cochrane Library, EMBASE, and PubMed until July 10, 2020. SRs with meta-analysis reported the benet and performance of mammography screening were included. Two reviewers independently extracted data and performed the methodological quality assessments using The Risk Of Bias In Systematic Reviews (ROBIS). The characteristics of included SRs, the results of the quality of Risk of bias (RoBs) assessment and the pooled estimates of effect size were descriptively summarized using systematically structured tables and evidence mapping. Results: Twenty two systematic reviews with meta-analysis were included. Only 13.6% of SRs were assessed as low-risk bias according to the overall risk of bias rating results in ROBIS tool. Pooled estimates for a reduction in breast cancer mortality attributable to mammography screening were range from 0.51 (OR, 95% CI: 0.46-0.55) to 1.04 (RR, 95% CI: 0.84-1.27). Sensitivity of difference mammography was ranged from 55% to 91%, specicity of difference mammography was ranged from 84% to 97%. According to the results of included SRs suggested, the statistically signicant was observed that digital breast tomosynthesis (DBT) increased the cancer detected rate (CDR) and reduced the recall rate compared to digital mammography (DM), DM increased the CDR compared to screen-lm mammography (SFM), and add DBT to digital or synthetic mammography increases the sensitivity, specicity, and CDR than DBT alone. Conclusions: Further study should investigate the value of different imaging technology in breast cancer screening. value of screening mammography between mammography screening and breast cancer


Study selection
The identi ed records were imported into EndNote X8 (Thomson Reuters (Scienti c) LLC Philadelphia, PA, US) for management. After the removal of duplicate records, the selection of potential reviews was performed rst by titles and abstracts by two independent authors (JYS and LNX). The records that did not meet the inclusion criteria were excluded. The full text of each potential review was then obtained and assessed by the same two reviewers (JYS and YYK) to determine whether they meet the eligibility criteria. The discrepancies between the two authors were resolved by discussion with a third reviewer (JHT).

Study selection and data extraction
We used the EndNote X8 (Thomson Reuters (Scienti c) LLC Philadelphia, PA, US) to manage Literature search records. Two independent reviewers (JYS, YYK) screened out possibly relevant studies by titles and abstracts extraction sheet to exclude records that did not meet the inclusion criteria. Then, the same two reviewers screen out the studies that met the inclusion criteria by evaluating the full text. Any disagreements were discussed and resolved by a third reviewer (JHT).

Data extraction and management
All authors involved in this study had previously piloted the form on a random sample of three included SRs to ensure the agreement among the interpretation of data items. One reviewer (JYS, and JL) extracted data from the included studies using a data extraction sheet, and a second reviewer (NL, and JHT) veri ed the extracted data. The extracted data included: authors, published year, country of the corresponding author, database searched, target population, intervention, intervention/exposure, comparison/control, outcome measures, study design, a methodological quality assessment tool used, number of included study and patients, follow up, the pooled results and their 95% con dence intervals (CI). Any disagreements were adjudicated by the third reviewer (NL).

Assessment of methodological quality
The Risk Of Bias In Systematic Reviews (ROBIS) tool was recently published and aimed to help assessors judge the risk of bias (RoBs) in the review process, results, and conclusions that can be applied to SRs including both RCTs and non-RCTs [11,12]. All reviewers involved in this study had previously piloted the form on a random sample of three included SRs to ensure the agreement among each criterion of ROBIS prior to employing this tool. Two review authors (J.Y.S, and H.J.) performed quality assessments independently using ROBIS, which is completed in three phases as follows: (1) Phase 1 assesses the relevance of the review and is considered optional; (2) Phase 2 includes four domains: study eligibility criteria, identi cation and selection of studies, data collection and study appraisal, synthesis, and ndings; (3) Phase 3 assess the overall RoBs. The results of each domain and phase 3 were rated as "high risk", "low risk" or "unclear risk". Any disagreements were adjudicated by the third reviewer (J.H.T.).

Data synthesis
The pooled estimates of effect size from meta-analyses and their 95% CI were expressed as odds ratio (OR), relative risk (RR), Risk Difference (RD), or event rate (ER), depending on what the authors had reported. The characteristics of included SRs, the results of the quality of RoBs assessment and the pooled estimates of effect size were descriptively summarized using systematically structured tables and evidence mapping.

Search results
The search of this overview retrieved 826 records, we excluded 191 articles after removing duplicates and screening title or abstract. The remaining 54 records, 9 were excluded due to the lack of target outcomes of this overview, 11 were excluded due to no meta-analysis conducted, 12 were excluded due to not focus on BC screening. Ultimately, 22 studies were included in this overview [2,. Details of the PRISMA ow chart of literature studies for this overview are presented in Fig. 1.

Characteristics of included SRs
The details characteristics of the included 22 SRs with meta-analysis included primary studies ranging from 4 to 24 are presented in Table 1 . 19 from developed countries, 3 from developing countries, 17 SRs reported database searched, only 1 study retrieved Chinese databases. 11 assessed the Nine SRs were identi ed comparing the association of mammography vs. no screening on BC mortality. Pooled estimates for a reduction in BC mortality attributable to mammography screening strati ed by study design were range from 0.51 (OR, 95% CI: 0.46-0.55) to 1.04 (RR, 95% CI: 0.84-1. 27). An SRs of 10 case-control studies average a 49% reduction in BC mortality for women who are screened (0.51 [OR, 95% CI: 0.46-0.55]), which was similar to SRs with quasi-RCTs conducted by Gøtzsche (Table 2) [25,26,32].
In the SRs of RCTs that strati ed by age, a signi cantly reduces BC mortality among women with the latest follow-up data (aged 40-49) invited to screening mammography was observed by Hendrick  Performance of mammography screening technology Eight SRs reported the accuracy of BC screening conducted using mammography ( Table 3). Sensitivity of difference mammography was ranged from 55% to 90.77%, speci city of difference mammography was ranged from 84%-97%.

Discussion
To our knowledge, this overview is the rst one to systematically evaluate the methodological quality of evidence from SRs to provide an evidence-based assessment and summarize the accuracy and e cacy on BC screen. A comprehensive literature search was performed and 22 SRs were identi ed, published from 1995 to 2020, the number of studies included in each SRs ranged from 4 to 24.

Risk of bias in systematic reviews
High standards SRs can provide the highest level of evidence for evidence-based decision making, while low-quality design or conduct of SRs have the potential to bias results and mislead clinical practice. It is crucial to assess the quality of SRs before being served a vital role in decision making. We performed the ROBIS tool to assess the quality of included SRs and the results were not optimistic. None of the SRs met all four domains in phase 2. Only 36.4% of SRs with low RoBs in study eligibility criteria, SRs without prior protocol may reduce the transparency and increase the RoBs in researcher outcome reporting, previous studies discovered that the quality of SRs could be improved if they performed a priori protocol. However, only six SRs reported prior study protocol or registration, which is similar to previous studies. Although some studies were clearly described question, objective, and criteria, it was still di cult to judgment about whether these SRs were stipulated in advance and avoid the RoBs. Only 18.2% with low risk for each signaling questions in domain 2 (identi cation and selection of studies), thirteen SRs searched three or more databases, six of the studies reported detailed search strategy, identi ed unpublished studies, and searched registry platform, eight studies did not report the follow-up time, seven studies did not involve at least two reviewers independently performing the process of selecting studies for inclusion. 45.4% of SRs with low risk for domain 3 (data collection and study appraisal), 50% of SRs assessed the RoBs or methodological quality using the appropriate tool, and only eight studies assessed the RoBs of included SRs independently by at least two reviewers. 59.1% of SRs had potential methodological aws in synthesis and ndings according to domain 4. First, the inadequate search of existing literature may ignore the results of some studies. Second, the lack of a bias risk assessment process and the inclusion of primary studies with bias may result in biased and unreliable results. Third, it is di cult to determine whether the data synthesis and analysis methods have been determined in advance and have been followed due to the lack of protocol and registration information. Last, substantial heterogeneity was observed, while further analyses were not conducted to explore the source of the variation. Only 13.6% of SRs were assessed as low-risk bias according to the overall RoBs rating results.
Nine SRs explored the effectiveness of mammography screening, pooled estimates for the reduction in BC mortality were ranging from 0.51 to 1.04. SRs included RCT found a lower BC mortality than SRs included other study designs, which was similar to the sub-group analysis of Cochrane analysis. In the SRs that were strati ed by age, the result of three SRs of RCTs found a statistically signi cant reduction in mortality among women ages 40 to 49 years with screening mammography, while pooled estimates are not statistically signi cant for women aged 40 to 49 years in other two SRs. According to Cochrane analysis, the subgroup of SRs only combined sub-optimally adequately randomized found a statistically signi cant reduction in mortality in women younger than 50 years old [32]. SRs conducted by Nelson et al. found BC mortality is generally reduced with screening among women aged 50-59 and 60-69 [2], however, pooled estimates are not statistically signi cant for women aged 70 to 74 years. Both Kerlikowske et al. and Cochrane's analysis found that screening mammography signi cantly reduces BC mortality in women aged 50 to 74 years, while Cochrane analysis found no statistically signi cant only combine adequately randomized trials [28,32]. In general, most of the included SRs suggested that mammography screening was a bene t to BC mortality reduction, while results were not statistically signi cant at all SRs. Moreover, the bene t of mammography screening for BC mortality reduction among women aged 40 to 49 was contradictory between included SRs.
Performance of mammography screening technology Improvements in mammographic technology provided the transition from FM to DM and DBT, which may increase the bene t of screening by increasing the detection accuracy of BCs. Five studies reported the sensitivity and speci city of mammography ranged from 55% to 90.77% and 84% to 97%. Hodgson et al. found DBT plus FFDM has higher sensitivity and speci city than FFDM alone, and Song et al. observed that the performance of DM was similar to that of FM [17,20]. Zhu et al. found that CESM has high diagnostic accuracy, while signi cant heterogeneity was observed and no potential source of heterogeneity could analysis due to partial loss of data.
Six studies compared the CDR and recall rate of different mammography, both of SRs conduct by Iared et al. and Farber et al. found the cancer detection rate of DM was statistically signi cantly higher than FM, however, the recall rate was not signi cant differences in former, but signi cant differences increased by later [14,23]. Vinnicombe et al. declared that there was no evidence of differences between FFDM and FM in CDR and recall rate [15]. Marinovich et al. found a statistically signi cant improves CDR and reduces recall for DBT vs. DM [21]. The use of DBT plus digital or synthetic mammography could improve the CDR of BC, while the recall rate was observed reduction by phi et al. and no signi cant improvement by Giampietro et al. [16,22]. Yun et al. found that adding DBT to FFDM is more effective in detecting BC than FFDM alone, especially in increasing the detection of early invasive BC speci city, and CDR than DBT alone [19]. According to the results of included SRs, most of the included reviews suggested that the CBT increased the CDR and reduced the recall rate compared to DM, DM increased the CDR compared to FM, and add DBT to digital or synthetic mammography increases the sensitivity, speci city, and CDR than DBT alone. However, the potential methodological aw and different reference standards challenge the interpretation of the results. More high-quality SRs and primary studies are needed to comprehensively determine the performance of difference mammography.

Strengths and limitations
To our knowledge, this is the rst overview to investigate the methodological quality of the SRs using the ROBIS tool and assess the bene t of mammography screening in BC mortality reduction and accuracy of mammography. The present overview has some limitations: First, SRs were included in 1995 and 2020, the result of mortality reduction of BC may be increased due to the improved treatment of more advanced cancer and BC awareness, and the decision to participate in screening programs should trade-offs carefully considering combined the over-diagnosis, costs and other in uence. Second, both Chinese and English databases were searched, but there were only English language studies met our criteria, which may exist bias of language. Finally, a high RoBs of the included SRs might affect the reliability of the evidence summarized.

Conclusion
Most of the included reviews concluded that breast cancer mortality is generally reduced with mammography screening, with the exception of women under the age of 50 and those aged 70 to 74. With the development of advanced imaging technology, the diagnostic performance of mammography has been improving, most of the included SRs suggested that DM increased the CDR compared to FM, the CBT increased the CDR and reduced the recall rate compared to DM, moreover, add DBT to digital or synthetic mammography increases the sensitivity, speci city, and CDR than DBT alone. However, the methodological quality of most of the included reviews was de ned as high risk, with the publication of methodological tool, as well as reporting checklist, improvement is urgent need over time.