The lack of systematic collection and reporting of BCDM means the number of patients living with BCDM is largely unknown, with data and intelligence lacking to effectively develop policies, plan and deliver Health and Social Care (HSC) services, including services within the charitable sector, for these patients (44,45).
We developed an algorithm to detect BCDM using datasets from population-based cancer registry (NICR), inpatient hospital administrative system (PAS) and death registration (GRO).
The rules-based algorithm detected patients with BCDM across NI with high levels of sensitivity (95.1%), specificity (99.2%), PPV (96.2%) and NPV (98.9%) (Table 1) and compared favourably to other algorithms estimating BCDM incidence utilising PBCR data (21,22,24,27,28).
To date there have been few studies that have assessed the incidence and prevalence of BCDM in the UK. We estimate that in 2020 there were 911 patients living with BCDM in NI, of which one third were de novo BCDM and two thirds progressive BCDM (See Table 2).
A study by Yip et al (2015) used mortality data to determine the number of people living with BCDM in the UK per annum. This study estimated there was 35,000 BC survivors in the UK, comprising those with DM and not in their last year of life (65.7%) and those in their last year of life (34.3%).
Clements et al (2012) also developed a model to estimate BCDM prevalence based on mortality and incidence data, with their model predicting approximately 3 to 4 prevalent BCDM cases for every BC death (46). Applying this to the NI population, where the average annual BC deaths for 2016–2020 was 314, would equate to between 942–1256 prevalent cases annually. The annual prevalent population we estimated, of 911 in 2020, was slightly below the lower end of this range, but importantly we have excluded cases with another primary cancer either before the primary BC diagnosis or between the primary BC diagnosis and the progressive BCDM (6.3% of the total BC population).
In a recent study which used secondary and tertiary care records from the English NHS Hospital Episode Statistics (HES) database to directly identify BCDM cases, Palmieri et al reported a prevalent population in England of 57,215 patients, and an incident population of 7,580, for the financial year 2020/21. However, this study included C77.3 (axillary node) cases within their definition of BCDM, did not exclude cases with other primary cancers and did not use a validation dataset. One of the strengths of our study is the exclusion of C77.3 cases from our definition of BCDM. We estimate that had C77.3 cases been included, total incidence would be over-estimated by 75% with specificity decreasing to 77.1%, and PPV decreasing to 48.7% (Supplementary Tables 2 and 3) (47).
As outlined above, previous studies have developed estimates of BCDM using a range of methods, including rule-based algorithms and machine learning models, and have used a range of datasets, which largely reflect their availability and suitability in the study’s region or country. Datasets that have been used to develop estimates including for example, hospital-based and administrative datasets. Our study is unique in that it used population-based cancer registry (NICR) data and linked this to inpatient hospital administrative system (PAS) and death record (GRO) datasets. The NICR process of staging for primary BC patients involves manual review by NICR staff of each patient’s electronic records, including pathology and radiology, with high levels of staging completeness in recent years, which supports confidence in the accuracy of de novo BCDM patient data in this study (48). We used a novel method of free-text mining of GRO death records for DM descriptors to identify patients with BCDM. Free text mining of death records between 2000–2020 yielded 77% of BCDM cases identified from GRO death notifications, with only 23% derived from GRO ICD10 secondary codes (See Fig. 1), highlighting the importance the contribution this technique makes in BCDM identification and estimation. Another study strength was the large validation dataset used, with 1,028 primary BC patients manually followed up for a period of 8 years leading to the identification of 184 persons developing DM. Very few previous studies used similarly sized validation datasets (22), with the majority validating against substantively smaller datasets (24,27,28,34)
Limitations
Our algorithm found stable numbers of newly diagnosed BCDM cases per year from 2009–2020, but lower numbers in the years before 2009. However, prior to 2009 GRO and NICR datasets were not as complete, with this especially true in the years before 2000. Prior to 2009 there were lower levels of staging completeness in the NICR dataset, whist between 1993 and 1999, the GRO death database did not allow for free-text mining as NICR did not receive cause of death text information at this time (See Fig. 2). There may as a result be some patients who developed BCDM in the period 1993–2008 and were still alive in 2020 who were not identified which may underestimate the period prevalent population in 2020.
This study presents estimates of BCDM in NI which, in the absence of routine systematic collection of BCDM incidence in NI (45), depend on clinical coding of DM by HSC Trusts and which are not linked to the original primary BC diagnosis. However, as virtually all DM cases in NI are diagnosed in a secondary or tertiary care hospital within a universal healthcare system, this recording is likely to have high levels of completeness and accuracy in recent years. The timeliness of BCDM recording by HSC staff was also not known, and delays in recording BCDM by HSC staff may lead to an underestimate of more recent cases. However, as this effect will likely be distributed across the annual prevalent figures reported here, its impact is likely to be small. Further information on diagnostic and treatment pathway data for these patients may help improve the timeliness of BCDM diagnoses.
Another possible limitation is that patients with another primary cancer prior to their primary BC, or with another primary cancer prior to their BCDM record, were excluded. This represented 6.3% of all LR (Stage I-III) and “Stage not known” patients (See Fig. 1). Manual note review for this cohort may help ascertain if the metastatic event was associated the primary breast, rather than the other primary, cancer, but this level of detail was not available for this study.