Estimating Uninsured and Underinsured Women Eligible for Minnesota’s Breast Cancer Screening Program

The mission of the National Breast and Cervical Cancer Early Detection Program’s (NBCCEDP) mission is to improve access to mammography and other health services for underserved women. Since its inception in 1991, this national program has improved breast cancer screening rates for women who are uninsured and underinsured. However, the literature has shown that NBCCEDP screenings are decreasing, and only reach a portion of eligible women. Reliable estimates at the sub-county level are needed to identify and reach eligible women. Our work builds upon previous estimates by integrating uninsured and insurance status into spatially adaptive filters. We use spatially adaptive filters to create small area estimates of standardized incidence ratios describing the utilization rate of NBCCEDP services in Minnesota. We integrate the American Community Survey (2010–2014) insurance status data to account for the percentage that an individual is uninsured. We test five models that integrate insurance status by age, sex, and race/ethnicity. Our composite model, which adjusts for age, sex, and race/ethnicity insurance statuses, reduces 95% of the estimation error. We estimate that there approximately 49,913.7 women eligible to receive services for Minnesota. We also create small geography (i.e., county and sub-county) estimates for Minnesota. The integration of the insurance data improved our utilization estimate. The development of these methods will allow state programs to more efficiently use their resources and understand their reach.


Introduction
The National Breast and Cervical Cancer Early Detection Program (NBCCEDP) is a national program administered by the Centers for Disease Control that provides funding for breast cancer screening for lowincome women (Health et al., 1995). Early detection of breast cancer through screening is critical to nding and treating cancer early when treatment is more likely to be successful (Ginsburg et al., 2020).
The program has provided over 15.7 million mammograms and annually provides screening services for over 250,000 women (Centers for Disease Control and Prevention, 2022). However, a gap in the literature has been determining at the national and local levels how many women are eligible to receive NBCCEDP services.
Tangka et al. rst described the nation's state-level estimates for NBCCEDP breast cancer screening services (Tangka et al., 2006). They estimated the numbers of women aged 40-64 at both the state and national levels and the percentage of those who received those services using census population data.
Howard and colleagues (2015) updated these national estimates, by integrating the Current Population Survey to speci cally estimate women's uninsured status by race/ethnicity and age group. Their results compared NBCCEDP screening from 2002-2003 and 2011-2012. Nationally the NBCCEDP screened more women, but the number of eligible women increased dramatically lowering the proportion of women reached in the program. Subramanian and colleagues (2015) continued this work, by examining the effectiveness of the NBCDEEP state-run programs. A contribution of their work is the integration of Behavioral Risk Factor Surveillance System (BRFSS) data. The BRFSS mammography prevalence measure is used as a proximal measure for women who were current with mammography. In contrast to the national approaches, Hughes et. al (2021) used a geographic small-area estimation technique to estimate the number of women eligible for Minnesota's NBCCEDP. This work provided estimates and utilization rates of mammography services for the sub-county level, a limitation of this work was its ability to provide reasonable uninsured estimates.
The lack of an accurate population description for uninsured and underinsured women has been a consistent problem for estimating the population eligible for NBCCEDP services. Creating reliable estimates is elusive because publically available insurance estimates are aggregated by geographic location, race, and income. The Small Area Health Insurance Estimates (SAHIE) provides reliable 1-year estimates at the county level but has not yet been used for creating NBCCEDP estimates. We address this gap in the literature through the integration of detailed uninsured and underinsured rates by age, race, and income. However, instead of integrating the one-year SAHIE data, we integrate the 5-year uninsured data from the American Community Survey (ACS), which are available for small areas (i.e., census tracts) 2. Methods

Study Context
The Minnesota NBCCEDP, "Sage" screens approximately 15,700 women a year, across a network of over

Population Data
The Research Triangle Institute (RTI) 2010 U.S. Synthesized Population dataset was used as the base population from which we de ned different eligible populations (Rineer & Bobashev, 2020). The synthetic population provides "individuals" with age, sex, and race information, as well as household size and income. Each individual is linked to a household and is provided a geographic location. Using this information, we can calculate the population estimates of women in Minnesota, 40 and older that meet Sage's criteria for mammography screening.

Health Insurance Data
We used the publicly available American Community Survey (ACS) 5-year estimates for 2010 to 2014 for estimates of individuals who have insurance. Insurance estimates are provided in three separate datasets by age-sex, race, and income (Manson, Steven et al., 2022). While breast cancer does affect both men and women for our analysis, we only used women for the age-sex insurance information and refer to it as age. ACS insurance estimates are available at both the census tract and zip code levels. We chose to use the ZIP Code Tabulated Areas (ZCTA) as uninsured estimates for this study. ZCTA was chosen as it would not bias the results as there were cases that only provided their zip code.

Integrating Health Insurance into Spatially Adaptive Filters
Spatially adaptive lters are a geographic small area interpolation technique that controls for errors in estimate calculation by using a population threshold (Beyer & Rushton, 2009;Haynes et al., 2022;Tiwari & Rushton, 2005). The lters grow in size to ensure that each area calculated has the minimum population. The reference grid and population threshold used for creating the spatially adaptive lters are the same as in our previous work (Hughes et al., 2021). We integrated health insurance into the adaptive lters by applying the percentage of each individual's uninsured status by age, race, and income that is aligned to the ZIP code in which they reside.
To do this, we used the synthetic population dataset that facilitates integration with spatial and aspatial datasets. In our case, we needed to apply a probability to each eligible woman that they are likely to be uninsured. We can do this by spatially joining the woman's residence to the ZCTA. Once they are linked, we assigned them the value of that ZCTA. The synthetic dataset reports each individual's age, race, and income characteristics, which allowed us to assign the appropriate ACS insurance estimates for each of these categories. Each insurance category (i.e., age, race, income) is subdivided into additional subcategories. The age category has ve sub-categories (35 < 44, 45 < 54, 55 < 64, 65 < 74, < 75+), there are also ve sub-categories for income (< 25K, 25K < 49K, 50K < 74K, 75K < 100K, < 100K) and six subcategories for race (i.e., White-nonHispanic, Black, Asia, Paci c Islander, some other race, and two or more races). The synthetic population dataset supports the same race de nitions but does not have a category for ethnicity. Therefore we align the synthetic population designated as White with the White-nonHispanic ACS insurance estimates. Aligning these categories allows us to provide precise insurance estimates for individuals based on their age, race, and income characteristics, and based on the ZCTA where each person resides.
To determine the insurance estimates for the composite model, we simply add together for each individual the amount of uninsured by age, race, and income plus a constant of 0.1 for all individuals who do not identify as American Indian or Alaska Native (Fig. 1). Adding together each of the insurance categories allows for a comprehensive understanding of all insurance estimates. The constant we apply is a conservative estimate of women that are underinsured for SAHIE data (Schoen et

Results
The model that we have developed allows us to simulate different insurance scenarios that will predict the population eligible for the NBCCEDP.  When compared to all of the other models, Fig. 2E (Composite), which combines all of the insurance estimates, has the least error, upon visual inspection. Figure 2E also has the most similar utilization spatial patterns of all the new models, when compared to the original model (Figured 2A). In particular, there are low utilization areas in the northwest and southwest portions of the state in Fig. 2A. The composite model (Fig. 2E) re ects a similar spatial pattern for these same areas.  Table 2 provides summary statistics of the utilization rate for each of the models in Fig. 2. Overall, the models show that the state utilization rate of breast screening services varies from as low as 26.52% percent to as high as 53.93%. The largest standard deviation is attributed to the age model because this model had the lowest estimates of women eligible for the NBCCEDP program, which resulted in the largest underestimates.
We also identi ed how often our utilization was over 100%. All of the maps have some residual error. The constant insurance model had about 5% of its cells with a utilization rate of over 100%. The age, income, and race models all increased the total amount of error in the data set. This is due to the models underpredicting in particular areas the number of women eligible for the NBCCEDP. Of the three models, the income error has the least number of cells with a value over 100%.
These results also show that our composite model, which integrates all three insurance statuses for an individual, is the best model. It improves our estimates by reducing estimation error by 95.2%. The composite model has less than 1% of cells with any error. In comparison, the other three insurance models (i.e., age, income, race) increase the overall error. We estimate that 49,914 uninsured and underinsured women are eligible to receive services within the state of Minnesota.
While utilization rates are helpful for understanding the uptake of a particular resource, estimated counts are important for programmatic goal setting. The resulting eligible population uses synthetic geographic locations, and it can be spatially aggregated into geographic boundaries. Sage works with providers in every county in Minnesota, and Fig. 3A describes for each county the number of women underinsured or uninsured who are eligible to use Sage screening services. A detailed description of the entire state's eligible population by county is included (Appendix 1). Figure 3A also illustrates the spatial variation of women eligible to receive services at the county level is distributed across urban and rural counties. The two counties (i.e., Hennepin and Ramsey) with the largest number of women eligible for NBCCEDP services are the two most populated counties in Minnesota. These two counties belong to the highest category meaning that over 5,000 women are eligible for NBCCEDP services in each county. The counties that are immediately adjacent (i.e., Wright, Anoka, Washington, Carver, Scott, and Dakota) are members of the second and third categories, meaning 501 to 5,000 women are eligible in each county. However, there are rural areas in the central northern portion of the state that also have a large number of women eligible for NBCCEDP services. The county of Beltrami, which has a large American Indian population has over 1,000 women eligible for NBCCEDP services. There are several counties (i.e., Becker, Otter Tail, Cass, Itasca, and Crow Wing) all located in the northern central portion of the state with at least 500 women eligible for NBCCEDP services.
County-level estimates may be effective for state-wide initiatives but insu cient for local clinic efforts. Figure 3B highlights the variation of eligible women in the seven-county metro area previously described. Figure 3B uses census county divisions (CCDs), which are established geographic boundaries cooperatively developed by the Census Bureau and state and local governments (Federal Register, 2018). County division units were chosen over ZCTA as our models integrated insurance data at the ZCTA level. The CCDs further highlight the spatial variation of individuals eligible for NBCCEDP services at the subcounty level and the importance of spatial disaggregation. Hennepin and Ramsey counties each have a single CCD, which are the designated areas of Minneapolis and St. Paul, belonging to the highest category that has over 1,001 individuals eligible for NBCCEDP services. The visual-spatial pattern present in Fig. 3B indicates that there is a trend, in which CCDs that have large numbers of women eligible for NBCCEDP services tend to be near similar CCDs.

Discussion
Our work builds upon previous work by Hughes and colleagues (Hughes et al., 2021) by integrating insurance status into spatially adaptive lters. The integration of these datasets has reduced the error in our original model by 95.2% and resulted in less than 1% error remaining in our mammography utilization estimates. Additionally, this work establishes small-area estimates for a state NBCCEDP. These methods ll a critical gap in the literature by providing detailed high-spatial resolution estimates at multiple geographic levels designating the number of women eligible to receive NBCCEDP resources. This provides a foundation for future research, which can then further examine the personal and structural barriers that have been eliminated that allow for high utilization rates of mammography services in particular communities.
Mammography rates continue to be lower among low-income women and racial and ethnic minorities (Davis et al., 2014). Howard (2015) estimated that while the total number of women using the NBCCEDP program had increased, the overall percentage had decreased (Howard et al., 2015). This is why creating a reliable estimate of women eligible for the NBCCEDP program is critical. The NBCCEDP serves a critical role in providing access to mammography for low-income women and racial and ethnic minorities, who often have irregular access to healthcare (Centers for Disease Control and Prevention, 2019; NBCCEDP Screening Program Summaries, 2022). Women who do not have regular access to care, often delay mammography, which has a rippling effect on underserved populations and ultimately leads to worse survival and outcomes (Mootz et al., 2020). These underserved populations are more likely to be diagnosed with advanced cancer, in part due to missed opportunities for timely screening. In the US, African-Americans are more than twice as likely, and Hispanics are 1.2 times as likely to be diagnosed with metastatic disease than non-Hispanic whites (Clegg et al., 2009).

Limitations
Our work only created estimates for one state's NBCCEDP and therefore it is unknown how generalizable the model is to other states. Future work should extend this research to multiple states to determine the generalizability of the model. Additionally, the estimates are based on the 2010 population and may need to be updated. However, the RTI synthetic population dataset that was used in this data is no longer freely available for the 2020 census.

Conclusions
Our geographic small area estimation approach demonstrates one potential critical step for accurately characterizing the population eligible for NBCCEDP services at the sub-county level. In particular, mammography utilization maps can be used by state o cials to evaluate the e cacy of partnered clinics and plan interventions in communities. The estimates and maps are relevant resources that public health decision-makers could use for the deployment of limited resources and evaluation of the program. This work estimates the women eligible and demonstrates the utilization rates of mammography resources at the sub-county level. Figure 1 Illustrates the Calculating Uninsured and Underinsured for Sage