Data
The data source for this study was the IBM MarketScan databases, which consist of the MarketScan Commercial Claims and Encounters database and the MarketScan Medicare Supplemental and Coordination of Benefits Database for the years 2012 to 2017. The MarketScan Commercial Claims and Encounters database contains data for several million individuals annually who are covered by employer-sponsored private health insurance in the United States.[22] There are nearly 300 contributing employers and 25 contributing health plans. The population covers employees, their spouses, and dependents. The MarketScan Medicare Supplemental and Coordination of Benefits Database includes data on Medicare enrollees with employer-sponsored retiree plans to supplement former employees’ Medicare plan.[22] Plans and employers from all 50 states and Washington DC contribute to both databases used in this study. This analysis was based on commercially available administrative claims data and did not involve direct contact with patients; therefore, Institutional Review Board approval was not required.
Study Population
We examined commercially insured adults older than 18 years of age and categorized them into the following cohorts: 1) Cancer survivors with and without mental health condition(s); 2) enrollees diagnosed with chronic conditions other than cancer with and without mental health condition(s); and 3) non-cancer and non-chronic condition controls with and without mental health condition(s). For the cancer survivor cohort, we included enrollees having at least one of the cancer diagnoses (including primary cancer or metastases and excluding in situ or benign cancers) defined in eTable1 during the study period (January 1, 2012 to December 31, 2017). Because this study used claims data, cancer survivors could be defined only by whether enrollees had a cancer diagnosis in the sample period, so there may be enrollees with cancer diagnoses prior to the sample period that are not defined as cancer survivors. Inclusion was limited to enrollees who were continuously enrolled in a contributing health plan for at least 365 days immediately prior to, and 365 days immediately after their first cancer diagnosis in the sample period (index date).
We then constructed two matched cohorts of enrollees, one with any chronic condition other than cancer and the other with neither cancer nor chronic conditions. Matching with the cancer survivor cohort was done based on age, sex, type of health plan, and whether they have a chronic condition. For matching, the age and health insurance plan type of each cancer survivor was set as December 31 of the year of the cancer diagnosis (index date). For the non-cancer chronic conditions cohort, henceforth referred to as ‘chronic condition enrollees,’ we included those that had at least one diagnosis for a chronic condition as specified in eTable2 during the sample period and did not have a cancer diagnosis during the study period. For the non-cancer controls without any chronic conditions’ cohort, henceforth referred to as ‘healthy enrollees,’ we included those that did not have any cancer diagnosis or any chronic condition during the sample period, and that had at least one medical claim in the 365 days immediately before or after the index date of the cancer survivor with whom they were matched.
Variables
The dependent variables examined are total costs and OOP costs for 1) all healthcare services and 2) mental health services. Total costs were computed as the sum of health plan costs and OOP costs. Health plan costs are the costs paid to the provider by the health plan or one or more health plans in case of Medicare supplemental coverage. The aggregate categories of mental health conditions and mental health services used in the analysis are presented in eTables3 and 4. OOP costs are the costs paid by the patient, computed as the sum of coinsurance, copayment, and deductible. Annual costs for each enrollee were calculated using the following formula:

All reported costs are adjusted to 2019 US$ using the Personal Health Care Deflator by the Centers for Medicare and Medicaid Services (CMS).[23]
Age was categorized as 18-39, 40-49, 50-64, 65-74, 75-84, and 85+. Health plan type was categorized into three groups: 1) Comprehensive/Major Medical Plans, Comprehensive Plans, and Preferred Provider Organization (PPO) Plans; 2) Health Maintenance Organization (HMO) Plans, Exclusive Provider Organization (EPO) Plans, and Point-of-Service (POS) Plans; 3) Consumer-Driven Health Plans (CDHP) and High-Deductible Health Plans (HDHP). Control variables used to generate adjusted costs include age, sex, and number of unique comorbid chronic conditions. Number of unique comorbid chronic conditions is a count of all chronic conditions except cancer, as listed in eTable2, that the enrollee was diagnosed with during the study period.
Other independent variables used to stratify the results include cancer type by system, cancer type by survival time, and timing of mental health condition diagnosis relative to cancer diagnosis at the enrollee-level. Area-level variables are defined at the Metropolitan Statistical Area (MSA) level. These include metropolitan status, region, poverty rate, population aged 65+, education, number of hospital beds, and number of psychiatrists. Details regarding operationalization of these variables are presented in eTable5.
Statistical Analysis
Due to the skewness in the distribution of costs, we used a two-part model. First, we used logistic regression to estimate the probability of an enrollee having any expenditures for the applicable set of healthcare services. Second, using the subset of enrollees predicted to have positive costs for the applicable set of healthcare services, we used a generalized linear model with gamma distribution and log-link to estimate the annual costs of all healthcare services and mental health services. For each demographic and socioeconomic stratification, we calculated adjusted annual costs using the two-part model with the mental health condition indicator as well as age, sex, and number of unique comorbid chronic conditions as independent variables. If the dependent variable in a particular stratification had only positive costs, costs were estimated using only a generalized linear model. The adjusted costs are presented as predictive margins, which standardizes these estimates to the covariate distribution of the overall population. For any estimates calculated using the two-part model, standard errors (SEs) used to compute the confidence intervals (CIs) were calculated using the delta method.[24] Statistical significance was set at P ≤.05, using two-tailed tests. All analyses were conducted using SAS 9.4 (SAS Institute, Cary, North Carolina) and Stata 16 (StataCorp LLC).