Rurality, Cardiovascular Risk Factors, and Early Cardiovascular Disease among Childhood, Adolescent, and Young Adult Cancer Survivors

Background and Aims: Cardiovascular risk factors (CVRFs) later in life potentiate risk for late cardiovascular disease (CVD) from cardiotoxic treatment among survivors. This study evaluated the association of baseline CVRFs and CVD in the early survivorship period. Methods This analysis included patients ages 0–29 at initial diagnosis and reported in the institutional cancer registry between 2010 and 2017 (n = 1228). Patients who died within five years (n = 168), those not seen in the oncology clinic (n = 312), and those with CVD within one year of diagnosis (n = 17) were excluded. CVRFs (hypertension, diabetes, dyslipidemia, and obesity) within one year of initial diagnosis, were constructed and extracted from the electronic health record based on discrete observations, ICD9/10 codes, and RxNorm codes for antihypertensives. Results Among survivors (n = 731), 10 incident cases (1.4%) of CVD were observed between one year and five years after the initial diagnosis. Public health insurance (p = 0.04) and late effects risk strata (p = 0.01) were positively associated with CVD. Among survivors with public insurance(n = 495), two additional cases of CVD were identified from claims data with an incidence of 2.4%. Survivors from rural areas had a 4.1 times greater risk of CVD compared with survivors from urban areas (95% CI: 1.1–15.3), despite adjustment for late effects risk strata. Conclusions Clinically computable phenotypes for CVRFs among survivors through informatics methods were feasible. Although CVRFs were not associated with CVD in the early survivorship period, survivors from rural areas were more likely to develop CVD. Implications for Survivors: Survivors from non-urban areas and those with public insurance may be particularly vulnerable to CVD.


INTRODUCTION
Remarkable progress in the overall survival of children, adolescents, and young adults with cancer comes at a signi cant cost with late therapy-associated toxicities and there is a critical need for risk-based care.Seventy-four percent of survivors developed at least one chronic health condition in adulthood with a thirty-year cumulative incidence of 42% for a severe or life-threatening condition or death. 1 Adolescents and young adults also face unique challenges after cancer treatment. 2,3Survivorship-focused, evidencebased care is essential to mitigate sequelae, such as heart failure and other cardiovascular diseases (CVD), to optimize the health of survivors and promote health equity. 4rdiovascular risk factors (CVRFs) potentiate cardiotoxicity from anthracyclines and chest radiation among survivors.CVRFs later in life elevate the risk for CVD in a near multiplicative fashion, with an excess relative risk for heart failure of 44.5 due to the interaction of hypertension and anthracyclines. 5e inclusion of CVRFs, when added to treatment-related exposures, further re nes risk prediction of subsequent heart failure among survivors. 6For adults with cancer, baseline cardiovascular risk assessment prior to initiation of cardiotoxic chemotherapy aims to ameliorate cardiovascular complications. 7][10][11][12] Indeed, CVRFs during childhood are associated with an increased risk of fatal and nonfatal cardiovascular events in adulthood. 136][17] Moreover, fragmented healthcare systems for AYA survivors and those from non-urban areas underscore the role of data standards to surmount siloed data challenges through interoperability with the ultimate goal to improve patient care. 18Recent advances in data science, such as clinically computable phenotypes for hypertension and diabetes, offer strategies to leverage real world data and accelerate population health research. 19,20e objectives of this study were to implement a clinical informatics approach to identify CVRFs prior to or during cancer treatment among children, adolescents, and young adults (CAYA) with cancer and then analyze the impact of CVRFs on the subsequent development of CVD in the early survivorship period.As a secondary aim, survivors were linked to the Oklahoma Health Care Authority (OKHCA) data to evaluate potential inequities among survivors from non-urban areas and ameliorate underdetection bias from institutional data.

Survivor Cohort Construction
The institutional cancer registry at the Stephenson Cancer Center at the University of Oklahoma reports all newly diagnosed cases to the National Cancer Database (NCDB), as per Commission on Cancer accreditation standards. 21,22The cancer registry contained the necessary demographic information (age at diagnosis, gender, and ZIP code to determine rurality).The cohort included children (ages 0-12 years), adolescents (ages 13-18 years), and young adults (ages 19-29 years) who, at the time of initial diagnosis, were evaluated in the academic pediatric oncology or medical oncology clinics, and received their rst course of treatment at their respective centers.To re ect the reliability of cancer registry data for these age groups and ensure a longitudinal follow-up of ve years, we included ve-year survivors diagnosed between January 1, 2010, and December 31, 2017.This research was submitted to and approved by the University of Oklahoma Health Sciences Review Board (IRB#14731) on June 15, 2022.

Disease Classi cation and Late Effects Risk Strati cation
As part of NCDB standards, the International Classi cation of Diseases-Oncology, third edition (ICD-O3) was used to group diagnoses into primary malignancy categories based on the International Classi cation of Childhood Cancer, third edition (ICCC-3). 23Coding for bone tumors, central nervous system tumors, Hodgkin's lymphoma, non-Hodgkins lymphoma, leukemia, neuroblastoma, retinoblastoma, sarcoma, Wilms tumor, and other categories were previously reported. 24The cancer registry captures whether patients received chemotherapy, surgery, radiation, or transplant as dichotomous variables. 21Late effects risk strati cation, based on primary diagnosis and dichotomous treatment exposures, was conducted based on the British Childhood Cancer Survivor Study risk groups. 25

Cardiovascular Risk Factors and Cardiovascular Disease
The Clinical Research Data Warehouse Team at the University of Oklahoma Health Sciences Center used standard query language to extract key data elements for CVRFs and race/ethnicity from the EHR.The primary CVRFs for this analysis included hypertension, diabetes, obesity, and hyperlipidemia.The Common Terminology Criteria for Adverse Events (CTCAE, v5.0) were used to classify CVRFs. 26For hypertension, CTCAE Grade ≥ 2 was de ned as a diagnosis consistent with hypertension and an outpatient prescription for an antihypertensive medication (Supplemental Material 1).The Observational Medical Outcomes Partnership Common Data Model (OMOP CDM) provides a critical framework for reliable data standards and supports research with real-world data. 27,28For medication data, OMOP CDM utilizes RxNorm codes and previous research supports the utility of this model to classify antihypertensive medications automatically extracted from the EHR. 29We leveraged the RxNorm Concept Unique Identi er to ascertain survivors with outpatient prescriptions for antihypertensive medications prior to diagnosis or within the rst year of initial diagnosis.Grade ≥ 2 diabetes was de ned as an ICD-9/10 code consistent with diabetes or HgbA1C ≥ 6.5% from discrete observational lab data.For obesity, discrete data elements were extracted from the EHR to classify survivors as obese according to CTCAE Grade ≥ 3 with a body-mass index ≥ 30 (or > 95th percentile based on age-and sex-speci c distributions) prior to diagnosis or within one year of initial diagnosis.Finally, dyslipidemia was de ned based on ICD-9/10 coding.Supplemental Material 1 provides a detailed account of the clinically computable phenotypes, based on discrete observations, ICD-9/10 codes, and RxNorm codes, used to de ne CVRFs for this analysis.
Heart failure or cardiomyopathy was the primary CVD outcome for this analysis.A previously published methodology, based on ICD-9/10 codes, was utilized (Supplemental Material 2). 13,30The date of initial diagnosis was used to landmark the date of diagnosis for CVD.Survivors with CVD prior to diagnosis or within one year of diagnosis were excluded from the analysis.An incident case of CVD was de ned as an ICD-9/10 code consistent with cardiomyopathy or heart failure one to ve years after the initial cancer diagnosis.

Oklahoma Health Care Authority (OKHCA) Data
The cancer registry and institutional EHR data were linked to Medicaid records from OKHCA.Medicaid number was used as the primary identi er for linkage and supplemented by other identi ers such as date of birth, and rst, and last names (for survivors without a match).The ICD-9/10 codes for CVD, as described above, were used and landmarked by date of cancer and CVD diagnosis to detect incident cases during the early survivorship period.

Statistical analyses
Descriptive statistics including mean, standard deviation, median, and interquartile range (IQR) were calculated for continuous variables (age at diagnosis).Percentages and counts were calculated for categorical variables (age group, sex, race/ethnicity, rurality, primary diagnosis, late effects risk group, hypertension, diabetes, dyslipidemia, and obesity).The chi-square test was used to examine the association between each predictor and CVD status if all cell counts were greater than 5. Fisher's exact test was used if any cell count was less or equal to 5. Unadjusted risk ratio (RR), RR adjusted for late effects risk strata and the corresponding 95% con dence intervals (CI) for examining the association between rurality and CVD in the early survivorship period were calculated using a modi ed Poisson regression model with robust error variance.Manual backward variable selection was used with an alpha threshold of 0.05.Confounding was assessed between predictors if the removal of one characteristic in uenced a change of 20% or more in remaining characteristics.Collinearity was assessed with a threshold of 0.70. 31Missing values were excluded from the analysis, all of which were in the group without a cardiac event (0.3% were missing race/ethnicity, 0.7% were missing rurality, 3.3% were missing late effects risk strati cation due to incomplete exposure documentation).All analyses were performed by using SAS 9.4.

Cardiovascular Risk Factors and Cardiovascular Disease among Survivors
Between 2010 and 2017, there were 1228 children, adolescents, and young adults with cancer reported to the institutional cancer registry who completed their rst course of treatment at the Jimmy Everest Center or the Stephenson Cancer Center.Among those with established oncology-related care (n = 916), an overall ve-year survival of 82% was observed.The analytic cohort excluded those with early documented death (n = 168) and those not seen in an oncology-related clinic (n = 312) (Fig. 1).In order to establish a temporal relationship between CVRFs and the detection of CVD, survivors with CVD prior to or during the rst year of treatment (n = 17) were also excluded (Fig. 2).Among the analytic survivor cohort, there were 10 incident cases (1.4%) of CVD observed between one year and ve years after treatment.Grade ≥ 2 hypertension was observed in 106 survivors (14.5%), 37 met the criteria for diabetes (5.1%), eight survivors with dyslipidemia (1.1%), and 226 were obese (30.9%).All ten of the cardiac events were observed in survivors with OKHCA coverage while 67% of survivors without an event had OKHCA coverage (p = 0.04).Survivors at high, moderate, and low risk had a cumulative incidence of cardiac events of 5.2%, 1.1%, and 0.4%, respectively; the percent of patients in each risk category differed signi cantly between survivors with and without CVD (p = 0.01).There were no statistically signi cant associations between CVRFs and CVD in the early survivorship period (Table 1).Oklahoma Healthcare Authority Data Analysis Data linkage of the analytic survivor cohort (n = 731) with OKHCA claims data showed that 67.7% of survivors had Medicaid coverage (n = 495).The inclusion of claims data identi ed two additional cases of CVD one to ve years after initial diagnosis that were not captured by institutional data, which yielded a cumulative incidence of 2.4% (n = 12).Among survivors with OKHCA coverage, those from small town/isolated rural areas accounted for 50% of the incident cases of CVD, despite representing 17.5% of all survivors.Survivors from rural areas had a cumulative CVD incidence of 6.9% compared with 1.9% and 1.3% of survivors from large town and urban areas, respectively (p = 0.02).Similar to the full cohort, there was a signi cant association between late effects risk strata and CVD (p = 0.006).Demographics, such as age, gender, and race/ethnicity, as well as CVRFs were not signi cantly associated with CVD in the early survivorship period among those with OKHCA coverage (Table 2).Age and age group at diagnosis had high collinearity (r = 0.83), thus continuous age was chosen as the preferred indicator for the model.However, age was not signi cantly related to CVD when included alongside other predictors and was dropped.Patient late effects risk strata was determined to be a confounder and is retained in the nal model.Therefore, multivariable modi ed Poisson regression modeling showed that there was a persistent association between rurality and CVD, as survivors from small town/isolated rural areas had a 4.1 times greater risk (95% Con dence Interval 1.1-15.3) of CVD compared with survivors from urban areas after adjustment for late effects risk strata (Table 3).

DISCUSSION
In single institution cohort of CAYA survivors, clinical informatics tools based on discrete data elements from the EHR were leveraged to construct clinically computable phenotypes and evaluated the prevalence of CVRFs prior to diagnosis and during treatment.This represents a feasible approach to identify CVRFs on a population health level for at risk survivors.No signi cant associations were observed between CVRFs and CVD in the early survivorship period for this cohort, yet this analysis and methods inform efforts to harness real world data to drive improvement in survivorship-focused care.
Furthermore, the presented analyses identi ed survivors at high risk for late effects in general and those with OKHCA coverage were at increased risk of CVD in the early survivorship period.Claims data augmented the detection of cardiac events among survivors with OKHCA coverage and the analysis from this subcohort suggested that those from rural areas were at increased risk of CVD even after adjustment for late effects risk strata.Rural-urban differences, particularly inequities in cardiovascular health, in the general population underscores the need to prevent CVD, particularly for CAYA survivors at risk for late morbidity and mortality.
The disproportionate burden of CVRFs and CVD in rural areas in the United States is well documented.In 2020, the American Heart Association released a call to action to reduce longstanding inequities in CVD among rural populations with a focus on individual factors, social determinants of health, and health delivery systems.[34] The evidence of recent progress on closing the rural-urban gap is mixed, and the persistence of these geographic disparities in the general population should inform healthcare delivery interventions for survivors. 17,35Strati cation by OKHCA coverage, particularly with the highlighted differences in race/ethnicity, rurality, hypertension, and obesity, further controls for these potential confounders and helps characterize this vulnerable population.The observed increased risk of CVD among survivors of CAYA cancer from rural areas in Oklahoma with public insurance suggests that, even as soon as one to ve years after the initial diagnosis, there is an opportunity to intervene and mitigate risk.
Data science and development of clinical informatics tools have the potential to catalyze improvements in health services research, guide population health management, and drive systems-level changes to promote equity for all survivors of CAYA.The presented methodology derived from data standards, such as RxNorm's RxCUI codes for antihypertensive medications, and the novel creation of clinically computable phenotypes support the feasibility of such tools to characterize modi able risk factors among survivors at a population health level. 29The analyses of the Oklahoma cohort failed to identify signi cant associations between CVRFs and CVD, which may re ect limitations in this cohort or perhaps suggest further re nement of clinically meaningful phenotypes to predict CVD are needed.
Nevertheless, data standards are foundational to ensure the interoperability of key information between health systems, both from a research and clinical operations viewpoint. 36Moreover, the Childhood Cancer Data Initiative (CCDI) seeks to address the fragmented data ecosystem and has made progress toward an infrastructure to facilitate data sharing to learn from every child, adolescent, and young adult with cancer. 37More than a decade after the Health Information Technology for Economic and Clinical Health (HITECH) Act, lessons across the healthcare eld in various specialties and domains offer insights to adapt evidence-based technologies for oncology and survivorship-focused care. 38,39e observations and analyses from the CAYA survivor cohort require contextualization for potential limitations.First, this cohort represented a single institution.While the majority of children in the state are treated at Oklahoma Children's Hospital, young adults may have received treatment at community-based oncology centers and there is one other site in Oklahoma that cares for children with cancer.Therefore, the data may not be representative of the state of Oklahoma or generalizable to the national population.Data linkage with claims data uncovered rural-urban differences in CVD, which likely re ects detection bias from institutional data as the absence of diagnosis records does not necessarily mean the absence of disease. 40Alternatively, the observed differences may only exist in the Medicaid population.Underdetection of CVRFs, such as dyslipidemia or diabetes, is also possible if they are not routinely assessed or documented from EHR-based data.The lack of robust historical data prior to 2009 and moderate cohort size may have contributed to insu cient power to detect potential associations between CVRFs and CVD.Additionally, in this cohort, acute cardiotoxicity was observed and events within a year of diagnosis were excluded from analysis, as assessment of baseline CVRFs prior to diagnosis was likely incomplete and would have muddled the temporal relationship.
5][46] Previously developed and validated NLP algorithms, such as EchoExtractor, serve as an example for open source informatics to automatically extract echocardiogram parameters. 479][50] The sole reliance on ICD-9/10 coding, while based on methods from large multi-institutional cohorts, may also lead to misclassi cation of cardiac events, which could be amenable to more precise measurements from echocardiograms. 13Even with the implementation of such tools, underdetection bias may still persist if echocardiogram reports are unavailable.Adolescent survivors in Oklahoma were previously identi ed as approximately ve times more likely to receive suboptimal guideline-adherent echocardiogram surveillance. 51 conclusion, clinical informatics tools to integrate data from various sources for cohort construction and apply data standards to characterize CVRFs highlight opportunities to leverage data to improve survivorship-focused care for CAYAs impacted by cancer.Survivors from rural areas may be at increased risk for CVD, even in the early survivorship period.Modi able CVRFs at baseline and during treatment merit additional investigation to determine their impact on later CVD for survivors.This study provides a framework to adapt clinical informatics-based approaches for CAYA survivors to promote interoperability based on data standards, facilitate interinstitutional collaborations to detect relevant predispositions to CVD, and, ultimately, improve care for equitable outcomes among all survivors.

Declarations
This work was presented as an oral presentation at the 2023 International Society of Paediatric Oncology Congress.
Con icts of Interest: The authors declare no competing interests.

Figures Figure 1
Figures

Table 1
Cardiovascular Risk Factors During Treatment and Cardiovascular Disease in the Early a Variables collapsed due to small numbers and concern for con dentiality b Based on the British Childhood Cancer Survivor Study risk groups (Frobisher et al 2017)a Variables collapsed due to small numbers and concern for con dentiality b Based on the British Childhood Cancer Survivor Study risk groups(Frobisher et al 2017)

Table 2
Cardiovascular Risk Factors During Treatment and Cardiovascular Disease in the Early Survivorship Period among Survivors with Oklahoma Healthcare Authority (OHCA) Coverage a Variables collapsed due to small numbers and concern for con dentiality b Based on the British Childhood Cancer Survivor Study risk groups (Frobisher et al 2017) b Based on the British Childhood Cancer Survivor Study risk groups (Frobisher et al 2017) Individually signi cant predictors related to CVD were age, age group, rurality, and late effects risk strata.

Table 3
Modeling for Association between Rurality and CVD in the Early Survivorship Period

Table 4 )
. Although primary diagnosis and late effects risk groups were not associated with OKHCA coverage, rurality was signi cantly associated with OKHCA coverage as 78.4% and 75.5% of survivors from small town/isolated rural areas and large towns had OKHCA coverage, respectively, compared with 63.7% of survivors from urban areas (p < 0.01).Moreover, race/ethnicity (p < 0.01), female sex (p = 0.03), and young adult age group at diagnosis (p = 0.02) signi cantly differed by OKHCA coverage status.Regarding CVRFs, 17% of survivors with coverage had hypertension compared with 9% of those without