Existing stroke prediction models
We selected the problem of predicting stroke in patients with atrial fibrillation as it has been well studied and is one of the only prediction problems to have been extensively validated. Therefore, we have ample benchmarks to compare to the results of this study. The existing models we replicated were ATRIA, CHADS2, CHA2DS2-VASc, Framingham and Q-Stroke.
The ATRIA [6] model was developed on a cohort of 7,284 patients who were 18+ and had an atrial fibrillation outpatient diagnosis during 1997 or 1998. ATRIA was internally validated on a 3,643 patient hold out set obtaining a c-statistic of 0.72. In the same paper, the authors also externally validated the model on a cohort of 33,247 patients aged 21+ with inpatient or outpatient atrial fib or flutter during 2006-2009, obtaining a c-statistic of 0.7. The CHADS2 score [8] was developed by combining two other stroke prediction models (using the variables from these models and assigning points) and was validated on 1,733 patients aged 65 to 95 years who had nonrheumatic atrial fibrillation. The CHADS2 score obtained a c-statistic of 0.81 on this population. The CHA2DS2-VASc score [9] is another score-based model that was developed using knowledge of risk factors. The model was validated on a cohort of 1,577 patients who were 18+ and had atrial fibrillation during 2003 to 2004 from 35 countries. The model obtained a c-statistic of 0.61 for this patient population. The Framingham score [7] model was based on a Cox model developed using data from 705 patients aged 55 to 94 with initial atrial fibrillation. The internal validation, using a bootstrap approach, showed a c-statistic of 0.66. The Q-Stroke [10] model was developed using primary care data from the UK consisting of 3, 549, 478 patients aged 25-84 with no prior stroke or anticoagulation use (except aspirin) and was internally validated on 1, 897, 168 similar patients. When applying the model to predict the 10-year risk of stroke in female patients with atrial fibrillation at baseline, the c-statistic was 0.65.
The existing models include a small number of variables, Table 1 summarizes the variables included in each model. Some of the variables are unlikely to be available in claims data and these are marked with the + symbol. A large number of Q-Stroke variables are not commonly recorded in claims data (or are UK specific), so this model is difficult to replicate in external non-UK databases. For example, US claims data contain incomplete measurement records and rarely record family history but many of the Q-stroke predictors were recent measurements or family history. Table 2 presents the internal performance and published external validation performance for the five models. Although the internal validation c-statistic for some of the models was as high as 0.8, independent external validation studies of the models tend to show the models achieve c-statistics between 0.6 and 0.7.
Validation Prediction task
Within a target population of female patients with newly diagnosed atrial fibrillation and no prior stroke predict who will develop a stroke 1 to 365 days after initial diagnosis of atrial fibrillation.
Sources of Data:
We validated the existing models using a retrospective cohort design and various observational healthcare datasets (e.g., claims data and electronic healthcare data). The datasets used to evaluate the models are:
IBM MarketScan® Commercial Database (CCAE) is a United States employer-sponsored insurance health plans claims database. The database contains claims (e.g. inpatient, outpatient, and outpatient pharmacy) from private healthcare coverage to employees, their spouses, and dependents, so patients are aged 65 or younger. The database contains data collected between 2000-2018.
IBM MarketScan® Medicare Supplemental Database (MDCR) represents health services of retirees in the United States with primary or Medicare supplemental coverage through privately insured fee-for-service, point-of-service, or capitated health plans. The patients are aged 65 or older. The database contains data collected between 2000-2018.
IBM MarketScan® Multi-State Medicaid Database (MDCD) contains adjudicated US health insurance claims for Medicaid enrollees from multiple states and includes hospital discharge diagnoses, outpatient diagnoses and procedures, and outpatient pharmacy claims as well as ethnicity. The database contains data collected between 2006-2018.
Optum© De-Identified Clinformatics® Data Mart Database – Socio-Economic Status (Optum Claims) is an adjudicated administrative health claims database for members with private health insurance. The population is primarily representative of US commercial claims patients (0-65 years old) with some Medicare (65+ years old) however ages are capped at 90 years. The database contains data collected between 2000-2018.
Optum© de-identified Electronic Health Record Dataset (Optum EHR) is a US electron health record containing clinical information, inclusive of prescriptions as prescribed and administered, lab results, vital signs, body measurements, diagnoses, procedures, and information derived from clinical Notes using Natural Language Processing (NLP). The database contains data collected between 2006-2018.
Stanford Translational Research Integrated Database Environment (STRIDE) is a clinical data warehouse that supports clinical and translational research at Stanford University. This resource includes the EHR data of approximately 2 million adult and pediatric patients cared for at either the Stanford Hospital or the Lucile Packard Children’s hospital. This study was completed on an OMOP-CDM adherent instance of STRIDE. The database contains data collected between 2000-2018.
Columbia University Medical Center’s (CUMC) data come from New York Presbyterian hospital’s clinical data warehouse. The database comprises EHR data on approximately 5 million patients and includes information such as diagnoses, procedures, lab measurements and prescriptions. The database contains data collected between 1980-2018.
Ajou University School Of Medicine (AUSOM) is a database containing the entire EHR data from 1994 to 2018 of Korean tertiary hospital, Ajou university hospital. It contains medical record of about 2.9 million patients. The database contains data collected between 1994-2018.
The Integrated Primary Care Information (IPCI) is an electronic health care database containing patients of Dutch general practitioners (primary care). The database contains data collected between 1996-2018.
Each site had institutional review board approval for the analysis, or used deidentified data and thus the analysis was determined not to be human subjects research and informed consent was not deemed necessary at any site.
Participants
The existing models were applied to two target populations. Both target populations consisted of female patients newly diagnosed with atrial fibrillation and no prior stroke or anticoagulant use but target population 1 was patients aged 65 to 95 and target population 2 was all ages.
Target population 1: The target populations was defined as females aged 65-95 with either:
- 2 atrial fibrillation records
- 1 atrial fibrillation in an inpatient setting
- 1 atrial fibrillation with an electrocardiogram (ECG) within 30 days prior
and at least 730 days prior database observation and no prior stroke and no prior anticoagulant.
Target population 2: The target populations was defined as females with either:
- 2 atrial fibrillation records
- 1 atrial fibrillation in an inpatient setting
- 1 atrial fibrillation with an ECG within 30 days prior
and at least 730 days prior database observation and no prior stroke and no prior anticoagulant.
The target populations may contain different types of patients per database (e.g., different country US, European or Asian patients and different types of records such as inpatient and outpatient). The different databases used in this study are detailed in section ‘Sources of data’.
Outcome
We predicted stroke occurring 1 day until 365 days after the initial atrial fibrillation start date. The stroke outcome was defined as:
- An ischemic or hemorrhagic stroke recorded with an inpatient or ER visit
The code sets used to define atrial fibrillation, ECG and ischemic or hemorrhagic stroke are presented in Appendix B. The full analysis code (data creation and model evaluation) is available at: https://github.com/OHDSI/StudyProtocolSandbox/tree/master/ExistingStrokeRiskExternalValidation
Sensitivity analysis
Patients with a high risk of future stroke are often given anticoagulants as a preventative. If a high-risk patient is given an anticoagulant intervention during the 1-year time-at-risk this may prevent the stroke. We therefore performed a sensitivity analysis to remove patients who had an anticoagulant during the 1-year time-at-risk that may have prevented a stroke. For the sensitivity analysis, the target populations were modified by censoring patients at the point an anticoagulant was recorded, so any patient with an anticoagulant during the time-at-risk period was effectively removed from the target population unless they had a stroke prior to the anticoagulant.
Predictors
We calculated existing model predictors using phenotype definitions specified in the paper describing the development of the model when provided. If the development paper did not provide a definition, we used our own. The definitions for each predictor can be found in Appendix A.
Missing Data
Age and gender are required by the OMOP common data model used by OHDSI and will never be missing.
For each condition (diabetes, chronic heart failure, stroke, hypertension, proteinuria, end stage renal disease (ESRD), vascular disease, liver disease, coronary heart disease (CHD), atrial fibrillation, rheumatoid arthritis, chronic renal disease and valvular heart disease), we considered no records of the condition in the database to mean the patient does not have the condition. Ethnicity is often missing completely from a database and when missing we did not include it. Smoking status and family history are rarely recorded in claims data, we imputed 0 (never smoker and no family history) when the predictor was missing. Townsend deprivation score is specific to the UK and was not included as a predictor in our validation. The blood pressure and cholesterol measurements are rarely recorded in claims data and were not included as predictors in our validation.
Statistical analysis
The prediction model performances were evaluated using the area under the receiver operating characteristic (AUROC) curve which is equivalent to the c-statistic for binary classification. Confidence intervals were also calculated when the number of outcome patients was fewer than 1000. As the models are being used to predict 1-year risk in diverse patients we recalibrated the models for each database. The models were recalibrated by fitting a linear model to the predicted scores to learn a database specific intercept and gradient. We present the calibration plots for each of the five models recalibrated in each of the datasets. For each decile we calculate the mean recalibrated predicted risk and plot against the observed fraction of patients who have the outcome.
Development vs Validation
We picked participants that matched all eligibility criteria for all 5 existing models being validated but this may be a subset of the patient population used to develop the model for many of the models. Many of the predictors for the Q-stroke model were not available in our data and the measurements for Framingham were also no available. The outcome in this validation study was 1 year following index but many of the models were developed for 10-year risk.