Machine Learning Based Prediction Model for Using Non-steroidal Anti-in ammatory Drugs on Risk of Adverse Events

Juan Lu University of Western Australia Ling Wang University of Western Australia Mohammed Bennamoun University of Western Australia Isaac Ward University of Western Australia Senjian An Curtin University Ferdous Sohel Murdoch University Benjamin JW Chow University of Ottawa Girish Dwivedi Harry Perkins Institute of Medical Research Frank M San lippo (  frank.san lippo@uwa.edu.au ) University of Western Australia


Introduction
Machine learning is increasingly common in big data science, experimentally in the electronic medical record (EMR) data1-3. Machine learning models have shown their advantages in risk predictions based on a wide array of patients' EMR data4, 5. The risk prediction of drug response has drawn close attention in public health. Adverse drug reactions (ADRs) have long been recognised as a potential outcome of taking medicines. It is practically impossible for doctors to reduce patients' risks of ADRs through manually screening millions of health records while prescribing. Therefore, we would like to build a set of machine learning models based on a mass of patients' EMR data to predict patients' risk of ADRs.
ADRs are common in older adults with various reactions and a signi cant proportion of ADRs are responsible for hospital admissions6, 7. Moreover, the older adults are nearly seven times more likely to be hospitalised due to ADR-related problems than their younger counterparts6, 8. Thus, accurate ADR risk prediction models are necessary for clinical practice and helping doctors to reduce the risk of ADRs in the elderly. A large number of surveys aimed to identify the key factors increasing a person's risk of ADR have been proposed9, 10. But they are not suitable for predicting the individual risk of ADRs due to the considerable differences in diseases and drug history between patients. This motivates the machine learning-based risk prediction model design of ADRs based on patients' diseases and drug history in their EMR database. This paper mainly focuses on the use of Non-steroidal Anti-in ammatory Drugs (NSAID) which are reported to be associated with a dose-related increased risk of cardiovascular (CV) events.
NSAIDs are extensively prescribed for the treatment of musculoskeletal disorders, rheumatoid arthritis (RA), osteoarthritis (OA), nociceptive pain, headache, and in ammation11, 12. A large number of structurally diverse NSAIDs with similar therapeutic effects have been developed and NSAIDs have belonged to the most widely used pharmacological drugs, both over the counter (OTC) and doctoral prescription13-15. However, their potential adverse effects are also well known. Multiple previous studies have reported an increased risk of CV events from the use of NSAIDs11, 15-21. For example, Rofecoxib, one of the NSAIDs, was withdrawn from the market in October 2004 after a randomised placebocontrolled trial showed an increased risk of CV among rofecoxib users21. Importantly, the population commonly taking NSAIDs is that of elderly individuals who suffer from CV diseases11, 15. Thus, we propose the machine learning-based prediction model for using NSAIDs on the risk of adverse event (AE). Admissions due to acute coronary syndrome (ACS) is one of the very common ADRs of NSAIDs.
The objective of this study was to build a machine learning model to predict the risk of AE for elderly patients who took NSAIDs in Western Australia. We used various patients' comorbidity history and medication history for model development. All the records are from the Pharmaceutical Bene ts Scheme (PBS), linked with Hospital Morbidity Data Collection (HMDC) and death register dataset in Western Australia. We compared the performance of different machine learning models and analysed the impact of features on the machine learning model.

Data sources
The study datasets consisted of public and private hospital admissions for heart disease in Western Australia during 2003-2008 from the HMDC, with linked admission records back to 1980 and forwarded to 201422. These were linked to matching records from the Western Australian death registry to 2014, and PBS data from mid-2002 to mid-2011 from the Australian Department of Human Services. The HMDC and mortality data are 2 of the core datasets of the Western Australian Data Linkage System23.
The PBS dataset contains patient-level information for medications dispensed from community pharmacies and PBS hospitals, including details such as drug name and strength, quantity supplied, and supply date.

Inclusion criteria and selection
In this paper, we sampled patients supplied with NSAIDs at least once between 1 Jan 2003 and 31 Dec 2004 and aged 65 or above, from PBS dataset. All the drugs were identi ed by the Anatomical Therapeutic Chemical (ATC) code. The sampling period, due to that Rofecoxib was withdrawn from the market in October 2004, ensured that we could capture all the records of NSAIDs. The PBS dataset just recorded medications that enjoy government bene ts and did not include records of out-of-pocket payments. Previous research had shown that patients aged 65 or above are mostly concessional bene ciaries, and their dispensing records in the PBS data are mostly complete24. Furthermore, most of the patients taking NSAIDs are also elderly and ARDs are more common and serious for the elderly. Thus, the age of the patients in the study was restricted to 65 and above. Figure 1 showed the timeline of sampling. The selected study patients are records between 1 Jan 2003 and 31 Dec 2004. The patients' comorbidity history dates back to 10 years, and their drug history goes back to six months. The AEs of NSAIDs are recorded based on the patient's status within one year after taking NSAID. All the features and outcomes of the study cohort will be detailed in the following subsections.

Input features
The features in our model are composed of (1) patients' demographic information, (2) patient's comorbidity history and (3) drug history. Demographic information includes the patients' age, gender, marital status, and indigenous ethnicity. These are very common features in medical records and are considered to be strongly related to the patient's health. Age and gender were de ned at the last supply date of the NSAIDs for the study cohort. Marital status and indigenous ethnicity were de ned at the last admission before patients' last NSAIDs supply. The comorbidity history and drug history are recorded based on the timeline design, and all-time information will not be recorded in the features. History of comorbidities was determined from the diagnosis codes (both ICD-9-CM and ICD-10-AM) in the HMDC dataset with a 10-year lookback period from the last supply date (Supplementary table 1: detailed ICD code). Comorbidities included 13 features: ischaemic heart disease, hypertension, atrial brillation, diabetes, chronic obstructive pulmonary disease, peripheral vascular disease, stroke, chronic kidney disease, cancer, dementia, depression, heart failure, and cardiomyopathy. We used comorbidity history as continuous variables representing the frequency of previous admissions. Drug history was identi ed using a 6-month look back from the last supply date of the cohort using the PBS data, and drugs were grouped into 16 features corresponding to the rst character of the corresponding ATC code25. We also included the history of NSAIDs into 13 features corresponding to 13 NSAIDs. Drug history was presented as continuous variables representing the total number of scripts supplied to the patients.

Outcome
We focused on patients' risk of ACS and all-cause mortality in our study, as previous studies have presented the risks of NSAIDs in CV events.11, 15-21. ACS admission was identi ed from the principal discharge diagnosis eld from the HMDC records using International Classi cation of Diseases and Related Health Problems, Tenth Revision, Australian Modi cation [ICD-10-AM]) code I20.0 for unstable angina and I21 for myocardial infarction. We also classi ed patients who died due to cardiovascular diseases as ACS. Patients who died before their last supply date were excluded. (Fig. 2). All-cause mortality was identi ed from the death registry. We also looked at a composite outcome, including both ACS admissions and all-cause mortality. Follow-up of patients began after their last supply date and nished at 365 days after the last supply date. In all the records we obtained, there are some patients with the same input features but different outcomes (with or without the event), which will interfere with the prediction results. Therefore, we excluded these records before training the machine learning models.

Machine Learning Method
We developed three machine learning models for risk prediction: gradient boosting classi er model (GBM)26, multi-layer neural network (MLNN) model and support vector machine (SVM) model. These machine learning models are well-performed in clinical risk prediction27-29. However, there is no literature exploring their performance in risk prediction of taking NSAIDs. All the records are split into the training set and testing set with a ratio of 0.75:0. 25 The predictive performance of models was compared by calculating sensitivity, speci city, and AUC-ROC.
We used the Youden index32 to identify the optimised threshold for the ML model predictions that would achieve a balanced sensitivity and speci city. For all models, we randomly split the dataset using different random states and calculated their mean performance matrices and their 95% con dence intervals from training and evaluating the models 50 times. Once the outperformed model was identi ed, we conducted a sensitivity analysis using different NSAIDs testing set. We divided the test set on patients supplied with different NSAIDs. The model was then compared with the Cox regression model based on the same features to validate our modelling and performance. We built two cox regression models, with one of them is using the same continuous variables as we had in machine learning models, the other one is built on the same features, but all features were binary variables. Feature importance plots were generated by GBM for inspection.

Results
Cohort characteristics Figure 2 shows the results of each step in sampling from the dataset. There were 109,101 patients supplied with NSAIDs during 2003 and 2004, and 40,212 were excluded due to age < 65 years or they died before the last supply ( Fig. 1). Therefore, we identi ed 68,889 patients in the cohort with more than 40% users of Celecoxib and 35% users of Rofecoxib. Table 1 shows patient characteristics for the study groups. The mean age was 76, and more than 50% of the cohort was female. More males developed ACS, and older patients were more likely to develop an adverse outcome. Cardiovascular diseases such as ischaemic heart disease and heart failure were more common among patients who developed ACS than those with no ACS. The frequency of comorbidity history was higher in patients who died during the follow-up. Performance of machine learning models   Table 3 shows the performance of GBM on predicting patients supplied with different NSAIDs. It achieved the highest AUC on patients supplied with Sulindac while predicting their risk of ACS (AUC 0.84). Its performance on predicting the risk of ACS was lower on patients supplied with Piroxicam (AUC 0.66). We found similar average AUC between different NSAIDs on all-cause mortality risk prediction, with a slightly lower AUC (0.79) on patients supplied with Ketoprofen. The AUC was higher while predicting the risk of the composite outcome on patients supplied with Sulindac and Tiaprofenic acid.  Feature Importance Figure 3 showed the ranked feature importance while predicting AE by GBM. Age was the most important predictor among all the features. Previous cardiovascular diseases such as ischaemic heart disease and heart failure were ranked at top while predicting ACS, which were followed by drug group Cardiovascular system (C) and Nervous system (N) (Fig. 3(a)). Cancer and heart failure history were important features associated with death, as well as drug group (N), Musculo-skeletal system (M) (Fig. 3(b)). Cyclooxygenase-2 (COX-2) inhibitors were ranked highest among all NSAIDs while predicting patients' risk of ACS. Naproxen, ibuprofen and Ketoprofen were ranked lower comparing with other NSAIDs. Similar results were found for composite outcome (Fig. 3(c)).

Discussion
This study presents a set of machine learning models for predicting the risks of AE after taking NSAIDs using data from PBS and HMDC in Western Australia. We focused speci cally on elderly patients (≥ age 65) who took at least one NASID. The prediction is based on the features including age, sex, medication history and disease history, which are widely concerned and counted in clinical practice. This approach encompasses a wide array of patients to truly re ect the population of patients taking NSAIDs in Western Australia. The machine learning based predictive models for AE showed greater sensitivity, speci city and AUC-ROC versus the classical cox-regression approach and GBM presented the best predictive performance in machine learning models we tested.
Several studies have reported the risk of AE with NSAIDs and the Rofecoxib was withdrawn from the market due to its increased risk of CV. The models are built to predict CV-related AE, death and overall AE. The performance of predicting death is the best with AUC-ROC values range from 0.67 to 0.81. This does not mean that the death was caused by NSAIDs, but this demonstrates that the predictive models built based on PBS and HMDC work well and can predict the risk of death.
NSAIDs include a series of medicines. Experimental data includes all NSAIDs with more than 100 patients. The AUC-ROC values of risk prediction for different NSAIDs range from 0.60 to 0.88. The proposed GBM model can be used to predict the risk of AE after taking any NSAID, especially for the Rofecoxib and Celecoxib whose AUC-ROC is more than 0.8.
Machine learning models have been widely used on EMR for prediction purpose, such as nationwide cohort predicting suicide death33, prediction of graft survival in kidney transplant recipients34, risk prediction of AEs following spine surgery5. These studies found that the machine learning approach did not show better performance than a classical generalised regression approach. However, in our data machine learning models tend to perform better over the cox-regression model. This could because most of the input feature in our model are continuous variables and machine learning models turn out to be outperformed on complex variables. In our study, we observed minimal performance improvement when using binary variables indicating the presence or absence of previous comorbidities or the use of speci c drugs. However, ML models achieved better performance than cox regression when we were using continuous variables to present patients use of different medications and their comorbidity history. This may be because machine learning approaches do not assume linearity for a predictor-outcome association; they are more adept at generating predictions based on continuous variables39.
Our machine learning model ranked COX-2 inhibitors higher among NSAIDs in ACS risk prediction. Multiple previous studies have reported an increased risk of CV events from the use of selective COX-2 inhibitors. 11, 15, 17-21. Rofecoxib was withdrawn from markets based on evidence that showed an increased risk of ACS. 21 Previously study has also combined heart failure has substantially increased the risk mortality. 40 This veri es that our machine learning model is reliable in ranking feature importance.
Despite the merits of this study, there are some limitations. As with all administrative database studies, this study relies on the accuracy of administrative coding of procedures, diagnoses, and records. The PBS dataset did not include all dispensing supplies of NSAIDs such as ibuprofen, as it is also available at the counter. Moreover, PBS dataset did not contain information about the actual drug dosage. Hence, in our study, we calculated the total number of supplied scripts rather than the dose used. In this study, we used state-level linked data to predict patients AE after their NSAIDs supply. The models can be further extended on national linked data in the future. Also, for general applicability, the models can be potentially extended to other drugs or drug groups on different outcomes, and this can be tested in future studies.
Implementing ML models on linked administrative data, including pharmacy claims (e.g. PBS), morbidity, and mortality has the potential to identify patients supplied with NSAIDs that may have a high risk of adverse outcomes. These can then be monitored closely by humans. Further investigation of additional data is required to validate the ML prediction performance on patients' risk of CV adverse events using population-level linked data.