Patient Population
Participants included all patients (n=476) undergoing their first pulmonary vein isolation ablation between June 2011 and December 2017 at a tertiary medical center, each of whom had at least one ECG or 24-hour Holter monitor performed after the 90-day blanking period, but prior to 1-year post-ablation. In the sampled clinical population, patients were post-operatively followed by either their referring provider or clinical electrophysiologist. Management of recurrent symptomatic AF was either performed by the procedural electrophysiologist or the patient’s referring physician. Patients were excluded if they had undergone a previous MAZE procedure. One patient was excluded due to her death prior to the passage of a year following her ablation procedure. Data from these patients was collected and analyzed retrospectively after approval from the Institutional Review Board.
Data Gathering Procedure
Information used to determine the clinical variables status was charted prior to the CA, except for AF during the blanking period and antiarrhythmic drug status post-ablation. A retrospective chart review was performed and data was collected on the following variables: age at the time of ablation, sex, body mass index, AF type, method of CA energy delivery (e.g. cryoballoon vs. radiofrequency), moderate or worse valvular heart disease, moderate or worse left ventricular concentric hypertrophy, coronary artery disease, history of myocardial infarction, evidence of prior reduced left ventricular ejection fraction (LVEF), heart failure with preserved EF (HFpEF), hypertension, prior transient ischemic attack, prior failure of antiarrhythmic drug, prior cardiac surgery, end-stage renal disease, coexistence of atrial flutter, antiarrhythmic drugs prescribed prior to ablation, antiarrhythmic drugs prescribed for at least one year following ablation, AF during post-procedural blanking period, and time since initial AF diagnosis. The LVEF, left atrial diameter (LAD), and left atrial volume index (LAVI) were also included for analysis only when an echocardiogram had been performed less than 6 months before the ablation. AF type was categorized as paroxysmal: AF episodes were intermittent lasting less than 1 week; persistent: AF episodes lasting greater than 1 week but less than 1 year; and long-standing persistent; AF episode lasting greater than 1 year. Clinical success was defined as the absence of a documented atrial arrhythmia, following the 90-day blanking period of greater than 30 seconds at the end of 12 months following ablation.
Model Generation
For five variables (LVEF, LAD, LAVI, LV concentric hypertrophy and months since initial AF diagnosis) with missing observations, data was randomly imputed. The observed data was used to identify appropriate models for the imputation process. The variables with complete data were compared between patients with arrythmia recurrence and those who remained free of arrythmia. For categorical variables, relative frequencies, odds ratios (OR) using a specified baseline category, 95 % confidence intervals for the OR and p-values computed using Fisher’s exact test are reported. For continuous variables, mean, 95% confidence interval and the p-value computed using a t-test are reported. All analyses were performed in R™ (Vienna, Austria) ver. 3.5.0. Statistical significance was assessed using =0.05.
Next, a logistic regression model was developed to identify factors associated with the recurrence of arrhythmia. To avoid sampling artifacts in the variables where imputed data was utilized, 1,000 imputed data sets were generated, and the regression model was fit on all the imputed data sets. Using all available variables, the logistic regression model is fit with forward stepwise regression using forward variable selection to determine which variables maximize the ability of the model to correctly predict AF recurrence. Using bagging to combine the results from these ensemble methods, 6 variables were selected for inclusion to estimate the probability of recurrence of atrial arrhythmia within 12 months of the procedure. The following equation (1) gives the estimated model to predict the probability of recurrence of atrial arrythmia:

Where XAF= AF documented during blanking period; XCTI= Coexistence of atrial flutter; XESRD= End-stage renal disease; XPriorEF= Prior reduced left ventricular ejection fraction; XFailed Drugs= Prior failure of antiarrhythmic drugs; and XVHD= Presence of valvular heart disease.
Model Validation
To study the model’s strength in predicting the recurrence of atrial arrhythmia, the classification model is divided randomly into two groups: training data, consisting of 80% (n=380) of the patients to construct the model; and testing data, consisting of the remaining 20% (n=96) of the patients. Division of data into training and testing data sets was performed after random imputation. To avoid sampling artifacts, we considered 1000 randomly imputed data sets. For each imputed data set, the samples were divided randomly into 1000 training and testing data sets. For the models, we used only the variables selected through forward selection procedure (as described previously). After estimating the model coefficients using the training data, the probability of recurrence of atrial arrhythmia is predicted for the patients in the testing data set. Patients with a predicted probability greater than 50% are classified as having recurrence of atrial arrhythmia. Comparing against the observed recurrence of atrial arrhythmia, accuracy of the prediction is calculated for the data set as the percentage of patients who are correctly classified. Accuracy of the ensemble models are combined using bagging and the mean accuracy of the 1000 training/testing data sets is recorded.
Ethical Approval
Given the retrospective nature of this study, we were exempted by Augusta University Institutional Review Board from receiving patient consent for use of patient-level data in this protocol, IRB-Net-ID: 1174793-1. The experimental protocols used in this study were approved by the Augusta University Institutional Review Board. The Augusta University Institutional Review Board guidelines were followed throughout the creation of this manuscript.