FORCE Risk Stratification Tool for Pediatric Cardiac Rehabilitation and Fitness Programs

Risk stratification is required to set an exercise prescription for cardiac rehabilitation, but an optimal scheme for congenital heart disease (CHD) is unknown. We piloted a system based on hemodynamic rather than anatomic factors: function, oxygen level, rhythm, complex/coronary anatomy, and elevated load (FORCE). Feasibility, efficacy, and safety of the FORCE tool were evaluated. Patients < 22 years old participating in the Cardiac Fitness Program at Boston Children’s Hospital between 02/2017 and 12/2021 were retrospectively analyzed. Assigned FORCE levels, anatomy, adverse events, fitness and exercise test data were collected. Of 63 attempts at FORCE classification, 62 (98%) were successfully classified while one with restrictive cardiomyopathy was not. Thirty-nine (62%) were FORCE 1, 16 (25%) were FORCE 2, and seven (11%) were FORCE 3. Almost half of FORCE 1 patients had simple or complex CHD and the majority of FORCE 2 patients had single ventricle CHD. FORCE 3 patients were more likely to have serious arrhythmias or cardiomyopathy than those in FORCE 1 or 2 (p < 0.001). Postural orthostatic tachycardia syndrome patients appeared in FORCE 1 only. No adverse events occurred over 958 total sessions. The total number of fitness sessions/participant was similar across FORCE levels. It was feasible to risk stratify patients with CHD using a clinical FORCE tool. The tool was effective in categorizing patients and simple to use. No adverse events occurred with fitness training over nearly 1000 exercise training sessions. Adding diastolic dysfunction to the original model may add utility.


Introduction
Cardiac rehabilitation is well established for adults with ischemic heart disease [1,2] but elements of a program designed for congenital and pediatric acquired heart disease is only just being explored. There is a call for systematic assessment of program elements designed for this population [3,4]. A critical first step for patients to enter a cardiac rehabilitation program is to undergo risk stratification to define parameters for safe exercise training intensities. Adult cardiac rehabilitation programs have developed various risk stratification algorithms to set appropriate exercise training zones that center around risk for ischemic events [1,[5][6][7]. No such risk stratification schema exists for congenital heart disease.
We thus developed a tool adapted from a European council on recommendations for physical activity and sport in congenital heart disease based on hemodynamic classification rather than anatomic criteria, cross referenced with the Bethesda guidelines [8][9][10]. The European criteria and exercise training ranges were designed to encourage participation in competitive and recreational sport in older teenage and adults with congenital heart disease. We adapted the schema for use for cardiac rehabilitation and fitness programs and used clinical judgment in the adaptation as evidenced based data were lacking. The tool involves using readily available clinical data on Function, Oxygen level, Rhythm considerations, Coronary/Complex risk, and Elevated load (FORCE) to categorize patients into one of three intensity level training zones. This FORCE tool was utilized in our initial pilot group of patients between 2017 and 2021 undergoing cardiac rehabilitation at Boston Children's Hospital. We sought to identify the feasibility, safety, and efficacy of using the FORCE tool for our initial cohort.

Materials and Methods
A single-center retrospective chart review was conducted to identify participants < 22 years of age who started the Boston Children's Hospital Cardiac Fitness Program (CFP) from the adoption of the FORCE risk stratification tool in February 2017 through to completion of the program by December 2021. Exclusion criteria included cardiac fitness patients ≥ 22 years, patients who had started but not completed the CFP within the study period, and patients with no cardiac diagnosis. One additional patient was excluded due to receiving a heart transplant midway through the program, negating validity of pre-and postprogram data. This study was approved by the Boston Children's Hospital Institutional Review Board with waiver of informed consent.
All patients were stratified at the time of program entry by a single cardiologist (NG). The data to stratify into FORCE categories were taken from chart review at the time of initial consultation. Imaging done as the course of usual care within six months of program intake served as the source data for FORCE classification. Table 1 lists the clinical criteria used for each FORCE variable, and Fig. 1 demonstrates the method for risk stratification. The first step in utilizing the FORCE tool is to consider each variable independently as meeting Class A (normal to mild), Class B (moderate), or Class C (severe) criteria (  [12]. Patient data included age, sex, race, BMI, diagnosis classification (simple congenital heart disease, complex congenital heart disease, single ventricle anatomy, postural orthostatic tachycardia syndrome, or arrhythmia/cardiomyopathy/transplant), the number of patients assigned to FORCE Level 1, 2 or 3, cardiopulmonary exercise test (CPET) data at baseline and post-program, and standard fitness metrics taken at baseline and at 60 days into the program. CPET data included baseline and discharge percent predicted peak workload, percent predicted peak oxygen consumption (VO 2 ) , peak respiratory exchange ratio (RER), percent predicted peak oxygen pulse, percent predicted peak heart rate, VE/VCO 2 slope, and the ventilatory anaerobic threshold. Only maximal cardiopulmonary exercise tests as estimated by RER > 1.09 were included for analysis. Standard fitness metrics were collected every 30 days and consisted of sit and reach distance to assess flexibility, plank hold time to exhaustion to assess core strength, and number of modified push-ups standardized to a block to assess upper body strength. Over the course of the study period time spent in the program evolved with variable lengths of about 2-4 months; to standardize across patients, the 60-day metrics rather than program completion metrics were chosen for comparison to allow enough time to expect to see a training effect but not compare patients who had been training longer than others. Patients had to have both baseline and 60-day fitness assessments to be included in the analysis. The metric used for lower body strength changed several times during the study period as clinical experience with utilizing field testing grew, therefore no lower body strength metric was collected for this study due to lack of consistency. Major and minor adverse events (listed in Appendix) were tabulated from start through completion of the program to assess safety. Categorical variables are summarized using frequencies and percentages, and continuous variables using medians with ranges or interquartile ranges as noted. Feasibility was calculated as the percent of patients able to be placed into a FORCE level and is presented with a 95% confidence interval. Safety was assessed by considering the number of adverse events occurring in the study cohort. Patient characteristics were compared across the three FORCE levels using Fisher's exact test for categorical variables, and the Kruskal-Wallis test for continuous variables. For CPET and fitness parameters, absolute and percent changes were calculated pre-to post-program, and changes were assessed using the Wilcoxon signed-rank test. These analyses were performed separately for patients in FORCE level 1 and FORCE level 2.

Baseline Patient Characteristics
Between February of 2017 and December of 2021, 59 unique patients aged < 22 years completed the Cardiac Fitness Program at Boston Children's Hospital 63 times, yielding 63 attempts at risk stratification. Baseline patient characteristics are listed in Table 2. One patient had diastolic dysfunction as the primary hemodynamic issue (diagnosis of restrictive cardiomyopathy) and was not able to be stratified into a FORCE level, yielding 98.4% (CI 91.5%, 100%) of attempts successfully able to be classified. Of those classified (n = 62), there were 39 in FORCE 1, 16 in FORCE 2, and seven in FORCE 3. The small sample size in FORCE 3 limited inferences with this group.
There were four unique patients who participated in the program a total of two complete times during the study period, two underwent FORCE re-classification due to intervening clinical change, and two remained the same (Table 3). For the two who were re-classified, one initially had failing single ventricle heart disease and then returned later after undergoing major surgery with improved hemodynamics but in need of rehabilitation, and one with complex CHD who participated initially in the program later had clinical decline due to a sudden onset intractable arrhythmia and returned to the program after a heart transplant. For the two who remained the same FORCE level, one had complex CHD with no interval clinical change from the first time to the second time in the program and returned due to a major social change and loss of motivation with deconditioning. The other had single ventricular palliation with intervening Fontan revision surgery with similar hemodynamic status at the start of both periods of the rehabilitation program.

Comparison of Patient Characteristics Between FORCE Levels
A breakdown of each FORCE group is listed in Table 4. Patients in FORCE 3 are slightly older than FORCE 1 and 2, with higher weight and body mass index, but FORCE 1 and 2 patients were similar. Of the 39 patients in the FORCE level 1 category, almost half had structural congenital heart

Adverse Events
There were no major or minor adverse events in any FORCE group over the course of the training program (Table 4). Exposure was similar across each FORCE group. There were 958 fitness sessions held over the study period, of which 652 were for FORCE 1 (median number per participant was 17 [10, 23]), 220 for FORCE 2 (median number per participant was 17 [6,19]), and 86 for FORCE 3 (median number per participant was eight [4,22]).

Comparison of Pre to Post-program Fitness Metrics for FORCE 1 and FORCE 2 Patients
Baseline and 60 day fitness metrics for FORCE 1 and FORCE 2 patients are summarized in Table 5. There are no statistically significant differences in baseline patient characteristics for individuals who have differences measured and those who do not. FORCE 3 patients were not analyzed due to small sample size.
For the FORCE 1 patients measured at both time points, there are significant increases in each fitness variable.  [19,31] cm or by 23% from baseline to 60 days (n = 8, p = 0.023). Core strength as assessed by plank hold (n = 7), increased dramatically by 118% from baseline to 60 days (median 22 [11,35] to 48 [22,50] seconds) but did not achieve statistical significance (p = 0.078, n = 7) due to the small sample size. Upper body strength as measured by number of push-ups increased from seven [4,10] to 13 [5,15] or by 38% from baseline to 60 days, but also did not reach statistical significance (n = 6, p = 0.13). On the cardiopulmonary exercise test, the percent predicted peak workload increased from 68 [67, 69] to 79 [55,83] or by 5% from baseline to discharge (n = 5, p = 0.44) and the percent peak predicted VO 2 increased from 64 [52, 66] to 69 [52,78] or by 7% (n = 6, p = 0.19), neither of which reached statistical significance in the setting of small sample sizes.

Discussion
Risk stratification is a critical step for patient entry into a cardiac fitness and rehabilitation program to set appropriate parameters around exercise intensity. Effectively selecting patients who can safely exercise at clearly prescribed levels is both a practical goal of the program and the primary liability concern of the providers. Since no protocol had been established for patients with congenital and pediatric acquired heart disease, the FORCE tool was developed to categorize patients into risk levels taking into account baseline hemodynamic parameters that would be further stressed by the demands of exercise. The tool needed to be feasible, safe, and effective. After gaining experience for the first five years of use, we examined its performance in our initial cohort.
The FORCE tool was indeed feasible and simple to use. The vast majority (98%) of our cohort were easily categorized with readily available clinical data that was already part of existing workflow. The only patient who was not classified had restrictive cardiomyopathy with isolated diastolic dysfunction, which was not specifically addressed in the original model. To strengthen the FORCE tool, we propose adding indices of diastolic dysfunction to the "elevated load" category and establishing definitions for mild, moderate, and severe, preferably using imaging parameters for accessibility and ease of use. The cut-offs used for risk stratification for the other variables in the FORCE tool were derived from several sources [8,9] and are consensus rather than evidence based. The original intent of the thresholds was to guide sports participation, not specifically set limits for supervised exercise training. We coopted these values as reasonable proxies for cardiac rehabilitation, and they are open for interpretation. Since cardiac rehabilitation is medically supervised whereas sports participation is not, these cut-offs may prove to be conservative. More widespread use of the FORCE tool with systematic data collecting will be important to improve the model over time.
Adult cardiac rehabilitation programs have decades of practice using various schema, and there is still not one singular risk stratification scheme in use for patients with ischemic heart disease [5]. There is a trade-off between simplicity of use and missing potential prognostic signs of adverse events [1,6,13]. Globally, there is not uniform adoption amongst the options for adult programs [7]. Additionally, some authors have looked at adding further modifiers such as a comorbidity index [14] for factors beyond cardiovascular elements. The FORCE tool does not yet examine non-cardiac hemodynamic factors although it is intriguing to consider neuromuscular or genetic factors for future consideration, and we anticipate further iterations as experience is gained.
Exercise testing has been shown to be safe for adults with ischemic heart disease [15] and with congenital heart disease [16], and there is ample evidence that exercise training and cardiac rehabilitation is also safe for both populations [4,[17][18][19][20][21][22]. In our patients, based on classification by the FORCE tool there were no major or minor adverse events over 958 patient-hours of supervised exercise. This included those who underwent high intensity exercise with complex congenital heart disease who were classified into FORCE 1 as well as patients who were more conservatively trained in FORCE 2 and 3. Adverse events are rare in adults with ischemic heart disease who are arguably more vulnerable but also potentially not trained at the same intensity as our young patients. The American Heart Association reports well established safety for older adults meeting criteria for adult cardiac rehabilitation programs with only two deaths per 1.5 million patient-hours of exercise and one major event per 50-120,000 patient-hours [2].The volume of patients with congenital and pediatric acquired heart disease is considerably smaller than those with ischemic heart disease so we anticipate needing many more years of assessment to better estimate safety, but we were reassured by our initial data and bolstered by the adult experience.
While safety is paramount, we also want to train our patients with enough intensity, volume, and progression to improve in their fitness and exercise capacity. For the FORCE tool to be useful, it needs to be effective in not only categorizing patients into appropriate groups, but also setting an appropriate exercise prescription level for each group. We examined efficacy in two ways: (1) whether using the FORCE tool, which looks at hemodynamic parameters only, subsequently resulted in anatomic distribution of patients that was logical and made clinical sense, and (2) whether the exercise prescriptions that were generated by the FORCE levels resulted in improvements of fitness.
In terms of categorization, we found our FORCE 3 patients were more likely to have an arrhythmia or cardiomyopathy than those in FORCE 1 or 2. This is largely by design as significant ventricular dysfunction or concerning arrhythmias would be most likely to result in a FORCE 3 categorization. Somewhat interestingly patients in FORCE 2 were more likely to have single ventricle anatomy than those in FORCE 1, and less likely to have simple or complex congenital heart disease, which again seems like anticipated anatomic breakdowns stemming from using hemodynamic parameters for categorization. For adult cardiac rehabilitation programs, the range of diagnoses is not as great, but there may be subjectivity in grouping as well. There was a small study suggesting potential intra-observer variability in classifying patients within various frameworks [23]; whether there was any harm or increased risk resulting from some variation in classification is unclear, but the low prevalence of adverse events in the adult population is reassuring. Our study was limited by the fact that we had only one provider performing all classifications, and there is certainly room for interpretation as clinical judgment is built into the model. However, we did find that the FORCE tool provided specific enough parameters for classification that we hope others will utilize it, test it, and build upon the initial experience to improve its value going forward as without any tool, rational and standardized exercise prescriptions for our patient population will not be possible.
Exercise training intensity as defined by training heart rate zones or levels of perceived exertion were scaled for each FORCE level. We were unable to analyze data on FORCE 3 patients due to small sample size. For FORCE 1 patients, there were significant increases in measures of strength (plank hold, 51% increase; push-ups, 78% increase) and flexibility (6% increase; Table 5) and for FORCE 2 patients, the fitness metrics markedly increased as well (plank hold, 118%, push-ups, 38%, flexibility 23%) but did not reach statistical significance due to small sample size. Since strength and flexibility training are not necessarily as hampered by hemodynamic limitations, it was not surprising to see these training effects. Looking at aerobic parameters, for FORCE 1 there were significant increases in percent predicted peak workload with borderline significant increases in percent predicted peak VO 2 while FORCE 2 patients did not have statistically significant changes for these parameters (Table 4). Whether this was from too small a sample size to measure differences, from training that did not reach an intensity level to move these measures, or expected if the outcome measure of peak VO 2 is not likely to change based on the hemodynamic limitations that placed these patients in the FORCE 2 level in the first place is unclear.

Limitations
The main limitation of our study were our small sample sizes. An additional potential limitation was that there was only one single provider applying the FORCE tool so variability in classification could not be assessed, although consistency of use was the benefit. Also, there was an evolution of the program with several iterations of the training and outcome metrics used through the study period. However, the FORCE classifications themselves did not change from inception through the end of data collection. The present study was designed to assess the efficacy of the FORCE tool in categorizing patients appropriately and setting exercise prescription zones, but did not assess adherence to those zones, volume of sessions per patient or FORCE level, and equivalency of physical activity in between supervised training sessions. Also, we did not independently measure the amount of exercise performed between supervised sessions, and there may have been significant differences between patients. Last, the fitness outcomes were standardized to the 60 day mark, representing a relatively short duration. This likely but not necessarily allowed enough time to demonstrate a training effect, although the percent increases seen from baseline to 60 days are encouraging. Future studies will need to address each of these considerations in detail and evaluating the delivery and effectiveness of the program will be the subject for future work.

Conclusions
The FORCE risk stratification tool is feasible, determines safe exercise training zones as observed over the 958 patient hours during the study period, and is effective in categorizing patients and providing the basis for an exercise prescription. The original FORCE tool did not account for diastolic dysfunction and the model could be improved by adding these parameters. Future studies are needed to better understand the adherence to the exercise prescription in terms of frequency, intensity, volume, and progression to know whether the risk model balances safety without undercutting the appropriate level of training for the patient to become more fit.

Author Contributions
All authors contributed to the study conception and design. Material preparation and data collection were performed by NG MD, LR BS, TC PhD, and JO'N MS, and analysis by KG ScD. The first draft of the manuscript was written by NG MD and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.
Funding There are no financial or non-financial interests that are directly or indirectly related to the work submitted for publication. Funding for the program was provided by a Boston Children's Hospital Heart Center Strategic Investment Grant.