Cost-Utility Analysis of a Chronic Back Pain Multidisciplinary Biopsychosocial Rehabilitation (MBR) compared to Standard Care for Privately Insured in Germany


 Introduction: Multidisciplinary biopsychosocial rehabilitation (MBR) is highly recommended to treat chronic lower back pain (CLBP). However, its economic benefit remains to be clearly demonstrated. Objective: To analyse the effect of a 12-month MBR with behaviour-change coaching and device-supported exercise on direct medical costs, sick leave and health-related quality of life (HRQOL) at 24 months. Methods: A cohort of privately insured in Germany was evaluated using administrative and trial data. After removing dissimilarities in characteristics between MBR and control via propensity score matching, treatment effects were calculated with a Difference-in-Difference approach.Results: The base-case analysis comprised 112 MBR participants and 111 members in the standard care group. With an incremental cost-effectiveness ratio (ICER) of €8,428 per quality-adjusted life year (QALY) gained, the intervention was classified as cost-effective. Economically unaccounted sick leave due to back pain (BP) in the last six months was reduced by 17.5 days (p = 0.001) in the MBR group. Development of HRQOL was positive for the MBR (0.046, p= 0.026). Subgroup analysis with major impaired participants demonstrated the possibility of a dominant intervention with an ICER of - €6,861 per QALY. Savings were driven by a reduction in BP specific costs by - €1,733 (p= 0.035). Difference in sick leave was 27 days (p = 0.006) in favour of the MBR group.Conclusion: This is the first cost-utility study with combined data from a private health insurer and a controlled trial to demonstrate that long term MBR is cost-effective in the treatment of CLBP. Subgroups with major impairment benefit more from the intervention than participants with minor impairment. MBR significantly reduces sick leave in all participants making it a profitable intervention from society’s point of view.


Introduction
Lower back pain (BP) is the leading cause of years lived with disability (YLD) worldwide [1]. For those affected, frequent absences from work and severe pain-related disability are common. BP causes high direct and very high indirect costs [2]. Estimates of the economic burden of BP range from annual costs of AU$9.17 billion in Australia to $91 billion in the USA [3,4]. In Germany, costs are estimated to be up to €49 billion per year [5].
Chronic lower back pain (CLBP) is an urgent global public health concern [6]. It is often conceptualised as a biopsychosocial problem, i.e. a complex and dynamic interaction between physical, psychological and social elements [7]. To treat CLBP, multidisciplinary biopsychosocial rehabilitation (MBR) -a combination of physical and behavioural or social components -is recommended in clinical guidelines [8]. lack of exercise [22,23]. It therefore seems to be too narrow to consider only BP speci c costs in the economic evaluation.
The working groups around Groessl [24], Lambeek [25] and Williams [26] also analysed structured interventions for CLBP and compared them with usual care. They are not included in the model of Herman et al. and actually used all direct health care costs for the QALY calculation.
Groessl et al. [24] investigated a 12-week long progressive yoga intervention on a total of 150 participants. They reported a QALY gain of 0.043, additional healthcare cost of $193 with a resulting ICER of $4,488.
Lambeek et al. [25] reported on a workplace intervention consisting of a graded activity programme with cognitive behavioural principles for sick-listed patients with CLBP. Sixty-six participants received the intervention, 68 were treated according to usual care. After 12 months, a QALY gain of 0.09 and additional direct costs of £217 were described, leading to an ICER of £2,411. Williams et al. [26] analysed the effectiveness of a healthy lifestyle intervention consisting of a brief advice, a clinical consultation and referral to a 6-month telephone-based coaching service. The intervention participants (n=80) had a QALY gain of 0.02 and increased total direct costs from a provider perspective of AU$386 compared to usual care (n=79), resulting in an ICER of AU$19,036.
Müller et al. [27] presented a cost-effective study for a long-term multimodal intervention with a deviceoriented exercise approach and broad access options for a German population. The intervention group consisted of 1,829 participants and the usual care control group of 495 members. After a follow-up time of two years, they saw a reduction of direct medical costs of €763 and BP speci c cost savings of €239. The biggest effect could be seen in the most affected group according to chronicity grades with savings of €4,535. The authors did not provide information about the development of the QALY. As neither a psychological nor a social dimension was included in their intervention analysed, it cannot be classi ed as an MBR. To the author's knowledge, no published cost-effective analysis for a long term MBR in Germany exists so far.
Four systematic reviews researching cost-bene t studies in the treatment of CLBP [9,[28][29][30] exist. All of them called for more high-quality site-speci c studies to reduce the uncertainty of cost effects in BP therapies, asked for observation periods longer than 12 months and recommended using high reporting standards.
This paper addresses the gap and ful ls the Consolidated Health Economic Evaluation Reporting Standards (CHEERS) criteria in analysing whether a 12-month outpatient MBR intervention consisting of behaviour-change coaching and device-supported exercise with low entrance barriers can prove costeffective compared to standard of care in patients with CLBP in a private health insurance (PHI) setting in Germany.
Routine data from the PHI provider, Generali Deutschland Krankenversicherung AG (Generali Germany Health Insurance, formerly known as "Central Krankenversicherung") is evaluated. Since 2014, Generali has been running a one-year MBR for people with CLBP. The intervention was accompanied by a controlled trial in Zelen's -Design, registered at the German Clinical Trials Register under DRKS00015463. Results showed medical effectiveness in the 12-month follow-up [31,32]. To nd out whether the intervention is also cost-effective, inpatient as well as outpatient costs, the development of sick days due to BP and the overall health status of the participants enrolled in the aforementioned study were collected and analysed in detail.
By isolating BP speci c costs, judgements about long-term effectiveness on BP outcomes can be deducted. Incremental costs in addition to the effectiveness of the MBR are calculated, complemented by insights about the effect on sick leave due to BP. This is a rst-time cost-utility evaluation -in accordance with the CHEERS statement -of a long term MBR for patients with CLBP in Germany.

Target population and subgroups
The present economic evaluation reuses data from a two-arm randomised controlled evaluation study in parallel-group Zelen's design for which recruitment took place between April and October 2015 [31,32] and amends it with administrative direct health costs data. Searches in the database of the PHI identi ed adults with administrative indications of CLBP. The full inclusion criteria can be found elsewhere [32].
One of the main results of the previously run effectiveness analysis stated the effect of the intervention being highly dependent on the level of impairment due to BP [32]. Therefore we divided participants into subgroups based on their overall result in the Chronic Pain Grade Questionnaire [33,34]. Two parameters were used to classify BP severity levels: the characteristic pain intensity (score 0-100) as an average of the current, average and maximum pain intensity, and the pain-related impairment (0-6 points), calculated from the number of impairment days and the extent of the impairment experienced in everyday life, leisure and work. These led to four hierarchical Graded Chronic Pain Grades (GCPS): Grade I, low disability-low intensity; Grade II, low disability-high intensity; Grade III, high disability-moderately limiting; and Grade IV, high disability-severely limiting [33].
In the following analysis, the grades were combined: Grade I and Grade II as minor impaired (functional chronic pain) and Grades III and IV as major impaired (dysfunctional chronic pain) due to BP. Table 1 shows the characteristics of the population before and after data processing and matching (see analytic methods). It can be observed that after the processing, participants in the intervention group (IG) and in the control group (CG) are almost equally distributed. Group size (n: IG = 112, CG = 111), as well as characteristics, are very similar in both groups at the beginning of the study period so that the costs incurred can be compared well. There is a division of health care in Germany. About 90 % of the population is a member of the statutory health insurance (SHI), and 10 % is privately health insured. The latter implicates a comprehensive health insurance plan dealing as a substitute for the mandatory SHI. Usually, members of the PHI are selfemployed, civil servants or employees with a salary above the compulsory insurance threshold (currently €64,350). In general, members of the PHI belong to a higher socio-economic class.
Due to the nature of the system ('lock-in effect'), there is a low uctuation within the membership [37]. The insurer has an interest in healthy members and therefore considers investments over a longer time period. The payer is therefore interested in optimal, evidence-based care for its insurees. Unlike other chronic diseases, still no uniform, guideline-based disease management programme (DMP) for BP exists in Germany [38]. Individual insurance companies have developed and implemented different approaches, which are however rarely evaluated scienti cally [32,39]. As an innovation driver from payer to health care partner, a PHI company in Germany can pave the way and test the effectiveness and e ciency of a structured treatment programme [40]. This could be a good blueprint for the introduction of a DMP for BP in the SHI.
This study was set within the Generali Germany Health Insurance. In 2019, Generali listed 308,088 fully insured members and 1,431,522 with supplementary insurance [41] and was one of the largest PHI in Germany. Only fully insured members were eligible to participate in the offered health programme.

Study perspective
The analysis was conducted from the payer's perspective. Total health costs were evaluated, as well as costs that could be attributed to chronic back pain (ICD M40-M54 diagnosis).
The costs were divided into outpatient and inpatient costs. In order to achieve better comparability with the statutory system, additional elective services (e.g. one or two-bedroom supplement) and the entire costs of dental care treatment were excluded. Included were costs from the following areas: General hospital services, GP and specialist care, medicines, remedies, alternative practitioners (e.g. chiropractor), aids and private medical treatment.
In a PHI setting, reimbursement of the health care bills depends on the respective tariff. The study population consisted of fully insured participants as well as policyholders eligible for governmental aid. Therefore, the bill amount was used as the costs and not the amount of refund paid by the insurance.
Thus, the actual costs were compared with each other without taking into account which payer (health insurance, subsidy or individual supplementary) reimbursed the costs.
In order to allow further analysis, the number of sick days due to BP were also analysed. Since not every participant in the study held a daily sickness bene t insurance (in contrast to the statutory system), no monetary value was assigned to the days of sick leave.

Comparators
The intervention combined two care components: BP was treated by networks of general practitioners, orthopaedic surgeons and pain therapists working according to the FPZ concept and were located as close as possible to the patient's home [42]. This concept is based on a training programme for the spinestabilising musculature developed at the German Sport University Cologne, which has been further advanced by the Cologne Research and Prevention Centre (FPZ) for the reconditioning of patients with sub-acute and chronic back pain (for details, see http://www.fpz.de). A therapy plan was drawn up following a functional biomechanical analysis comprised of up to 24 one-hour equipment-supported training units to build up the spine-stabilising musculature in FPZ back centres. After completing the training, IG members could receive €100 twice for the use of further freely selectable health sport offers as a movement bonus.
Each participant received personal telephone support from an external health coach provided by Thieme TeleCare. The aim of the coaching, which initially accompanied the therapy and was then followed up for six months, was to support behaviour changes, thus contributing to the continuation of physical activities.
Members of the CG did not receive any offer for training or coaching but were treated as current practice receiving standard care, for which recommendations exist in the National Clinical Practice Guideline for Non-Speci c Low Back Pain [43].

Time horizon
The original study had a duration of 24 months [32]. The present cost comparison covered a period of four years. From the individual start time of the study (t-0), the costs were compared 24 months before and 24 months after the index date. Since Difference-in-Difference (DiD) estimation was chosen as the method of comparison (see methods section), the pre-intervention and post-intervention observation periods were selected to be equal.
The follow-up period of 24 months was chosen to compare the measured endpoint of health developments with the economic effects, thus calculating ICER using QALYs. No further tracking of cost developments into the future was carried out, as no longer-term data on the individual health status of the participants was collected.

Discount rate and currency
Following the recommendations of the German Institute for Quality and E ciency in Health Care [IQWIG], the discount rate was set at 3 % per year [44]. Sensitivity analyses were performed for a discount rate of 0 % and 5 %.
Depending on which discount rate was applied, the results varied mildly. Since IQWIG recommends 3% and the overall results did not change much, only the results of 3 % discounting are presented in the analysis. Changes in statistical signi cance due to discounting are marked.
All costs were converted to 2017 Euros (€) using consumer price indices.

Choice of health outcomes
Outcomes of patients receiving standard care with those of a cohort receiving standard care combined with the intervention over two periods of 24 months were compared. The dataset provided by the insurance contained longitudinal patient-level information on medical diagnoses, direct medical costs (in 2017€) and healthcare utilisation between 2010 and 2017.
Values to calculate the QALYs, the number of sick days and the health status at the beginning of the study and after 24 months were obtained through an online questionnaire. The exact method is described elsewhere [32].
The study outcomes can be divided into direct medical costs (overall as well as BP speci c), the number of sick days due to BP ("Approximately how many days in the last six months were you unable to carry out your normal activities (work, school/study, housework) due to your back pain?") and the overall health developments (EQ-5D). BP related costs were identi ed through ICD-10 M40 to M54.9 diagnoses. All outcomes represent the average values over the 24-month baseline and follow-up period.
To calculate the QALYs, the EQ-5D was used. Since the original study used the SF-12, the EQ-5D value was calculated using Lawrence's algorithm [45].

Study Size
A total number of 189 participants in the IG and 254 participants in the CG took part in the reference study [32]. After data cleaning and matching, 112 participants in the IG and 111 in the CG were included for further analysis. The selection criteria are shown in Table 2.
Participants who did not exercise physically due to a large distance to the training centre were excluded from the baseline (n=34) but controlled for in the second sensitivity analysis [1]. Participants who dropped out during a later stage (while training or during coaching) were treated according to the intention to treat principle and kept in the study. , four further participants had to be excluded. One participant from the IG was excluded as he enrolled himself in the programme a second time before the end of the follow-up period. In the CG, three participants had to be excluded as they enrolled themselves in the intervention before the end of the study period.
As the data was provided by a PHI, there was the additional obstacle of managing the individual yearly threshold of costs before payment of expenses (deductible). To reduce the bias introduced by the tariff, 27 participants who did not hand in an invoice in one of the four examined years were excluded (yearly average of invoices = 42). Data management and statistical analysis were carried out using the software R 3.6.0 [47] in the application of the packages listed in the bibliography [48-54].

Difference-in-Difference regression
The aim of the DiD method is to estimate the average effect of treatment on the treated (ATT). Changes in cost over time between the IG and the CG were compared with a regression model with an interaction term between period and treatment (Y= β0 + β1*[Period] + β2*[Treatment] + β3*[Period*Treatment]). The outcomes in the baseline period were measured two years before the respective index date. The timeseries dimension of the two year baseline period was removed by comparing the values over two years to avoid biased standard errors due to serial correlation [55]. Thus, a single value per outcome measure for the baseline and follow-up period was generated.
To examine if the costs developed equally over time (parallel trend assumption) [56] BP speci c costs were plotted over a time horizon of four years prior to the intervention. Outcomes were considered quarterly to test the parallel trend over 16 data points. The graph (Fig. 1) shows that the parallel trend assumption could be accepted. BP speci c costs in both groups developed similarly before the start of the intervention. The average cost before the intervention could be divided by BP impairment.
Propensity score matching Table 1 shows that IG and CG differed in the distribution of participants according to their pain levels. As the pain stage has been shown to have an impact on BP speci c costs [5], nearest neighbour propensity score matching (PSM) with a caliper of 0.1 was performed to achieve adequate covariate balance at baseline [57]. Covariates used for the matching were sex, CCI, GCPS, STarT-Back, health cost before the intervention, age, sick days prior and risk for chronic BP as identi ed in the data selection [32].
As DiD regression is sensitive to high-leverage observations [58] extreme outliers (BP speci c cost before intervention >=€13.000) were excluded (n= 25). After excluding high-cost cases, the regression results became more robust.

Analysis of covariance (ANCOVA)
A one-sided covariance analysis (ANCOVA) was carried out for the outcomes of secondary interest here, i.e. the reduction of sick days due to BP and the general state of health (EQ-5D).
ANCOVA is a common, statistically robust method with model assumptions that should be respected (e.g. linearity between the covariate and the outcome, homogeneity of regression slopes, normally distributed outcome variable, homoscedasticity of residual variance) [59]. In the analysis of the EQ-5D, all model assumptions of the ANCOVA were met. For the range of sick days, however, the residuals were not normally distributed. This was, however, negligible as violations of these assumptions do not decisively . This was not the case here. Therefore, an ANCOVA can be applied.
[1] Invitations were not controlled for the distance between the FPZ training centre and home. Some participants therefore only realised after their enrolment that the distance to reach the training centre was not manageable on a regular basis. The drop-out of these participants did not depend on their pain-levels as the percentage was comparable with 4.7% and 6.6 % dropout in the respective pain groups.

Results
Incremental costs and outcomes Table 3 presents the cost data of the 223 participants analysed. Total medical costs were reduced in the IG and slightly increased in the CG, resulting in a statistically non-signi cant difference of -€769.16 (p = 0.78). BP speci c costs were reduced in the IG and slightly increased in the CG so that an ATT of -€1096.56 (p = 0.039) was reached. The main driver for the reduction could be revealed as the reduced inpatient costs by -€717.00 (p = 0.025). The number of sick days due to BP in the last six months was reduced in the IG, whereas it increased in the CG. The estimated mean treatment difference was -17.5 days (p = 0.001) in the IG. The development in the EQ-5D was found to be more positive in the IG, compared to the CG (0.046, p= 0.026). These results were not calculated with a DiD regression but with an ANCOVA Table 3 -Discounted outcomes for the intervention (IG) and control group (CG) in the baseline (2 years) and follow-up period (2 years) with the respective Difference-in-Difference estimator and its standard error (SE)

Subgroup analysis
In general BP speci c costs differ depending on their GCPS status: The higher the status, the higher the costs [5]. To see if this applied to this data set, IG and CG participants were divided according to their BP impairment into minor and major impaired groups, as presented in Table 4.
In the minor impaired group, the total medical cost (€570.09, p = 0.85) and the back speci c costs (-€538.10, p = 0.42) did not differ signi cantly. The amount of sick leave was signi cantly lower in the IG than in the CG (-9.49, p = 0.024). The overall health status was improved in both groups but did not reach statistical signi cance in between them (0.039, p= 0.119).  Calculating the ratio only with cost differences in BP speci c cost reduces the ICER to €4,652 for the main group and -€2,627 for the subgroup with major impaired participants.

Characterising uncertainty
One source of uncertainty was the BP speci c costs. Those were accounted for as soon as a diagnosis of the ICD group M40 to M54 was indicated on the bill. However, other diagnoses were often co-included on those invoices, resulting in ambivalence in cost allocation. In cases of doubt, it was not possible to distinguish precisely which costs were to be allocated to which diagnosis on the bill. This problem concerned members of IG and CG in equal shares and arose mainly in the outpatient sector. In order to compare whether cost-intensive co-diagnoses were more frequently coded in one group, the BP speci c invoices were analysed additionally.   Table 5 -Characterising uncertainty of BP speci c invoices comparing median of BP invoices and distribution and frequency of co-listed diagnosis at baseline and follow-up period The main comorbidities were the same in the rst two groups and only differed from rank four onwards.
In the group of other M diagnoses, gonarthrosis was mentioned most frequently, in the E group diabetes mellitus, in the I group mainly essential hypertension, in the R group mainly other chronic pain, in the K group fatty liver, in the G group polyneuropathy and in the F group an unspeci ed depressive episode. Table 5 shows how many invoices with M40-54 diagnoses were taken into account on average, how many invoices only contain an M40-54 diagnosis and the most frequently listed comorbidity groups.
As can be seen in Table 5, the included bills are highly comparable to each other. It is noticeable that the IG had in median two more bills containing M diagnosis before the start of the programme. In the followup, however, this difference was diminished.
Due to minor differences in the distribution of the M diagnoses and the co-mentioned groups, it could be assumed that the BP speci c costs in both groups were not in uenced by a wrong allocation.

Characterising heterogeneity
The results of the main analysis indicated that participation in an MBR could save costs in the long run. The reduction in BP speci c costs is signi cant both in the overall comparison and in the subgroup with major impairment. Due to the large standard deviation, the effect on total costs could not be clearly interpreted. In order to understand the development of the total costs better, a sensitivity analysis was carried out in which only participants indicating an improvement in BP -measured as a lower STarT-Back raw value accompanied by no regression in GCPS -were included ("pro teer" subgroup).
A second sensitivity analysis was conducted with all participants who enrolled in the programme, regardless of whether they nished all components or not (ITT). Explicitly included were those who could only take part in the coaching and not the physical training.
As matching in itself was -due to the small study group -a possible source of bias, a third sensitivity analysis compared the results without PSM.
Sensitivity Analysis I: Results of the outcome-based analysis Table 6 presents the rst sensitivity analysis with participants who improved their BP. In the IG, 62 of the 111 (56 %) participants achieved an improvement of their BP after two years. In the CG, only 44 of 112 (39 %) did. The total medical costs in the baseline period of the "pro teer" subgroup showed a range of €118,000. To exclude outliers, the groups were divided into low and high cost (0.1 -0.5, 0.51 -0.9 of the quantiles) members.
In the low-cost group, no signi cant difference in the costs between IG and CG could be identi ed. Only the improvement in the overall health status was different and in favour of the IG (0.11, p = 0.034).
In the high-cost group the total costs could be reduced by €5,431 (p = 0.08) in the IG compared to the CG (undiscounted: -€6,762, p = 0.04).
The number of sick days decreased signi cantly in the low and high-cost groups; to a greater extent in the IG than in the CG (low: -5.7, p = 0.44, high: -18.3, p = 0.06). Statistical signi cance prevails below a level of 0.1 for the high-cost group. These results were not calculated with a DiD regression but with an ANCOVA Table 6 -Sensitivity analysis of participants who improved their BP with the respective Difference-in-Difference estimator (ATT) and its standard error (SE) in the baseline and follow-up period Sensitivity Analysis II: Results Intention-to-treat Table 7 shows the second sensitivity analysis with all participants who enrolled in the programme, regardless of whether they nished all components or not (ITT). Results of the second sensitivity analysis showed no signi cant result on the treatment effect on costs or health. The reduction of sick leave due to BP was signi cant. In the complete group, the IG had an ATT of 13.6 days (p= 0.002). The major impaired group (n= 62) reduced their sick leave due to BP by 22 days (p = 0.012) more than the CG (n = 57). In contrast to that, the minor impaired group (n= 74) reduced their sick leave due to BP by 7.2 days (p = 0.085) compared to the CG (n = 80). Since the results and their signi cance level seem dependent on the matching, a third sensitivity analysis (second part of Table 7) was run with the original evaluation group [32]. The complete data preparation was omitted (no consideration of dropouts, tariff, or high-cost cases). Only those with no billing information over the course of four years (n=8) were excluded. In the overall population and the major impaired group, there was no signi cant cost difference between participants and non-participants in the programme. The minor impaired group has a favourable ATT of €4,982 (p = 0.065).
The health status and the number of sick days were in favour of the IG. In the whole group, the ATT for sick days is close to 13 days (p = 0.001), and the health gain is 0.03 (p = 0.039). In the major impaired group, participation in the health programme reduces sick days by almost 23 days (p = 0.001). The health status increases by 0.05 points (p = 0.030).

Summary
The evaluation presented included real-world evidence from an outpatient, long-term MBR with behaviourchange coaching and device-supported exercise for chronic BP in the health care system of a PHI company in Germany between 2013 and 2017. It was demonstrated that the classi cation of the participants according to their individual degree of impairment at the beginning -represented by the GCPS -is a very good separator for the effectiveness.
Signi cant cost savings were achieved especially in the major impaired group at the isolated BP speci c costs (-€1,732). It is noteworthy that participants in the IG needed at least one additional doctoral visit before the start of the training because a medical prescription for the FPZ training was required, often combined with an MRI to rule out medical contraindication. Despite these extra charges, the IG still reduced BP related costs signi cantly, indicating that the intervention is saving costs in the longer term.
This is in line with the results of Müller et al. [27]: In a multimodal intervention with a similar exercise therapy in Germany's SHI, the savings depended on the BP impairment at baseline. Their intervention was particularly cost-saving for participants with GCPS grades IV. In contrast to Müller's study, the intervention examined here can be classi ed as an MBR with fewer training units but with an additional coaching component also focusing on behavioural change. We conclude that BP speci c cost savings can be already achieved from GCPS grade III onwards.
With a QALY gain of 0.046, the intervention achieved the second-best bene t compared to all forms of therapy presented by Herman et al. [15]. The MBR duration of one year resulted in high programme costs of €1,500. Savings of €769 in total costs, respectively €1,097 in BP speci c costs resulted in an ICER of €8,428 (respectively €4,652 for BP speci c costs only) per QALY gained. The intervention considered in this study can therefore be classi ed as cost-effective.
If the insurance companies were able to steer the participants better before enrolment and assign them in minor or major impaired appropriately, a dominant intervention with an ICER of -€6,861 could be achieved. Herman et al. have shown that most interventions for CLBP are cost-effective from the perspective of the payer -whereas they are dominant for the society [15]. Focus on the direct medical costs results in cost-effectiveness, but the additional consideration of the days of sick leave saved (-17.5 in the last six month) turns the intervention into a cost-saving instrument from society's perspective. Furthermore, it could be shown that by consistently using the GCPS to determine participation in the programme, it is possible to create a dominant intervention from the payer's point of view.
The study was able to observe that BP prevailed in 61% of the participants in the CG after two years, which is subject to standard care. This correlates with BP being considered a long-term condition with a variable course [65]. The exact recurrence is unclear, but 33% to 67% of people with BP can be expected to have permanent recurrent episodes [6]. The present study adds to the existing knowledge that if BP is improved, the outcome differs strongly in terms of cost. The total costs are reduced in both "pro teer" groups. However, the IG has more savings in total costs than the CG (-€5,431). By focusing on exercise and self-e cacy, the high total cost can be signi cantly reduced. Judging from a pure cost perspective, it can be stated that the payer should increase its efforts to persuade the severely affected and costintensive insurees to participate in an MBR -and nd an alternative offer for the minor impaired & lowcost groups.
However, if the perspective of society is also taken into account, participation in a back programme is worthwhile for all those affected. Across all groups and sensitivity analyses, the reduction in sick days due to BP was signi cantly higher in the IG than in the CG (ranging from 9.5 to 27 days). Wagner et al.
reported a reduction of 44.3 days in the duration of incapacity to work and BP speci c saving of €1,284 (daily sickness allowance excluded) after the completion of a 20 days short-term IMPT [13]. The effects were observed after one year -however, their programme is more intensive (time and costs) with more than 100 hours of treatment and twice the expenses. The target group is comparable to the major group, and the results are similar, even if the period under consideration is different. If one takes into account that there was a reduction of 27 sick days in the last six months, it can be assumed that a long term MBR is not inferior to a focused, more expensive IMPT in terms of sick days and cost reduction. Considering the results of the major impaired group -this intervention can become superior in access, cost-savings and reduction in sick leave -if the provider nds a way to allocate groups more targeted.
Sensitivity analyses II and III indicate that the payer should also aim to ensure that the MBRs created are carried out in full and analysed thoroughly with regard to cost effects. Special cost background of the privately insured, with the possibility of not submitting incurred costs on the one hand, and the heterogeneity of the population with regard to the overall health burden on the other need to be taken into account. This enforces the need to conduct the cost analysis with a large population that can be better and more clearly separated.
The sensitivity analyses nevertheless con rm the positive effect of the intervention on sick days and general health, which suggests that these effects are robust.
A DMP for BP in Germany is long overdue. It has been proven that signi cant improvements in health and savings of direct and societal costs are possible through improved management of CLBP. With an excellent QALY gain, cost savings in the treatment of BP and high reductions in sick days, the present cost-effective MBR can serve as a blueprint for a treatment programme that applies the clinical guidelines and is both medically and economically effective.

Limitations
The study has three limitations: BP speci c costs, size and the time perspective.
The rst limitation is the calculation of the BP speci c costs. By analysing the comorbidities in detail, an attempt was made to keep the in uence of other diseases as low as possible. A residual doubt remains that could only be remedied in the setting by focusing solely on inpatient visits due to BP. Since a large proportion of the costs occur in the outpatient sector and only a small number of the study population was treated as inpatients, this was -in the interest group size -not done here. With a larger group, only taking inpatient costs could be a feasible approach in the clari cation of BP speci c costs. It should also be noted that higher total costs are to be expected in the SHI system. In contrast to the PHI, sickness bene ts must be paid after a certain period of absence. Therefore, it can be assumed that the cost-bene t effect of a long-term MBR treatment would be better in the SHI.
Even though the sample size with 443 participants was more extensive than most other published studies [9], it was still challenging to present clear results -making this the second limitation. The sample size was reduced by the exclusion of dropouts, switchers, tariff-related exclusion, truncation and PSM. In the course of the analysis, the still cost heterogeneous groups needed to be separated according to BP status. The analysis suggests savings from participation in the MBR in many places -but occasionally does not meet signi cance levels. Thus, those ndings should be veri ed in a study with a larger sample size.
The third limitation is the time perspective. A study period of four years exceeds other published analyses but might still exclude savings that occur further down the line. Generali's business case is calculated for ve years after the start of the programme, which suggests that more savings are expected later on. For a complex intervention like the one in the present study, the Medical Research Council recommends a lifetime horizon to demonstrate the sustainability of outcome changes [66]. Since no information was available on the self-assessed health status after the follow-up period, this could not be done in the present study. However, this is the rst health cost-utility study with data from a PHI in Germany on a health intervention that is actually offered, run and followed up thoroughly and accompanied by a trial study. The presented results therefore are able to address the aforementioned uncertainty gap about the cost-effectiveness of long-term MBR.

Generalisability
The results can be generalised to a limited extent. The data used was granted by a PHI provider, who is, in general, insuring healthier individuals than an SHI [67]. It can be assumed that the high improvement rate in BP in the CG is due to the composition of health insurance risk pro les. Good insurance risks are known to take care of their own health and believe in self-e cacy [68,69]. The effect in favour of the IG might be higher in the general population, where a more passive CG can be expected.
With a gender ratio of 65(M) to 35(F), there is a male surplus, which is not representative of the overall German population but can be considered representative for PHI [70]. To achieve better comparability with the statutory system, speci c individual costs (e.g. one/two-bedroom) were not considered. Additionally, the individual's tariff in uence was kept as low as possible by taking the billed amount as cost and not the respective amount of refund.
The perspective of the analysis was stated from the payer's point of view; indirect medical costs incurred by the individual, their relatives, society or the employer were not taken into account. The apparent savings in sick days and the improvement in general health status suggest that further savings beyond direct medical costs are conceivable.
The data used in the analysis was obtained from a single clinical trial, which potentially limits the generalisability of the ndings for several reasons [71]. However, since the study was multicentric and the participants and study centres were spread across Germany, this factor was mitigated.
The effects shown in this study using a relatively small sample size suggest that a participation in an outpatient MBR with behaviour-change coaching and device-supported exercise leads to cost savings, reduction of sick days as well as improvements in health status. The different effects were backed up with extensive sensitivity analyses, which increased the robustness of the ndings. To erase the still prevailing doubt, an evaluation with a larger group is recommended.

Conclusion
This is the rst cost-utility study with combined data from a PHI and a controlled trial in Germany to demonstrate that long term MBR is a cost-effective instrument in the treatment of CLBP at 24-month follow-up. Subgroups with major impairment measured by Korff's Chronic Pain Grade Questionnaire bene t more from the intervention than patients with minor impairment due to BP. When the ndings are compared to more expensive IMPT in terms of sick days and cost reduction -the intervention analysed here can become superior in access, cost savings and reduction in sick leave. It is therefore recommended to use MBR for the major impairment group. For the minor impaired another solution needs to be developed. Providers should try to identify BP impairment in their administrative data so that they can make targeted offers. It should be stressed that MBR signi cantly reduces sick leave in minor and major impaired participants in the long term, making it a pro table intervention from society's point of view. However, it remains to be seen how those positive results develop over a lifetime horizon.

Declarations
Funding: In this study, data from the published medical evaluation are linked with economic administrative data to carry out a cost-bene t analysis. The medical evaluation study was funded by the Generali Deutschland Krankenversicherung AG. It paid a grant for the scienti c evaluation of the programme to the researchers of the University of Lübeck. publicly available. The data is however available from the authors upon reasonable request and with permission including a signed data access agreement of Generali Deutschland Krankenversicherung AG and University of Lübeck.
Code availability: Data management and statistical analysis were carried out using the software R in application of the packages listed in the bibliography. The code that supports the ndings of this study is available from Generali Deutschland Krankenversicherung AG but restrictions apply. It is available from the authors upon reasonable request and with permission including a signed access agreement of Generali Deutschland Krankenversicherung AG.
Authors' contributions: MH planned the analyses, analysed the data, interpreted the results, wrote the manuscript. VA supervised the project. MW contributed to the implementation of the research. JPS veri ed the analytical methods. PR contributed to the interpretation of the results. All authors provided critical feedback and helped shape the research, analysis and manuscript. All authors read and approved the nal manuscript.
Trial registration: The trial of the evaluation study was registered at the German Clinical Trials Register under DRKS00015463 retrospectively (dated 4 Sept 2018).
Ethics approval and consent to participate: The independent research ethics committee of the University of Lübeck gave approval for the medical evaluation study (Re.-No.14 -249, dated 20.11.2014). As the participants had already consented to the usage of the data for further analysis, no new ethic vote was sought for the present evaluation. Written informed consent was obtained from all study participants. Consent for publication: Not applicable