Target population and subgroups
The present economic evaluation reuses data from a two-arm randomised controlled evaluation study in parallel-group Zelen's design for which recruitment took place between April and October 2015 [31, 32] and amends it with administrative direct health costs data. Searches in the database of the PHI identified adults with administrative indications of CLBP. The full inclusion criteria can be found elsewhere [32].
One of the main results of the previously run effectiveness analysis stated the effect of the intervention being highly dependent on the level of impairment due to BP [32]. Therefore we divided participants into subgroups based on their overall result in the Chronic Pain Grade Questionnaire [33, 34]. Two parameters were used to classify BP severity levels: the characteristic pain intensity (score 0-100) as an average of the current, average and maximum pain intensity, and the pain-related impairment (0-6 points), calculated from the number of impairment days and the extent of the impairment experienced in everyday life, leisure and work. These led to four hierarchical Graded Chronic Pain Grades (GCPS): Grade I, low disability-low intensity; Grade II, low disability-high intensity; Grade III, high disability-moderately limiting; and Grade IV, high disability-severely limiting [33].
In the following analysis, the grades were combined: Grade I and Grade II as minor impaired (functional chronic pain) and Grades III and IV as major impaired (dysfunctional chronic pain) due to BP.
Table 1 shows the characteristics of the population before and after data processing and matching (see analytic methods). It can be observed that after the processing, participants in the intervention group (IG) and in the control group (CG) are almost equally distributed. Group size (n: IG = 112, CG = 111), as well as characteristics, are very similar in both groups at the beginning of the study period so that the costs incurred can be compared well. The mean age is 55.19 years. A sex distribution of 35 % female and 65 % male participants is present. Of the study population, 53 % are categorised as minor and 47 % as major impaired by BP. According to Keele STarT Back Screening Tool [35] 55 % are of low, 33 % of medium and 12 % of high risk of persisting disabling symptoms.
Based on the weighted Charlson Comorbidity Index Score (CCI) [36], used to classify comorbidities in the likelihood of mortality/high resource use, it can be stated that the groups are comparable at baseline.
Table 1 - Baseline Characteristics of IG and CG before and after data processing
|
|
Before
|
After
|
|
IG
|
CG
|
p
|
IG
|
CG
|
P
|
n
|
189
|
254
|
|
112
|
111
|
|
Age (mean (SD))
|
53.86 (8.13)
|
54.14 (8.65)
|
0.725
|
55.68 (7.34)
|
54.69 (8.45)
|
0.354
|
Gender = Female (%)
|
70 (31.7)
|
103 (40.6)
|
0.072
|
39 (34.8)
|
40 (36.0)
|
0.960
|
GCPS (%)
|
|
|
0.062
|
|
|
0.948
|
I
|
64 (33.9)
|
115 (45.3)
|
|
41 (36.6)
|
42 (37.8)
|
|
II
|
26 (13.8)
|
36 (14.2)
|
|
18 (16.1)
|
19 (17.1)
|
|
III
|
52 (27.5)
|
49 (19.3)
|
|
28 (25.0)
|
24 (21.6)
|
|
IV
|
47 (24.9)
|
54 (21.3)
|
|
25 (22.3)
|
26 (23.4)
|
|
STarT-Back (%)
|
|
|
0.068
|
|
|
0.942
|
1
|
100 (52.9)
|
161 (63.4)
|
|
62 (55.4)
|
60 (54.1)
|
|
2
|
64 (33.9)
|
71 (28.0)
|
|
36 (32.1)
|
28 (34.2)
|
|
3
|
25 (13.2)
|
22 (8.7)
|
|
14 (12.5)
|
13 (11.7)
|
|
EQ-5D (mean (SD))
|
0.61 (0.18)
|
0.63 (0.20)
|
0.253
|
0.61 (0.18)
|
0.60 (0.19)
|
0.78
|
CCI
|
|
|
0.661
|
|
|
0.6
|
0
|
76 (49.4)
|
127 (50)
|
|
56 (50)
|
47 (42.3)
|
|
1-2
|
54 (35.1)
|
97 (38.2)
|
|
40 (35.7)
|
46 (41.4)
|
|
3-4
|
20 (13.0)
|
23 (9.1)
|
|
14 (12.5)
|
14 (12.6)
|
|
>=5
|
4 (2.6)
|
7 (2.8)
|
|
2 (1.8)
|
4 (3.6)
|
|
Table 1 - Baseline Characteristics of IG and CG before and after data processing
Setting and location
There is a division of health care in Germany. About 90 % of the population is a member of the statutory health insurance (SHI), and 10 % is privately health insured. The latter implicates a comprehensive health insurance plan dealing as a substitute for the mandatory SHI. Usually, members of the PHI are self-employed, civil servants or employees with a salary above the compulsory insurance threshold (currently €64,350). In general, members of the PHI belong to a higher socio-economic class.
Due to the nature of the system ('lock-in effect'), there is a low fluctuation within the membership [37]. The insurer has an interest in healthy members and therefore considers investments over a longer time period. The payer is therefore interested in optimal, evidence-based care for its insurees. Unlike other chronic diseases, still no uniform, guideline-based disease management programme (DMP) for BP exists in Germany [38]. Individual insurance companies have developed and implemented different approaches, which are however rarely evaluated scientifically [32, 39]. As an innovation driver from payer to health care partner, a PHI company in Germany can pave the way and test the effectiveness and efficiency of a structured treatment programme [40]. This could be a good blueprint for the introduction of a DMP for BP in the SHI.
This study was set within the Generali Germany Health Insurance. In 2019, Generali listed 308,088 fully insured members and 1,431,522 with supplementary insurance [41] and was one of the largest PHI in Germany. Only fully insured members were eligible to participate in the offered health programme.
Study perspective
The analysis was conducted from the payer's perspective. Total health costs were evaluated, as well as costs that could be attributed to chronic back pain (ICD M40-M54 diagnosis).
The costs were divided into outpatient and inpatient costs. In order to achieve better comparability with the statutory system, additional elective services (e.g. one or two-bedroom supplement) and the entire costs of dental care treatment were excluded. Included were costs from the following areas: General hospital services, GP and specialist care, medicines, remedies, alternative practitioners (e.g. chiropractor), aids and private medical treatment.
In a PHI setting, reimbursement of the health care bills depends on the respective tariff. The study population consisted of fully insured participants as well as policyholders eligible for governmental aid. Therefore, the bill amount was used as the costs and not the amount of refund paid by the insurance. Thus, the actual costs were compared with each other without taking into account which payer (health insurance, subsidy or individual supplementary) reimbursed the costs.
In order to allow further analysis, the number of sick days due to BP were also analysed. Since not every participant in the study held a daily sickness benefit insurance (in contrast to the statutory system), no monetary value was assigned to the days of sick leave.
Comparators
The intervention combined two care components: BP was treated by networks of general practitioners, orthopaedic surgeons and pain therapists working according to the FPZ concept and were located as close as possible to the patient's home [42]. This concept is based on a training programme for the spine-stabilising musculature developed at the German Sport University Cologne, which has been further advanced by the Cologne Research and Prevention Centre (FPZ) for the reconditioning of patients with sub-acute and chronic back pain (for details, see http://www.fpz.de). A therapy plan was drawn up following a functional biomechanical analysis comprised of up to 24 one-hour equipment-supported training units to build up the spine-stabilising musculature in FPZ back centres. After completing the training, IG members could receive €100 twice for the use of further freely selectable health sport offers as a movement bonus.
Each participant received personal telephone support from an external health coach provided by Thieme TeleCare. The aim of the coaching, which initially accompanied the therapy and was then followed up for six months, was to support behaviour changes, thus contributing to the continuation of physical activities.
Members of the CG did not receive any offer for training or coaching but were treated as current practice receiving standard care, for which recommendations exist in the National Clinical Practice Guideline for Non-Specific Low Back Pain [43].
Time horizon
The original study had a duration of 24 months [32]. The present cost comparison covered a period of four years. From the individual start time of the study (t-0), the costs were compared 24 months before and 24 months after the index date. Since Difference-in-Difference (DiD) estimation was chosen as the method of comparison (see methods section), the pre-intervention and post-intervention observation periods were selected to be equal.
The follow-up period of 24 months was chosen to compare the measured endpoint of health developments with the economic effects, thus calculating ICER using QALYs. No further tracking of cost developments into the future was carried out, as no longer-term data on the individual health status of the participants was collected.
Discount rate and currency
Following the recommendations of the German Institute for Quality and Efficiency in Health Care [IQWIG], the discount rate was set at 3 % per year [44]. Sensitivity analyses were performed for a discount rate of 0 % and 5 %.
Depending on which discount rate was applied, the results varied mildly. Since IQWIG recommends 3% and the overall results did not change much, only the results of 3 % discounting are presented in the analysis. Changes in statistical significance due to discounting are marked.
All costs were converted to 2017 Euros (€) using consumer price indices.
Choice of health outcomes
Outcomes of patients receiving standard care with those of a cohort receiving standard care combined with the intervention over two periods of 24 months were compared. The dataset provided by the insurance contained longitudinal patient-level information on medical diagnoses, direct medical costs (in 2017€) and healthcare utilisation between 2010 and 2017.
Values to calculate the QALYs, the number of sick days and the health status at the beginning of the study and after 24 months were obtained through an online questionnaire. The exact method is described elsewhere [32].
The study outcomes can be divided into direct medical costs (overall as well as BP specific), the number of sick days due to BP ("Approximately how many days in the last six months were you unable to carry out your normal activities (work, school/study, housework) due to your back pain?") and the overall health developments (EQ-5D). BP related costs were identified through ICD-10 M40 to M54.9 diagnoses. All outcomes represent the average values over the 24-month baseline and follow-up period.
To calculate the QALYs, the EQ-5D was used. Since the original study used the SF-12, the EQ-5D value was calculated using Lawrence's algorithm [45].
Analytic methods
Study Size
A total number of 189 participants in the IG and 254 participants in the CG took part in the reference study [32]. After data cleaning and matching, 112 participants in the IG and 111 in the CG were included for further analysis. The selection criteria are shown in Table 2.
Participants who did not exercise physically due to a large distance to the training centre were excluded from the baseline (n=34) but controlled for in the second sensitivity analysis[1]. Participants who dropped out during a later stage (while training or during coaching) were treated according to the intention to treat principle and kept in the study.
Table 2 - Data preparation processes for selection of study population
|
Data Processing Steps showing the number of participants
|
Overall
|
IG
|
CG
|
Used in
|
Evaluation study group
|
443
|
189
|
254
|
Publication in [32]
|
Exclusion of participants without any billing invoice available
|
435
|
186
|
249
|
Sensitivity Analysis III
|
Exclusion due to unstable unit treatment
|
431
|
185
|
246
|
|
Exclusion due to deductible/tariff
|
404
|
172
|
232
|
|
Propensity score matching (Group ITT) + truncation
|
273
|
136
|
137
|
Sensitivity Analysis II
|
Exclusion of non-exercising participants
|
223
|
112
|
111
|
Main Analysis
|
Analysis of participants who improved their BP at the end of follow-up
|
106
|
62
|
44
|
Sensitivity Analysis I
|
Table 2 - Data preparation process for selection of study population
Considering the stable unit treatment value assumption (STUVA) for the estimation of treatment effects [46], four further participants had to be excluded. One participant from the IG was excluded as he enrolled himself in the programme a second time before the end of the follow-up period. In the CG, three participants had to be excluded as they enrolled themselves in the intervention before the end of the study period.
As the data was provided by a PHI, there was the additional obstacle of managing the individual yearly threshold of costs before payment of expenses (deductible). To reduce the bias introduced by the tariff, 27 participants who did not hand in an invoice in one of the four examined years were excluded (yearly average of invoices = 42). Data management and statistical analysis were carried out using the software R 3.6.0 [47] in the application of the packages listed in the bibliography [48–54].
Difference-in-Difference regression
The aim of the DiD method is to estimate the average effect of treatment on the treated (ATT). Changes in cost over time between the IG and the CG were compared with a regression model with an interaction term between period and treatment (Y= β0 + β1*[Period] + β2*[Treatment] + β3*[Period*Treatment]). The outcomes in the baseline period were measured two years before the respective index date. The time-series dimension of the two year baseline period was removed by comparing the values over two years to avoid biased standard errors due to serial correlation [55]. Thus, a single value per outcome measure for the baseline and follow-up period was generated.
To examine if the costs developed equally over time (parallel trend assumption) [56] BP specific costs were plotted over a time horizon of four years prior to the intervention. Outcomes were considered quarterly to test the parallel trend over 16 data points. The graph (Fig. 1) shows that the parallel trend assumption could be accepted. BP specific costs in both groups developed similarly before the start of the intervention. The average cost before the intervention could be divided by BP impairment.
Propensity score matching
Table 1 shows that IG and CG differed in the distribution of participants according to their pain levels. As the pain stage has been shown to have an impact on BP specific costs [5], nearest neighbour propensity score matching (PSM) with a caliper of 0.1 was performed to achieve adequate covariate balance at baseline [57]. Covariates used for the matching were sex, CCI, GCPS, STarT-Back, health cost before the intervention, age, sick days prior and risk for chronic BP as identified in the data selection [32].
As DiD regression is sensitive to high-leverage observations [58] extreme outliers (BP specific cost before intervention >=€13.000) were excluded (n= 25). After excluding high-cost cases, the regression results became more robust.
Analysis of covariance (ANCOVA)
A one-sided covariance analysis (ANCOVA) was carried out for the outcomes of secondary interest here, i.e. the reduction of sick days due to BP and the general state of health (EQ-5D).
ANCOVA is a common, statistically robust method with model assumptions that should be respected (e.g. linearity between the covariate and the outcome, homogeneity of regression slopes, normally distributed outcome variable, homoscedasticity of residual variance) [59]. In the analysis of the EQ-5D, all model assumptions of the ANCOVA were met. For the range of sick days, however, the residuals were not normally distributed. This was, however, negligible as violations of these assumptions do not decisively influence either the probability of a first type error or the test strength [60, 61]. Covariance analyses are only contraindicated if the regression slopes are heterogeneous, the samples are unequal in size, and the residuals are not normally distributed [62]. This was not the case here. Therefore, an ANCOVA can be applied.
[1] Invitations were not controlled for the distance between the FPZ training centre and home. Some participants therefore only realised after their enrolment that the distance to reach the training centre was not manageable on a regular basis. The drop-out of these participants did not depend on their pain-levels as the percentage was comparable with 4.7% and 6.6 % dropout in the respective pain groups.