Study Design
This was a cross-sectional study. Participants of the study were insured members of the Generali Deutschland Krankenversicherung AG, who signed up for long-term MBR against CLBP between July 2014 and March 2021.
All MBR participants underwent a digital assessment at the beginning of the intervention – the information to calculate the GCPS was collected here among others. Two ways to sign up for the programme existed. The standard way (I) was an invitation sent out by the insurance company based on the specific disease history as stated below. The alternative path (II) was based on the clients’ initiative, where they directly requested participation in the health programme (further referred to as self-selected). For the invited insurance holder (I), the CC was available at the date of invitation. This was calculated on the basis of the submitted medical bills of the last 12 months before the invitation. The CC of insured persons who enrolled within three months of invitation was compared with the GCPS at the point of enrolment. For the secondary outcome of the cost analysis, participants who proactively requested participation (II) were additionally taken into account. Data management and statistical analyses were carried out using the software R[39] and the listed packages [40–43].
Participants
Invited participants (I) were selected according to CC [37]. For the calculation of the CC, the following routinely collected data from 12 months prior to invitation, were taken into account:
- Number of BP specific ICD-10 diagnoses (M40 – M54)
- Incapacity to work due to BP and its duration
- Use of strong opioids (ATC Group: N02A) as an indication of chronic pain and
- Psychiatric ICD-10 F-diagnoses: F32*, F33*, F34.1, F34.8, F34.9, F38, F41.2, F45.4, F48.0, F43.20, F43.21, F43.22, F54, F62.80
The three chronicity classes were assigned as:
1) Without evidence of chronicity:
Two M40 to M54 diagnoses and not CC group 2 or 3.
2) Evidence of risk of chronicity:
Two M40 to M54 diagnoses combined with less than two opioid prescriptions and either a) incapacity to work due to an M40 to M54 diagnosis of less than six weeks or b) at least two F diagnoses.
3) Evidence of chronicity:
Two M40 to M54 diagnoses combined with either a) incapacity to work of at least six weeks or b) at least two opioid prescriptions within six months.
As the insurance company subdivided participants by BP severity in the digital assessment and assigned a suitable programme variant, persons with all three CC levels were invited. Therefore, the minimum requirement to be invited was the presence of two ICD-10 diagnoses in the range of M40 to M54 within the last 12 months. Excluded from invitation were individuals with any condition that precluded participation in an intensive physical intervention (e.g., stroke or need for care). The complete exclusion list can be consulted elsewhere [24].
The aim of the study was to validate the CC algorithm. To achieve this, the GCPS was used [10] and compared with the CC. Participants were questioned about the duration, intensity and impairment due to their BP within the previous six months prior to the date of enrolment. Depending on the answers to the seven questions, every participant was assigned a GCPS grade. Grades ranged from:
- Grade I: low disability-low intensity
- Grade II: low disability-high intensity
- Grade III: high disability-moderately limiting and
- Grade IV: high disability-severely limiting.
Variables
Primary Outcome
The primary outcome was the criterion validity of the CC, i.e. an evaluation of the accuracy of the prediction model of BP chronicity classes using claims data as developed by Freytag and colleagues. The GCPS was used as a reference value for the classification of chronicity. The predicted (CC) was compared with the actual chronicity grade (GCPS) for all invited participants of the MBR. To compare the four-level GCPS with the three-level CC, the GCPS needed to be reduced by one grade. GCPS grades I and II were combined and compared with CC 1 - “without evidence of chronicity”. GCPS grade III was compared with CC 2 - “Evidence of risk of chronicity” and GCPS grade IV with CC 3 – “evidence of chronicity”.
In a first step, the correlation between CC and newly categorised GCPS was assessed using Spearman’s rho rank correlation coefficient with 95 % confidence intervals (CI). Strength of correlation was interpreted as weak (rho < 0.1), modest (rho 0.1 – 0.3), moderate (rho 0.31 – 0.5), strong (rho 0.51 – 0.8) or very strong (rho >0.8) [44]. The second step included the assessment of the agreement between CC and the categorised GCPS by using Cohen’s weighted Kappa. The agreement was interpreted as poor (Kappa < 0.2), fair (Kappa 0.21 – 0.4), moderate (Kappa 0.41 – 0.6), substantial (Kappa 0.61 – 0.8) or almost perfect (Kappa 0.81 – 1) [45].
Furthermore, the GCPS and the CC were both dichotomised in severe and non-severe BP cases. Grades I and II were previously defined as functional chronic pain, and Grades III and IV as non-functional chronic pain [10]. In order to allow easier comparability and interpretation, GCPS grades I and II, which were already summarised, were relabelled as non-severe and III to IV as severe cases. CC class 1 and 2 equally as non-severe, and CC 3 as severe BP cases and presented in a 2x2 confusion matrix.
The confusion matrix assigned the chronicity class of each MBR participant with its predicted class (severe BP or non-severe BP). As a result, every sample belonged to one of the following four classes:
- True positive (TP) were actual severe BP cases that were correctly predicted as severe
- True negative (TN) were actual non-severe BP cases that were correctly predicted as non-severe
- False positive (FP) were actual non-severe BP cases that were wrongly predicted as severe
- False negative (FN) were actual severe BP cases that were wrongly predicted as non-severe
Sensitivity (i.e. the proportion of participants with severe BP who were correctly classified by the model), specificity (i.e. the proportion of participants without severe BP correctly classified as not having severe BP by the model) and Matthews correlation coefficient (MCC) [46] (i.e. the correlation between actual and predicted severity grades) were estimated to evaluate the model’s performance. MCC was chosen instead of accuracy and F1 score as it is more reliable taking into account all of the four confusion matrix categories [47]. As MCC is a discrete case of Pearson Correlation Coefficient, the strength of correlation was interpreted equally, meaning: very weak relation (MCC 0.01 – 0.29), fair relation (MCC 0.3 – 0.59), moderately strong relation (MCC 0.6 - 0.79) or very strong relation (MCC >= 0.8) [48]. Cohen’s weighted Kappa was again estimated as a concordance statistic.
Participant characteristics potentially associated with the grade of BP chronicity
Demographic information of the participants (e.g. age, sex), overall health (e.g. weighted Charlson Comorbidity Index Score (CCI) [49], self-assessed overall health status using the first item of SF-12), possible psychological comorbidities (PHQ-4 score and its subscales [50], ICD-10 F-diagnoses) and direct effects of BP (ICD-10 M-diagnoses, everyday impairment, average pain level, number of days restricted in everyday activities within the last six months) were selected. These variables were descriptively compared across CC respective GCPS grades.
Not every participant was enrolled in a daily sickness benefit insurance in addition to their regular PHI policy at this provider. It is likely that most participants were insured against sick leave at another provider. However, no information was available on the insurance status. Therefore, in contrast to the SHI system, there was no general incentive for the insured to report incapacity to work to Generali. Since the days of incapacity to work played a major role in the calculation of the CC, the daily sickness allowance insurance status of the insured was regarded as a possible confounder and analysed separately in a sensitivity analysis. However, it was assumed that those insured against sick leave at this provider also reported absence.
Secondary outcome
The secondary outcome was an updated representation of the costs of care for CLBP in the German PHI setting. Overall health costs and BP specific inpatient, as well as outpatient costs in the last 12 months before enrolment, were considered. Costs were descriptively compared across CC respective GCPS grades.
Included were costs from the following areas: General hospital services, GP and specialist care, medicines, remedies, alternative practitioners (e.g., chiropractor), aids and private medical treatment. Additional elective services (e.g., one or two-bedroom supplement) and the entire costs of dental care treatment were excluded.
In a PHI setting the reimbursement procedure follows the principle of refund of expenses, i.e., the clients pay the health care bill in advance, submits the bill afterwards to their insurance company and receives the reimbursement according to the insurance tariff concluded from it. Reimbursement of the health care bills depends on the respective tariff. The study population consisted of fully insured participants with different levels of deductible as well as policyholders eligible for governmental aid. Therefore, the cost component was defined as the total bill amount instead of the refund amount paid. Thus, the actual costs were compared with each other without taking into account which payer (health insurance, subsidy or individual supplementary) reimbursed the costs. As costs were presented for a period of 12 months, no discounting was executed. All costs were converted to 2020 Euros (€) using consumer price indices.
As healthcare costs tend to be highly skewed and heavily right-tailed [51], a truncated mean was also calculated in addition to the average costs per category. For this, all upper outliers (high-cost cases) were calculated using Tukey’s method with 1,5 * interquartile range (IQR) [52]. Low-cost cases were defined as participants who did not submit an invoice from the presented area in the last 12 months before enrolment.
Data Source/Measurement
For this study, two data sources were used. The information to calculate the CC and its connected variables (e.g., diagnoses, sick-days, opioid use, CCI), as well as all cost data, were obtained from claims data of the insurance. The information to calculate the GCPS was obtained through participants’ responses in the standardised, self-administered digital assessment during enrolment. Participants were questioned about their current health status to a) assign the best type of intervention and b) control for individual developments with follow-up measurements.
Bias
The routinely collected data did not yield a potential source of bias. For the data collected within the standardised, self-administered questionnaire there were two potential sources of a) recall bias and b) demand characteristics. A recall bias was possible since the GCPS was calculated using the development of BP within the last six months. However, the GCPS is in general widely used to assess CLBP [8, 11, 53, 54] but also other types of anatomically defined pain conditions [55, 56]. It has been validated several times [57, 58] and is an internationally recognised tool in self-administered pain assessment [59] so that a possible effect of recall bias was neglected.
A second possible source of bias was demand characteristics [60], i.e. that respondents answer the questionnaire tactically in order to receive the most comprehensible care possible. However, participants were asked to answer truthfully in order to receive an intervention tailored to their individual needs. Since all participants were pain patients, who volunteered for the intervention, which is always free of charge, it could be assumed that their answers were rather accurate. Moreover, the specific steering logic was not mentioned in writing. Therefore, a bias due to demand characteristics seemed also unlikely.
Study Size
Different samples were required to answer the two research questions. The selection criteria are shown in Table 1. The study population consisted of 3,629 participants for whom the GCPS grade at enrolment was available. As the data was provided by a PHI, there were participants with an individual yearly threshold of costs before payment of expenses (deductible). Insurees with a fixed deductible usually only hand in their invoices of a year if they exceed that amount. To reduce the potential bias introduced by the tariff, 122 participants who did not hand in any invoice in the 12 months before enrolment (yearly average of invoices = 27) were excluded.
To answer the first research question of the criterion validity all participants of the MBR who enrolled in the standard way (n = 2,722) were taken into account. The time between the initial invitation and enrolment was calculated. As CC was only available at the date of the invitation, participants who took longer than 90 days to register were excluded (n = 326) in order to rule out the temporal effect and thus potential changes of the CC. The final group size for the first research question was 2,396.
For the estimation of the cost of CLBP participants signed up on their own initiative were additionally considered (n = 872). The size of the group used to answer the second research question increased to 3,506.
Table 1 - Data preparation processes for selection of study population
Data processing steps showing the number of participants
|
Overall
|
Used in
|
Study size
|
3629
|
|
Exclusion of participants without any billing invoice available
|
3506
|
Research question II
|
Enrolment after invitation by insurance
|
2722
|
|
Enrolment within 90 days after invitation
|
2396
|
Research question I
|
Enrolment within 90 days after invitation plus insured against sick leave
|
1114
|
Sensitivity analysis
|