The Core Outcome Measures Index (COMI) is a short, multidimensional instrument that has one question each on back pain intensity, leg/buttock pain intensity, function, symptom-specific well-being, general quality of life, work disability and social disability, scored as a 0–10 index . The COMI has been described in detail by Mannion et al. .
Translation and cross-cultural adaptation
The translation and cross-cultural adaptation of the original English version of the COMI into Slovene was carried out in accordance with previously published guidelines .
Two bilingual translators whose first language was Slovene independently translated the original English version of the COMI to Slovene. The first translator (T1) was an expert in the field (Resident of Orthopedic surgery). The second translator (T2) was a High school teacher of English, not familiar with the concepts and the clinical content of the questionnaires. Both translators compared and discussed their versions and a consensus version (common Slovene translation T-12) was produced.
Back translation of T-12 into English was performed independently by native English speakers who were also fluent in the Slovene language. Both back-translators were blind to the original English version and had no medical knowledge. They were both working in a high school as assistant teachers.
A committee was formed consisting of one of the translators, one of the back translators, three spine surgeons and one methodologist research scientist. The committee examined all the translations and reached a consensus of the pre-final Slovene version. All stages of the translation process were documented in written form.
Fifteen surgical patients with chronic LBP were asked to fill out the pre-final version of COMI. After completing the questionnaire, they were asked about the content and the structure of it. The findings were then discussed and a final Slovene version was produced accordingly.
The study was approved by our Institutional Ethical Review Board. After giving their written informed consent, the patients received a booklet of questionnaires including items on demographic variables, the final Slovene COMI and the Oswestry Disability Index (ODI).
The sample consisted of 353 patients from our Orthopedic clinic department who were administered the COMI questionnaire between January 2017 and March 2019. All patients indicated that they had problems with back pain, leg/buttock pain, or sensory disturbances in the back/leg/buttocks (e.g. tingling). Both sexes were relatively equally represented in the sample (47.0% male, 53.0% female) and the average age was 65.1 years (SD = 12.5; range: 2587). Overall, 129 (36.5%) patients indicated that back pain was the problem that troubled them the most, 116 (32.9%) leg/buttock pain, and 108 (30.6%) sensory disturbances.
Some analyses were performed on subsamples with similar basic demographic characteristics (gender, age, and the chief complaint) to the ones presented above. One part of the construct validity analysis was performed on a subsample of patients who had also filled out the Oswestry Disability Questionnaire on the same day (before surgery). The responsiveness part of the analysis was conducted on a subsample of participants who filled out the questionnaire again approximately 24 months after surgery (M = 81.1 days, SD = 13.6, range: 60115 days after surgery). Additionally, the reliability (stability) part of the analysis was performed on a subsample of patients who filled out the questionnaire once more, approximately 3 months later (M = 81.8 days, SD = 5.6, range: 6090 days after). All the subsamples are depicted in Figure 1 below.
The overall COMI score was computed as previously described . It can range from 0 (best health status) to 10 (worst health status). Scores for Oswestry Disability Index were calculated as described by Fairbank and Pynsent , and ranged from 0 to 100, with higher scores indicating a higher severity of disability.
Missing data were analyzed for each COMI item and the overall score; specifically, we divided the number of missing values by the number of respondents in the sample. Floor and ceiling effects were assessed by calculating the percentage of respondents who, respectively, exhibited maximum and minimum possible scores on individual COMI items and the overall score. Floor and ceiling effects can make it impossible to detect deterioration or improvement in the participants’ status (e.g. if the value already indicates the best possible status, improvement cannot be detected) . When interpreting these values, floor and ceiling effects larger than 70 % are often considered to be adverse and effects smaller than 15 % are often considered to be ideal [14, 15].
We also performed statistical analyses aimed at investigating construct validity, which refers to the degree to which scores on one instrument relate to other measures in a manner that is consistent with theoretically derived hypotheses [15, 16]. In other words, two questionnaires that measure the same construct or highly similar constructs, are expected to be (strongly) positively correlated. In the present study, these analyses were done by testing the relationship between individual COMI items, the overall COMI score, and a previously established instrument – the Oswestry Disability Index. Spearman Rho (ρ) corrected for ties was used in correlational analyses, and the following thresholds were used to interpret the calculated validity coefficients: ρ > .80 as excellent, .61-.80 very good, .41-.60 good, .21-.40 fair, and .00-.20 poor . Based on previous studies that examined the relationship between ODI and COMI, we expect fair to good correlations between individual COMI items and the overall ODI score. Additionally, we expect to find very good to excellent correlation between the overall COMI and ODI scores [7, 18].
Responsiveness, one of the key attributes that needs to be considered when evaluating new questionnaires, is defined as the ability of a questionnaire to detect clinically important changes over time, even if these changes are small. A large number of methods have been proposed for assessing responsiveness . In the present study, we used approaches that have already been used in previous COMI validation studies . The change in group median scores from pre-surgery (baseline) to 3 months post-surgery were calculated using the Wilcoxon Signed Ranks Test, the non-parametric equivalent of the Paired Samples T-Test. We also calculated effect sizes (r) for the change scores , with values of 0.1 indicating small effects, 0.3 medium effects, and 0.5 large effects [8, 9, 18]. Additionally, we further explored the change in median scores based on the “global treatment outcome” question (i.e. Overall, how much did the operation help your back problem?). In other words, we aimed to find out whether the median change in the overall COMI score differed between patients who perceived the therapeutic intervention as being helpful or very helpful and those who perceived it as less efficacious [8, 9]. To compare median changes, we performed a non-parametric Mann-Whitney U test.
Lastly, we performed test-retest analyses, which explore questionnaires’ stability over time. A common method to evaluate this form of reliability is by calculating intraclass correlation coefficients (ICC) and their 95 % confidence intervals. Intraclass correlation coefficients can occupy values between 0.0 and 1.0, with values of 0.6 – 0.8 generally indicating good reliability and values above 0.8 indicating excellent test-retest reliability . While several previous COMI validations  show that COMI is a very reliable measure, it is worth noting that the time lag between the two COMI applications (with no therapeutic intervention in between) is a bit longer in our study (approximately 3 months as opposed to 2 weeks). As such, we expected slightly lower, but still satisfactory, test-retest values for each individual item as well as the overall COMI score. Standard errors of measurement (SEM) were also calculated and, in the next step, used to obtain data regarding the minimum detectable change (MDC95%) – the degree of change required in a patient’s score in order to establish it as being a real change, over and above measurement error. At the 95% confidence level, this is defined as 1.96 × √2 × SEM, which is equivalent to 2.77 × SEM.
All statistical analyses were performed with IBM SPSS 23.0 software; p values of less than .050 were considered significant.