Cross-cultural Adaptation and Validation of the Polish Version of the Lower Limb Functional Index

Background: Patient reported outcome measures (PROMs) are recommended to enable the standardization of collected data and provide accurate representation of the patients’ subjective opinions of their functional capabilities. The purpose of this study was to perform linguistic and cross-cultural adaptation to establish a Polish version of the Lower Limb Functional Index (LLFI), and to evaluate the psychometric properties of internal consistency, reliability, error score, validity, and factor structure with standardized criteria PROMs in a population with lower limb problems. Methods: Linguistic and cultural adaptation complied with the International Society for Pharmacoeconomics and Outcomes Research (ISPOR) guidelines to produce the Lower Limb Functional Index-Polish version (LLFI-PL). The study recruited subjects (n=125, age =52.86±19.53 years, range 20-87, 56% female, injury duration =17.69±18.39 weeks, range 5-71). Baseline reliability and criterion validity included the LLFI-PL, Western Ontario and McMaster University Osteoarthritis Index (WOMAC), Euroqol Health Questionnaire 5-Dimensions (EQ-5D-5L), and an 11-point pain Numerical Rating Scale, with retest at 3-7 days. Practicality for readability was considered within the face and content validity. Completion and scoring time were also calculated. Results: Statistical analysis showed excellent internal consistency ( α =0.94) and high test-retest reliability (ICC=0.96). The error score found the SEM=3.49% with MDC 90 =8.11%. Validity analysis showed strong correlations between the LLFI-PL with the WOMAC (r=-0.81) and moderate with the EQ-5D-5L (r=-0.63). Exploratory factor analysis confirmed a single-factor structure. Times for completion (172±33 seconds) and scoring (20±9 seconds) were determined. Conclusions: The LLFI-PL is a psychometrically sound questionnaire for Polish-speaking patients with lower limb musculoskeletal conditions. The results support the findings of previous original-English, Spanish, and Turkish versions for internal consistency, validity, reliability, error score, and factor structure.


Introduction
Lower limb problems and dysfunctions are an increasing concern in society, regardless of age.
Problems, including pain on movement and at rest, plus impaired functions limiting activities of daily living (ADL) and participation in social life, lead to decreased quality of life [1]. Patients' opinions about their own health and functional status may differ from objective evaluations provided by different professionals. Consequently, patient reported outcome measures (PROMs) are recommended to enable standardization of the data collected and provide accurate representation of the patients' subjective opinions of their functional capabilities. It is increasingly emphasized that the overall assessment of health status should include the subjective consideration of functional status, particularly the use of PROMs, together with objective testing [2]. These PROMs enable detailed planning of treatment and rehabilitation programs as well as the determination of the effectiveness of medical or rehabilitation interventions [3]. However, the measurements made are only as good as the tools that are used and the clinimetric considerations of both the psychometric and practical properties must both be fully investigated using the international guidelines such as Consensus-based Standards for the selection of health status Measurement Instruments (COSMIN) [4].
In English-speaking countries, many PROMs have been created that can be used to assess the functional status of specific joints, conditions or region-specific conditions [5][6][7]. However, in Poland, only a limited number of questionnaires are available to assess the function of the lower limbs.
Foreign language PROMs must be adapted according to the methodology available in the scientific literature in order to be used. This requires a translation, cross-cultural adaptation and subsequent validation. Examples of such available PROMs include the Knee Injury and Osteoarthritis Outcome Score (KOOS), which has been comprehensively validated in Polish on patients with anterior cruciate ligament (ACL) injury and osteoarthritis undergoing total knee replacement (THR) [5,6]. Similarly, the Polish version of the Knee Outcome Survey Activities of Daily Living Scale (KOS-ADLS) demonstrates good reliability, validity, and responsiveness for use in patients at the end-stage of knee osteoarthritis who have undergone THR [7]. The Lysholm's scale and International Knee Documentation Committee (IKDC) are also available for knee problems, but the authors have only evaluated internal consistency [8]. The Western Ontario and MacMaster Universities Osteoarthritis Index (WOMAC) is validated to assess function with degenerative changes in the hip and knee [9]. However, none of these PROMs are applicable to the whole lower limb as a single kinetic chain enabling the use of a single tool for all joints and conditions. Only the Lower Extremity Functional Scale (LEFS) [10] and, more recently, the Lower Limb Functional Index (LLFI) [11] are validated for regional use, but neither are available in the Polish language.
The LLFI was developed to assess function within the domains based on the World Health Organization (WHO) International Classification of Functioning, Disability and Health (ICF). This included body functions, structures, activities, participation, and environmental factors [11,12]. The original English version has demonstrated strong clinimetric properties. These include the psychometric characteristics for internal consistency, reliability, error measurement, validity, and responsiveness. From the perspective of practical characteristics, the LLFI demonstrated brevity, ready transferability to a 100-point scale, ease and rapidity of completion and scoring, low missing responses, and suitable readability [11]. These clinimetric properties were reinforced with the scale's adaptation to Spanish [13] and Turkish [14].
The purpose of this study was to perform a translation and cross-cultural adaptation of the LLFI to establish a Polish version (LLFI-PL), and to evaluate its psychometric properties.

Methods Material
Patients were recruited during all visits made with a specialist rehabilitation physician or orthopedic surgeon at the Specialist Hospital in Rudna Mała, Poland from January to May in 2018. All patients were diagnosed by a specialist based on: the results of an interview, physical examination, and imaging studies (depending on the needs: USG, MRI, CT, X-ray).
People with a variety of lower limb conditions were included if they met the following inclusion criteria: age > 18 years, native speaker of Polish, informed consent to participate in the study, and an injury duration of > 4 weeks. Exclusion criteria were coexisting neurological disease, failure to provide written consent, and an inability to read Polish. A total patient sample of n = 125, i.e. 69% of patients

Design
This was a two-stage, repeated cross-sectional study. Stage 1 involved translation and cross-cultural adaptation of the LLFI. This was completed in accordance with the International Society for Pharmacoeconomics and Outcomes Research (ISPOR) guidelines and is approved by regulatory agencies such as the Food and Drug Administration and the European Medicines Agency [15].
It consisted of nine steps, each of which was documented with a written report (Fig. 1).
Step 1. Two forward translations were performed by two independent translators whose native language is Polish. This process allows Polish language equivalents to be introduced in place of terms that are otherwise difficult to translate.
Step 2. Reconciliation meeting between the two forward translators and the authors of the Polish adaptation (experts in health/physiotherapy, expert with experience in instrument development and translation) → common version of the two forward translations.
The team analyzed the individual items, the sets of answers to questions, the instructions for questionnaire completion and score calculation. There were some acceptable differences between the two translations, resulting from the many Polish language equivalents which can be used by the translators.
Step 3. Back translation by an independent English native speaker, fluent in Polish, who was not familiar with the original version or different language versions of the LLFI → back translation version.
Step 4. Back translation review. The English back translation version was compared to the original English version by the author of the original LLFI, the "back" translator, and authors of the Polish adaptation. Change was made to one item -#24. A minor problem was with the phrase "unaccustomed footwear" and matching its best meaning in Polish to the concept of the questionnaire item. Step 6. Cognitive debriefing. In this stage the resulting version was pilot-tested on a group of five symptomatic patients (two women, three men) with lower extremity problems that were present > three months (ACL injury, gonarthrosis, knee joint endoprosthesis, coxarthosis, and patellar chondromalacia) to assess face and content validity through the accuracy of questions and clarity of the wording. The group assessed whether a given position of the scale was fully understood or raised doubts using a three-point (2,1,0) scale where: 2 = completely understood, 1 = partially understood and 0 = completely incomprehensible. In the event that the question was incomprehensible to the respondent, they were asked to indicate the reason for the lack of understanding. Analyzing the obtained answers from the five respondents gave an average = 2.0. Consequently, no corrections were made.
Step 7. Review of cognitive debriefing results and finalization → final Polish version of the LLFI (the LLFI-PL).
At this stage, no changes of the received version of the LLFI-PL were introduced by the Polish research team. The final version of the LLFI-PL was approved.
In this stage, a Polish language teacher checked the LLFI-PL for any minor errors (spelling, grammatical or other), which could have occurred during the translation process. There were no such errors reported.
The final report provided a description of all translations and cultural adaptation decisions and was sent to the author of the original version of the LLFI.
Stage 2 involved a prospective evaluation of the essential psychometric properties of the LLFI-PL. The subjects were evaluated twice with an initial baseline examination that consisted of completing the Polish versions of all questionnaires by the respondents: the LLFI-PL, the WOMAC, the EQ-5D-5L, and the pain-NRS. The second (re-test) assessment was performed three to seven days later (average = six days), during a period of non-treatment, which is considered adequate and reasonable [4,16].
Patients completed the LLFI-PL and the pain-NRS. There is a low probability of changes in symptoms in this period, and the recollection of original responses is reduced [16].

Research tools
The LLFI assesses the impact of any lower limb problem on everyday activities. It is a 25-item regional PROM with a three-point response option of 'Yes' (points = 1), 'Partly' (points = 1/2) and 'No' (points = 0) with a raw score range of 0-25 points. The final score is calculated by simple addition of the responses from the 25 items. The sum is multiplied by four and then subtracted from 100 to generate a 0-100% score (100%= no disability) [11].
The WOMAC was used to subjectively assess the functional status of patients. It includes 24 questions on a five-point (0-4) scale that determines symptom intensity in three domains: pain (five items), stiffness (two items) and limb function (17 items). The final score is obtained by summing all items (0-96) where the higher the score, the worse the functional status and more serious degenerative changes are suspected. The data could be standardized to a range of values from 0-100 on a percentage scale, where 0 represents the best health status and 100 the worst health status. The WOMAC has been adapted into multiple languages including Polish [9,[17][18][19].
The EuroQol 5-Dimensions, five-level version questionnaire (EQ-5D-5L) consists of two parts. The initial part's questions are grouped into five life-domains: movement, self-service, everyday activities, pain/discomfort, and emotional state (anxiety/despondency). Each question is assessed on five levels, from 1 = no problems to 5 = impossible to perform. The second part consists of a 0-100 point visual analogue scale (VAS) where 0 = worst health and 100 = best health. A basic subdivision can be made according to the structure of the EQ-5D-5L: presenting results from the EQ-index value and presenting results of the EQ-VAS as a measure of overall self-rated health status. The questionnaire has been adapted to a Polish version [20,21].
The 11-point pain-NRS was used, where the smallest problem or pain was assigned to the value of 0 as no pain, 5 as moderate, and 10 as the worst imaginable [22].

Statistical analysis
All statistical analyses were conducted using SPSS Statistics software version 24. The level of statistical significance was assumed at p < 0.05. Normal distribution of the results of this study was verified using the Shapiro-Wilk test.

Sample size
The sample size was pre-selected on the basis of a literature review concerning the creation of the LLFI questionnaire and other language validations [11,13,14]. The sample size was also based on Altman's recommendations of > 50 subjects in a methods comparison study [23].

Internal consistency
The internal consistency was determined using Cronbach`s alpha (α) coefficient, where α should be between 0.70 and < 0.95 [16,24]. Data from the first examination were included in the analysis (n = 125).

Reliability
The intra class correlation (ICC 2.1 , CI = 95%) was used to assess test-retest reliability in patients whose completed the LLFI-PL twice and for whom the difference on the pain-NRS between baseline and retest was 0±1 point (n = 94). In addition, the Pearson's correlation coefficient (PCC) was also estimated between two LLFI-PL measurements. Fisher's F test was used to assess statistical significance of the PCC. Reliability was good when the ICC ≥ 0.70 and the PCC was r≥0.70 [25].

Error Score
To assess error the Standard Error of Measurement (SEM) and Minimal Detectable Change at the 90% level (MDC 90 ) the sample was all participants who completed the LLFI-PL at baseline and reassessment (n = 125).
The SEM was calculated using the formula: SEM = SD√(1-R), where SD is the standard deviation of measurements repeated from the test and retest and R -the reliability parameter (ICC 2.1 ) [26].
The MDC is the minimum change in a patient's score that ensures the change is not the result of measurement error. The MDC 90 was calculated using the formula: MDC = SEMx1.645x√2, where 1.645 reflects the 0.90% Confidence Interval (CI) of no change, and √2 indicates two measurements assessing change [27].

Construct Validity
In order to evaluate the LLFI-PL construct validity the PCC was calculated between the LLFI-PL, the EQ-5D-5L index value, and EQ-5D-5L-VAS (n = 125). Consequently, the LLFI-PL should correlate highly with the WOMAC, which is also used to assess the function of joints of the lower extremities. It should correlate moderately with EQ-5D-5L index value because the questionnaire was designed to measure both the functional state, pain, and the emotional state. The LLFI-PL should also correlate moderately with EQ-5D-5L VAS, because this part of the questionnaire assesses the overall sense of health.
Therefore, the a-priori hypotheses were proposed as follows: 1.
The LLFI-PL should correlate highly with the WOMAC; 2.
The LLFI-PL should correlate moderately with the EQ-5D-5L index value;

Stage 1: LLFI translation and cross-cultural adaptation
The LLFI translation and cross-cultural adaptation process described previously produced the LLFI Polish version (LLFI-PL, Appendix 1). The PROMs absolute values are presented in Table 2. Table 2 The absolute values of all scores Internal consistency
Cronbach's α values' after removal of a given item from the questionnaire ranged from 0.931-0.937.

Reliability and measurement error
The value of ICC 2.1 was very high (0.962, CI ranged from 0.941-0.975) with SEM = 3.49% and MDC 90%CI = 8.11% (Table 3). In addition, the correlations (PCC) between two LLFI-PL measurements were also high r = 0.843 (p < 0.0001, Fisher's F test), which also indicated good test reliability. Construct Validity Table 4 shows the construct validity using the PCC, for the LLFI-PL and the reference questionnaires.

Factor Structure
The EFA was conducted to assess factor structure and indicate construct validity. Initially, the factor analysis was performed without the single-factor extraction option. The KMO test was adequate (0.88) and Bartlett's Test of Sphericity was significant (p < 0.0001).
A total of five factors were extracted from the raw data analysis with Eigenvalues > 1 (Fig. 2, horizontal line indicates E = 1). However, only the single-factor solution fit all three a-priori assumptions which complied with the a-priori requirements for a single-factor structure ( Table 5).

Practical Considerations
The time to complete the questionnaire was 172 ± 33 seconds, and scoring was 20 ± 9 seconds.
Missing responses were minimal with items 3, 5, 11, and 18 missed once. Respondents did not indicate these items were missed due to issues with understanding the question or item comprehension. Consequently, no further corrections to the LLFI-PL were necessary.

Discussion
The United States FDA defines PROMs as "any report of the patient's health condition that comes directly from the patient, without interpretation of the patient's response by a clinician or anyone else" [29]. A comprehensive assessment of patient health status should, consequently, combine objective clinical and biological data with the patient's subjective opinion. The PROMs allow clinically focused health professionals to reliably assess the impact of interventions and effectively select optimal therapy solutions and interventions [30,31].
Each of the proposed hypothesis were proven. The LLFI-PL had high but suitable and not excessive internal consistency, and test-retest reliability. The criterion validity was stronger between the LLFI-PL and WOMAC than between the LLFI-PL and the generic EQ-5D-5L. The EFA of the LLFI-PL confirmed a single-factor structure, though the presence of a possible inflection at point-2 in the scree plot suggests that a modification to the questionnaire may be necessary, such as shortening to remove potential redundant items. This is consistent with the recommendations of previous authors [11,13,14].
The cultural and linguistic adaptation that produced the LLFI-PL complied with recognized standards [15] and ensured the linguistic proportionality of the concepts used and accounted for the slight discrepancies from a number of synonyms for individual words. The process confirmed the strong psychometric properties of the original-English, Spanish, and Turkish versions. The decision to complete this study was justified as it provided a regional lower limb PROM in the Polish language that could be applied to a wide range of patients with various functional problems in the lower limbs of varying severity and duration.
The criterion validity, assessed by the PCC, was higher with the joint and condition specific WOMAC (r = 0.81) than the generic EQ-5D-5L, (r = 0.63) and the EQ-5D-5L-VAS (r = 0.57). These correlation differences were expected with the higher level due to the greater relevance and specificity of a joint/condition-related PROM compared to a general health and quality of life PROM. These are mildly higher than the Spanish findings for the WOMAC (PCC, r = 0.77), EQ-5D-3L (r = 0.62), and EQ-5D-3L-VAS (r = 0.58) [13] and similar to the Turkish findings where the SF-36 subscales were used and with a high-moderate finding for the physical dimensions (from r = 0.43 to r = 0.76) but moderate-low (from r = 0.20 to r = 0.66 ) for the mental dimension [14].
The finding in all four versions of the LLFI (English, Spanish, Turkish, and Polish) recommend a preferred single-factor structure. This was achieved consistently with the recommended MLE and Varimax rotation format [11,13,14]. From the perspective of parsimony this confirms that the questionnaire items do measure the construct of lower limb functional status as a single kinetic chain and can be calculated with a single-summated score. However, each previous study also found multiple factor structures were potentially possible from the raw data analysis. This suggests a shortened version is preferred and an eight-item preference has been recommended and is currently The LLFI-PL questionnaire is easy and quick to complete and score. The questions are simple and clearly defined, so the patient and therapist burden is minimized. The times to complete (172 ± 33 seconds) and score (20 ± 9 seconds) are marginally longer than determined by the original study (131 ± 23 and 17 ± 5 seconds respectively) [11].

Limitations and Strengths
The current study limitations include a lack of assessment of responsiveness of the LLFI-PL. Further, the test-retest period of three to seven days may have been too long for the acute participants (32.0%) as change can occur within a shorter period for these patients and this may increase the change scores. However, the high ICC 2.1 value reflects that of previous versions and, consequently, this difference may not have greatly affected the psychometric properties. Ideally, a regional criterion such as the LEFS [Binkley et al. 1999] should be used but, as there is none available in Polish, this was not possible. The substitution of the WOMAC was not ideal but provided a regional indication.
The study strengths include the use of standardized methods for both cross-cultural adaptation and assessment of the psychometric properties. Further strengths are the prospective nature and diversity of conditions affecting each lower limb sub-region with varied degrees of severity and duration.

Future Considerations
The lack of determination of the responsiveness suggests that future studies need to consider the ability of the LLFI-PL to detect minimal clinically important differences (MCIDs) over a period of time longer than two weeks. Also, that a larger study population (~ 1000) or data pooling be used to definitively clarify the factor structure through the use of confirmatory factor analysis (CFA Informed consent was obtained from all individual participants included in the study.

Consent for publication
Not applicable.