The original FIHOA assesses hand OA-related functional disability, consisting of 10 questions. We translated and cross-culturally adapted the FIHOA following established guidelines[22, 23]. Subsequently, a prospective observational multicenter study was undertaken for the validation process. We evaluated the measurement properties of the J-FIHOA among Japanese hand OA patients within the classical test theory framework, referring to the consensus-based standards for the selection of health measurement instruments (COSMIN) risk of bias checklist to scrutinize the methodological quality of our study[34].
Translation and cultural adaptation
An expert committee was convened. It was comprised of two language professionals (A.I. and D.S.), three health professionals (Y.N., S.K. and H.H.), and an American native English-speaking research assistant (J.C.). Forward translation was performed independently by one professional translator who had no prior knowledge of the study and the two language professionals serving on the expert committee. All forward translators were native Japanese speakers and fluent in English. Results were assessed and synthesized into a preliminary version of the J-FIHOA. It was translated back into English independently by two English native professional translators, one with a medical background and one without. Both of them were blinded to the study aims and FIHOA concepts throughout the process. Additional forward and backward translations were undertaken to resolve specific points such as linguistic problems. We consulted with the FIHOA developers (E.M. and R.L.D.) about certain discrepancies and issues of interpretation. The translated FIHOA was pretested on 10 Japanese hand OA patients to identify potentially difficult words or phrases. We added kana script above difficult Chinese characters to facilitate comprehension, as is common in written Japanese. We also inserted a question and answer example illustrating how to mark responses. The committee submitted written reports to the developers that documented all processes and how we reached consensus. After their approval, the translation and cultural adaptation process was completed (Table 1).
Validation
Participant recruitment
Our university hospital and 16 other hospitals recruited hand OA patients at the outpatient departments from September 2017 to December 2018. New or already followed hand OA patients who were Japanese natives and over 20 years old were eligible. American College of Rheumatology (ACR) classification criteria for hand OA was used for the diagnosis[35]. Patients with other rheumatic diseases or post-traumatic OA were excluded. Participants had conventional therapies for hand OA. Written consent was obtained from each participant. The study protocol was approved by the review board of each participating hospital.
Data collection
At the enrollment visit a medical history (including duration of hand pain/stiffness and previous treatment for hand OA), postero-anterior radiographs of both hands and the following patient reported questionnaires were collected: J-FIHOA, Hand20, Japanese version of the Stanford Health Assessment Questionnaire (HAQ), numerical rating scale for pain (NRS pain), and Japanese version of the Short Form 36 Health Survey (SF-36).
Participants were followed up to one year to collect longitudinal data sets. For the test-retest reliability, we obtained J-FIHOA data from those whose symptoms and treatment were unchanged over at least 3 months to avoid disease flares or therapeutic modifications. The test-retest interval was one to two weeks. Examinees were allowed to answer the retests either at a face-to-face visit or, for their convenience, via postal mail. To assess the responsiveness, we selected symptomatic hand OA participants who started or changed to certain new systematic pharmacological treatments limited to oral acetaminophen, NSAIDs and/or tramadol. We checked J-FIHOA and other questionnaire scores immediately before the treatment and at a 4-week follow-up visit (+/− 2 weeks). At the follow-up visit, these participants also evaluated the change in the clinical state of their hands using a 7-point Likert scale (global rating of change [GRC]).
Questionnaires
The FIHOA consists of 10 questions, one of which requests a separate response from females, Are you able to sew? and males, Are you able to use a screwdriver? (Question 7). In this study, we removed “for women” and “for men” from Question 7 to obtain all 11 responses regardless of gender. Participants answered all 11 items, from 0 (possible without difficulty) to 3 (impossible), and we calculated the J-FIHOA scores in two different ways. One was to sum the 10 items as the original FIHOA does, called the “total score” (range, 0 to 30). Participants with total scores of 5 or more were defined as having symptomatic hand OA. The other was to sum all 11 items, called the “11-item model.” We used the 11-item model only when performing exploratory factor analysis and pursuing internal consistency on Question 7.
The Hand20 is composed of 20 illustrated questions for disorders that assess the upper limb including hands. Scoring for each item ranges from 0 to 10, higher numbers indicating greater disability. The total score is obtained by dividing the sum of all questions in half (range, 0 to 100). Explanatory illustrations and short, easy-to-understand questions facilitate good response rates especially among elderly people[36, 37].
- Japanese version of the Stanford Health Assessment Questionnaire (HAQ)
The HAQ is a widely used instrument to assess functional disability especially in RA[38]. Regarding cultural differences, 3 questions on the Japanese version of HAQ have been modified: get in and out of bed to get up and down from futon, cut your meat to use chopsticks for meal, and a 5 pound object to a 2 liter plastic bottle[39]. Scores are increased to 2, if they were lower, in any categories in which the patient used a device or relied on help from another person.
- Numerical rating scale for pain (NRS pain)
Global pain of the affected hand(s) was assessed using a numerical rating scale. Participants were asked to mark the level of their pain on a horizontal scale from “0 = no pain” to “10 = worst pain imaginable.”[40]
- Japanese version of the Short Form 36 Health Survey (SF-36)
The SF-36 is a questionnaire to measure general health status with 36 questions consisting of eight scales that can be summarized into components. It has been validated in Japanese[41]. A three-component model is used when analyzing results of the Japanese version of SF-36 scores: physical component summary (PCS), mental component summary (MCS) and role-social component summary (RCS)[42].
- Global rating of change (GRC) scale in clinical state of the hands
Global rating of change (GRC) scales are commonly used to evaluate the responsiveness and to calculate the minimal clinically important change of the scale[43]. We used a GRC scale for symptomatic hand OA patients who started or changed to the new pharmacological treatments. At the 4-week follow-up visit, patients were asked “How would you describe your hand condition compared to before you took the new drug?” and scored their change using a 7-point scale: very much improved, much improved, a little improved, no change, a little deteriorated, much deteriorated, or very much deteriorated. We categorized patients based on the GRC scale and performed subgroup analyses.
Data analysis
Descriptive analyses were performed to summarize patient characteristics and scores of questionnaires. Incomplete J-FIHOA items were also examined. We compared the characteristics between female and male participants using the Student’s t-tests for continuous variables and chi-square tests for categorical variables. By checking the distribution of each questionnaire with the Kolmogorov-Smirnov test, we found that the only measure normally distributed was the SF-36 MCS. We used the Mann-Whitney U test to examine gender difference in each item or total score, and the Spearman’s rank correlation coefficients to measure the strength of the associations among items and/or questionnaires. Correlations were categorized as none (r=0–0.29), weak (r=0.30–0.49), moderate (r=0.50–0.69) or strong (r=0.70–1.00). All analyses were carried out using IBM SPSS software, version 24. The level of significance was set at p values of less than 0.05.
Validation
Using the 11-item model, factor analysis was performed with the maximum likelihood method to explore the scale structure of the J-FIHOA. Since the original FIHOA is a hand OA-specific scale, the Japanese version was expected to be unidimensional. We determined the number of relevant factors based on eigenvalues larger than one (the Kaiser criterion) and visual inspection of the scree plot[44].
Internal consistency was evaluated with Cronbach’s alphas and correlations between each individual item and the total score of J-FIHOA without it (item-total correlations). To examine gender difference, we compared the score of each item and performed an additional investigation to explore the measurement properties of Questions 7A and 7B, which consist of two gender-role specific items. Cronbach’s alphas and item-total correlations were re-calculated in the 11-item model.
We selected participants in stable condition, whose symptoms and treatments were unchanged. They were asked to answer the J-FIHOA twice, repeating it after a one- to two-week interval. The intraclass correlation coefficient (ICC) was used to assess test-retest reliability.
To assess construct validity, we performed hypothesis testing by focusing on correlations between the J-FIHOA and the other validated scales. Six hypotheses were established prior to data collection: Hand20 correlation would be the strongest among instruments; HAQ and NRS pain correlations would be moderate; SF-36 PCS correlation would be moderate but MCS and RCS would be weak or none.
To analyze responsiveness, we recruited symptomatic participants whose total J-FIHOA scores were 5 or more and who were starting oral analgesic drugs. We used scores immediately before the treatment and at the 4-week follow-up visit (+/− 2 weeks). We compared the scores using the Wilcoxon signed-rank test. The effect size (ES) and standardized response mean (SRM) were also evaluated. ES was obtained by dividing the mean change of scores by the standard deviation (SD) of initial scores. SRM was obtained by dividing the mean change of scores by the SD of that change.
We evaluated responsiveness using two different approaches: comparisons of ES and SRM between the J-FIHOA and other measurements; and subgroup analyses of the J-FIHOA. We expected the J-FIHOA to have the largest ES and SRM among all the PROMs, except for NRS pain. We used the GRC scale for subgroup analysis. At the end of data collection, thirty data sets were available for responsiveness analysis. Almost all patients reported their changes as either a little improved (n=14) or much improved (n=12) and there were no complaints of deterioration. The remaining patients said very much improved (n=1) or no change (n=3). So, we divided the patients into two subgroups based on the GRC scale: major change group (very much improved and much improved, n=13) and minor change group (a little improved and no change, n=17). We hypothesized that the major change group would have larger ES and SRM on the J-FIHOA than the minor change group.