We created the J-FIHOA through a process of forward and backward translations to address the linguistic and cultural differences. We then performed a prospective multicenter study to validate its measurement properties with 210 Japanese hand OA patients with widely ranging conditions from well-controlled to severely symptomatic.
The translation process was performed following guidelines and publications distilled from comparable initiatives[22, 23]. Justifications for cultural adaptation depend on concepts of the instrument and populations concerned. Although the FIHOA consists of simple questions focusing on daily activities, several points needed to be scrutinized, especially the experimental equivalence of Question 10 Would you accept a handshake without reluctance?. As a cultural convention, Japanese, the elderly in particular, shake hands infrequently. Yet, the committee and developers agreed that Question 10 was unique and irreplaceable because it assessed aspects of aesthetics and communication.
Exploratory factor analysis revealed that the J-FIHOA was a unidimensional scale and all loadings satisfied the minimum requirement, usually set at 0.5[44]. All questions contributed to a single factor, which we considered representative of “physical dysfunction.” Cronbach’s alphas were above 0.9 and each item-total correlation exceeded 0.5. These findings indicated that the J-FIHOA was a unidimensional scale with good internal consistency[45]. Question 10 also showed feasibility and consistency with only one missing response.
We also examined another concern regarding gender differences using the 11-item model[32]. Females had a statistically higher score (i.e. greater disability) in Question 4 Are you able to lift a full bottle with the hand?. We assumed it might be due to inherent differences, such as in muscle strength[46]. Both Questions 7A and 7B had the strongest item-total correlations (i.e. 7A for women, 0.885 among males and 7B for men, 0.809 among females), suggesting that the gender specific parts were irrelevant and could be removed. This gave rise to another concern regarding how to treat Question 7—measure both items, delete either, or combine. We were indecisive but disinclined to deviate from the original and kept “for women” and “for men” in the J-FIHOA. Additional investigation might be necessary such as differential item functioning (DIF) analysis, which assesses different probabilities of responding to certain items among different groups (ideally with more than 100 patients per group)[34].
Several validated PROMs were used to examine associations with the J-FIHOA. In addition to the FIHOA, the Australian/Canadian Hand OA Index (AUSCAN) is the other hand OA-specific assessment tool[47]. AUSCAN has excellent measurement properties and has been frequently applied to hand OA clinical trials[26, 27, 32]. Although a number of linguistic versions are available, AUSCAN has not been translated and validated into Japanese. In addition, the tool is not freely available. Therefore, we did not include AUSCAN in this study.
Construct validity was assessed by testing hypotheses based on the fundamental assumptions that the FIHOA scores reflected the severity of physical dysfunction of the hands and that the Japanese version was equivalent to validated versions of the FIHOA. Hand20 assesses upper limb dysfunction and has some similar items to the J-FIHOA such as “Do up shirt buttons with both hands” (cf. Question 8 Are you able to fasten buttons?). It showed the strongest correlations (r = 0.82), as expected. Although the FIHOA has no pain-related items, pain has been reported to be moderately correlated on the FIHOA and our data were consistent (r = 0.58) [25, 24, 26, 28–31]. Since mental and social status are dimensions distinct from physical condition, no correlations were observed in the SF-36 MCS and RCS (r = − 0.22 and − 0.23, respectively). Our results showed that the HAQ had a stronger correlation (r = 0.73) and SF-36 PCS had a weaker correlation (r = − 0.36) than previous reports, where both HAQ and SF-36 PCS were moderately correlated with the FIHOA (r = 0.57 to 0.73 and r = − 0.57 to − 0.67, respectively)[26, 30, 29, 31]. We concluded that our results were not inconsistent with the construct of the J-FIHOA because these correlations largely depended on patient characteristics or conditions.
Longitudinal data enabled an evaluation of the test-retest reliability (135 participants) and responsiveness (30 participants). In the test-retest analysis, we allowed the examinees to answer the retests either at a face-to-face visit or via postal mail. Most chose the latter. Although the different forms of administration might have affected the reliability, the ICC was 0.83. Even the lower bound of the 95% CI was greater than 0.70, the minimal requirement of reliability, indicating the J-FIHOA had good test-retest reliability.[44]
ES and SRM are widely used to evaluate responsiveness. However, the COSMIN suggests that both are inappropriate in some situations. One reason is that ES and SRM are highly dependent on the SD of initial scores and change scores, respectively. If the target population is homogeneous or the variation in treatment effect is small, these values can be large. So, we assessed responsiveness of the J-FIHOA by two construct approaches, comparing with other PROMs and dividing the patients into two subgroups[34]. The J-FIHOA showed the highest ES and SRM among all PROMs, except for NRS pain. We assumed that it was because J-FIHOA was a hand OA-specific scale and could detect subtle differences in physical dysfunction. It was unsurprising that NRS pain showed the best responsiveness among the questionnaires. We used oral analgesic drugs with a short duration for assessing responsiveness because no disease-modifying drugs are available and evidence does not support surgical intervention, especially for osteoarthritis of interphalangeal joints[48]. Previous clinical trials also revealed that pain scoring is the most sensitive tool in hand OA symptom assessment[49]. Subgroup analysis was performed by clustering the patients into two groups based on the GRC scale. As expected, the patients who reported greater improvements obtained a larger ES and SRM. Although the results indicated that the J-FIHOA had good responsiveness, the number of longitudinal data sets were relatively small (n = 30) and inadequate for further analyses such as of minimal clinically important difference[50].
The strengths of this study were the number and diversity of our hand OA patient panels, from well-controlled to awaiting scheduled surgery with severe symptoms. They enabled us to estimate precise internal consistency and easily generalize our results to a wide range of clinical and research settings. To our knowledge, with its 17 participating university and community hospitals, this is the first multicenter study conducted for hand OA research in Japan. We are confident that this framework will function effectively in future clinical investigations.
This study has several limitations. We did not evaluate measurement invariance by comparing Japanese and Western patients directly. Some cultural inequalities, such as in daily activity, personality trait and perception of physical dysfunction, may variously have influenced responses to each question. Data from two populations and DIF analysis would reveal differences at item levels. Another limitation is that we have not verified the diagnostic cut-off value of the J-FIHOA for defining symptomatic hand OA among Japanese patients. We neither recruited healthy individuals nor clearly defined non-symptomatic hand OA based on criteria other than the J-FIHOA scores. We anticipate future international trials or Japanese cohort studies will accumulate more evidence regarding the measurement properties of the J-FIHOA.