The SF-36 is one of the most commonly used generic HRQoL questionnaires worldwide, in studies that measure the impact of a disease on HRQoL in different groups of patients [22–31], as well as studies that assess the effect of certain therapeutic interventions on HRQoL [32–38]. It has also been used as a reference in the validation of new instruments [39–44]. The SF-36 was used to measure HRQoL in patients with C1-INH-HAE [25–31] and to assess the effect of some therapeutic interventions [35,36,38,45–47]. However, we have found no evidence of any studies on its psychometric properties in patients with C1-INH-HAE and, to the best of our knowledge, it has yet to be for use in C1-INH-HAE.
The psychometric analysis in this study yields satisfactory results overall and provides support for validating the SF-36 as a tool for assessing HRQoL in C1-INH-HAE patients. The extremely low rate of unanswered questions indicates the questionnaire was suitable for patients with C1-INH-HAE. However, further analysis reveals elevated ceiling effect in the majority of individual items. This suggests that either a greater choice of answers should be included at the top of the scale or respondents did not consider those items to be relevant to C1-INH-HAE. In either case, it would clearly limit the content validity of the SF-36 in C1-INH-HAE.
The SF-36 showed good internal consistency, with all Cronbach’s α coefficient values being higher than 0.7. Similar data were observed for the eight domains in other studies [42–44,48].
The ceiling effect is present in 5 out of the 8 SF-36 domains (“RE”, “SF”, “PF”, “RP”, “BP”). However, we should take into account that we adopted a very strict definition of this effect (if > 15% of respondents obtained the highest possible score), in comparison to other studies in which the threshold was as high as 60% [49]. The presence of the ceiling effect indicates that there may be a lack of response options for items at the top of the scale, which would imply a limited content validity. Consequently, patients with the highest score may not be distinguished apparently and thus reliability would be reduced. It could also indicate that these domains are not relevant to C1-INH-HAE patients. On the contrary, no floor effect was found in the SF-36 domains, which might mean there is not a lack of responses at the bottom of the scale. The SF-36 has certain content validity limitations that may affect its use in C1-INH-HAE. Similar findings have already been described in a study in which the author found a low sensitivity of the SF-36 when assessing subtle variations of functional status and emotional functioning in patients with brain tumors.
As there is no single gold standard assessment tool for HRQoL, we analysed convergent criterion validity by comparing data from the SF-36 and HAE-QoL questionnaires. In our study, correlations obtained among the SF-36 domains and summary scores, and the HAE-QoL total and domain scores were mostly mild to moderate (> 0.40) and statistically significant, which indicates some agreement between the two instruments. The strongest correlations were seen between the HAE-QoL total score and the “BP” and “RF” domains, as well as the “MCS” of the SF-36. Higher correlations were also observed among related domains of both questionnaires (such as the “MH” domain of both questionnaires, “Physical functioning and health” with “RP” and “Emotional and Social roles” with “SF”) than among other unrelated domains. Based on these results, we can assume that coherence and equivalence are verified for the quality-of-life concept, as assessed by these two instruments. This indicates that both scales coincide in subjective and objective aspects that make up the construct, although their conceptual structures and items differ. Furthermore, the lack of strong correlations might be due to the fact that SF-36 is a generic questionnaire while the HAE-QoL is specifically for patients with C1-INH-HAE. This would also explain the low correlations observed between the “Concern about offspring” domain and the SF-36 domains and their physical and mental summaries, as this aspect is specific for C1-INH-HAE and other hereditary diseases and could not be adequately considered by a generic questionnaire such as the SF-36.
For the construct validity, the recommended quality criterion that at least 75% of pre-established hypotheses be confirmed [16] was fulfilled using the combination of “a priori” and “post hoc” defined criteria with an 87.5% (7/8) of confirmed hypothesis. It is worth noting that patients who presented some factors which could be a priori considered determinants of the impact on HRQoL (such as having undergone intubation or a tracheotomy at least once) showed no significant differences. Past intubation or tracheotomy procedures may have no impact on current HRQoL as they may have been performed years earlier and, as a result, are no longer of concern at the time of questioning. Therefore, it would not be a good criterion on which to assess the construct validity of the instrument. This issue also arose in the psychometric study of the HAE-QoL [5]. With respect to other factors, such as the effect of long-term prophylaxis (LTP), no significant differences were observed in the “RP”, “RE”, and “SF” domains, in which there was a ceiling effect, and in the “VT” domain, which had neither floor nor ceiling effects. The variable of having angioedema symptoms in the last 6 months had no significant differences in the “PF”, “SF”, and “RE” domains, and all of them exhibited a ceiling effect.
Analysis of the discriminant validity of the SF-36, shows discrimination was good among patients with different levels of C1-INH-HAE severity in the last 6 months. There were significant differences in the 3 scoring groups across all domains and the two summaries, with HRQoL lower when the severity of the disease was higher. Such data show the SF-36 capacity to distinguish among these known groups.
An examination of test-retest reliability shows that the generic SF-36 questionnaire is stable in patients with C1-INH-HAE, as it meets the recommended standards of the GA2LEN taskforce for assessing Patient-Reported Outcomes on allergy [50]. This means that SF-36 is stable in patients with C1-INH-HAE.
The MCID calculated by two different distribution methods shows that the generic SF-36 questionnaire could be useful as a tool for detecting real changes in HRQoL in patients with C1-INH-HAE. MCID has been evaluated to a lesser degree than other psychometric properties in other studies that validate SF-36 in other diseases.
The main limitations of the study include the post hoc design of the study and the different sample sizes among participating countries.
Despite these disadvantages, the internationally accepted scientific recommendations for the validation of HRQoL measurement instruments have been followed [15,17], and data on reliability and content and construct validity have been highly acceptable. Moreover, as an international multicentric study, it provides results on which to base generalization, unlike studies with less diverse patient samples.