Measurement invariance of the Seattle Angina Questionnaire in coronary artery disease

The Seattle Angina Questionnaire (SAQ) is a widely used patient-reported measure of health status in patients with coronary artery disease. Comparisons of SAQ scores amongst population groups and over time rely on the assumption that its factorial structure is invariant. This study evaluates the measurement invariance of the SAQ across different demographic and clinical groups and over time. Data were obtained from the Alberta Provincial Project on Outcome Assessment in Coronary Heart Disease registry, a registry of patients who received coronary angiogram in Alberta, Canada. The study cohort consists of adult patients who completed the paper-based version of the 16-item Canadian version of the SAQ (SAQ-CAN) 2 weeks and 1-year post-coronary angiogram between 2009 and 2016. Multi-group confirmatory factor analysis was used to assess configural, weak, strong, and strict measurement invariance across age groups, sex, angina type, treatment, and over time. Model fit was assessed using the comparative fit index and root mean square error of approximation. Of the 8101 patients included in these analysis, 1300 (16.1%) were at least 75 years old, while 1755 (21.7%) were female, 5154 (63.6%) were diagnosed with acute coronary syndrome, 1177 (14.5%) received coronary artery bypass graft treatment, and 3279 had complete data on the SAQ-CAN at both occasions. There was evidence of strict invariance across age, sex, and angina type, and treatment groups, but partial strict invariance was established over time. SAQ-CAN can be used to compare the health status of coronary artery disease patients across population groups and over time.


Introduction
Heart disease is the leading cause of premature mortality and affects more than 126 million individuals globally [1]. In addition to conferring increased risks of premature mortality and other major adverse cardiac events, chronic heart disease leads to significant ongoing disabling symptoms and associated impairment in functional status and health-related quality of life [1][2][3]. Professional societies [4], such as the American Heart Association, have called for the integration of patient-reported outcomes as an important endpoint in the delivery of cardiovascular care. Patient-reported outcomes measures (PROMs), which are patients' appraisals of their health status and quality of life, are useful for comparing health status of population groups and evaluating the effectiveness of treatment interventions in routine clinical care [5][6][7]. However, such comparisons rely on the 1 3 assumption that the health-related quality of life construct being measured is invariant across groups or over time (i.e., measurement invariance) [8,9]. Specifically, measurement invariance determines the extent to which a PROM construct is equivalent across groups of interest. Violation of this assumption may lead to biased and flawed conclusions about population group differences or temporal changes [6,10,11].
The Seattle Angina Questionnaire (SAQ), a cardiac disease-specific measure of quality of life, is a commonly used PROM in patients with coronary artery disease (CAD) [12][13][14]. Originally developed in a population of United States veterans, the SAQ has been translated into more than 52 languages, with several studies confirming its construct validity and reliability. The psychometric properties of the SAQ have been well-documented, with studies reporting adequate internal consistencies ranging between 0.70 and 0.98 [12][13][14][15][16]. However, a number of studies have reported suboptimal factorial validity of the SAQ and have recommended different variants of the SAQ measure in different populations. For example, Kimble et al. [16] validated the SAQ in a predominantly female sample with stable angina and showed the emergence of new subscales (e.g., division of the physical limitation subscale into two separate factors) and misfit of one of the SAQ items. Similarly, the translation and validation of the Farsi version of the SAQ yielded a five-factor solution with subscales that were not identical to the original SAQ subscales [17]. Garrath et al. [18] also reported that the original factorial structure of the SAQ was not replicated in a sample of patients with stable angina but resulted in the emergence of the 15-item United Kingdom version of the SAQ with 3 subscales. Recent work by our group also revealed that the original factorial structure of the SAQ was invalid in a Canadian cohort with stable angina. Instead, a four-factor measurement model consisting of 16 items provided an optimal fit called the Canadian version of the SAQ (SAQ-CAN) [19]. In addition, we are not aware of any study that has previously examined the measurement invariance of the 19-item SAQ or modified versions of the measure across population subgroups.
This study aims to evaluate measurement invariance properties of the SAQ across demographic (sex, age), clinical (treatment, angina type) groups, and time using data from a cohort sample of patients with coronary artery disease. Specifically, we hypothesized that the 16-item SAQ would be invariant across these demographic and clinical subgroups and longitudinally. This investigation has important research and clinical implications for the use of SAQ for describing population group differences and evaluating longitudinal changes in patient outcomes in coronary artery disease.

Data source
Data for this study were obtained from the Alberta Provincial Project for Outcome Assessment in Coronary Heart Disease (APPROACH) registry, a prospective population-based registry of all patients who had a coronary angiogram and/or revascularization in Alberta, Canada since 1995 [20][21][22]. The registry captures detailed clinical information, including information on demographics, comorbidities, medications, indications for the procedure, the use of invasive coronary procedures (coronary artery bypass graft [CABG] or percutaneous coronary intervention [PCI]), as well as mortality, based on linkage with provincial administrative databases. Data collected during coronary angiogram included demographic characteristics (sex, age, and address), number of comorbid conditions, measures of disease severity, coronary angiography results, and quality of life in patients who consent to follow-up. Patient-reported outcomes measures, including SAQ, were mailed to patients (with return postage) 2 weeks, 1 year, 3 years, and 5 years after cardiac catheterization. Details about the APPROACH registry have been published elsewhere [21,22].
This study cohort consisted of adult patients (≥ 18 years) who (1) underwent coronary angiogram for coronary artery disease between January 1, 2009, and December 31, 2016, (2) had stable angina or acute coronary syndrome, and (3) received invasive or medical management for coronary artery disease between 2 week and 1 year of coronary angiogram. Only patients who had complete data on the SAQ-CAN [16] were included in the analyses. This measure consists of 16 items with four subscales, including indoor physical limitation (IPL-3 items), outdoor physical limitation (OPL-4 items), angina symptoms burden (ASB-6 items), and treatmentrelated experience (TRE-3 items). The SAQ-CAN has been shown to be valid and responsive in the Canadian population [19]. The items are scored on 5-or 6-point Likert scales, and the sum of item scores in each subscale is then transformed to scores ranging from 0 (no functioning) to 100 (highest level of functioning). Ethics approval for this study was obtained from the University of Calgary Conjoint Health Research Ethics Board (REB15-1195).

Statistical analysis
Descriptive analysis based on frequencies and percentages were used to summarize the distribution of patients' responses to the 16 items of the SAQ-CAN at the first assessment (2 weeks post-angiogram). Between group mean differences and longitudinal differences in SAQ-CAN subscale scores were also described using (paired) t-tests and/or analysis of variance. The fit of the theoretical factorial structure for the SAQ-CAN items was evaluated using confirmatory factor analysis based on diagonal weighted least square estimator to account for the ordinal nature of the SAQ-CAN items. Multiple indices were examined to determine model fit: (a) chi square test, (b) the comparative fit index (CFI), and (c) the root mean square error of approximation (RMSEA) with its 90% confidence interval (90% CI). To interpret these indices, we used the critical values previously recommended [23][24][25]. Specifically, CFI values > 0.90 and RMSEA values of < 0.08 were considered benchmarks for acceptable fit, respectively.
Hypotheses about measurement invariance were tested for the following grouping variables, namely: (a) age groups (age ≤ 75 years vs. age > 75 years), (b) sex (female vs. male), (c) angina type (acute coronary syndrome [ACS] vs. stable angina), (d) treatment received (CABG vs. PCI vs. medical management), and (e) time (2 weeks vs. 1-year follow-up). Previous studies have reported age differences in quality of life and risk of adverse health outcomes in older heart disease patients compared to younger patients [22,26,27]. Multi-group confirmatory analysis (MGCFA) based on with diagonal weighted least squares estimation was used to examine the four forms of measurement invariance on the SAQ-CAN items, namely: configural, weak, strong, and strict invariance. A series of multi-group confirmatory factor analyses were done on the data to define each grouping variable. The MGCFA begins with the examination of configural invariance, which involves fitting the factor structure of the SAQ-CAN across subgroups of each grouping variable while freely estimating parameters (i.e., thresholds, factor loadings, error variances) for each subgroup. Configural invariance is satisfied if the factor structure of the SAQ-CAN is a good fit for the data in both groups (i.e., CFI > 0.90; RMSEA < 0.08).
Then, weak, strong, and complete invariance were tested by sequentially placing constraints on the parameters (i.e., factor loadings, thresholds, and error variances) of the configural invariance model. Weak invariance assesses the extent to which the magnitude of the factor loadings for the items are the same between groups or over time. When weak invariance is satisfied, the latent factor is being measured the same way across groups or over time. For strong invariance, the thresholds and factor loadings of the configural measurement model are constrained to be equal across groups or over time. Hence, comparisons of mean scores on the SAQ-CAN can be considered valid. Strong measurement invariance is a prerequisite for making valid between group comparisons. Strict factorial invariance holds if the factor loadings, intercepts, and error variances are invariant for the groups or over time [28].
Changes in goodness-of-fit statistics and published cutoff criteria have been proposed for assessing measurement invariance [28][29][30]. For example, chi squared difference test which can be used to test for differences in χ 2 test statistic values for unconstrained and constrained models (i.e., Δ 2 ) in ordinal data are one way to test for measurement invariance. Differences in comparative fit index (CFI) values for nested models (i.e., ΔCFI) are alternative measures of invariance recommended in the literature. An absolute value of ΔCFI less than or equal to 0.01 indicates the null hypothesis of invariance should not be rejected, while an absolute value greater than 0.01 indicates a likely difference in fit between constrained and unconstrained models [28]. Overall, ΔCFI was given more weight than the chi LRT when there was disagreement between the two statistics [30].
Similarly, MGCFA based on maximum likelihood was used to assess measurement invariance at the subscalelevel across sex, age, type of angina, and treat groups. All descriptive analyses were conducted using R software [31] while the MGCFA analyses were conducted in Mplus version 8.4 [32].  Table 3 shows that female patients consistently reported significantly lower SAQ-CAN subscale scores than male patients. Similarly, older patients reported significantly lower scores on indoor physical limitations and outdoor physical limitations subscales than younger patients. While ACS patients generally reported higher average SAQ-CAN subscale scores on outdoor physical limitations, angina stability/burden, and Excluded paƟents: x Missing disease indicaƟon (n=11788)

Factorial validity of the SAQ-CAN
The original four-factor structure of the SAQ-CAN provided a good overall fit to the data (CFI = 0.989; RMSEA = 0.061 (90% CI = [0.059, 0.063])) at item-level. See Fig. 2 and ESM Table A2, A3 for more details. Using the same process as described above, successively stricter constraints were tested to evaluate configural, weak, strong, and strict measurement invariance across male and female patients. Configural invariance was supported by fit indices meeting requirement for good/acceptable fit (CFI = 0.989; RMSEA = 0.061 (90% CI = [0.059, 0.063])). The assumptions of weak, strong, and strict measurement invariance were satisfied with acceptable model fits as evidenced by negligible changes in model fit for more constrained on the model (∆CFI < 0.01; ∆RMSEA < 0.01).

Cross-sectional measurement invariance
To examine measurement invariance of the SAQ-CAN across types of angina, the measurement models were fit to subgroups of individuals with acute coronary syndrome (ACS) and those with stable angina. The configural measurement invariance of the four-factor measurement model was supported with respect to good/acceptable fit indices across the two types of coronary artery diseases (CFI = 0.989; RMSEA = 0.061 (90% CI = [0.059, 0.063])) with negligible changes in fit indices for the stricter models. Successively stricter constraints on the factor loadings, intercepts, and residuals were tested to evaluate strong and strict invariance and showed that weak, strong, and strict invariance could be assumed across ACS and stable angina subgroups, as evidenced by a negligible change in model fit for the stricter models (∆CFI ≤ 0.01; RMSEA = 0.047 (90% CI = [0.045, 0.048])).
Furthermore, we evaluated the four forms of measurement invariance across subgroups of patients who received CABG, PCI, or medical management for their disease. Configural invariance across these treatment groups was supported as evidenced by good/excellent fit (CFI = 0.988; RMSEA = 0.061 (0.059, 0.063)). Successively stricter constraints on the factor loadings and intercepts, and residual variances revealed that weak, strong, and strict measurement invariance were supported by the data (CFI = 0.984; RMSEA = 0.056 (0.055, 0.058)) and; with negligible changes in fit indices across the three groups (∆CFI ≤ 0.01). The result suggests that comparisons of SAQ-CAN across these three treatment groups are valid. Table 5 describes the goodness-of-fit statistics for the assessment of measurement invariance of the SAQ-CAN subscales across age, sex, angina type, and treatment subgroups. Specifically, Configural invariance across sex groups was supported by fit indices meeting requirement for good/acceptable fit (CFI = 0.994; RMSEA = 0.055 (90% CI = [0.041, 0.071])). The assumptions of weak, strong, and strict measurement invariance were satisfied with acceptable model fits as evidenced by negligible changes in model fit for more constrained on the model (∆CFI < 0.01; ∆RMSEA < 0.01). Similarly, strict measurement invariance was established across age groups, sex, and type of angina with acceptable model fit indices. However, the assumption of strong invariance across treatment groups was not supported despite having acceptable fit indices. Instead a partial strong invariance was established across these groups when the constraints on OPL and ASB subscale intercepts were relaxed (CFI = 0.984; RMSEA = 0.068 (90% CI = [0.061, 0.075])).

Longitudinal Measurement Invariance
The longitudinal measurement invariance was conducted on individuals with complete data on SAQ-CAN items two weeks following coronary angiogram and at 1-year follow up (N = 3279). Table 5 describes the model fit indices for the four forms of measurement invariance over time at itemlevel and at subscale level. First, our analysis revealed that configural measurement model, in which all parameters were freely estimated across occasions, had an acceptable fit (i.e., CFI = 0.955; RMSEA = 0.038 (90% CI = [0.036, 0.039])) (See Fig. 3). Successively stricter constraints on the factor loadings intercepts, and residual variances revealed that both weak and strong measurement invariance were supported by the data with negligible changes in fit indices across time

Discussion
This study investigates the measurement invariance of the SAQ-CAN across demographic groups, type of angina, treatment groups, and over time in a sample of patients with coronary artery disease. To our knowledge, this is the first study to evaluate between group and longitudinal measurement invariance of any SAQ family of instruments in coronary artery disease. These findings provide preliminary evidence in support of the factorial validity of the SAQ-CAN in a heterogeneous sample of individuals with coronary disease and support the findings from an earlier investigation of its psychometric properties in individuals with stable angina [19] (Table 6).
Our analyses revealed notable differences in conclusions about measurement invariance of the SAQ-CAN based on item-and subscale level analyses across subgroups and over time. The item-level analyses confirm strict invariance of the SAQ across all subgroups but partial strict longitudinal invariance over the 1-year follow up period. In contrast, subscale analyses confirm strict invariance of the SAQ-CAN longitudinally and across sex, age, and angina subgroups but not across treatment groups. Instead, partial strong invariance was established across the treatment groups. According to Baumgartner and Steenkamp [28], confirmation of weak and strong invariance are sufficient and necessary conditions to ensure valid comparisons of measures between groups and over time. Consequently, these results provide evidence that the latent construct being measured is invariant over time and/or subgroups and that the subgroup and/or longitudinal differences are meaningful. These findings also buttress previously reported subgroup comparison of SAQ scores. For example, epidemiological studies have consistently shown that, among patients who receive coronary angiograms, women report poorer SAQ scores than men, and older patients reported lower SAQ scores than younger patients [33][34][35][36]. Such differences have been previously attributed to differences in how women experience cardiac symptoms from men [30]. However, these results suggest that female patients interpret the questions in a similar manner as male patients. On the other hand, studies have also reported treatment-related differences in HRQOL scores. In particular, individuals who received CABG or PCI reported significantly higher scores than those who are medically managed [37,38].
Furthermore, the confirmation of partial strict longitudinal measurement invariance of the SAQ-CAN items in this sample provides evidence in support of the validity of the SAN-CAN over time. Previous research reported overall improvement in patient-reported physical functioning, angina stability, and satisfaction with treatment post-coronary angiogram [39,40]. Also, heterogeneity in longitudinal changes in health-related quality of life has been noted in coronary artery disease patients, with more than 25% reporting a significant decrease on all subscales over time despite an overall average increase in PRO subscale scores [35]. Our study findings buttresses the conlusions from these studies.
The lack of full strict longitudinal measurement invariance of the SAQ-CAN in this study can be explained by non-invariance error variance over time, which can be attributed to how individuals with coronary artery disease adapt to their disease following coronary angiogram. Extensive research has shown that such adaptations can influence how patients respond to the same questions about their health and well-being, leading to a change in one's internal standards of measurement, a change in one's values, or a change in one's definition of the construct, a phenomenon commonly referred to as response shift [41][42][43]. As a type of longitudinal measurement invariance, the confirmation of partial strict longitudinal measurement invariance indicates the presence of non-uniform recalibration response shift [44,45], which occurs when participants change their internal   standards of measurements and down (or up) grade their ratings at follow-up assessment. Future research will seek to quantify the resulting magnitude of non-recalibration response shift effect in this population.
A notable strength of this study is its use of a large cohort of patients with coronary artery disease. However, this study is not without its limitations. First, our examination of measurement invariance of the SAQ-CAN across sex, age, treatment, angina type, and time was motivated partly by previous research that has mostly reported measurement non-invariance across demographic groups and partly by commonly reported comparisons of health-related quality of life across disease/treatment groups and over time in the literature. However, our study conclusions might not be generalizable to all subgroup comparisons of the SAQ-CAN. For example, the measurement invariance of the SAQ-CAN across acute coronary syndrome and stable angina groups might not be generalizable to all types of angina. Also, the measurement invariance the SAQ was across age subgroups (< 75 years and ≥ 75 years) might not be generalizable to other categorization of the age distribution. Future research will seek to replicate these findings in other populations of people with coronary artery disease. Second, our analyses examined measurement invariance in SAQ-CAN using grouping variables that are known a priori to be associated with patient-reported HRQOL in individuals with coronary artery disease. It is possible that there are other salient variables or interactions among variables for which the SAQ-CAN are not invariant across their subgroups. Future research should examine the use of latent variable mixture models to examine measurement non-invariance when relevant grouping variables are not known a priori [46][47][48]. Second, the validation of the SAQ-CAN has relied on data from the APPROACH registry. Future research should seek to replicate these findings in other cohorts of patients with coronary artery disease. Finally, of the 8101 patients included in our study sample, only 3279 (40.5%) had complete data at 1-year post-coronary angiogram follow-up. Our assessment of measurement invariance was based on complete case analysis with case-wise deletion of observations with missing data on at least one item. But complete case analysis might result in reduced statistical power to detect longitudinal measurement non-invariance at both item and subscale levels. Nevertheless, we expect our longitudinal analyses, with a sample size of over 3200, to be adequately powered to detect parameter non-invariance across subgroups and longitudinally. Future research will examine the impact of missing data, different assumptions about missing data, and missing data methods on the statistical power of tests of longitudinal measurement invariance in MGCFA.
In conclusion, these findings provide evidence in support of the measurement invariance properties of the SAQ-CAN across sex, age, angina type, and treatment groups and longitudinally in a sample of people with coronary artery disease. We recommend that measurement invariance analysis be conducted as part of the preliminary analyses when researchers are interested in comparisons of PROM scores among population subgroups and over time.