Instrument improvement and domain structure
Item and confirmatory factor analyses were used to determine the final structure of the CSAT. Table 2 shows the improved psychometrics during the instrument development process. The baseline model is used as a comparison for the pilot and final models. The pilot model includes all initial 49 items contained in seven subscales. After psychometric analyses, 14 items were dropped from the pilot version of the framework and tool. Items were dropped if they had lower loadings in the latent factors, had poor variance (i.e., restricted range), or had excessive missing data. The final CSAT is comprised of 35 items organized within 7 subscale domains. Each domain has five items. This simple and balanced structure facilitates training and scoring with practices and groups.
The domain structure and final items of the CSAT can be seen in Additional file 1. The first domain, Engaged Staff & Leadership, includes items assessing the extent to which the clinical practice has the support of internal frontline staff and management within the organization. Engaged Stakeholders assesses the extent to which the practice has support among external stakeholders. Organizational Readiness measures whether the organization has the internal supports and resources needed to effectively manage the practice. Workflow Integration refers to whether the practice has been designed to fit into existing workplace processes, policies, and technologies (e.g., EMR systems). Implementation & Training reflects whether the organization promotes processes and learning that appropriately guide the direction, goals, and strategies of the clinical practice. Monitoring & Evaluation assesses the extent to which the organization monitors the clinical practice and uses data to inform quality improvement. Finally, Outcomes & Effectiveness refers to whether and how the organization measures practice outcomes and impacts. Each subscale is scored separately, and an overall CSAT score can be obtained, ranging from 1 to 7. Higher scores indicate greater organizational capacity for clinical sustainability.
Table 2 also shows the relatively good fit of the 7 factor CFA model to the data—that is, the 35 item CSAT does a credible job of measuring seven important clinical sustainability domains that were identified in the literature and in the concept mapping phase of the study. Although the CFI could be larger, the RMSEA of 0.084 is just above the target for an acceptable fit (0.08), and the SRMR of 0.075 is smaller than 0.08, indicating a good fit [33, 34]. The most important pattern in the CFA results presented in Table 2 is the improvement in model fit as we move from a single factor model, to a seven subscale model with all items, and finally the seven subscale model with the reduced number of items. The overall results suggest that the CSAT measures distinctive aspects of clinical sustainability with a relatively modest number of items that promotes ease of use.
Subscale reliability
Table 3 presents the subscale reliabilities (internal consistency) for the CSAT. The average internal consistency of the seven subscales is 0.88, and ranges from 0.82 to 0.94. These indicate excellent scale reliability, especially given the small size of each subscale (5 items) [35]. (Internal consistency always goes up with more items, so the most desirable goal is the fewest items that still maintains high reliability [33].) Furthermore, the item loadings show consistently high intercorrelations with their respective subscales, indicating good fit of individual items with overall subscale scores (full results available from authors).
Preliminary CSAT results and validation
Subdomain scores are the simple means of the five items in each domain. The total CSAT score is then the mean of the seven subdomain scores. Table 4 presents descriptive statistics for the seven subscales and the total scores. Organizational Readiness has the lowest average score, while Outcomes & Effectiveness is rated the highest. The standard deviations, ranges, and the density plot shown in Figure 2 show that there is good variability of scores, and no major problems with restricted range. The CSAT total scores range from 2.3 to 6.9. The standard deviation for the total scores is lower than for the subdomain scores, which is expected given that the total score is the mean of the seven subdomain scores.
In addition to the CSAT scores, basic characteristics of the clinical setting were collected (i.e., patient, setting, and environment type) as well as two characteristics of the respondent (i.e., job position, and training profession). As part of a set of preliminary validational analyses, CSAT total and subscale scores were examined across these five setting and respondent characteristics. In terms of setting characteristics, total CSAT scores varied significantly by setting type (F = 3.16, p = .047) and environment (F = 2.93, p = .038), but not by patient type (F = 1.09, p = .299). CSAT scores did not vary by respondent’s profession (F = 0.93, p = .449) or job position (F = 1.69, p = .175).
Figure 3 shows the CSAT subscale score profile plots for the three setting (top row) and two respondent (bottom row) characteristics. Domain scores were very similar based on patient age category but showed significant differences based on outpatient vs. inpatient clinical setting and academic vs. community vs. private practice environment. Outpatient settings report consistently lower sustainability scores across all seven domains than inpatient settings. Academic medical centers and community hospitals were assessed as having higher sustainability capacities than community health centers and private practices. Type of environment is a little more complicated, but it appears that community hospitals and academic medical centers report higher sustainability than community health centers and private practice settings.
Usability testing
We also collected data regarding ease of use and asked participants to report on their experience using the CSAT. On average, it took participants just under 20 minutes to complete the longer initial 49 item version of the CSAT and just under 10 minutes to complete the final 35 item tool. Participants also rated the experience of completing the CSAT in a positive manner: 85% of participants rated the tool as easy to use; 75% felt very confident about their ability to use the tool; 90% thought that most other people would be able to learn quickly how to use the tool. Importantly, only 35% thought that they would need external support to use the tool effectively.
In addition to the usability data from our pilot participants, we assessed the final 35 item version of the CSAT using the Psychometric and Pragmatic Evidence Rating Scale (PAPERS)[36]. The CSAT rates good or excellent in all five of the PAPERS pragmatic categories: brevity, cost, ease of training, ease of interpretation, and language. The CSAT rates good in brevity with less than 40 items and language at less than a 10th grade reading level. It rates excellent in cost, ease of training, and ease of interpretation being free, requiring no training to use, and automated calculations of scores.