Reliability, Validity, and Cutoff Point of the Chinese Version of the Chelsea Critical Care Physical Assessment Tool in Critically Ill Patients

Purpose: Translation and cross-cultural adaptation of the Chelsea Critical Care Physical Assessment Tool (CPAx) into a Chinese version of CPAx (“CPAx-Chi”), test the reliability and validity of CPAx-Chi, and verify the cutoff point for the diagnosis of intensive care unit-acquired weakness (ICU-AW) Material and methods: Translation and cross-cultural adaptation of CPAx into CPAx-Chi was based on the Brislin model. Participants were recruited from the general ICU of ve third-grade class-A hospitals in western China. Adult patients (n = 200) 48 h after receiving intensive care were included (median age, 53 years; 64% males). Patients were assessed by two assessment scales: Medical Research Council Muscle Score (MRC-Score) and CPAx-Chi. Results: The item-level content validity was 0.889. The scale-level content validity was 0.955. Taking the MRC-Score scale as standard, the criterion validity of CPAx-Chi was r = 0.758 (p < 0.001) for Researcher A, and r = 0.65 (p < 0.001) for Researcher B. Cronbach’s (cid:0) was 0.939. The inter-rater reliability was 0.902 (p < 0.001). The AUC of CPAx-Chi for diagnosing ICU-AW based on MRC-Score ≤ 48 was 0.899 (95%CI 0.862– 1.025) and 0.874 (0.824–0.925) for Researcher B. The maximum value of the Youden Index was 0.643, and the best cutoff point for CPAx-Chi for the diagnosis of ICU-AW was 31.5. The sensitivity was 87% and specicity was 77% for Researcher A, whereas it was 0.621, 31.5, 75%, and 87% for Researcher B, respectively. The consistency was high when taking CPAx-Chi ≤ 31 and MRC-Score ≤ 48 as the cutoff points for the diagnosis of ICU-AW. Also, kappa = 0.845 (p = 0.02) in Researcher A and 0.839 (p = 0.04) Conclusions: CPAx-Chi had good content validity, criterion-related validity and reliability. CPAx-Chi showed the best accuracy in assessment of patients at risk for ICU-AW with good sensitivity and specicity at a recommended cutoff of 31.


Introduction
Intensive care unit-acquired weakness (ICU-AW) is a severe and debilitating complication in critically ill patients. The prevalence of ICU-AW in patients receiving mechanical ventilation for more than 4-7 days has been reported to be 33-82% [1][2][3][4][5] . The prevalence of ICU-AW in sepsis patients is 100% [3][4][5] . Early identi cation, assessment and active prevention are crucial to reduce ICU-AW risk because the pathophysiological mechanism of ICU-AW is not clear, and e cacious pharmacotherapy is lacking [1,6] .
A "gold standard" for ICU-AW is not available. The Medical Research Council Muscle Score (MRC-Score) is the most widely used diagnostic tool for ICU-AW [7] . The MRC-Score evaluates the strength subjectively in three muscle groups of all four limbs according to the Oxford Muscle Strength Grading Scale. The latter is not only affected by several factors, it also cannot evaluate respiratory function. Several studies have shown that diaphragmatic dysfunction is correlated signi cantly with ICU-AW [8][9][10] , and that the function of respiratory muscles may be related to the occurrence and development of ICU-AW.
Chelsea Critical Care Physical Assessment Tool (CPAx) could be the optimal tool for predicting and evaluating ICU-AW. CPAx not only includes physical function, mobility and grip strength, it also includes respiratory function and cough ability [11][12][13] . CPAx has been translated into several languages for use in the UK, Sweden, Denmark and other countries [14][15] . However, a Chinese version of CPAx, or the cutoff point of CPAx for the diagnosis of ICU-AW, is lacking. Therefore, we undertook translation and crosscultural adaptation of CPAx into "CPAx-Chi", tested its reliability and validity, and veri ed the cutoff point for the diagnosis ICU-AW.

Ethical approval of the study protocol
The study protocol was approved by the Ethics Committee of the First Hospital of Lanzhou University (LDYYLL2019-232) in Lanzhou, China. This institution stated that written informed consent was not required.

Translation, cross-cultural adaptation and pre-testing
The translation of the original CPAx tool into Chinese was completed with the consent and assistance of the original author (EJ Corner). Translation, cross-cultural adaptation and pre-testing were done based on the model described by Brislin and colleagues [16][17] .

Translation and back-translation
Three bilingual authors with Chinese as their native language undertook the forward translation of CPAx from English to Chinese. One was a physician experienced within the specialty of critical illness; one was a nurse experienced within the specialty of critical illness; one was a graduate student in nursing with College English Test 6 certi cation unfamiliar with clinical medicine. A seminar was conducted to discuss and synthesize the results of the three translators. Different opinions were resolved through group consultation, and then integrated into the Chinese version of CPAx, which was named "CPAx-Chi-Forward".
Three bilingual translators with English as their native language translated CPAx-Chi-Forward back into English. One was a doctoral student in nursing based in the UK; one was a doctoral student in physiotherapy based in Canada; one was a certi ed English linguist. They were unfamiliar with and blinded to the original CPAx version. A seminar was conducted to discuss and compare CPAx-Chi-Forward with the original CPAx. Discrepancies between the three translations were discussed until consensus was reached, and then the nal synthesized back-translated English version was named "CPAx-Eng-Back". The researchers provided a nal report that included the annotations from translators about their rationale for translation, choices, and linguistic considerations to the author of the original CPAx.
Nine experts revised the items of CPAx-Chi-Forward based on their theoretical knowledge, practical experience, subjective feelings, and expression in the Chinese language. Two were specialists in critical care medicine, ve were nursing specialists in critical care, one was a respiratory therapist, and one was a physiotherapist. During the process some words were rephrased or adjusted due to linguistic, grammatical, terminological or cultural differences between English and Chinese. Changes from the original CPAx version to the synthesized back-translated English version were discussed with and accepted by the original author.

Pre-testing
Forty ICU nurses from the First Hospital of Lanzhou University applied CPAx-Chi-Forward to assess ICU patients. Meanwhile, a dichotomous method was used to assess if the written expression in CPAx-Chi-Forward was easy to understand, clear, and based on Chinese expressions, and suggestions could be made. CPAx-Chi-Forward had good cross-cultural adaptation and there were no signi cant differences in sex, nationality, professional title, or time working in the ICU (p > 0.05 for all). Adjustments were not deemed necessary, and the nal Chinese version of CPAx (CPAx-Chi) was created.
2.6. Veri cation of CPAx-Chi 2.6.1. Participants and sample size Adult critically ill patients were recruited pragmatically from the general ICU of ve third-grade class-A hospitals in western China from September 2019 to June 2020.
The inclusion criteria were: (i) critically ill and seriously ill patients eligible for ICU admission; (ii) age ≥ 18 years; (iii) duration of ICU stay ≥ 48 h; (iv) Glasgow Coma Scale (GCS) score ≥ 11; (v) volunteered to participate in our study.
The exclusion criteria were: (i) unstable fracture, limb deformity or limb dysfunction; (ii) myasthenia gravis or neuromuscular dysfunction.

Study design
This was a cross-sectional observational study, and the owchart is shown in Fig. 1.. Two investigators simultaneously and independently assessed eligible patients using the MRC-Score and CPAx-Chi. SPSS 22.0 (IBM, Armonk, NY, USA) was employed for statistical analyses. Frequency and percentages were used for dichotomous variables. The mean ± standard deviation was used for continuous variables. Content validity and criterion-related validity were employed to test the validity of CPAx-Chi. Cronbach's coe cient and inter-rater reliability were used to test the reliability of CPAx-Chi. The MRC-Score was taken as the standard to calculate the receiver operating characteristic (ROC) curve and area under the ROC curve (AUC) of CPAx-Chi. The cutoff point of CPAx-Chi was determined by the maximum value of the Youden Index (YI). The kappa test was used to test the consistency of the MRC-Score and CPAx-Chi. p < 0.05 was considered signi cant.

Content validity
The nine specialists mentioned above were invited to evaluate the importance and applicability of items. The median age of specialists was 38 (interquartile range (IQR) 33-50) years. The median time the specialists had been working in the ICU was 13 (IQR 6-23) years. There were 9 specialists that included 1(11.11%) undergraduate, 4(44.

Criterion validity
The correlation coe cient for ICU-AW assessment by Researcher A between the MRC-Score and CPAx-Chi was 0.60 (p < 0.001). The correlation coe cient for ICU-AW assessment by Researcher B between the MRC-Score and CPAx-Chi was 0.65 (p < 0.001) ( Table 1).

Best cutoff point for the diagnosis of ICU-AW using CPAx-Chi
The ROC curve for ICU-AW diagnosis with CPAX-Chi was drawn taking MRC-Score ≤ 48 as the standard for the diagnosis of ICU-AW. An MRC-Score ranging from 0 to 48 was termed "1" (ICU-AW group). An MRC-Score > 48 was termed "0" (non-ICU-AW group).
The AUC for Researcher A was 0.899 (95% con dence interval (CI) 0.862-1.025). The AUC for Researcher B was 0.874 (95%CI 0.824-0.925) (Fig. 2). The best cutoff point was determined by the maximum value of the YI. The maximum YI for Researcher A was 0.643, the cutoff point was 31.5, the sensitivity was 87%, and speci city was 77%. The maximum YI for Researcher B was 0.621, the cutoff point was 31.5, the sensitivity was 75%, and speci city was 87%.
3.5. MRC-Score and CPAx-Chi were consistent for the diagnosis of ICU-AW We took 31 as the best cutoff point to diagnose ICU-AW using CPAx-Chi. Hence, if the total score of CPAx-Chi ranged from 0 to 31, it was marked as 1 (ICU-AW group), and if the total score of CPAx-Chi ranged from 32 to 50, it was marked as 0 (non-ICU-AW group). We found no signi cant difference in the total score of the ICU-AW group and non-ICU-AW group for Researcher A (F = 4.53, p = 0.035) or Researcher B (F = 6.51, p = 0.011). The test for consistency suggested that, taking CPAx-Chi ≤ 31.5 and MRC-Score ≤ 48 as the best cutoff points for the diagnosis of ICU-AW, then kappa was 0.845 (p = 0.02) for Researcher A, and kappa = 0.839 (p = 0.04) for Researcher B (Table 3) . 4. Discussion

Translation
The present study is the rst to translate CPAx from English to Chinese using the Brislin model to guarantee su cient equivalency [16][17] . We not only included a multi-disciplinary committee to remedy content variance, but also included two Chinese nurses with English certi cations studying, respectively, in the UK and Canada, as well as being native speakers of Chinese. In addition, we undertook tests for criterion validity and reliability for the completed translation.

Validity of CPAx-Chi
Validity is the degree that a measured result re ects the measured content. The more consistent the measured result is with the measured content, the higher is the validity [18][19] . According to the guide of the scale compilation, when the number of experts is more than 5, the good standard of I-CVI is more than 0.78, and the experts must be authoritative and coordinated [19][20] .
The present study involved nine ICU multidisciplinary experts with deep theoretical knowledge and clinical experience. The Expert Authority Coe cient ranged from 0.75 to 0.95. The Kendall Synergy Coe cient was 0.061(p = 0.842) and I-CVI ranged from 0.889 to 1. Therefore, CPAx-Chi had good content validity [21][22] .
Corner and colleagues demonstrated that the CVI of CPAx was 1 (p < 0.05) [11][12] . They also showed that CPAx has good predictive validity, and that the CPAx score could be used as an alternative indicator of functional prognosis in critically ill patients by analyzing the relationship between the CPAx score and patient outcomes [13] . Other colleagues demonstrated the criterion validity of CPAx taking the scores for the MRC, Short Form (SF)-36, Sequential Organ Failure Assessment (SOFA), and GCS as a standard [23] .
They found that the correlation coe cient between the CPAx score and MRC-Score was 0.65 (p < 0.001).
The correlation coe cient for the right upper limb, left upper limb, right lower limb, and left lower limb with the CPAx score was, respectively, 0.69, 0.64, 0.69 and 0.67. The correlation coe cient between the CPAx score and SOFA score was 0.68 (p < 0.001). The correlation coe cient between the CPAx score and GCS was 0.74 (p < 0.001). The correlation coe cient between the physical-function item of SF-36 and the CPAx score was 0.72 (p = 0.013). The correlation coe cient between the mental-function component of SF-36 and the CPAx score was 0.024 (p = 0.95). In the present study, the correlation coe cient between the CPAx-Chi score and the items of the MRC-Score ranged from 0.60 to 0.65 (p < 0.001). Therefore, CPAx-Chi had good validity.

Reliability of CPAx-Chi
Cronbach's α mainly re ects the internal consistency of a scale [18][19] . In general, Cronbach's α should be > 0.7; a value < 0.6 indicates that the items of scale must be revised. From the perspective of psychometrics, the "ideal" Cronbach's α should be > 0.8 [24][25][26] . The inter-rater reliability mainly demonstrates the consistency of evaluation results among different evaluators, and the stability of scales used among different evaluators [27][28] . An inter-rater correlation coe cient > 0.7 indicates that the inter-rater reliability is good. The inter-rater correlation coe cient ranging from 0.8 to 0.9 indicates that the inter-rater reliability is high [14,[18][19][20]28] . In the present study, Cronbach's for CPAx-Chi was 0.939, and the inter-rater reliability of the CPAx-Chi score was 0.902 (p < 0.001). The inter-rater correlation coe cient was > 0.8 for the items of respiratory function, transfer from bed to chair, and grip strength. The inter-rater correlation coe cient of other items of CPAx-Chi were all > 0.7. Therefore, CPAx-Chi had good reliability.

Best cutoff point, sensitivity and speci city of CPAx-Chi
Typically, evaluation of diagnostic performance is based on the ROC curve and AUC. If the AUC of a certain scale is 1, then it is considered to be a "perfect" diagnostic tool, but the perfect tool does not exist in the real world. Hence, if the AUC of one scale ranges from 0.85 to 0.95, then the measurement effect of the scale is very good. If the AUC of one scale ranges from 0.5 to 0.7, then the measurement effect of the scale is considered to be undesirable. If the AUC of one scale is 0.5, then the measurement effect of the scale is barely functional [29][30][31] . Our experts regarded an MRC-Score ≤ 48 as the standard to diagnose ICU-AW. First, some studies have demonstrated the value of diagnostic ICU-AW using the Barthel Index [32] , grip strength [33] , ICU Mobility Scale [34] , de Morton Mobility Index [35] , and the Physical Function Intensive Care Test [36] using MRC-Score ≤ 48 as the standard. Second, the best cutoff point, sensitivity and speci city of neuromuscular ultrasound, electrophysiological recordings, electromyography, and other objective diagnostic methods used to diagnose ICU-AW have been veri ed using MRC-Score ≤ 48 as the criterion [23,37−39] . Third, scholars have constructed several models of early prediction of ICU-AW by taking MRC-Score ≤ 48 as a diagnostic criterion [40][41][42] . In the present study, the best cutoff point for the diagnosis of ICU-AW with CPAx-Chi was 31 points. This was veri ed by taking MRC-Score ≤ 48 as the criterion, and the sensitivity and speci city were good.
The kappa statistic quanti es inter-rater reliability for ordinal and nominal measures. In general, a kappa value between 0.40 and 0.60 indicates "moderate" agreement, 0.61 and 0.80 denotes "substantial" agreement, and > 0.81 re ects "excellent" agreement; a negative value for kappa represents disagreement [43][44] . The concordance of the kappa value was high when taking the MRC-Score ≤ 48 and CPAx-Chi ≤ 31 as the best cutoff points to diagnose ICU-AW for Researcher A and Researcher B.

Strengths of our study
First, two researchers assessed and collected data independently, which improved the reference value of the validation data. Second, the best cutoff point for the diagnosis of ICU-AW using CPAx-Chi was determined to be 31 points according MRC-Score ≤ 48.

Weaknesses of our study
First, our ndings were limited by use of a non-randomized pool of participants chosen primarily by their availability during the study period: this may have reduced the generalizability of our ndings. Second, there were speci c exclusion criteria that may have stopped the potential "ceiling and oor" effects of CPAx-Chi to be tested. Therefore, to further con rm the clinical value of CPAx in assessing and diagnosing ICU-AW, it must be applied together with the MRC-Score, ultrasound, electrophysiology, and electromyography. Also, multicenter, large-sample, and randomized trials are needed to verify the best cutoff point for CPAx.

Conclusions
We showed that CPAx-Chi had high criterion validity and reliability for assessing ICU-AW in adult patients in the ICU. CPAx-Chi showed good sensitivity and speci city in assessment of patients at risk of ICU-AW at a recommended cutoff of 31 points.