2.1 Translation process
The forward- and back-translation process was adopted from principles established in previously published guidelines . The process included 5 steps: (1) the survey was forward-translated into Vietnamese by two independent bilingual translators; (2) a committee approach among the two translators and the bilingual researcher was used to obtain consensus on the final Vietnamese version; (3) the Vietnamese version was blindly back-translated into English by two different independent bilingual translators, and again, the committee approach and consensus were reached on the English back-translation version; (4) the English back-translation version was compared with the original English version by two independent native English-speaking experts. These experts evaluated whether the meaning of these two versions was similar using a five-point Likert scale (1 = strongly disagree, 2 = disagree, 3 = neutral, 4 = agree, and 5 = strongly agree). If uncertainties and differences could not be resolved, steps 1 through 4 were repeated. (5) The Vietnamese version was sent to three experts who were familiar with the clinical nursing setting. They were asked to judge each item of the instrument for translation and content equivalence using a 4-point Likert scale (1 = not relevant, 2 = unable to assess relevance; 3 = relevant but needs minor alteration; and 4 = very relevant and succinct). The proportion of the experts’ agreement was used for equivalent assessment of the translated instrument.
2.2 Psychometric evaluation
The validity of the N-CT-4 Practice (V-v) was assessed with both content and construct validity. The method suggested by Lynn (1986) and Polit, Beck and Owen (2007) was used to identify the content validity on the item level content validity index (I-CVI) [26, 27]. Three experts assessed the relevance of each item on a 4-point Likert scale, from (1) not relevant to (4) very relevant. Lynn (1986) stated that the I-CVI must be 1.0 when there were five or fewer experts. The construct validity was evaluated by confirmatory factor analysis (CFA). Model fit was explored with several procedures because different authors have recommended using a number of indicators to identify the fit of models [28, 29].
The reliability of N-CT-4 Practice (V-v) was evaluated by both internal consistency and test-retest reliability. The former was assessed with the Cronbach’s alpha coefficient. The scale was considered to display acceptable, good, or excellent internal consistency when this index was more than .7, .8 or .9, respectively. The latter was evaluated by the intraclass correlation coefficient (ICC), and a minimum value of .7 was considered satisfactory .
The participants were clinical nurses recruited based on convenience from the internal medicine, surgery, intensive care unit (ICU), emergency department (ED), and anesthesiology and recovery departments of three general hospitals located in Can Tho City, southwestern part of Vietnam. The required sample size was 545, with 5 participants per variable , which was treated as one item in the N-CT-4 Practice (V-v) questionnaire. The eligibility criteria for nurses included (1) work as a clinical nurse; (2) 20 years old and above; and (3) full-time employment. All working nurses in these departments were invited to participate in this study. Nurses absent during data collection were excluded.
2.4 Ethical considerations and study procedure
This study adhered to the ethical principles in congruence with the Declaration of Helsinki  and was permitted ethical approval by the ethical review board of the first author’s institution.
Once approved, the researcher contacted three hospitals and obtained a name list of nurses from each hospital. The research group contacted and invited these nurses to participate in this study. The research participants were provided both verbal and written information relating to the purpose, benefits and risks of research as well as procedures to assure anonymity, confidentiality, and voluntary participation to potential subjects. Once they agreed, the consent form was signed, and a questionnaire was sent to them directly. It took approximately 20 minutes for participants to complete the N-CT-4 Practice (V-v) questionnaire and provide demographic characteristics.
The N-CT-4 Practice questionnaire was developed by Zuriguel-Pérez (2017) and based on the theoretical model of Alfaro-LeFevre (2016). It was a specific tool developed to measure the level of critical thinking ability of nurses in clinical practice environments. This scale has 109 items with a 4-point Likert response format (1 = never or almost never, 2 = occasionally, 3 = often and 4 = always or almost always). There are four dimensions: personal characteristics (Prs, 39 items); intellectual and cognitive abilities (Int, 44 items); interpersonal abilities and self-management (Atg, 20 items); and technical abilities (Tcn, 6 items). The total score ranges between 109 and 436, and the levels of critical thinking are categorized as low level (score < 329), moderate level (score between 329 and 395), and high level (score > 395). By expert evaluation, the original results from 399 clinical nurses had an I-CVI of .85, a total Cronbach’s alpha coefficient of .96, and an ICC of .77. The goodness-of-fit indices in CFA were χ2/df = 1.95, RMSEA = .055, SRMR = .65, CFI = .629, and TLI = .621, indicating that the N-CT-4 Practice was in keeping with the four-dimensional model proposed by Alfaro-Levre .
2.6 Statistical analysis
SPSS for Window version 22.0 (IBM Corp., Armonk, NY, USA) was used to analyze the data. Descriptive statistics were used to summarize the characteristics of the participants. The I-CVI was calculated to assess the content validity of the N-CT-4 Practice (V-v) using Microsoft Excel. CFA was conducted using AMOS version 22.0 to evaluate construct validity. The goodness-of-fit of the model was assessed by using the indices and criteria: chi-square test (χ2; nonsignificant). Because chi-square is sensitive to sample size, we evaluated the goodness-of-fit index based on the ratio between chi-square and the degrees of freedom (χ2/df; < 3), the root mean square error of approximation (RMSEA; <.06), the standardized root mean square residual (SRMR; <.08), the comparative fit index (CFI >.95 is a good fit), and the Tucker–Lewis index (TLI > .95 is a good fit; 0 < TLI < 1 can be acceptance) [28, 29]. The Cronbach’s alpha coefficient was used to evaluate the internal consistency, and a value of α ≥ .7 was acceptable. The ICC was used to assess the test-retest reliability for a 2-week period, and the value of ICC ≥ .7 was satisfactory .