Development and Validation of the Perceived Stress Scale in Emergency Medical Teams During the Epidemic of COVID-19

Background During the epidemic of COVID-19 of China, the emergency medical teams are facing serious stress in the front-line. As far as we know, there are no studies to test the applicability and measurement properties of the 10-item Chinese perceived stress scale (CPSS-10) in the emergency medical team. Methods From March 17 to 27, 2020, an online survey was conducted on the emergency medical teams of Liaoning Province who supporting Wuhan. The CPSS-10 was used to measure the stress of medical workers. Classical test theory (CTT), bifactor model and multidimensional graded response model (MGRM) were used to analyze the measurement characteristics and differential item functioning (DIF) of CPSS-10. Results The Cronbach's alpha coecient of CPSS-10 was 0.86. Bifactor model conrmed that CPSS-10 was a two-factor structure. MGRM showed ordered response categories of K10. Item 8 could distinguish individual stress, but the slope of this item was very large (slope is 7.97, which was higher than 4), showing local dependence. There was a signicant age DIF, but no DIF in gender. After removing the items 2, 5, and 8, the CPSS-7 showed high reliability, without DIF of age and gender, and there was no local dependence. MGRM could provide useful measurement information about CPSS-10 and CPSS-7. MGRM found that CPSS-10 did not fully conform to the item response theory (IRT). CPSS-7 had proved to be a more effective and reliable tool for assessing the perceived stress of emergency medical team.


Introduction
The twenty-rst century witnessed the challenges of infectious diseases, and new or re-emerging infectious pathogens which remained the major causes of morbidity and mortality by infectious diseases [1]. COVID-19 had attracted attention after the report of unexplained pneumonia in Wuhan, China [2,3]. It was caused by SRAS-COV-2 virus infection, and subsequently spread to many other parts of the world through global travels [4]. In January 2020, the WHO designated COVID-19 as a public health emergency [5]. At present, COVID-19 had developed into a global pandemic [6]. According to incomplete global statistics before 10 January 2021, there had been 88.38 million con rmed cases of COVID-19, including 1.92 million deaths, reported by WHO [7]. The wide spread of COVID-19 might monopolize government activities and even cause fear and hysteria [8].
During the COVID-19 epidemic, people all over the world are suffering from fear, anxiety, and panic, especially the health care workers, who are facing more huge emergency medical pressure [9]. Health care workers have long shifts, lack of personal protective equipment, fear of bring infection to their families, and face high risk of infection with COVID-19 [10], which aggravate their perceived stress. Previous studies have found that the COVID-19 pandemic has a serious impact on the mental health of health care workers and the general population [11][12][13][14]. According to the research on the outbreak of severe acute respiratory syndrome and Middle East respiratory syndrome (MERS), anxiety and fear are the rst symptoms of health care workers, but the emergence of depression and post-traumatic stress symptoms will cause serious consequences and have a long-term impact on the mental health of health care workers [15,16].
Medical emergency rescue was an important resource to deal with public health emergencies. By February 23, 2020, more than 330 emergency medical teams and 41600 health care workers had supported Hubei Province [17]. Emergency medical teams were at high risk of being infected with SRAS-COV-2 during their participation in the treatment of COVID-19 patients. As of February 11, 2020, a total of 1716 health care workers were con rmed to be infected with COVID-19 in mainland China, accounting for 3.8% of all con rmed cases [18]. Recently, a meta-analysis found that there was a high prevalence of anxiety, depression and insomnia for the health care workers who participated in the diagnosis and treatment of patients infected with COVID-19 [19]. The higher the frequency of contact with COVID-19 patients, the higher the risk of mental disorders [20]. Previous studies had found that health care workers in emergency medical teams had higher psychological pressure [21,22]. Therefore, it was necessary to study the mental health of medical workers in emergency medical teams by means of stress measurement scales.
One of the most popular stress measurement scales [23], the perceived stress scale (PSS) had been used to test the mental health of people in different countries and races during the epidemic of COVID-19 [24,25]. Although PSS-10 was widely used in the studies about mental health of emergency medical team, few concerned about the applicability and measurement properties of it, and the veri cation research of measurement properties of the 10-item Chinese perceived stress scale (CPSS-10) in the emergency medical team was even less. Meanwhile, most psychological assessments tended to measure two or more latent trait or multidimensional, and CPSS-10 was a Likert scale, its items were graded hierarchically. In order to overcome the limitations of ranked data and multidimension of CPSS-10, multidimensional graded response model (MGRM) should be used for items analysis [26]. Therefore, based on the mental health survey data of front-line medical staff in Hubei Province, the classical test theory (CTT), bifactor model and MGRM were used to evaluate the psychometric properties of CPSS-10 in this study, which could provide the reference for the appropriate revision of the scale. The accuracy of PSS score could help the government understand the psychological pressure of the front-line medical staff more effective.

Study design and participants
From March 17 to 27, 2020, an online survey was conducted among the medical workers of emergency rescue medical team, including doctors, nurses, and team leaders of Liaoning Province who were chosen to alleviate pressure on Wuhan in Hubei province during the epidemic of COVID-19. The study team communicated with the medical rescue team leaders of Liaoning Province many times and made a detailed plan about the distribution and collection of the questionnaire. Finally, we decided to develop an electronic questionnaire and sent it to the emergency rescue medical team through the Web-based survey tool Sojump (https://www.wjx.cn/ accessed on 13 March 2020). We trained the medical team leaders online through Wechat (Tencent Corp). And the trained team leaders guided the team to nish the electronic questionnaire with good quality. The data were collected in excel format through platform of Sojump.
A total of 352 emergency medical team workers (300 females) were in study. All the health care workers who lled in the electronic questionnaire signed the informed consent forms.

Measures And Instrument
The 10-item perceived stress scale (PSS-10), a popular self-report tool, represents the perceived stress of participants in the past month [23]. PSS-10 includes two factors. Six negative wording items (items 1, 2, 3, 6, 9 and 10) constitute the rst factor, and four positive wording items (items 4, 5, 7 and 8) constitute the second factor [27]. Speci cally, the rst factor represents the perceived helplessness subscale (PHS), and the second factor re ects the perceived self-e cacy subscale (PSES) [28]. Items 1, 2, 3, 6, 9 and 10 score from 0 to 4; items 4, 5, 7 and 8 score from 4 to 0. The PSS score lies between 0 and 40 [23]. In this study, the CPSS-10 [29] was used to measure the perceived stress of front-line medical staff in emergency medical teams.

Factor Variables Of Differential Item Functioning
Age and gender are the most important factor variables. In our study, age was transformed into binary variable. So, participants could be classi ed as adults (aged 22-34 years old) and middle-aged (aged 35-59 years old) according to age.

Statistical Analysis
Bartlett sphericity test [30] and Kaiser Meyer Olkin (KMO) test [31] were performed to offer empirical support for the adequacy of the research data. The null hypothesis of Bartlett's sphericity test showed that there was no signi cant difference between the correlation matrix and the identity matrix, which meant that the variables were uncorrelated and unsuitable for structural detection [30]. In the actual analysis, when KMO value was more than 0.8, which shows that the effect of factor analysis was better [31]. The Cronbach's alpha coe cient was used to test the scale reliability.
To determine the appropriate factor structure of CPSS-10, an exploratory factor analysis (EFA) was conducted on the samples of emergency medical team by maximum likelihood estimation and oblimin with Kaiser normalization [32]. Con rmatory factor analysis [33] was further used to test the dimensions of CPSS-10 and verify the applicability of the two-factor and one-factor models [34]. Based on the factor structure reported in PSS literature, our study sets up three different models of CPSS-10: (a) One-factor model [34], assuming that all 10 items measure a single stress factor. (b) Two-factor model [29], which is set as a two-factor structure. (c) Bifactor model [35], which consists of six negative projects and four positive projects. The bifactor model allows researchers to test whether CPSS-10 has other general measures of perceived stress. See Fig. 1 for details.
As the one-dimensional veri cation index used in bifactor model, we calculate the explained common variance (ECV) and percentage of uncontaminated relations (PUC). ECV is the ratio of common variances attributed to general factors [36]. Higher ECV value indicates that compared with other speci c group factors, the data has a strong general factor.
When the ECV value is greater than 0.70, the scale can be regarded as one-dimensional, otherwise it supports bifactor model. PUC re ects the degree of item correlation only affected by global factors in the bifactor model [36]. When PUC > 0.8, a group of two-factor model data tting single dimensional model can get unbiased estimation, that is, the scale conforms to the one-dimensional model.
In this study a set of statistical criteria determine were used to determine the goodness [37], including: Chi-square/df (≤ 3), a comparative t index (CFI, ≥ 0.90), a tucker Lewis index (TLI, ≥ 0.90), a root mean square error of approximation (RMSEA, ≤ 0.08), and a standardized root mean square residual (SRMR, ≤ 0.08). When the goodness of t index meets the above conditions, it means that the model is well tted.
For further evaluating the item tting of CPSS-10, MGRM was also used to analyze the items. MGRM is a probability model, which describes the response of the participant to any given item according to the level of the participant' latent trait [26]. DIF is an indispensable part of psychometric analyses aiming to measure invariance across sample groups, e.g., for male and female. The DIF of this study was tested by MGRM [26].
The CPSS-10 scale was revised according to the following strategies. First, according to DIF, if DIF is statistically signi cant (P < 0.005) after correction, it was deleted. Secondly, if the discrimination estimation of an item was larger than 3.00, it was deleted [38] according to edge maximum likelihood estimation. Thirdly, according to Wright and Linacre's suggestion: when the information-weighted t statistic (in t) mean square (MNSQ) and outlier-sensitive t statistic (out t) MNSQ values of an item was greater than 1.3 or less than 0.7, the tting effect of the item was poor [39]. Finally, the performance of CPSS-10 and the revised version was evaluated based on the goodness of t.
All statistical analysis and graphic plotting were performed using R version 4.0.3 R software (The R Foundation for Statistical Computing, Vienna, Austria). And the software packages "mirt" and "ltm" were used to build the MGRM and bifactor model.

Results
Exploratory factor analysis (EFA) The KMO test value of CPSS-10 in this study was 0.88, and Bartlett test of sphericity was signi cant (Chi-square = 1703.26,P < 0.01), which met the prerequisite for EFA. EFA results showed that eigenvalues of the rst two components were greater than 1 (4.62 and 1.91), so the two factors structure is suitable. Factor 1 was composed of six negative items (called PHS subscale), accounting for 46.15% of the variance. Factor 2 was composed of four positive items (PSES subscale), explaining 19.13% of the variance (Table 1). There was no double loading in the pattern matrix, and the loading of all valid items was greater than 0.5. In addition, there was a correlation between PHS and PSES (r = 0.39, P < 0.05). Note. PSS-10 10-item Perceived Stress Scale; PHS Perceived helplessness Subscale (First common factor); PSES Perceived Self-E cacy Subscale (Second common factor).   Table 2 showed that PSS-10 of bifactor model was preferred with tting effect, Chi-square/df = 1.62, CFI = 0.99, TLI = 0.99, RMSEA = 0.07 (95%CI [0.06,0.09]), SRMR = 0.07, which was better than that of two-factor of CFA model (Chi-square = 73.385, P < 0.001). Figure 1 showed the schematic representations of one factor model (a), two factor model (b) and bifactor model (c) of CPSS-10. ). The ratio of the rst factor eigenvalue to the second factor eigenvalue is 2.58 (e.g., < 3). Two factor structure of CPSS-10 was found in both EFA and CFA, as shown in Table 2. In addition, Martin-Loef-Test found that LR value = 365.186, P < 0.001, which indicated that CPSS-10 may not be an unidimensional scale. The bifactor model showed that PCU = 0.53 < 0.8, and ECV = 0.65 (which was less than 0.70), indicating there were not only general factor, but also speci c factors. These above results con rmed the multidimensional of CPSS-10 from different aspects.

Reliability
In our study, the overall, PHS subscale, and PSES subscales Cronbach's alpha coe cient were 0.86, 0.91 and 0.77, respectively ( Table 1). The overall reliability and PHS subscales were high, while the reliability of PSES subscale was only acceptable. After removing item 5 and item 7, the Cronbach's alpha coe cient increased, which were 0.863 (95%CI [0.833,0.886]) and 0.865 (95%CI [0.836,0.889]) respectively, while deleting any other item would reduce the Cronbach's alpha coe cient. This result indicated that the CPSS-10 could be further optimized (Table 3). Note. PSS-10 10-item Perceived Stress Scale; CI con dence interval; In t Information-weighted t statistic; Out t outlier-sensitive t statistic.

Multidimensional Graded Response Model (Mgrm) Analysis
In t and out t MNSQ values were used to evaluate the tting of items. MNSQ of in t and out t for MGRM were 0.807-1.077 and 0.717-1.204 respectively, which showed an overall good tting effect of CPSS-10. However, the tting effect of item 8 was poor with in t and out t MNSQ value 0.064 and 0.220, respectively (Table 3).
CFA combined with MGRM analysis was used to test the CPSS-10 structure of the emergency medical team. The loads of all items were greater than 0.60, and no item showed disorder threshold (Table 4 and Fig. 2). Items 1, 2, 3, 6, 9, and 10 had higher loads on the coe cient λ 1 (PHS subscale), while items 4, 5, 7, and 8 had higher loads on the coe cient λ 2 (PSES subscale), which con rmed the stability of two-factor structure CPSS-10. MGRM showed that the correlation between PHS and PSES was 0.535. In addition, category probability curves of items 5 and 6 were provided, see Fig. 2A and 2B, which showed that the items could distinguish personnel ability and project di culty. Other items of CPSS-10 showed similar category probability curves (Additional le 1: Figure S1). Note: λ indicate factor loadings; α is the item discrimination (slope) parameter; b 1 to b 4 are item response category di culty (threshold, location) parameters; Bolded items indicate the items chose for short form of CPSS-10; se standard error. Table 4 summarized the slope (α, discrimination) and di culty (b1 to b4, threshold or location) parameters of 10 items in MGRM analysis. Item 8 and item 2 had high slopes (7.97 and 3.23), while item 1, item 3, item 6, item 9, and item 10 had medium slopes (2.54-2.94). Item 8 had the largest slope and item 2 had the second largest slope. Although item 8 was the most effective way to distinguish individual stress on CPSS-10, it had a very high slope compared with other items (e.g., > 4), and the results of MGRM showed a local dependence.

Expected Score And Item Information Function Of Cpss-10
As shown in Fig. 2C and 2D (e.g., items 5 and 6), the expected score (cumulative probability of category probability curves) increased with the increase of latent trait ( ) of emergency medical team members. The item information function (IIF) of items 5 and 6 of CPSS-10 were shown in Fig. 2E and 2F. Generally, IIF reached the maximum when the potential trait ( ) was between − 2 and 3. When the latent trait ( ) was close to -3 or 3, the IIF was the smallest (close to zero). This result indicated that CPSS-10 showed better discrimination ability among emergency medical workers with medium ability level, rather than those with the lowest or highest ability level. Other expected score and item information function of CPSS-10 were showed in Supplementary information (Additional le 1: Figure S2 and S3). The DIF of CPSS-10 was determined by gender (male/female) and age (Younger:22-34; middle-age: 35-59). We used likelihood ratio Chi-square to test DIF. MGRM analysis found that DIF of item 2, 5, and 8 were all statistically signi cant across age groups. However, the DIF of item 2 and 5 were not statistically signi cant when Bonferroni adjustment was taken with α value (0.05/10 = 0.005). However, DIF of item 5 was still statistically signi cant among age groups. No signi cant DIF was found in gender (Table 5). Revision Of Cpss-10 Scale Item 5 was deleted according to DIF (corrected P < 0.005). Then items 2 and 8 were deleted for their discrimination beyond the interval [− 3.00, 3.00]. According to in t and out t MNSQ values, the items with poor tting were deleted. Finally, the CPSS-7 Version (composed of 7 items) was obtained. Cronbach's alpha coe cient of CPSS-7 was 0.82, and CPSS-7 was highly correlated with CPSS-10 (r = 0.98, P < 0.001). It was found that CPSS-7 was more effective in identifying stress perception than CPSS-10. In addition, no DIF difference was found in gender and age of CPSS-7 (Table 5). In this study, we aimed to evaluate the reliability and validity of CPSS-10, test the psychometric properties of CPSS-10 in medical staff of emergency medical team during the COVID-19 epidemic. In the previous studies, CPSS-10 was detected to have a high level of internal consistency and reliability [29]. CTT and IRT provided su cient evidence to con rm the multidimensional and two factor structure of CPSS-10 [35,40]. In our study, PCA, bifactor model and MGRM all supported the two-dimensional structure of CPSS-10. MGRM analysis showed that the response categories were in order of severity. Although there was a signi cant DIF in age of item 5, there was no signi cant DIF in gender and age of other items of CPSS-10. In addition, this study had enough psychometric evidence to prove that the revised version of CPSS-7 was reliable and effective.

Discussion
The Cronbach's alpha coe cient of CPSS-10, PHS and PSES subscales were 0.86, 0.91 and 0.77, respectively. This indicated that CPSS-10 had good internal consistency reliability, which is in line with the results of other PSS-10 studies (the reliability versions range from 0.78 to 0.91) [29,41,42]. In addition, a positive correlation was found between PHS and PSES subscale of CPSS-10, this result was consistent with other validation studies in China [43] and South Korea [44], but contrary to the results in Greece [45].
Our study showed that there might be a bifactor structure of CPSS-10. The results of EFA, CFA and bifactor model indicated that bifactor model was best. The bifactor model of CPSS-10 had a general stress factor and a two speci c factors structure [28,46]. Six negative items were loaded into the perceived helplessness factor, and the other four positive items were loaded into the perceived self-e cacy factor. Our result was accordant with Wiriyakijja' view [46]. However, our study was not only different from the two-factor solution proposed by Pereira et al [47], but also different from the bifactor structure of ve negative items and ve positive items, which was put forward by Park et al [44].
IRT models, such as MGRM, are not common in health studies. Typically, the measurement characteristics and dimensions of a scale are evaluated by CTT and factor analysis [48]. IRT analysis results in our study showed that the items of PHS and PSES were well tted by MGRM. The parameterization of items showed that both PHS and PSE can effectively measure and distinguish the potential perceived stress level of grade data [26]. Our study demonstrated an ordered threshold in the category probability curves, which meant that medical workers with more stress on the item did have more anxiety than those who claimed to have less trouble. In other words, CPSS-10 could accurately distinguish low-level and high-level items [49]. However, the discrimination between item 2 and item 8 in cpss-10 was beyond the recommended range, and item 8 showed strong local dependence (e.g., > 4) [50]. However, some previous studies did not nd the local dependence of PSS-10 [44,51].
This study found that items 2, 5, and 8 showed signi cant DIF in age (P < 0.05). In addition, the items of CPSS-10 did not show DIF in gender. However, an American study [52] showed that PSS-10 had no signi cant DIF both in gender and age. Based on the ndings of DIF in gender and age, it was necessary to verify and revise CPSS-10.
After deleting some items from the scale, some redundancy can be eliminated, and the performance of the scale may be improved [53]. A previous study showed that when K10 (0.93) and K6 (0.89) were compared, some redundancy appeared in Cronbach's alpha coe cient [54]. Our analysis con rmed that the Cronbach's alpha coe cient (0.83) of CPSS-7 was almost equivalent to that of CPSS-10 (0.86). In addition, CFI, TLI and SRMR of CPSS-7 and CPSS-10 were 1.00, 1.01, 0.06 and 0.99, 0.99, 0.07, respectively. All of these indicators showed goodness of CPSS-7. Therefore, CPSS-7 could replace CPSS-10 to measure the perceived stress of emergency medical team. In addition, our study found that the revised CPSS-7 did not show DIF both in gender and age, which meant that there was no gender or age bias in the CPSS-7 [26]. Therefore, the CPSS-7 developed and validated by MGRM is an effective measurement tool with good discrimination ability and high item information function, which can be used to quantify the non-speci c psychological severity of emergency medical team members during the epidemic of COVID-19.
There are some limitations in this study. Firstly, during the period of COVID-19 epidemic, online survey was adopted because it was impossible to conduct face-to-face survey with the respondents. Although quality control was emphasized, the difference between online survey and traditional eld survey was not considered. Secondly, all diagnoses in this study were mental health symptoms/states, not mental disorders. Thirdly, we could not calculate the response rate due to the participants' anonymously recruiting. Finally, our study did not investigate the pre-existing stress and anxiety of medical team members. Some respondents may have these baseline stresses and anxieties, so the reported scores may not all be attributed to the COVID-19.
Conclusion CTT, bifactor model and MIRT showed that CPSS-10 had high reliability and effectiveness in emergency medical team during the epidemic of COVID-19. However, the study recognizes that due to cultural differences and strict adherence to MIRT characteristics, the CPSS-10 needed to be modi ed to measure stress in Chinese medical workers. The CPSS-7 provided a more powerful measurement performance than the CPSS-10. CPSS-7 met most of the hypotheses of MGRM, which did not show DIF both in gender and age, and had no local dependence. During the COVID-19 epidemic, this study conducted a more comprehensive analysis of the measurement attributes of the Mandarin version of CPSS, and recommended it as a rst-line stress measurement and evaluation tool for emergency medical team.

Consent for publication
Not applicable.

Availability of data and materials
All authors had full access to the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis. The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.