Participants and procedure
A convenience sample of physicians and nurses from across mainline China was recruited using a snowball sampling method  between May 27 and April 26, 2020. Inclusion criteria were 1) physicians or nurses; and 2) length of practice at least two years. The exclusion criteria were: (1) a history of six months or more of an extended break from practice for any reason during the past two years; (2) unable to use the internet or other mobile devices due to the vision or other disability preventing the completion of an online questionnaire; and (3) those not formally licensed to practice medicine or nursing.
Potential participants were provided a link to an online questionnaire through a popular social media platform (Wechat). Those who responded to the invitation were encouraged to forward the invitation letter to colleagues and post it on social media sites.
The invitation letter was initially sent to 19,583 potential participants by the Wechat network, of which 4,003 responded to the invitation; 28 participants refused after reading the informed consent form, resulting in 3,975 completed questionnaires (Fig. 1). Of those, 968 records were excluded during the data cleaning process, leaving a final sample of 3,006 that consisted of 583 nurses and 2,423 physicians for inclusion in the analysis.
Two-week test-retest reliability was determined by asking 100 physicians from three hospitals to complete a questionnaire version of the full survey on two occasions, of whom 73 completed the survey at both times.
Sociodemographic characteristics. Information was collected on age, gender, marital status, educational attainment, ethnicity (Chinese Han vs. minority ethnicity), area of specialty, work area (general medical ward, ICU, emergency room), and length in practice.
Moral injury. Moral Injury Symptoms Scale-Health Professional (MISS-HP) is a measure of moral injury symptoms that assesses betrayal, guilt, shame, moral concerns, loss of trust, loss of meaning, difficulty forgiving, self-condemnation, religious struggle, and loss of religious/spiritual faith . Response options for each of the ten items range from 1 to 10 signifying agreement or disagreement with each statement, with a total score ranging from 10 to 100. The higher scores indicate a greater number and severity of MI symptoms .
In order to assess convergent validity, the 4-item Expressions of Moral Injury Scale-Short Form (EMIS-SF) was administered. Developed by Currier and colleagues, this measure has been used widely to assess MI in military personnel . Items were rated on a Likert scale from 1 (strongly disagree) to 5 (strongly agree). Higher total scores indicate the number and severity of MI symptoms, reflecting maladaptive behaviors and internal experiences associated with the moral challenges of delivering clinical care.
Mental health. The 9-item Patient Health Questionnaire (PHQ-9)  and 7-item Generalized Anxiety Disorder (GAD-7) were used to measure depressive symptoms and anxiety symptoms, respectively. These two instruments are short screening measures frequently used in medical and community settings. Each item on these measures is rated on 4-point Likert scale (from 0 to 3) indicating how often each symptom has occurred within the past two weeks. Total scores range from 0–54 for PHQ-9 and 0–42 for GAD-7, with higher scores indicating more severe symptoms. The Chinese version of PHQ-9 and GAD-7scale both have strong internal and test-retest reliability as well as strong construct and factor structure validity in both medical patients and those in the general population [29, 30].
Well-being. The 12-item Secure Flourish Index (SFI) was used to measure six domains of well-being: happiness and life satisfaction, physical and mental health, meaning and purpose, character and virtue, close social relationships, and financial and material stability . Each item was measured on an 11-point visual analogue scale (from 0 to 10), where higher scores indicate higher levels of well-being in each of these areas. Two items assess each of the six domains, and these are averaged to domain-specific scores; the total SFI score is calculated as the average of all six domains with equal weighting. The Chinese version of the SFI has been shown to have acceptable validity and reliability in a Chinese sample .
Burnout. A modified Maslach Burnout Inventory-Human Services Survey for Medical Personnel (MBI‐HSMP) was used to measure the three dimensions of burnout that include emotional exhaustion, depersonalization, and reduced personal accomplishment. Each item on the 22-item scale is scored on a 7‐point Likert scale from 0 (never) to 6 (daily). Higher scores on each subscale and the overall scale indicate higher levels of burnout. The Chinese version of MBI‐HS has been translated following a standard procedure and shown to have acceptable reliability and validity in a sample composed of participants from a range of occupations.
Workplace violence. Workplace violence was measured by asking, “Have you ever been attacked by your patients or their close relatives, either physically or verbally?” Response categories were yes or no.
Translation Of Instruments
The 4-step procedure recommended by WHO was used to guide the translation of instruments in this into Chinese [35, 36]. First, the original English MISS-HP was translated into Chinese by two health professionals from out research team who were bilingual and fluent in both Chinese and English. Next, the two translations were compared and discrepancies reconciled to arrive at a draft Chinese version. Second, a bilingual expert panel consisting of three health professionals (including the original translators) and two social science researchers reviewed the draft Chinese translation separately, making cultural adaptations as necessary. Third, the draft Chinese version was back translated into English by two bilingual health professionals (different translators than those in the first step). The back-translated English version was then compared to the original English version and reviewed by the original author to ensure that the questions were translated correctly and discrepancies resolved at this stage. Fourth, the draft version of the scale was administrated to 11 physicians from two hospitals for pre-testing. These physicians were asked to send back comments about ease of administration, clarity of wording, and time burden. Necessary changes in language were then made based on consensus to arrive at the final Chinese version of the MISS-HP (Supplementary Table 1).
Missing values. When computing scale scores, the mean substitution method was used to replace missing values. If two items or fewer on a scale were missing, we substituted the average of items answered on the scale for the missing item score. If more than two items were missing, the scale score was considered missing and no substitutions made.
Statistical analyses. Descriptive analyses were performed on all subjects depending on whether responses were categorical or continuous. Differences in socio-demographic characteristics between nurses and physicians tested using the Student’s t-test for continuous variables and the chi-square test for categorical variables. The difference in MISS-HP total scores between different demographic groups were examined using one-way analysis of variance (ANOVA). General linear regression was used to control for covariates.
Convergent/divergent validity was determined by examining correlations between the MISS-HP score and other measures. A correlation matrix was constructed using Pearson correlation coefficients. Cronbach’s alpha was used to assess the internal consistency the of MISS-HP, where alphas equal to or greater than 0.70 are considered acceptable. The intra-class correlation coefficient (ICC) was used to determine 2-week test-retest reliability, where ICCs between 0.41 and 0.60 indicate moderate reliability, those between 0.61 and 0.80 represent good reliability, and those higher than 0.80 indicate excellent reliability. Internal reliability tests were performed separately for the total sample, nurses, and physicians.
Exploratory factor analysis (EFA) and confirmatory factor analysis (CFA) were conducted to extract scale factors. The total sample of physicians was split randomly into two groups. EFA was performed using principal components analysis with Promax rotation (an oblique rotation method allowing factors to correlate with each other) using group 1 (n = 1,198). The Kaiser-Meyer-Olkin (KMO) index was used to measure sample adequacy, where KMO values of 0.6 or higher indicate adequacy. The Bartlett’s test of sphericity was used to assess the appropriateness of the correlations between variables in the factor model. For nurses, the full sample was used for both EFA and CFA.
CFA using the maximum likelihood method was performed to assess the stability of the factor structure (using group 2, n = 1,225 for physicians). Model adequacy was determined using the chi-square test with degrees of freedom (df). If the p-value is less 0.05, the model is considered acceptable and should improve with smaller χ2 values and larger df. Indices for the model fit included the comparative fit index (CFI), normed fit index (NFI), incremental fit index (IFI), and root mean square error of approximation (RMSEA). The Akaike information criterion (AIC) was also calculated. Values of CFI > 0.90, NFI > 0.90, IFI > 0.90, and RMSEA < 0.08 indicate that the model fit is acceptable. All the statistical analyses completed under IBM SPSS 23.0 version software (SPSS Inc., Chicago, IL, USA).