Development and Psychometric Testing of a Generic Student Engagement Scale for Contemporary Healthcare Education

Background: Student engagement receives heightened research attention in traditional and contemporary education. This study aimed to develop and validate a Generic Student Engagement Scale (GSES) for the use in contemporary healthcare education including online, distance learning or e-learning. Methods: A 2-phased methodological study employed a cross-sectional design with repeated measures. Phase 1 aimed to develop the items of GSES for the use in generic learning environment of nursing students through item development/revision, content validation, face validation, and statement standardization. Phase 2 aimed to test the psychometric properties of the newly developed GSES on its internal consistency, stability, and factorial validity. Data was collected in 2016-2018. Results: Based on 45 items of the Distance Students Engagement Scale (i.e., 3-factor model), 39 items were revised and adapted to t the generic learning environment in nursing education. Seven experts and 20 nursing students examined each items and indicated a satisfactory content (CVI = 0.71–1.00 except item G9) and face validity (100% comprehensibility and interpretability). By analyzing 451 valid data from nursing students, conrmatory factor analysis did not support this 3-factor model GSES. Exploratory factor analysis was used to explore the new internal structure of GSES. The 26-item 4-factor solution that accounted for a total of 41.0% variance was found the most optimal. The factors were labelled “self-regulated learning”, “cognitive strategy use”, “teacher-student interaction”, and “experienced emotion”. The internal consistency and stability of this new model was re-examined with satisfactory results. The newly developed 26-item 4-factor GSES is reliable and valid scale measuring student engagement among nursing students.

. Since online learning is more dependent on students' autonomy and self-regulated, student engagement in the online mode of learning would be a concern for all educators [4].

Student Engagement
Student engagement is de ned by Kuh as the time and energy students devote to educationally sound activities,which is regarded as a key factor that facilitates school completion, enhances achievement motivation [5], and associated with student's academic performance [6]. From the educational and institutional perspectives, student engagement is considered an important factor that leads to educational reform and evaluation, and that mediated the in uence of curricular and instructional reforms on student achievement [7]. Student engagement has gained greater importance in the eld of higher education because higher education highlights the independence, autonomy, and freedom in learning [8]. Both Kuh [9] and Coates [10] consistently suggested that student engagement should be regarded as an indicator of higher education institutional performance by inducing students to participate in educationally purposeful activities. With the advent of online education, an increasing number of higher education institutions have emphasized the importance of promoting online student engagement thus to improve student achievement in the online environment, and as a way to address the boredom and dropout problems of online learning [11,12]. Consequently, the measurement of online student engagement has gradually become a focus of higher education research, particularly in healthcare disciplines that is commonly regarded as bachelor level or higher diploma as fundamental entrance.

Measurement of Student Engagement
There were several existing scales measuring student engagement in the literature [13]; however, these rarely re ect the increased technology and online use in both teaching and learning. The multifaceted nature of engagement has already been re ected and documented in the education literature over the last few decades. However, the constructs of engagement vary with both the theoretical perspective of the researcher and the educational context in which the student engages himself/herself [14]. Recently, with the advancement of technological innovations in education, the learning environment has shifted from the classroom to a hybrid learning environment that blends both Web and face-to-face components in the same course. The hybrid learning environment such as the integration of e-learning [15], mobile learning [16], and simulation [17] is also common in nursing education. More importantly, online learning would be sole mode and new trend to keep the education of healthcare disciplines during the period of pandemics [11].
Fredricks and Paris [18] proposed and tested a three-dimensional construct of engagement that appropriately covers the most essential elements and encompasses "behavioral engagement, emotional engagement, and cognitive engagement" (p. 62), which is a more recognized engagement in the literature [14]. Behavioral engagement represents student behaviors while being involved in school activities (including following the rules) and academic tasks. Emotional engagement refers to "students' affective reactions" (p. 63) both inside the classroom and in the school campus in general. Cognitive engagement is described as a preference for challenge, strategy use, and self-regulation. The three-dimensional construct of student engagement is widely accepted in the literature because it was adequately encompassed the key dimensions on re ecting student engagement and applicable to any kind of teaching modes including traditional classroom teaching and online education.
For these new initiatives, many traditional instruments are now considered inadequate in capturing student engagement under the new learning environment, thus requiring the development of new engagement models and their corresponding instruments. However,literature review shows that there is limited reliable and valid scale to unfold student engagement in the online learning environment. A scale named as Online Student Engagement (OSE) Scale revised from Handelsman et al.'s Student Course Engagement Questionnaire [6] was claimed to t for measurement of engagement in online environment [11]. This 19itme scale measures engagement in four engagement factors including skills, emotion, participation and performance. However, both the theoretical framework and how the items were constructed are still questionable. For example, the adequacy of the scale might be skeptical because cognitive aspect mentioned by the three-dimensional construct of engagement is missed. Moreover, incorporating the learning outcomes ("performance" factor of OSE) as items (i.e., item 15: Getting a good grade, item 16: Doing well on the quizzes) for re ecting students engagement was controversial and invited many criticism [18]. More importantly, this scale was generic for the use but not speci c developed for measuring the student engagement in online learning environment.
Distance Student Engagement Scale (DSES) was developed based on three-dimensional construct of engagement [18] to measure student engagement for students enrolled in universities and online educational institutions in China [19]. Unlike other instruments measuring engagement using the traditional learning methods, DSES contains items that re ect the use of contemporary technologies in the blended learning mode, such as online platforms, discussion forums, audiovisual materials, and so on. The blended learning perspective of DSES largely matches the situation of Hong Kong higher education relevant to any international universities.
DSES is a 45-item self-administrated questionnaire presented in simpli ed Chinese [the written version of Chinese used in Hong Kong and Taiwan] [20]. A three-factor structure (behavioral, emotional and cognitive engagement) has been identi ed using exploratory factor analysis. The behavioral engagement subscale measures students' involvement in learning related to activities such as active participation and sharing, whereas the emotional engagement subscale measures students' emotional input in states such as happiness, curiosity and boredom. The cognitive engagement subscale re ects students' mental affordance such as elaboration and metacognition. The psychometric properties of DSES including internal consistency, criterion-related validity, and construct validity (assessed by con rmatory factor analysis) are all satisfactory (e.g., Cronbach's alpha = 0.88-0.96 for the scale and subscales; Comparative Fit Index = 0.81, Tucker-Lewis Index = 0.80, Root Mean Square Error of Approximation < 0.08) (details referred to [19]). [13] emphasized that students from different socioeconomic status and educational contexts may require distinct items and instruments that can accurately re ect their engagement. Even though the psychometric properties of DSES have been examined in a group of healthcare students in mainland China, the process of cross-cultural adaptation and retesting the psychometric properties are still necessary when DSES are applied to different target groups in cultures. To carefully examine the applicability of this scale to the contemporary healthcare education in Hong Kong, the current study aimed to revise and translate (into English) the DSES for adapting to measure the student engagement in the online higher education. Furthermore, the psychometric properties of the revised DSES were examined among a group of students in healthcare disciplines.

Study Design
This was a methodological study employing cross-sectional design. The process employed a recognized instrument development design that consists of two phases, namely item generation and psychometric testing [21,22]. Table 1 presents the overview of the study plan. Data of two phases was collected in 2016-2018. Examination by a group of target respondents (i.e., 20 students studying in healthcare disciplines) Step 4: Standardization of item statement To standardize the item by the use of consistent tone, statement pattern, phrases, etc.
Standardization of the entire OSES by the research team (GSES) to better refer its nature.
Step 2 was to perform content validation. A panel of seven experts was invited to examine relevance and adequacy of the rst draft of GSES.
Step 3 was to perform face validation.
A group of 20 target respondents (i.e., 20 randomly selected students studying in higher education) was invited to comment on the comprehensibility and interpretability of the drafted GSES items, where such method is suggested in the literature [22,23,29].
Step 4 was the standardization of item statement. Items were rephrased in a uniform manner and similar sentence structure to facilitate the respondents' comprehensibility. The major principles observed in item construction are maintenance of clarity, preference for short statements, avoidance of double negatives, avoidance of double-barreled statements, and avoidance of factual statements according to the typical method in designing item [22,23,24].
Phase 2 aimed to perform psychometric testing of the newly developed GSES in measuring the student engagement among a group of nursing students studying in higher education. Reliability of GSES was re ected through its internal consistency and stability. For construct validity, con rmatory factor analysis (CFA) was used to evaluate whether a pre-speci ed factor model of GSES provides a good t to the data [25]. This method that examined the internal structure of target instrument provided evidence to the construct validity.

Setting and Sample
Phase 1 involved a panel of at least seven experts (including educators working in higher education and researchers with experience in education or social science) and 20 randomly selected students studying in higher education.
In Phase 2, a convenient sample consists of about 650 full-time students who were major in healthcare subjects (e.g., nursing, dietitian, physiotherapist) were recruited. They were studying in a local University in Hong Kong. Based on Li & Yu [19], a 30% attrition rate was estimated. As Li & Yu [19] conducted CFA using a sample size of 443, this study invited 650 participants to ensure a comparable sample size. Then, a subsetsample of 70 participants was selected to respond to the GSES four weeks later for examining the test-retest reliability [26].

Ethical consideration
The study obtained the approval from the ethical committee of the University. Through an information sheet given to them, all participants were fully informed about the aim and procedure of this study. The researchers explained and highlighted anonymity and con dentiality. It was estimated that the respondents spent about 20 minutes to complete the questionnaire.

Measurement
The questionnaire used in Phase 2 consisted of two parts. Part one was demographic questions including gender, age, and year of study. Part two was the GSES.

Data Collection
Prior to data collection, the study obtained the ethical approval from the ethical committee of the University. Through an information sheet given to them, all participants were fully informed about the aim and procedure of this study. The researchers explained and highlighted anonymity and con dentiality. It was estimated that the respondents spent about 20 minutes to complete the questionnaire.
Data Analysis IBM® SPSS Statistics version 22 was used to analyze the quantitative data. Descriptive statistics, for example, frequency, percentage and mean were used to describe and summarize the questionnaire data. Inferential statistics including Cronbach's alpha and intraclass correlation coe cient were used to examine the internal consistency and stability, respectively. For con rmatory factor analysis, IBM® SPSS AMOS was used to test the hypothesized factor model of GSES.
Both the Descriptive statistics and inferential statistics were adopted to conduct psychometric testing as follows. Signi cant level is set at 0.05.

Content validation
Seven experts were invited to evaluate the relevancy of the items using a 4-point Likert scale (i.e., 1 = not relevant; 2 = somewhat relevant; 3 = quite relevant; 4 = highly relevant) and adequacy of the entire scale on a dichotomy scale (i.e., yes/no). For the problematic items receiving a rating of 2 or below, written comments were required. Experts were asked to suggest additional item if there was a query of inadequacy. The content validity index (CVI) included item-level CVI (i.e., proportion of content that experts gave the item a relevance rating of 3 or 4) and scale-level CVI (i.e., average of the Item-CVIs for all items on the scale) were both computed and presented. The indices above were used to indicate the proportion of relevant responses. According to Polit, Beck and Owen [27] and Portney and Watkins [28], when using an expert panel of six members, the criterion level needed for item-level CVI and scale-level CVI should be greater than or equal to 0.80 to indicate the validity of instrument content.

Face validation
Twenty students from several healthcare courses were invited to evaluate the instrument "for assuring the expression of the items was understandable words and style" [29]. The participants inspected and commented on each item to attain comprehensibility (i.e., Yes/No nominal scale), and rephrased each item based on their own words (i.e., the interpretability). For that, the researchers determined whether the rephrased content was appropriate on 4-point scale (i.e., 1 = fully correct, 2 = general correct; 3 partially wrong; 4 = totally wrong) [28]. The participants would ask to make suggestion through discussion if there was a great discrepancy (i.e., score = 3 or 4) in the interpretation.

Internal consistency
The Cronbach's alpha was calculated to determine the captioned property. The internal consistency should be considered adequate if the value was greater than 0.7 of the parameter indicates [28].

Stability
Through four-week test-retest reliability, the initial scores of the GSES with the ones obtained from the same participants four weeks later were compared. An Intraclass Correlation Coe cient (ICC) greater than 0.75 would indicate good test-retest reliability [28].

Construct validation
The internal structure of GSES was tested by CFA, which is regarded as the most rigorous method to do so [28]. It examined the degree of tness of the data in a hypothesized model. Correlation matrices were entered in the analyses and the maximum likelihood estimation procedure in AMOS software was used to conduct each analysis. The goodness-of-t measures that were used to assess model t included: the Chisquare/degree of freedom ratio (χ 2 /d.f.), the root mean square error of approximation (RMSEA), the comparative t index (CFI), and the incremental t index (IFI). All of them were frequently reported indices [21,28]. The goodness-of-t criteria for the indices were χ 2 /d.f. < 3.00, CFI and IFI > 0.90, and RMSEA < 0.08 [30]. If CFA failed to validate the internal structure of GSES, exploratory factor analysis (EFA) should be used to explore the new factor structure of student engagement among the nursing student in Hong Kong higher education [28].

Results
Out of the 45 items of DSES, 39 items were reworded and contextualized into a generic learning environment in higher education to different extents. Many items were revised or rephrased to increase the comprehensibility due to the language change from the written language commonly used in Beijing to one commonly used in Hong Kong. For examples, "would" in the original version (n = 30) was deleted because this instrument aimed to re ect the student engagement through their actual learning experiences but not their expectations. Some items (i.e., item D6,D12,D16,D17,D18,D30,D31,D32,D33,D34,D35,D38) were revised to remove the emphasis on the distance learning mode. Examples of various learning materials, devices and platforms were appropriately added in blanket to increase the relevance and adaptability of the GSES for tting into different learning modes (i.e., item D1,D2,D11,D24,D28,D29). A few items (i.e., item D24,D27,D31,D33,D34) were shortened to increase clarify of the items.
In step 2, seven experts, including academic staff in nursing and social science, commented on the relevance and adequacy of 45-item GSES. The Item-level CVI ranged from 0.80 to 1.00, except item G9 with 0.43, and some items (G12,G17,G23,G28,G34,G46) with 0.71. Regarding the item G9 "discussion of nonacademic related matters with teachers and students", the experts wondered why the discourse of "nonacademic matters" should be relevant to the engagement scale. This item was used to re ect the construct "behavioral engagement" through assessing their habits of sharing daily matters with someone in school.
Like the National Survey of Student Engagement (NSSE, http://nsse.indiana.edu/), some items assessed the engagement in student life including "activities other than coursework", "management of non-academic responsibilities", and "attending campus activities and events" [6]. Therefore, item G9 was retained. The items with Item-level CVI of 0.71 were considered good because of the increase in the number of experts [27]. In general, the Scale-level CVI was 0.92.
In step 3, twenty students studying a bachelor degree or a higher diploma in healthcare courses reviewed and commented on the comprehensibility and interpretability of 45-item GSES. They indicated whether they could comprehend each item rst and then they were requested to describe and explain each item in their own words. The result indicated all items were 100% comprehensibility and interpretability.
In step 4, researchers reworded some items (e.g., item G7,G9,G16G17,G18) in a uniform manner or similar sentence structure. For example, the term "teachers or students" is consistently used to refer "people who come into contact during the learning process in school". Table 2 presents the overall description of item modi cation and justi cations behind for the abovementioned steps.
For the study in phase 2, a total of 665 students from different healthcare courses participated in the study.
After the data cleansing procedure, 214 data sets (attrition rate = 32.1%) were removed because of habitual responses and incomplete data, leaving 451 data sets for analysis (i.e., 163 studying higher diploma year 1, 91 studying higher diploma year 2, 112 studying bachelor degree year 1, 91 studying bachelor degree year 2). Their age ranged from 18 to 33-year-old (mean = 20.6). Female (n = 348) accounted for 77.2% of total participants.
The GSES demonstrated a satisfactory reliability. The score of Cronbach's alpha, re ecting the internal consistency of the scale, were 0.93. Those of subscales were 0.80, 0.84 and 0.90 for behavioral, emotional and cognitive engagement subscales respectively. For the stability, the 4-week test retest on 70 randomly selected subsamples indicated that the ICC was 0.88 (95% con dence interval = 0.81-0.92), which suggested an evidence of good stability.
Face, content and construct validation of GSES were all performed. The results of face and content validation have been presented in the phase 1 study. For construct validation, the results of CFA were unsatisfactory. CFA did not support the hypothesized 3-factor structure of GSES with unacceptable goodness of t index (i.e., χ 2 /df (3793.80/942) = 4.03, CFI = 0.64, ILI = 0.65, RMSEA = 0.08). Table 3 illustrates the psychometric properties of the GSES in comparisons with DSES. As the t statistics did not support the three-factor model of GSES in students of healthcare disciplines, the EFA was, then, used to explore any new structure of the GSES. First, the suitability of data for factor analysis was assessed using Bartlett's test of sphericity and the Kaiser-Meyer-Olkin (KMO) measure of sampling adequacy [31]. Bartlett's test of sphericity that reached statistical signi cance (Chi-squared = 18693.51, df = 990, p < .001) and the KMO value of 0.89 (≥ 0.60) was considered to support the factorability of the correlation matrix [31]. Based on the results of Kaiser's criterion and Scree plot test, the research examined the item loadings on 2, 3, 4 and 5 factor models by various extraction methods (i.e., principal component analysis for general use, maximum likelihood for normally-distributed data or principle axis factoring for non-normally-distributed data) with two rotation methods (i.e., promax for oblique rotation or varimax for orthogonal method) [32]. A total of 24 factor solutions were computed, examined and compared. Eventually, the 26-item 4-factor solution, as shown in Table 4, was identi ed to be the most optimal by principal component analysis with varimax rotation because of maintaining conceptual adequacy and less crossloadings [32]. Indeed, the internal structure of this version was most interpretable and clear, with items measuring similar dimensions of engagement clustered together in similar way as suggested by the literature [18,29]. This 4-factor structure accounts for a total of 41.02% of variance. The factors were then labelled as: "self-regulated learning" (factor 1, 9 items, 26.42% of variance) which refers to a learning cognition that is regulated proactively by oneself; "cognitive strategy use" (factor 2, 6 items, 5.59% of variance) which refers to learning strategies effectively used by oneself; "teacher-student interaction" (factor 3, 6 items, 4.88% of variance) which refers to one's learning behavior through interaction with teachers and students; and "experienced emotion" (factor 4, 6 items, 4.13% of variance) which refers to typical emotions experienced through the process of learning. Step 1: Deletion of "would" was performed to increase the comprehensibility in Cantonese.
Step 4: Standardization of the use of "teachers or students" 8. D8. I would actively respond to questions, calls for help, or posts from teachers or classmates.
G8. I actively respond to questions and calls for help from teachers or students.
Step 1: Deletion of "would" was performed to increase the comprehensibility in Cantonese.
Step 4: G9. I always discuss extracurricular matters with my teachers or students.
Step 1: Deletion of "would" was performed to increase the comprehensibility in Cantonese.
Step 4: Standardization of the use of "teachers or students" 10. D10. I would always discuss course-related learning matters with my classmates.
G10. I always discuss course-related learning matters with my classmates.
Step 1: Deletion of "would" was performed to increase the comprehensibility in Cantonese. ,

Remarks
Step 1: Deletion of "would" was performed to increase the comprehensibility in Cantonese.
The emphasis of distance learning mode was removed.
Step 4: G33. I try to arrange for a comfortable environment for my studying.
Step 1: Deletion of "would" was performed to increase the comprehensibility in Cantonese.
The emphasis of distance learning mode was removed.
Item was shortened to increase clarity. Step 1: The emphasis of distance learning mode was removed.
Item was shortened to increase clarity. Step 1:

Remarks
Deletion of "would" was performed to increase the comprehensibility in Cantonese. G38. I use some strategies to alleviate the stress from learning.
Step 1: Deletion of "would" was performed to increase the comprehensibility in Cantonese. Step 1: Deletion of "would" was performed to increase the comprehensibility in Cantonese.  .687

Remarks
Cognitive strategy use (n = 5) G28. I always make use of memorization strategies to study during the course and review before exams (e.g., learning by rote, using images and mind maps). The internal consistency of this 26-item 4-factor GSES was re-examined with satisfactory results (Cronbach's alphas = 0.90 for the entire scale, 0.85 for factor 1, 0.80 for factor 2, 0.77 for factor 3 and 0.77 for factor 4). The stability, which was re-computed for this 26-item GSES, also obtained satisfactory results (ICC = 0.85, 95% con dence interval = 0.77-0.87). Further details are provided in the supplementary material for the nal version of GSES.

Discussion
Understanding student engagement not only facilitates school completion but also shapes educational reform and evaluation [19]. However, student engagement measurement is a subject to be modi ed according to the learning environment [13]. The newly developed GSES is a timely response to this emerging and latest need, i.e., online and e-learning during the pandemics.

Issues in instrument development
A conventional instrument development process involved item generation, item reduction, and psychometric testing [22,28]. The present study tried to develop a new engagement scale through adapting the most relevant existing scale, DSES. This method combined the steps of item generation and reduction together and had the advantage of saving time and additional work in designing item structure and response modality [23,24]. Unlike many published studies for instrument development, this study offers an explicit description and justi cations on the process of item generation which will facilitate other researchers in this eld to adopt the GSES, perform cross-cultural validation, or conduct psychometric testing in another population [23,28,29].

Issues in psychometric testing
The results of internal consistency, stability, and hypothesis testing suggested satisfactory evidence to the reliability and validity of 45-item GSES. Comparing the psychometric testing conducted in this study and in Li and Yu [19]'s study, two issues are worth discussing. First, several signi cant psychometric properties such as stability, face validity and content validity were investigated in this study. Such results contributed to the knowledge of engagement measurement. Second, based on the same factor structure of GSES and DSES, the Cronbach's alphas were comparative regarding the scale level (0.93 vs 0.96) and subscale levels (0.80-0.90 vs 0.88-0.93). Nevertheless, the results of CFA were of great difference between GSES and DSES. Unlike the satisfactory CFA results of DSES, the originally hypothesized 3-factor structure of GSES did not obtain the support from this robust structure examination. Instead, EFA was used to explore the latent structure of GSES among the students in healthcare disciplines. The results suggested a 26-item 4-factor solution. The different factor structure between GSES and DSES will be discussed in the next section.
Although it is believed that the relevance and interpretability of 45-item and 26-item should make no difference, only the internal consistency and stability of this new structure were re-examined. This calls for a further study to re-examine the factor structure through CFA using a new data set.
Conceptual comparison of 26-item GSES, 45-item DSES, and existing instruments When examining the four latent factors, it is observed that the items with similar meaning are grouped into one factor. Table 5 presents the comparison of 26-items GSES and 45-item DSES. The factor "self-regulated learning" encompasses nine items that are entirely originated at cognitive engagement of 3-factor model. More importantly, all items clearly re ect how the students perform selfregulated learning through proper management of time and resources, and self-re ection and selfevaluation, and adaptive learning methods. Like the other engagement scales, self-regulated learning grouped as a single factor (or subscale) is common and comparable to the subscale of "self-e cacy" of Motivation and Engagement Scale-High School [33], and the subscale of "re ective & integrative learning" of NSSE.
The second factor is "cognitive strategy use" that includes six items that come from cognitive engagement of the original model. This factor covers various study strategies that were cognitively embedded in the learning process (e.g., memorization, connection between old and new knowledge, use of real-life examples, and so on). The literature indicated the concepts being identi ed as "cognitive strategy use" were frequently mentioned in the other engagement scales, for example, subscales of "learning strategy section" in Motivated Strategies for Learning Questionnaire (College version) [34] and that of "learning strategies" and "quantitative reasoning" in NSSE.
The factor "teacher-student interaction" encompasses similar concepts including many occasions of interaction between teacher and students to that of behavioral engagement of the original model. Although one item (G20) is regarded as emotional engagement, the meaning of this item describes the sense of being respected during teacher-students interaction. Sense of being respected, in many occasions, would be considered under social dimension. This factor con ned the four aspects of behavioral engagement of original model to a single concept, which might suggest that interaction with people shapes the major component of behavioral engagement among students of healthcare disciplines. Similar to the subscale of "teacher-student relationships" and "peer support for learning" in Student Engagement Instrument [5], that of "participation/interaction engagement" in Student Course Engagement Questionnaire [6], and that of "learning from peers" and "student-faculty interaction" in NSSE, the current identi ed factor "teacher-student interaction" is well justi ed.
Last, the factor "experienced emotion" covers the essential emotional engagement experienced throughout the learning process, including curiosity, happiness and boredom. The items of this factor are those in emotional engagement of original model. Indeed, the concept of emotional engagement was frequently mentioned in the literature and recognized as one of essential domains in shaping student engagement measurement [6]. EFA was used to remove the items with low factor loading (i.e., identi ed as noise regarding concept measurement) and increase the measurement quality.

Limitations and recommendations
Several limitations were worth discussing. Measurement of student engagement sometimes depends on the student learning environment [14]. Although GSES was developed by modifying the latest engagement scale to ensure the relevance and adequacy in contemporary higher education, it is still uncertain whether the GSES is appropriate to measure student engagement in other cultures and educational settings. Indeed, the 3-factor model of 45-item GSES, which has already been recognized as a well-accepted model for engagement measurement, was challenged when it was applied to students with different healthcare background. Therefore, the psychometric properties of either the 45-item or 26-item GSES should be reexamined in other learning environments. The second is about the comprehensiveness of psychometric testing in this study. For this 26-item 4-factor GSES, the satisfactory reliability was well demonstrated in this study, nevertheless, concurrent validation and construct validation, namely CFA and hypothesis testing, have not yet been performed. By recruiting a new group of participants, further examination on this area should be conducted. Last, the literature recommended that the sample size should reach 300 subjects to be regarded acceptable in CFA [25,34]. However, a complex model (i.e., item number ≥ 30; Hair et al., 2010) should be computed with a larger sample size to increase the accuracy of goodness-of-t indices [25,34]. Further study may consider conducting CFA on 45-item GSES by increasing the sample size [25].

Conclusions
Student engagement is important in higher education as well as online mode of learning. This study developed the 26-item GSES by modifying the original 45-item DSES. EFA identi ed a new 4-factor model (i.e., self-regulated learning, cognitive strategy use, teacher-student interaction, and experienced emotion) in explaining engagement among students of healthcare disciplines in higher education. The content and face validity of these 26 items have been examined in the process of item generation. Furthermore, the results of internal consistency, stability and factorial validity provided a satisfactory evidence to support the reliability and validity of 26-item GSES. Despite the limitations as stated above, the current study provides an initial attempt to develop a generic student engagement scale that ts the diverse learning modes in higher education. Authors' contributions SCL, SL, SCNS, JYSC, HCYL, EYMT and KCL conceived the study. SCL SL and SCNS performed data analysis and drafted the manuscript. JYSC HCYL and EYMT collected the data. KCL helped to revise the manuscript. All authors have read and approved the manuscript.