This study developed an oral presentation scale (OPS) to objectively evaluate nursing students’ oral presentations. We conducted the study in two phases: development of the initial items of the OPS (Phase I); psychometric testing of the scale, internal the internal consistency reliability and three-weeks test-retest reliability (Phase II). Approval was obtained from Chang Gung Medical Foundation institutional review board (ID: 201702148B0) prior to initiation of the study. The informed consent was obtained from all subjects before the data collection of this study. All the study methods were carried out in accordance with relevant guidelines and regulations.
Phase I: Item development and content validity index
A review of the literature regarding oral performance, self-efficacy, and characteristics of oral communication was used to determine categories considered important for the objective evaluation of oral presentations [2, 6, 7, 10-13]. Three categories were determined to be important: preparation, presentation, and post-presentation.
We determined key elements of these presentation categories with individual face-to-face semi-structured interviews conducted with a sample of teachers (n = 8) and nursing students (n = 11). Nursing students give oral presentations to meet the curriculum requirement, therefore the teachers were university tutors experienced in coaching nursing students in preparing for and giving an oral presentation. All participants provided signed informed consent indicating willingness to be audiotaped during the interview. Teachers were recruited if they had at least ten years’ experience coaching university students; students were included if they had given at least one oral presentation.
The teachers were asked the following questions: 1) What has been your reaction to oral reports or presentations given by your students? 2. What problems commonly occur when students are giving oral reports or presentations? 3. In your opinion, what do you consider a good presentation and could you describe the characteristics? 4. How do you evaluate the performance of the student’s oral reports or presentations? Are there any difficulties or problems evaluating the oral reports?
Students were asked two questions: 1. Would you please tell me about your experiences of giving an oral report or presentation? 2. In your opinion, what is a good presentation and what are some of the important characteristics? Interviews lasted approximately 20-30 minutes.
Analysis of the interview data provided characteristics important to each of the three categories. Characteristics of a good preparation included: the presenter is well prepared before the presentation; the presenter prepares materials suitable for the target audience; the presenter practices giving the presentation in advance; and the presenter discusses the content of the presentation with classmates and teachers. Presentation included the following characteristics: obtain the attention of the audience; provide materials that are reliable and valuable; express confidence and enthusiasm; interact with the audience; and respond to the questions from the audience. The third category, post-presentation, involved feedback and evaluation from teachers and peers in order to improve performance: discuss the content of presentation with teachers; and gain feedback from the audience.
Content validity of the 28 items of the OPS was established with a panel of eight expert instructors in oral presentation. All instructors had over ten years’ experience in coaching students in giving an oral presentation that would be evaluated for a grade. For the item-level content validity index (I-CVI), the experts were provided with a description of the research purpose, a list of the proposed items, and were asked to rate each item on a 4-point Likert scale (1 = not representative, 2 = item needs major revision, 3 = representative but needs minor revision, 4 = representative). Based on the suggestions of the experts, six items of the OPS were reworded for clarity: item 12 was revised from “The presentation is riveting” to “The presenter’s performance is brilliant; it resonates with the audience and arouses their interests”. Two items were deleted because they duplicated other items: “demonstrates confidence” and “presents enthusiasm” were combined and item 22 became, “demonstrates confidence and enthusiasm properly”. The item “the presentation allows for proper timing and sequencing” and “the length of time of the presentation is well controlled” were also combined to one item (9), “The content of presentation follows the rules, allowing for the proper timing and sequence”. Thus, a total of 26 items were included in the OPS. The I-CVI value was .88~1 and the scale-level CVI/universal agreement was .75, indicating that the OPS was an acceptable instrument for measuring an oral presentation [14].
Phase II: Psychometric testing of the OPS
Reliability and validity of the developed scale was determined with exploratory factor analysis (EFA), confirmatory factor analysis (CFA), internal consistency reliability, test-retest reliability (3-weeks), and criterion-related validity, respectively. The items in the scale for EFA and CFA were presented in random order and were not nested according to constructs.
Participants
A sample of nursing students was recruited purposively from a university in Taiwan to conduct EFA and CFA. Students were included if they were: (a) full-time students; (b) had declared nursing as their major; and (c) were in their sophomore, junior, or senior year. First-year university students were excluded (freshman). A bulletin about the survey study was posted outside of classrooms; a total of 707 students attend these classes. The bulletin included a description of the inclusion criteria and instructions to appear at the classroom on a given day and time, if students were interested in participating in the study. Students who appeared at the classroom on the scheduled day (N = 650) were given a packet containing a demographic questionnaire (age, gender, year in school), a consent form, and the OPS instrument; the documents were labeled with an identification number in order to anonymize the data. These 650 surveys were divided into two groups, based on the demographic data: one for EFA (the calibration sample, n = 325) and one for CFA (the validation sample, n = 325), using the SPSS random case selection procedure, (Version 23.0; SPSS Inc., Chicago, IL, USA). The selection procedure was performed repeatedly until the homogeneity of the baseline characteristics was established between the two groups (p >.05). The mean age of the participants was 20.5 years (SD = 0.98) and 87.1% were female (n = 566). Most participants were third-year students (40.6%, n = 274), followed by forth-year (37.9%, n = 246) and second-year (21.5%, n = 93).
Exploratory factor analysis
Data from the 325 students designated for EFA was used to determine the construct validity of the OPS. The Kaiser-Meyer-Olkin measure for sampling adequacy and Bartlett’s test of sphericity demonstrated factor analysis was appropriate (Nunnally & Bernstein, 1994). Principal component analysis was performed on the 26 items to extract the major contributing factors; varimax rotation determined relationships between the items and contributing factors. Factors with an eigenvalue > 1 were further inspected. A factor loading greater than .50 was regarded as significantly relevant (Hair et al., 2006). All item deletions were incorporated one by one, and the EFA model was respecified after each deletion, which reduced the number of items in accordance with a priori criteria. In the EFA phase, the internal consistency of each construct was examined using Cronbach’s alpha, with a value of .70 or higher considered acceptable (DeVellis, 2003).
Confirmatory factor analysis
Data from the 325 students designated for CFA was used to validate the factor structure of the OPS. In this phase, items with a factor loading less than .50 were deleted (Hair et al., 2006). The goodness of the model fit was assessed using the following: absolute fit indices, including goodness of fit index (GFI), adjusted goodness of fit index (AGFI), standardized root mean squared residual (SRMR), and the root mean square error of approximation (RMSEA); relative fit indices, normed and non-normed fit index (NFI and NNFI, respectively), and comparative fit index (CFI); and the parsimony NFI, CFI, and likelihood ratio (x2/df; Bentler, 1992).
In addition to the psychometric testing, a research team, which included a statistician, determined the appropriateness of either deleting or retaining each item. The convergent validity (internal quality of the items and factor structures), was further verified using standardized factor loading, with values of .50 or higher considered acceptable, and average variance extraction (AVE), with values of .5 or higher considered acceptable (Hair et al., 2006). Convergent reliability (CR) was assessed using the construct reliability from the CFA, with values of .7 or higher considered acceptable (Fornell & Larcker, 1981). The AVE and correlation matrices among the latent constructs were used to establish discriminant validity of the instrument. The square root of the AVE of each construct was required to reach a value that was larger than the correlation coefficient between itself and the other constructs (Fornell & Larcker, 1981).
Criterion validity of the OPS
Criterion validity was determined by examining the relationship of the developed OPS with constructs of two scales for assessing performance of an oral presentation: the Personal Report of Communication Apprehension (PRCA) scale was developed by McCroskey (1977); and the Self-Perceived Communication Competence SPCC ([15].
The 24-item PRCA scale. The PRCA scale is a self-report instrument for measuring communication apprehension, which is an individual’s level of fear or anxiety associated with either real or anticipated communication with a person or persons [16]. The 24 scale items are comprised of statements concerning feelings about communicating with others. Four subscales are used for different situations: group discussion, interpersonal communication, meetings, and public speaking. Each item is scored on a 5-point Likert scale from 1 (strongly disagree) to 5 (strongly agree); scores range from 24 to 120, with higher scores indicating greater communication anxiety. The PRCA has been demonstrated to be a reliable and valid scale across a wide range of related studies [5, 17-20]. The Cronbach’s alpha for the scale is .90 [21]. We received permission from the owner of the copyright to translate the scale into Chinese. Back-translation was used to ensure that the semantic validity of the translated scale. The Cronbach’s alpha value in the present study was .93.
The 12-item SPCC scale. The SPCC scale evaluates a persons’ self-perceived competence in a variety of communication contexts and with a variety of types of receivers. Each item is a situation which requires communication, such as “Present a talk to a group of strangers”, or “Talk with a friend”. Participants respond to each situation by ranking their level of competence from 0 (completely incompetent) to 100 (completely competent). The Cronbach’s alpha for reliability of the scale is .85. The SPCC has been used in similar studies [17, 22]. We received permission owner of the copyright to translate the scale into Chinese. Back-translation was used to ensure that the semantic validity of the translated scale. The Cronbach’s alpha value in the present study was .941.
Stability
To determine the stability of the OPS, test-retest reliability was conducted with 89 of the participants enrolled in this study. The interval between the first and the second test was 3 weeks.
Statistical Analysis
In addition to the previous descriptions of statistical analysis, all data were analyzed using SPSS for Windows 23 (SPSS Inc., Chicago, IL, USA).