4.1 Test results
RQ1: In what types of reading comprehension questions do students find success when using an online ER program?
In this study, Rasch analysis was used to measure student achievement, compared to item difficulty. According to the person-item map in Figure 2, which includes the students’ reading comprehension on the left and item difficulty on the right, the students performed poorly on the detail (Items 7 and 37) and inference (Item 4) questions. In this case, Item 9’s vocabulary question and Item 35’s detail question were the simplest, while the students found the other questions neither difficult nor easy.
4.2. Reliability and fitness of the reading comprehension test
RQ2: What is the reliability and fitness of the reading comprehension test?
Based on the findings, the average measures (logits) for the items and individuals in experimental group 1 were 74.10 and 30.90, respectively, while in experimental group 2, the average measures for the items and individuals were 73.10 and 28.10, respectively. In the control group, the average measures for the items and individuals were 56.70 and 27.80, respectively. All of these values had positive standardized deviation values, indicating that the data collected from the sample could be used to examine the influence of online ER on students’ reading comprehension achievement32.
According to Table 1, the item and person reliabilities were deemed satisfactory. In addition, the internal consistency reliability scores of both the experimental and control groups were over 0.70 when using Cronbach’s alpha (KR-20). Overall, the test items in this study exhibited good reliability33.
Table 1. Reliability and fit statistics of the test for both experimental and control groups.
Construct
|
Experimental Group 1
|
Experimental Group 2
|
Control Group
|
Items
|
Persons
|
Items
|
Persons
|
Items
|
Persons
|
Number
|
50
|
120
|
50
|
130
|
50
|
100
|
Mean
|
74.1
|
30.9
|
73.1
|
28.1
|
56.7
|
27.8
|
Standard Deviation
|
22.5
|
8.4
|
27.2
|
7.2
|
35.3
|
5.1
|
Reliability (Cronbach’s alpha)
|
0.95
|
0.86
|
0.96
|
0.81
|
0.94
|
0.74
|
Separation
|
4.30
|
2.43
|
4.98
|
2.06
|
3.89
|
1.71
|
MNSQ (infit)
MNSQ (outfit)
ZSTD (infit)
ZSTD (outfit)
|
0.99
1.03
-0.1
0.2
|
0.98
1.03
0.0
0.1
|
0.99
1.05
-0.2
0.2
|
0.99
1.05
-0.1
0.0
|
0.99
1.00
-0.1
0.0
|
1.00
0.98
-0.4
-0.1
|
Chi-squared (X2)
|
6174.62
|
6980.01
|
3506.78
|
df
|
5831
|
6321
|
4242
|
This table also indicates that the mean square (MNSQ) infit/outfit values and their corresponding standardized Z-score (ZSTD) values were within the permissible range of 0.6 to 1.5 and 0.6 to 2.0, respectively [33]. Meanwhile, the chi-square values, when considering the degrees of freedom, were found to be less than 3 (± 2/df 3). Overall, the reading comprehension test was found to be suitable for evaluating the reading comprehension levels of the students.
4.3. Analysis of covariance (ANCOVA)
RQ3: What is the effect of online ER on students’ reading comprehension?
In this study, the ANCOVA applied student achievement (the post-test scores) as the dependent variable, the pre-test scores as covariates, and the online ER treatment (groups) as the independent variable. The homogeneity of slopes assumption in the ANCOVA was also assessed. Based on the findings, there was no statistically significant interaction (p = 0.160; p > 0.05) between the online ER treatment and the pre-test scores, indicating that this assumption was met. Moreover, after accounting for the initial differences in the pre-test scores, a statistically significant distinction was observed between the experimental and control groups in terms of their reading comprehension performance ((F (2, 349) = 6.944; p = 0.001, p < 0.001)) (see Table 2).
Table 3 presents the mean values and standard deviations for the experimental and control groups in terms of their reading comprehension skills. According to the findings, the experimental groups, which were exposed to the online ER treatment, had superior performance in their reading comprehension, compared to the control group. The researchers also conducted an analysis of effect size, specifically partial eta squared, in order to evaluate the difference between the two groups. The results revealed a significant impact size (0.038), due to the implementation of the online ER treatment (see Table 2).
Table 2. Analysis of covariance for reading achievement (post-test) as a function of groups, using
pretest scores as a covariate.
Source
|
df
|
Mean Square
|
F-Value
|
p-Value
|
η2
(Eta Square)
|
Pre-test
|
1
|
100.705
|
1.986
|
0.160
|
0.006
|
Groups
|
2
|
352.069
|
6.944
|
0.001***
|
0.038
|
Error
|
349
|
50.704
|
|
|
|
Note: *** p < 0.001.
Table 3. Adjusted and unadjusted group means and variability for reading achievement (post-test),
using pretest values as a covariate.
Group
|
Number
|
Unadjusted
|
Adjusted
|
Mean
|
Standard Deviation
|
Mean
|
Standard Error
|
Experimental 1
|
120
|
31.05
|
8.45
|
30.96
|
0.64
|
Experimental 2
|
130
|
28.12
|
7.18
|
28.07
|
0.62
|
Control
|
100
|
27.69
|
4.99
|
27.85
|
0.71
|
4.4. Structural equation modeling
RQ4: What are the structural associations between affective aspects, English reading behavior, and English reading motivation?
Descriptive statistics
Table 4 presents the descriptive statistics pertaining to the latent variables in this study. Based on the findings, all of the means exhibited values greater than 2.00 on the four-point Likert scale. This indicates that the students conveyed their active engagement, both in terms of emotional and behavioral involvement. Additionally, the observed standard deviations ranged from 0.534 to 0.815, suggesting a narrow value spread in relation to the mean outcomes.
Table 4. Descriptive statistics of the study construct
Variables
|
Mean
|
SD
|
Reading Self-Efficacy
|
2.269
|
0.636
|
Self-Concept
|
2.871
|
0.599
|
Intrinsic Motivation
|
2.359
|
0.671
|
Extrinsic Motivation
|
2.840
|
0.634
|
English Reading Behavior
|
2.019
|
0.534
|
English Reading Comprehension
|
3.019
|
0.815
|
Evaluation of the measurement model
Following Hair et al.34, a reflective measurement model was used in this study, which included evaluating the loading factors, the internal consistency of the items, convergent validity, and discriminant validity. In this regard, they recommended that the loading factor should be > 0.70. According to the loading factors in Table 5, all of the items ranged from 0.609 to 0.903. In order to examine internal reliability, composite reliability was used, instead of Cronbach’s alpha. Hair et al.34 also stated that a coefficient for composite reliability between 0.60 and 0.70 is considered acceptable, while that between 0.70 and 0.90 is satisfactory. In this study, the composite reliability ranged from 0.734 to 0.965. Regarding convergent validity, it refers to the extent to which the measures/constructs converge with other constructs, and appears when the cut-off value of the average variance extracted (AVE) is equal to or higher than 0.5. As shown in Table 5, the coefficient for the AVE of all the latent variables was higher than 0.5, which met these the guidelines.
Table 5.
Convergent validity of the construct
Latent Variables
|
Item
|
Factor Loading
|
Average Variance Extracted
|
Composite
Reliability
|
Reading Self-Efficacy
|
|
|
0.743
|
0.929
|
SE1
|
0.837
|
|
|
SE2
|
0.875
|
|
|
|
SE3
|
0.852
|
|
|
|
SE4
|
0.876
|
|
|
|
SE5
|
0.814
|
|
|
Self-Concept
|
|
|
0.743
|
0.921
|
|
SC1
|
0.831
|
|
|
|
SC2
|
0.854
|
|
|
|
SC3
|
0.890
|
|
|
|
SC4
|
0.873
|
|
|
Intrinsic Motivation
|
|
|
0.736
|
0.965
|
|
IM1
|
0.895
|
|
|
|
IM2
|
0.879
|
|
|
|
IM3
|
0.903
|
|
|
|
IM4
|
0.824
|
|
|
|
IM5
|
0.799
|
|
|
|
IM6
|
0.827
|
|
|
|
IM7
|
0.883
|
|
|
|
IM8
|
0.870
|
|
|
|
IM9
|
0.827
|
|
|
|
IM10
|
0.865
|
|
|
Extrinsic Motivation
|
|
|
0.549
|
0.829
|
|
EM1
|
0.745
|
|
|
|
EM2
|
0.787
|
|
|
|
EM3
|
0.784
|
|
|
|
EM4
|
0.637
|
|
|
English Reading Behavior
|
|
|
0.500
|
0.734
|
|
ERB1
|
0.727
|
|
|
|
ERB2
|
0.609
|
|
|
|
ERB3
|
0.737
|
|
|
Note. EM = extrinsic motivation, ERB = English reading behavior, ERC= English reading comprehension, IM = intrinsic motivation RSE= Reading self-efficacy, SC = self-concept.
According to Fornell and Larcker35, assessing discriminant validity involves examining the correlation between the factors and the square root of the AVE. If this square root exceeds the correlation across these components, then it indicates the presence of potential multicollinearity34. According to Table 6, the AVE had a greater value, compared to the correlation between the factors. Hence, the discriminant validity of the factors were deemed adequate.
Table 6. Discriminant validity
Variables
|
EM
|
ERB
|
IM
|
RSE
|
SC
|
EM
|
(0.741)
|
|
|
|
|
ERB
|
0.192**
|
(0.794)
|
|
|
|
IM
|
0.223**
|
0.780**
|
(0.958)
|
|
|
RSE
|
0.289**
|
0.738**
|
0.913**
|
(0.851)
|
|
SC
|
0.436**
|
0.130*
|
0.214**
|
0.272**
|
(0.862)
|
Note. *p < 0.05, ** p < 0.001, EM = extrinsic motivation, ERB = English reading behavior, ERC= English reading comprehension, IM = intrinsic motivation RSE= Reading self-efficacy, SC = self-concept.
Hypothesis testing
In general, hypothesis testing encompasses the determination of coefficients (R2), the measurement of cross-validated redundancy by blindfolding (Q2), and the evaluation of the statistical significance/relevance of the path coefficient. We analyzed our data based on the hypothesized model as in figure 1. As shown in Figure 3, the level of ERC may be accounted for by various factors (e.g., EM, IM, and ERB), which collectively explain 1.4% of the variance (R2 = 0.014). Specifically, the relationship between ERB and both EM and IM was accounted for 61.3% of the variance (R2 = 0.613).
The relationship between EM and IM was also examined in relation to RSE and SC. The results indicated that the former accounted for 22.8% (R2 = 0.228) of the variance in EM, while the latter accounted for 83.5% (R2 = 0.835) of the variance in IM. According to Hair et al.34, in terms of potential blindfolding-based cross-validated redundancy, it is necessary for the value of Q2 to exceed zero in order to demonstrate predictive accuracy for a particular endogenous construct. In this regard, the coefficients of Q2, ranging from 0 to 0.25 and 0 to 0.5, respectively, are indicative of low, medium, and high prediction accuracy. In the present study, the prediction accuracies of RSE and SE on EM and IM were found to be medium (Q2 = 0.117) and high (Q2 = 0.609), respectively. As for the ERB, the prediction accuracies of both EM and IM yielded a moderate effect size (Q2 = 0.289). In sum, the impact of EM, IM, and ERB on ERC was low (Q2 = 0.003).
Furthermore, this study assessed the statistical significance of the path coefficients. Figure 3 depicts the model, in which the standardized path coefficients illustrate the relationship between the different elements. Based on the findings, there was a positive relationship between RSE and EM (β = 0.197, p = 0.001) and IM (β = 0.923, p = 0.000), indicating that SC had a significant positive effect on EM (β = 0.385, p = 0.000), but no such effect on IM (β = -0.035, p = 0.190). The findings also indicate that there was no significant relationship between EM and both ERB (β = 0.029, p = 0.523) and ERC (β = 0.061, p = 0.354). However, IM had a significant positive effect on ERB (β = 0.776, p = 0.000), whereas it had no such effect on ERC (β = -0.153, p = 0.108). Meanwhile, the sole factor of ERB did not demonstrate a significant association with ERC (β = 0.054, p = 0.585). In regard to indirect prediction, only RSE had an indirect effect on ERB (β = 0.716, p = 0.000) through the mediating factor of IM. Interestingly, the remaining indirect forecasts yielded negligible results (see Table 7).
Table 7. The results from the hypothesis test
Path
|
Estimate
|
t-value
|
p-value
|
Result
|
EM -> ERB
|
0.029
|
0.639
|
0.523
|
Not supported
|
EM -> ERC
|
0.061
|
0.926
|
0.354
|
Not supported
|
ERB -> ERC
|
0.054
|
0.546
|
0.585
|
Not supported
|
IM -> ERB
|
0.776
|
34.010
|
0.000
|
supported
|
IM -> ERC
|
-0.153
|
1.610
|
0.108
|
Not supported
|
RSE -> EM
|
0.197
|
3.205
|
0.001
|
supported
|
RSE -> IM
|
0.923
|
79.492
|
0.000
|
supported
|
SC -> EM
|
0.385
|
6.328
|
0.000
|
Supported
|
SC -> IM
|
-0.035
|
1.310
|
0.190
|
Not supported
|
RSE -> IM -> ERC
|
-0.141
|
1.605
|
0.109
|
Not supported
|
RSE -> IM -> ERB
|
0.716
|
30.189
|
0.000
|
Supported
|
SC -> IM -> ERC
|
0.005
|
0.945
|
0.345
|
Not supported
|
IM -> ERB -> ERC
|
0.042
|
0.543
|
0.587
|
Not supported
|