The extracted data included 1,052,807 recorded logs, which became 790,956 log records after removal of non-learning events (e.g., clicks on profile pages). The median number of students per course was 92 and ranged from 84 to 99. The median number of events per course offering (in the final data set after removal of non-learning events) was 15,345 and ranged from 3,925 to 36,417. The median number of forum consumption events was 48.5 per student per course offering; the median of forum contribution per student per course offering was 50; the median of sessions per student per course offering was 48, and the median duration of online time was 4.99 hours per student per course offering.
Table 1
Comparison (ANOVA) of the three clusters according to their mean activity indicators
Indicator
|
State
|
Mean
|
SD
|
p
|
ε²
|
Frequency of course browsing
|
Actively engaged
|
8.47
|
1.487
|
< .001
|
0.686
|
Averagely engaged
|
4.91
|
1.820
|
Disengaged
|
1.95
|
1.211
|
Frequency of lecture access
|
Actively engaged
|
7.31
|
2.413
|
< .001
|
0.304
|
Averagely engaged
|
5.17
|
2.506
|
Disengaged
|
2.94
|
2.057
|
Frequency of forum reading
|
Actively engaged
|
7.95
|
2.074
|
< .001
|
0.456
|
Averagely engaged
|
4.89
|
2.218
|
Disengaged
|
2.67
|
1.901
|
Frequency of forum contribution
|
Actively engaged
|
7.25
|
2.421
|
< .001
|
0.277
|
Averagely engaged
|
5.13
|
2.509
|
Disengaged
|
3.09
|
2.298
|
Active days
|
Actively engaged
|
8.51
|
1.432
|
< .001
|
0.726
|
Averagely engaged
|
4.95
|
1.738
|
Disengaged
|
1.79
|
0.938
|
Session count
|
Actively engaged
|
8.71
|
1.095
|
< .001
|
0.796
|
Averagely engaged
|
4.87
|
1.553
|
Disengaged
|
1.70
|
0.851
|
Total online time
|
Actively engaged
|
8.02
|
1.995
|
< .001
|
0.500
|
Averagely engaged
|
4.94
|
2.187
|
Disengaged
|
2.45
|
1.663
|
Regularity
|
Actively engaged
|
8.15
|
1.832
|
< .001
|
0.609
|
Averagely engaged
|
5.10
|
2.034
|
Disengaged
|
1.96
|
1.032
|
|
To answer RQ1, an LCA model was fitted using students’ learning activities. We tested models with two to ten clusters (similar to [9]). The model with three clusters had the lowest AIC = 43,748.7 and BIC = 44,891 as well as had clearly separable clusters and high epsilon-squared effect size (ranging from 0.3 to 0.8). Table 1 shows the three identified clusters, which can be described as follows:
-
Actively engaged cluster: Students in this cluster had the highest values of activity indicators (between the 7th and 9th decile), frequent access to course resources, frequent forum posting and reading. They had the highest frequency of active days, longer sessions, and the highest regularity. These indicators ranged between the 7th and 9th decile.
-
Averagely engaged cluster: Students in this cluster had moderate (mostly around the 5th decile) values of activity indicators: average frequency of access to course resources, forum posting, forum reading, active days and regularity.
-
Disengaged cluster: Students in this cluster had the lowest levels of activities that lied between the 1st decile and the 3rd decile.
Throughout this manuscript we will refer to each cluster as an engagement state as it describes the students’ state of engagement in each course.
To answer RQ2, the engagement states were used to construct a sequence object in each of the two study groups (Early engagement and Early disengagement) which was mined used SPM. We then compared the two groups and the characteristics of each of their trajectories.
Early disengagement group: The sequence index plot (Fig. 1.A) shows each trajectory of a student as a sequence of horizontally colored stacked bars: the colors reflect their engagement states. The hierarchical clustering (Fig. 1.A) shows two distinct groups of students: G1, a mostly disengaged group (n = 9) who eventually drop out, G2 with fluctuating engagement trajectories which can be further divided into two subgroups: G2a with fluctuating engagement trajectory dominated with disengagement states, and G2b, a relatively stable group dominated with active engagement states. Overall, among the group of Early disengagement, the total number of students who dropped-out was two students at the 3rd course, six at the 6th course, nine students at the 9th course, and 13 students by the 15th course. A noticeable observation that students who were able to catch up and engage again in the second course were more likely to maintain such an engaged state (11/13), while students who were disengaged for two successive courses had more dropouts (11/28). The distribution plot (Fig. 1.B) shows the distribution of engagement states at each time point, highlighting that dropout occurred immediately after the first term. The mean time plot (Fig. 1.C) shows that these students spent an average of 7.7 courses as disengaged or dropout and 7.3 as averagely or actively engaged.
Early engaged group: The index plot (Fig. 2.A) shows that the students in this group had more stable trajectories that were dominated with average and active engagement states with very infrequent disengagement states (an average of 13.6 courses were active or average engagement per student, Fig. 2.C). Two distinct subgroups can be revealed with hierarchical clustering: G1, a subgroup with mostly average engagement with frequent transition to engagement states, and G2, a subgroup with mostly engaged states with infrequent transition to other engagement states. The distribution plot (Fig. 2.B) confirms that students in this group were mostly highly engaged or averagely engaged. Only a single student dropped at the course 12.
The dynamics of engagement trajectories
Comparing both trajectories helps understand the dynamics of events (transitions, sequences of transitions, stability and persistence. Firstly, we compare the transition probabilities between engagement states to investigate how each group changes across time. Secondly, we compare the frequent transition subsequences that are characteristic of each group. Thirdly, we compare the stability of trajectories in each group using transversal-entropy curves.
Students in the Early disengagement group were more likely to descend from the “actively engaged” state to the “disengaged” state (transition probability 28%) compared to the Early engagement group (transition probability 11%), which highlights the vulnerability of the former group of students to transition to disengagement states in the future (Table 2). Similarly, students in the Early disengagement group were also more likely to stay in a “disengaged” state with a probability of 59% compared to 39% in the Early engagement group. The top statistically significant discriminating subsequences in the Early disengagement group were characterized by persisting in a “disengaged” state, descending from an engagement state (averagely or actively engaged) to a “disengaged state”, or ascending to an engagement state.
Table 2
Transition probability matrix among engagement states for each engagement group.
Early disengagement
|
Actively engaged
|
Averagely engaged
|
Disengaged
|
Dropout
|
Actively engaged >
|
0.60
|
0.09
|
0.28
|
0.03
|
Averagely engaged >
|
0.30
|
0.55
|
0.11
|
0.04
|
Disengaged >
|
0.35
|
0.04
|
0.59
|
0.02
|
Dropout >
|
0.00
|
0.00
|
0.00
|
1.00
|
Early engagement
|
Actively engaged
|
Averagely engaged
|
Disengaged
|
Dropout
|
|
Actively engaged >
|
0.65
|
0.24
|
0.11
|
0.00
|
Averagely engaged >
|
0.30
|
0.69
|
0.02
|
0.00
|
Disengaged >
|
0.45
|
0.16
|
0.39
|
0.00
|
Dropout >
|
0.00
|
0.00
|
0.00
|
1.00
|
Figure 3, shows the Chi-squared test for discriminating subsequences. The Early disengagement group had more frequent and statistically significant (Disengaged) and (Disengaged)-(Disengaged > Actively engaged) subsequences, while less likely to have (Averagely engaged > Actively engaged) subsequences. The transversal-entropy plot show how stable the groups are (Fig. 4). The Early disengagement group had higher entropy values at each time point highlighting the instability of Early disengagement group. In summary, the Early disengagement group were more likely to persist in a “disengaged” state, descend from an engagement state to a disengagement state as well as showed an unstable trajectory. These findings add to the previous findings, that not only being in a disengagement state is an alarming distress signal, but also the persistence or transition to a disengaged state is similarly alarming.
To investigate the likelihood to persist in the program, we estimate the survival probability (probability of completing the program) using KM curves, comparing Early engagement and Early disengagement groups (Fig. 5.A). The survival probability of the Early disengagement group at the end of the 1st year was 0.86 CI (0.76:0.97), at the end of 2nd year the survival probability dropped to 0.81 CI (0.7:0.94), while it was 1.00 in the Early engagement group. By the end of the program, the survival probability in the Early disengagement group was 0.7 CI (0.57:0.85), while it was 0.98 CI (0.95:1.0) in the Early engagement group. The Log-rank, Gehan, Tarone-Ware and Peto-Peto tests were all statistically significant at the level of p < 0.001 emphasizing the difference between the groups. In summary, the Early disengagement group had a higher and statistically significant probability of dropping out of the program.
Lastly, the results of the WMW test revealed that the performance (measured as GPA) students in the Early engagement group (85.15/100) was significantly higher than that of students in the Early disengaged group (79.73/100) with a medium effect size (rank-biserial correlation coefficient of -0.38). These findings indicate that early engagement does not only predict persisting in the program, but it is also a catalyst of higher performance.