A total of 593 candidates subscribed to the exam in the four Flemish universities. Four hundred seventy-two (79%) candidates used the supervisor app for an off-campus exam and 121 (20%) were present on the campus (Table 2). Most candidates registered at the KU Leuven (227) and at the University of Ghent (203) chose to take the exam off-campus. The results of both (off- and on-campus) groups were comparable.
Table 2 Participation and test result off versus on campus
|
Off campus n (%)
|
On campus n (%)
|
TT-test pooled
|
Total number of candidates (n=593)
Number of candidates per university
- Leuven
- Antwerp
- Brussels
- Gent
|
472 (79,6%)
227 (84,1%)
29 (35, 8%)
13 (50%)
203 (94%)
|
121 (20,4%)
43 (15,9%)
52 (64,2%)
13 (50%)
13 (6%°
|
|
Average test result
|
72/100
|
72,8/100
|
P > 0,15
|
Overall, we registered and solved 15 technical issues in the off-campus context (Table 3). Eight of these issues concerned software problems (in particular loading a reading text in a new tab). Two candidates experienced a negative impact on the exam performance due to technical issues. The developers team switched one candidate to the SEB mode to complete the exam. Based upon the post exam analyses and after deliberation, the exam coordinator exempted the other candidate of the jury exam.
Table 3 Comparison of exam procedure and outcome off- versus on-campus
|
Off campus
|
On Campus
|
Technical issues
- with impact on exam
Type of issue
- Internet failure
- Hardware issue
- Camera crash
- Software issue
|
15
2
2
4
1
8
|
1
0
0
1
0
0
|
Average suspicious score
Median suspicious score
|
0,4
0,3
|
NA
NA
|
Number of suspicious candidates
- Detected by the app
- Flagged by supervisors
- With non-critical events <1
- Without events
- With noise event
|
22 (4%)
2 (0,04%)
455 (96%)
15 (3%)
472 (100%)
|
NA
NA
NA
NA
NA
|
Interventions during exam
- Technical intervention
- Warning to candidate
|
8
2 individuals
1 group (background noise)
|
1
0
0
|
In total, the app detected 22 (4%) candidates with a suspicious level >1. All cases concerned one or more noise-event (background noise). All other non-critical events consisted of leaving the webpage, closing a page or typing text. Live monitoring and a post exam review of records revealed that all these events occurred hazardously but without fraud purpose.
The monitoring supervisors flagged two candidates who were typing more than expected (in a multiple-choice exam). After revision of the records, these candidates were using ‘control find’ to search for words in the reading text. During the exam, the supervisors intervened eight times for a technical issue, they warned two candidates to stop talking to themselves and they sent a group message to ask to turn down background noise.
Out of the 472 candidates that used the supervisor app, 304 filled in the post-test questionnaire, 213 women and 91 men. All of them were taking the proficiency test for the first time. Almost seventy-two percent of the candidates (219) had chosen the GP Training as first choice postgraduate medical training (Figure 2). Ten percent (30) wanted to follow a medical training in Internal Medicine, but they were participating in the proficiency test to be able to register in the AMGP as a second choice.
An explanatory factor analysis was initially conducted on the 6 items of the questionnaire with oblique rotation. However, one item “I was well informed about the supervisor app” had to be omitted because of a communality lower than .40.(6) The other five items (communality> 0.40) were included in a secondary analysis. This analysis yielded a Kaiser-Meyer-Olkin (KMO) of .63 verifying the sampling adequacy (Table 4). After analysing the data to obtain eigenvalues for each factor, two factors had eigenvalues over Kaiser’s criterion of 1 explaining 82.46% of the variance. The scree plot also justified retaining two factors (Figure 3). Table 5 shows the factor loadings after rotation. The items that cluster on the same factors suggest that factor 1 represents candidates’ appreciation of the supervisor app, while factor 2 represents emotional distress because of the supervisor app. Our sample size was appropriate to perform a factor analysis, since it suffices the 10:1 rule of thumb subject to item ration.(6) The reliability of the questionnaire was calculated based on Cronbach’s alpha as 0.72 (Table 6).
Table 4 KMO and Barlett's test
KMO and Barlett’s Test
|
Kaiser-Meyer-Olkin measure of sampling adequacy
|
0.632
|
Barlett’s Test of sphericity
|
|
Approx. chi-square
|
667,418
|
df
|
10
|
Sig.
|
.000
|
Table 5 Summary of exploratory factor analysis results for the supervisor app questionnaire (n=304)
Rotated Factor Loadings
|
Item
|
Appreciation of the supervisor app
|
Emotional distress because of the supervisor app
|
Q2: I was more nervous than usual before the exam because of the supervisor app.
|
-.017
|
.851
|
Q3: I was more nervous than usual during the exam because of the supervisor app.
|
-.116
|
.963
|
Q4: The supervisor app had an impact on my results.
|
.121
|
.692
|
Q5: I found the supervisor app reassuring.
|
.973
|
.035
|
Q6: I would use the supervisor app again in the future for other exams
|
.738
|
-.023
|
Eigenvalue
|
2.450
|
1.673
|
% of Total Variance
|
49%
|
33.46%
|
Note: Factor loadings over 0.40 appear in bold.
Table 6 Reliability coefficient of the questionnaire
Reliability coefficient
|
Cronbach’s alpha
|
.723
|
From the qualitative analysis, four different themes were discerned. All themes were related to candidates’ emotional well-being. The first two themes referred to stress and anxiety before and during the exam. Some candidates felt more anxious because they feared technical problems or running out of time. The third theme was anxiety and stress because of the supervisor app. Candidates mentioned that they felt stress because the app might detect something as fraud, without the intention of fraud. Candidates also admitted that they felt awkward, observed, and distracted because of the app.
Instead of just thinking about a question and the possible answers, I had to constantly remind myself about my behaviour (not to look upwards or left and right). The feeling of constantly being watched was not helpful (Anonymous candidate).
The fourth theme was connected to positive emotions about the supervisor app. Candidates thought that the supervisor app was reassuring and that, in case of need, someone could intervene and help.
The app worked pretty well for me, it was rather reassuring that it would be taken into account, if technical problems arose (Anonymous candidate).
Finally, candidates identified some technical issues before and during taking the exam. Before the exam, the main problem was slow internet connection. During the exam, candidates experienced problems, when opening multiple tabs at the same time.