This section details crowdsourcing performance obtained from field experiments. We compared crowdsourcing outcomes regarding the quality of the 3D model generated in three scenarios and investigated crowd behaviours in terms of participation, contribution, user experience, and awareness perception.
5.1 Quality of 3D Reconstruction
We pre-processed collected photos and reconstructed 3D models for each group respectively using the RealityCapture software. As observed (Fig. 7), Disorganised groups failed to produce complete 3D models regardless of the objects' complexities. F2F and Virtual groups successfully generated complete 3D models for four artefacts (Fig. 8). As for Object E, both Virtual and F2F groups produced 3D models with a missing bottom (Fig. 9).
This can be attributed to the fact that Object E has the largest size among all five objects. It has a height of 93.1cm, a width of 75.6cm, and was placed in a display case, making the object even taller and more difficult for participants to get a top view or bottom of it. Considering its complexity and reconstruction difficulty, we can conclude that Virtual and F2F groups obtained similar crowdsourcing performance in model completion.
Thus, we the alternate hypotheses are true. H1a: Asynchronous virtual collaboration generates better photogrammetric 3D models than conventional open calls and H1b: Asynchronous virtual collaboration yields photogrammetric 3D models of the same quality as synchronous offline collaboration. This confirmed the effectiveness of our asynchronous virtual collaboration in reconstructing photogrammetry-based 3D models.
Table 2
Results of Statistical Tests. Basic statistics are calculated for three groups in terms of number of images contributed and time spent. Kruskal-Wallis H Tests are computed, and the results confirm the statistically significant differences among three groups in the number of images contributed (H(2) = 50.52, p < .01) and time spent (H(2) = 23.05, p < .01). The post-hoc Mann-Whitney U Tests are adopted for further comparison between two groups, and the results verify statistical significance in all pairs, expect for the time spent in Virtual and Disorganised scenarios. It means that Virtual (x̄=10.0min) and Disorganised groups (x̄=9.5min) spent similar amount of time on crowdsourcing tasks.
|
Basic Statistics
|
H-Test
df = 2, N = 216
|
Post-hoc Comparison (U-Test)
|
|
n
|
Σ
|
x̄
|
x̃
|
|
U-value
|
z-score
|
p-value
|
Sig.
|
Num. of Images
|
D
|
42
|
278
|
6.6
|
6
|
H = 50.52
p < 0.001
Reject H0
|
V vs D
|
2226.5
|
3.332
|
0.0009
|
Y
|
V
|
159
|
1439
|
9.1
|
8
|
F vs V
|
13.5
|
− 6.320
|
< 0.001
|
Y
|
F
|
15
|
228
|
15.2
|
26
|
F vs D
|
0
|
− 5.799
|
< 0.001
|
Y
|
Time Spent (mins)
|
D
|
42
|
399
|
9.5
|
8
|
H = 23.05
p < 0.001
Reject H0
|
V vs D
|
2964
|
1.117
|
0.263
|
N
|
V
|
159
|
1519
|
10.0
|
10
|
V vs F
|
310.5
|
− 4.727
|
< 0.001
|
Y
|
F
|
15
|
258
|
17.2
|
16
|
F vs D
|
100.5
|
− 5.700
|
< 0.001
|
Y
|
Note: D = Disorganised, V = Virtual, F = F2F, n = sample size, Σ = total number, x̄=mean, x̃=median, df = degree of freedom, Sig.=Significance.
5.2 Participation and Contribution
Participation and contributions can be reflected using the number of participants and images contributed during the crowdsourcing activities at both the group and individual levels.
Three crowdsourcing approaches differed significantly (Table 2) in the total number of participants (nDisorg.=42, nVirtual=159, nF2F=15), image collected (Σimg.Disorg. = 278, Σimg.Virtual = 1439, Σimg.F2F = 228). The statistics reveal that Virtual scenarios successfully motivated more participation over three days, which confirmed our H2a: Compared to conventional open calls, more participants self-select in asynchronous virtual collaboration within the same time period.
As observed, even if Disorganised groups (Σimg.Disorg = 278) contributed more images than F2F groups in total, they failed to produce complete 3D models. Despite the high-quality 3D reconstructions, the F2F scenarios recruited the fewest volunteers (nF2F=15) and used the least total number of images (Σimg.F2F = 228) compared to the other two settings. It implies that the simple accumulation of images could not guarantee desired outcomes for mass photogrammetry. High-quality contributions from individual participants are required to achieve effective crowdsourcing.
When it comes to the individual level, Fig. 10 illustrates that individual participants in three scenarios behaved and contributed differently. Since distribution shapes among the three groups differ (Fig. 11), we adopted the Kruskal-Wallis and post-hoc Mann-Whitney tests to verify the statistical significance (Table 2). It indicates that individual contributions of three groups were statistically different, which can be ranked from most to least as: F2F (x̄=15.2, x̃=26), Virtual (x̄=9.1, x̃=8) and Disorganised (x̄=6.6, x̃=6). It verifies H2b: Individual participants contribute more data in asynchronous virtual collaboration than in conventional open calls. We can conclude that our asynchronous virtual collaboration successfully motivated more participation and better contributions for mass photogrammetry than conventional open calls.
We further investigated the communication that occurred in the crowd-based self-organised collaboration in F2F and Virtual scenarios. As coordinated (Table 1), F2F groups that gathered offline usually had a quick brief prior to task executions. During their image acquisition, explicit communication among group members was rarely observed. This can be ascribed to three main reasons. 1) the task is relatively easy with low interdependence. Group members can adjust behaviours through implicit observation. 2) the ad-hoc teams are formed randomly and temporally, and participants barely know each other. 3) their image acquisition actions do not align with conventions of practices in museum visits during which visitors are supposed to be quiet.
We processed the chat logs obtained from the anonymous online chatrooms in Virtual scenarios. The content was encoded with multiple tags: Action.Des: Describe actions (e.g., "I took pictures of..."). Angles: Describe shooting angles (e.g., "top", "bottom", "inside"). Artefacts: Specific reference to the cultural heritage objects. Barriers: Mention barriers during the task execution (e.g., "Reflective glass", "Bar frame"). Concerns: Raise concerns/issues about crowdsourcing (e.g., "afraid", "concerned"). Emotions: Express emotional reactions (e.g., emojis; "wow"; "ha-ha"). History.Info: Share historical information. History.Q: Ask questions about history/artefacts. Obj.Features: Mention specific features of an artefact (e.g., "paw", "eyes", "camel's hump"). Photograph: Mention the image acquisition/photo-taking action. Suggestion: Give suggestions for platform/activities (e.g., "should", "it's better to"). Tech.Info: Share information about 3D technology or mass photogrammetry. Tech.Q: Ask questions about technology or techniques. 3D.Model: Specific reference to photogrammetry-based 3D model.
As depicted (Fig. 12), participants in Virtual groups were not intuitively triggered to report their actions. Instead, their share desires were more likely to be stimulated by emotional reactions towards the artefacts. Although the information exchange seemed less effective in terms of task execution, it covered a more comprehensive range of topics than that within F2F groups.
5.3 User Experience
User experience consists of two main aspects: user effort and crowdsourcing experience. User workload can be examined by combining objectively recorded task execution times with subjective feedback. Crowdsourced experiences are primarily related to self-reported app interactions and overall enjoyment. Our questionnaires collected participants’ responses using 7-point Likert scale items ranging from 'strongly disagree' to 'strongly agree'. The Likert scale distributions of three groups were plotted to compare and reflect user experience.
First, we examined the time users spent performing tasks, following the method of analysing image contribution (Section 5.2). It can be observed (Fig. 10 and Fig. 11) that the three groups appeared to have varying task execution times. Nevertheless, our statistical tests (Table 2) indicate that there was no significant difference in time spent between the Disorganised (x̄=9.5min, x̃=8min) and Virtual (x̄=10min, x̃=10min) groups. Both groups took much less time to perform tasks than the F2F groups (x̄=17.2min, x̃=16min). Less execution time implies that Virtual groups required less effort on the users’ side. Participants also rated effort regarding time spent and image contribution for the crowdsourcing. A lower score indicates less workload for participants. As illustrated (Fig. 13), the shapes of Likert scale distributions generated by the Virtual groups are skewed towards lower scores for both questions, especially compared to F2F groups. Combining the above observations, we can deduce that our H3a is true: Compared to synchronous offline collaboration, asynchronous virtual collaboration requires less user effort.
We learnt about participants’ experiences from their feelings about the overall campaign and interactions with our platform. A higher score implies a better experience. Figure 13 shows that all three groups responded positively as most scores were higher than 4 (i.e., neutral response). Fluctuations in the distributions generated by F2F groups may be due to the varying complexities of target objects handled by different participants. Objects of larger sizes require users to contribute more images and spend more time interacting with the app. Scores obtained from Virtual groups for both questions produced two stronger and steadier upward trends, implying the overall experience was generally good regardless of object complexities. These observations can validate our H3b: Compared to synchronous offline collaboration, asynchronous virtual collaboration provides a better crowdsourcing experience.
Figure 13 also reveals that participants in Virtual groups tended to give better reviews to our platform (app interaction) than those in Disorganised and F2F groups. Unlike the other two experimental settings (Table 1), Virtual scenarios allowed participants to access the full features of our platform, consequently differentiating their user experience, especially when it comes to app evaluation. Hence, we assessed the user experience of Virtual groups in details.
Specifically, the Likert plots (Fig. 14) demonstrate that most participants reported that they had a positive crowdsourcing experience in platform interactions (78%) and overall enjoyment (67%). The distribution of user effort ratings no longer consistently favours high scores (a higher score means a higher workload). 65% of participants felt they did not put much effort into image contributions. Relatively speaking, the evaluation of time spent was evenly distributed, with roughly the same number of participants considering they spent a lot (47%) or little time (39%) on task execution. This may be due to the varying complexity of target objects experienced by different participants. We can thus validate the usability of the Open Collaboration functionalities and infer that asynchronous virtual collaboration requires little user effort and promotes a better crowdsourcing experience.
5.4 Awareness Perception
Performance differences can be closely related to awareness perception during the collaboration. Hence, we further explored whether and to what extent our virtual collaboration can support awareness perception, and how it can affect user behaviour. We measured participants’ perceived information using self-reported 7-point Likert scores and adopted Spearman’s rho to determine the relations between ranked variables.
Our platform attempted to emphasise the impact of digitalising cultural heritage and the significance of our crowdsourcing campaign (Section 3.1). Here we reflect on our design and investigate the correlations between user experience and their perception of the meaningfulness of the activities.
We focused on the user experience in Virtual scenarios. The plots in Fig. 14 demonstrate that 86% of participants recognised the importance of our crowdsourcing activities. Participants’ awareness of meaningfulness was strongly correlated with their evaluation of our platform (rs=.74, p < .01, n = 51) and moderately related to their enjoyment of the overall experience (rs=.46, p < .01, n = 51). However, there were no statistically significant relationships between perceived importance and self-reported user effort (p-values > .01 for contribution and time spent). The results indicate that our platform had successfully conveyed the meaning of our crowdsourcing campaign, which can, to some extent, improve users’ favourability of our platform.
Knowledge of photogrammetric principles and procedures was critical to the success of mass photogrammetry. Since such information was communicated to participants differently in three experimental settings (Table 1), we assessed participants’ understanding of principles and steps within each of the three groups respectively.
Different results were computed (Fig. 15), although the p-values (< .001) for all three groups statistically confirmed a positive correlation between these two factors. F2F groups had the most significant proportion of participants who understood the principles (81%) and steps (81%), with a strong correlation between them (rs=.89, p < .01, n = 15). This can be explained by the way they cooperated. Participants in F2F groups discussed strategies prior to fieldwork, which guided subsequent image acquisition steps. Disorganised groups (rs=.85, p < .01, n = 28) also obtained a high coefficient even though they had a minor proportion of participants who responded positively to both questions (principles: 54%, steps: 39%). It might suggest that participants who could understand the principles were funnelled to be those who knew task execution steps. These phenomena highlight the significance of illustrating technical principles, for it can help contributors develop a taskwork mental model
Virtual groups were generally more aware of mass photogrammetry knowledge (principles: 65%, steps: 69%) than Disorganised groups, but they were less knowledgeable than F2F groups. Interestingly, among three experimental settings, the Virtual scenario was the only one in which more participants knew the steps than those who understood the principles. And these two variables were moderately correlated (rs=.69, p < .01, n = 51). We consider possible explanations as follows. Since Virtual groups could generate 3D reconstructions of a similar quality to F2F groups, they should have been able to obtain relevant information during the activities. Many participants in Virtual groups may feel lost prior to task execution, like those in Disorganised groups, as there was no one to assist them directly on the spots. The virtual chatrooms can provide a channel where they can ask and learn instant feedback from others. Moreover, rather than spending efforts figuring out complex principles, most participants tended to pick up executable actions they observed. The results suggest that asynchronous virtual collaboration can facilitate the creation and development of the shared mental model within groups.