Advised or Paid Way to get it right. The contribution of fact-checking tips and monetary incentives to spotting scientic disinformation

Disinformation about science can impose enormous economic and public health burdens. Several types of interventions have been proposed to prevent the proliferation of false information online, where most of the spreading takes place. A recently proposed strategy to help online users recognise false content is to follow the techniques of professional fact checkers, such as looking for information on other websites (lateral reading) and looking beyond the first results suggested by search engines (click restraint). In two preregistered online experiments (N = 5387), we simulated a social-media environment and set-out two interventions, one in the form of a pop-up meant to advise participants to follow such techniques, the other based on monetary incentive. In Experiment 1, we compared these interventions to a control condition. In Experiment 2 another condition was added to test the joint impact of the pop-up and the monetary incentive. We measured participants’ ability to identify whether presented information was scientifically valid or invalid. Results revealed that while

The massive circulation of inaccurate scientific information can have nefarious societal 2 consequences. Successful misconceptions influence the public debate on decisions 3 regarding the effectiveness of a vaccine, the adoption of solutions mitigating climate 4 change, or the cost of a social policy. The sharing of false information is easily fuelled 5 by political or social motivations that disregard the best scientific evidence on the 6 matter. It is indeed tempting to share information on social media without verifying its 7 truthfulness, simply because the mere act of sharing allows us to exhibit our position on 8 a given topic and to justify the validity of such a position. This phenomenon is 9 amplified in crisis situations when scarce information is accompanied by multiple and 10 contrasting rumours (also called infodemic) that could serve different views. People's 11 propensity to accept scientifically dubious information can thus become a crucial 12 problem for both democracy and public welfare. 13 There are structural challenges to fighting the spread of false information on social 14 media. One key issue is that companies often perceive a trade-off between engaging 15 users and combating viral but fake content, to the point of favouring the former over 16 the latter [1]. Curtailment is made even more difficult when there is a deliberate intent 17 behind the dissemination, what researchers refer to as disinformation. For example, at 18 the peak of the coronavirus infodemic, only 16% of fact-checked disinformation was 19 labelled as such by Facebook's algorithms, partly because content creators were able to 20 simply repost content with minor changes, thus escaping detection [2]. It is therefore 21 essential that, in combination with a systematic change in policy, users themselves are 22 empowered against malicious or false content. User-based resilience needs to be part of a 23 toolkit to fight disinformation: for instance, among the pillars of infodemic management, 24 Eysenbach [3] lists eHealth Literacy, science literacy capacity, and critical thinking 25 ability to fact-check information. Fighting science-related disinformation is harder than 26 contrasting other forms of disinformation (e.g. political) because in the former case the 27 lines between expertise and pseudoexpertise are blurred, and incompetent or otherwise 28 biased sources pose as expert sources on topics like epidemiology or climate change. 29 Research on countering disinformation has developed substantially over the last 30 decade, bringing a wealth of different approaches [4][5][6][7][8]. These include debunking, the 31 systematic correction of false claims after they have been seen or heard [9,10], 32 pre-bunking, preventive measures before exposure to disinformation [5,11], nudging, 33 interventions affecting users' choices without limiting their freedom of choice [12], and 34 boosting, the empowering of users by fostering existing competences or instilling new 35 ones [12]. All of the above approaches have proven to be useful in a social media 36 context, not least by adopting ingenious and innovative adaptations of classical 37 paradigms. Debunking has been extensively studied, with several experiments focusing 38 on the source [13][14][15][16] and the timing [17] of fact checking. Research has also explored 39 whether evaluations about the quality of contents and sources can be delegated to the 40 so-called wisdom of crowds, with encouraging results [18][19][20]. Studies on pre-bunkning 41 have largely focused on the concept of inoculation [5,21], namely exposing users to 42 disinformation strategies in order to ease their recognition in future settings. 43 Inoculation has demonstrated pronounced and lasting effects when introduced through 44 games [22][23][24][25]. Nudging was also tested by showing warning labels for unchecked or 45 false claims [26][27][28][29], but also by priming users to pay attention to the accuracy of 46 content they might be willing to share [30][31][32] (however see [33] for a critique of this 47 approach). Finally, boosting was tested by presenting users with a list of news/media 48 literacy tips or guidelines on how to evaluate information on-line [34][35][36][37][38], producing 49 some remarkable results and some non-significant ones. 50 A promising example of media literacy intervention has been carried out by 51 researchers interested in understanding how fact checkers search for information about 52 October 8, 2021 2/23 unknown but institutional-looking sources [39]. Researchers catalogued fact checkers' 53 strategies and distilled a series of questions to evaluate content, a set of skills that was 54 named Civic Online Reasoning [40,41]. Two are the most prominent strategies adopted 55 by fact checkers. One is lateral reading, namely leaving a website and opening new tabs 56 along a horizontal axis in order to use the resources of the Internet to learn more about 57 a site and its claims. The other is click restraint, that is, skipping the first search results 58 of a browser search to avoid biases created by results-ranking algorithms. These 59 strategies seem particularly fit when a content has unknown origins that are hardly 60 identifiable or that appear legitimate on the surface, a feature that has been associated 61 with content creators spreading scientific disinformation and disinformation [42]. 62 Detecting scientific disinformation often requires specific expertise to evaluate the 63 content and cross-check sources. Under such conditions, assessing the truthfulness of 64 information becomes tricky. 65 In the absence of expertise and content knowledge, users can rely on a number of 66 external cues to infer whether information presented as scientific is reliable [43]. aspects such as immediate payment play an important role in workers' motivation [48]. 99 Monetary incentives have been proven to be a cost effective tool to modify behavior 100 in domains such as health and human development [49], where often an early boost in 101 motivation promotes the adoption of cheap preventive behaviours, avoiding this way 102 costly consequences [50]. From a psychological perspective, the use of incentives builds 103 on the attention-based account of disinformation spread. This account posits that unexpected content at the expense of accuracy [4,51]. Recent research in this field has 106 found both laboratory and field evidence that accuracy of content is often overlooked 107 and that simple cues reminding participants to evaluate the accuracy of content they 108 share has an impact in terms of the proportion of fake/true news shared [30,32,52,53]. 109 Increasing accuracy through incentives is not an entirely novel idea in social media 110 either, as shown in a recent initiative promoted by Twitter [54]. Although these 111 premises indicate that this type of intervention can be very effective, it is not a given 112 that economic incentives will have a positive effect on scientific content evaluation. In 113 an experimental setting in particular, social media content is subject to higher scrutiny 114 than when users scroll through their news feed [30]. It is therefore possible that 115 additional incentives may not further increase participants' accuracy.

116
The aim of the present study was to test and compare the effectiveness of Civic preceded the post presenting the lateral reading and click restraint strategies 1. The use 126 of a pop-up ensured that participants processed the content before observing the post, 127 an approach that has also been adopted in previous research [52]. A pop-up could be 128 easily adapted in a social media setting as regular reminders with the necessary 129 precautions to avoid the reduction of their salience with time [55,56]. To test the effect 130 of monetary incentives instead, we doubled the participation fee (equivalent to an 131 average +£8.40/hour) if participants guessed correctly the validity of the post they were 132 evaluating.

Experiment 1 134
In Experiment 1, we tested separately the efficacy of pop-up and monetary incentives, 135 and compared their effects to a control condition with no interventions. To assess that 136 the effect of the interventions is effective over the widest possible range of contexts, we 137 used a set of 9 different Facebook posts varying in various properties, such as the 138 scientific topic, the source reputation, and its level of factual reporting. We conducted the experiment on Qualtrics and lab.js [57]. During the experiment, pre-specified criteria (see S3 Methods). Participants could take as much time as they 166 wanted in giving their rating. Crucially, participants were also explicitly told that they 167 were allowed to leave the study page before evaluating the post. After the rating,  Experimental conditions. Participants were randomly assigned to one of three 171 experimental conditions: control, incentive, and pop-up. In the control condition, 172 participants completed the task as described above. In the incentive condition, 173 participants were doubled their participation fee if their rating matched that given by 174 the experimenters. Unbeknownst to participants, the correctness of the answer depended 175 only on whether the answer was valid or invalid, and not on the extremity of the answer 176 (e.g. having answered 4 instead of 5), even though we selected unambiguously valid or 177 invalid content. In the pop-up condition, presentation of the post was preceded by a pop-up (Fig 1) presenting a list of civic online reasoning techniques (e.g., lateral reading, 179 click restraint) as tips to verify the information in the post.

180
Stimuli. Each participant observed one out of nine possible Facebook posts (Fig 2; 181 see S1 File for a full list). Posts varied in terms of: (i) scientific validity of the content 182 (i.e., six valid and three invalid posts, either with verified or debunked information; S3 183 Methods); (ii) topic (i.e., three on climate change, three on the coronavirus pandemic, 184 three on health and nutrition); (iii) factual reporting of the source, based on ratings 185 from mediabiasfactcheck.com (i.e., three high/very high versus six low/very low); (iv) 186 source reputation, as measured in a screening survey (S4 Methods; three categories: 187 trusted (2 posts), distrusted (4), unknown source (3)). Posts were balanced to have 188 three posts for each topic, one from a source with high factual reporting displaying valid 189 information, one from a source with low factual reporting displaying valid information, 190 and one from a source with low factual reporting displaying invalid information. Accuracy. We computed two measures of accuracy-correct guessing and accuracy 200 score. Correct guessing refers to a dichotomous variable that tracks whether participant 201 gave a 'valid' (vs. 'invalid') rating when the post content was actually scientifically valid 202 (vs. invalid). Accuracy score instead is a standardised measure ranging from zero to one, 203 with 0 indicating an incorrect "1" or "6" validity rating, 0.2 indicating an incorrect "2" 204 or "5" rating, 0.4 an incorrect "3" or "4" rating, 0.6 a correct "3" or "4" rating, 0.8 a 205 correct "2" or "5" rating, and 1 a correct "1" or "6" rating. Accuracy score allows to 206 distinguish validity evaluations that are associated with different behaviours: for 207 instance, not all participants would be willing to share content that they rated as 4 in 208 terms of scientific validity. In addition, accuracy score is statistically more powerful 209 than correct guessing as it includes more possible responses [58]. We thus considered 210 accuracy score as our main index.

211
Search behaviour. During the evaluation of the post, we tracked participants' 212 behaviour on the study page. We measured the time spent both inside and outside the 213 page, and a series of dummy variables tracking whether participants had clicked on any 214 of the links present (e.g., Facebook page, article page, Wikipedia page). Based on these 215 calculations we were able to estimate participants' response times and search behaviour. 216 Civic Online Reasoning. After having rated the scientific validity of the post, 217 participants completed a questionnaire investigating those factors that could have 218 influenced their choice. In order to test our hypotheses, we asked participants whether 219 they engaged in lateral reading and click restraint. Participant were said to have used 220 lateral reading if they reported having searched for information outside the study page 221 (yes/no question), and if they specifically searched on a search engine among other Statistical tests were conducted using base R [59]. We adopted the standard 5% 238 significance level to test against the null hypotheses. All tests were two-tailed unless 239 otherwise specified. Post-hoc tests and multiple comparisons were corrected using the  To test the effect of our interventions on accuracy, we adopted two tests, one for the 260 accuracy scores, and one for correct guessing (original preregistered analyses are 261 presented in S1 Analyses  Since our measure of technique use is based on self-reporting, responses might have 279 been biased by external expectations. We therefore checked whether participants who

301
Based on these results, we proceeded to test whether pop-up and incentives had 302 some mediated impact on accuracy score through technique adoption. To test mediation 303 we used the R package MarginalMediation [60]. Technique adoption was found to  Civic Online Reasoning techniques were originally designed for helping to evaluate 331 content from seemingly legitimate but unknown websites [39]. We thus analysed

342
Perhaps not surprisingly, we observed that, in the pop-up condition, adoption of 343 lateral reading and click restraint was strongly linked with source type (Chi squared test 344 with technique adoption and source category as variables, χ 2 (2) = 15.407, p < .001): 345 when the source was trusted, only 6.7% of participants used these techniques, whereas 346 the proportion was 20% when the source was unknown. We then tested differences of 347 the interventions by source type in accuracy scores and correct guessing.

362
By contrast, the presence of the pop-up seemed not to affect directly any indicator 363 of accuracy. In spite of that, participants in the pop-up condition reported more lateral 364 reading and click restraint, as well as the frequency of searches outside the study page. 365 In turn, this increment of Civic Online Reasoning techniques (up to +13.5% when 366 source is unknown) seems to mediate a small but significant increase in accuracy scores 367 (marginal mediation analysis), suggesting an indirect effect of the pop-up. An effect of 368 pop-up is possibly seen in posts produced by unknown sources, where correct guessing 369 (but not accuracy scores) is slightly higher in the pop-up condition than in control (S4 370 Analyses).  One potential takeaway from these findings is that some initial biases might affect 378 the rate at which participants look for information outside the content provided (e.g.   Some titles, subtitles and captions of the posts included references to governmental 432 or academic institutions. To prevent that these references could affect the evaluation of 433 the content, we slightly rephrased some sentences to remove this information. In 434 addition, we corrected also grammatical mistakes in the text that could have given away 435 the reliability of the source. Technique adoption 480 We tested whether technique adoption was influenced by either interventions following a 481 similar procedure to our test for correct guessing (comparison of two logistic regressions 482 with/without interaction). likelihood-ratio tests again favoured the model without  Analyses for an exploration of participants' search behaviour).

500
To test whether participants who adopted civic online reasoning techniques 501 performed better in the task we run two tests, one for each accuracy index. For   In this experiment, we also tested the interaction between incentive and pop-up.

557
Model comparison showed no interaction between the two interventions, suggesting that 558 pop-up and monetary incentives contributed separately to the increase in accuracy. We 559 additionally observe that monetary incentives increased participants' time spent on particularly remarkable given the strong benchmark against which it was compared: in 575 fact, participants in the control condition were already primed for accuracy [30], and are 576 therefore likely to exert a greater degree of attention than when routinely browsing  Our results on incentives are in line with an attention-based account of information 621 processing on social media; that is, increased deliberation is sufficient to decrease belief 622 in false content [4]. Our results add to the literature of attention-based interventions by 623 showing how monetary incentives can additionally modulate motivation and attention 624 and increase performance.   [63][64][65]. In fact, under some circumstances incentives decrease rather than increase 628 motivation [66]. One crucial aspect lies in incentives' calibration, as it has been proven 629 that if the effect of incentives on performance is non-monotonic and too small incentives 630 are often counterproductive [66]. Moreover, when explicit incentives seek to modify 631 behaviour in areas such as education, environmental actions, and the formation of 632 healthy habits, a conflict arises between the direct extrinsic effect of incentives and how 633 these incentives may crowd out intrinsic motivations. Seeking accuracy in judging news 634 is certainly driven by the intrinsic motivations of individuals. In all likelihood, however, 635 these intrinsic motivations do not conflict with monetary incentives. Seeking accuracy, 636 unlike deliberately adopting ecological behaviour or going on a diet, is a largely 637 automatic process.

638
Another concern was that motivation and attention might not have been sufficient 639 for content that is hardly accessible to non-experts. The effectiveness of incentives is 640 then even more remarkable when considering that participants were asked to evaluate 641 information based on scientific and technical reports, and thus had to rely external 642 knowledge and intuition when claims and data were not immediately available.

643
Compared to work on Civic Online Reasoning [39], our study finds correlational and 644 causal evidence supporting the importance of lateral reading and click restraint as 645 predictors of accurate information, especially (as initially intended) when the 646 information about the source is scarce. Notably, this is the first reported evidence of a 647 general population intervention in a social media context, extending the evidence for its 648 applicability. We note however that the connection between our intervention (the pop-up) and technique use is only indirect, as participants were free to ignore 650 recommendations. Stronger evidence for the efficacy of Civic Online Reasoning 651 techniques could come from within-subject studies that could limit selectively the use of 652 the techniques to assess their direct impact on users' behaviour.

653
Our results also partly support literature on media and news literacy [34]. Previous 654 successful attempts at using fact-checking tips relied on presenting participants with 655 some of the Facebook guidelines for evaluating information [36,37]. Critically, these tips 656 acted by reducing post engagement (liking, commenting, sharing) and perceived 657 accuracy of headlines by hyper-partisan and fake-news sources. Given that our results 658 highlight the effectiveness of fact-checking tips when participants are less familiar with 659 the source, we suspect that the use of such tips is inversely associated to the knowledge 660 and reputation of the source: that is, the more the source is well-known and widely 661 respected, the less participants will rely on guidelines and recommendations. This 662 interpretation goes against previous studies in the literature claiming that source 663 information has little impact on the accuracy judgement of social media content [67][68][69]. 664 Although we did not directly test for the presence/absence of source information, we did 665 find that familiarity with and trust in a source largely affected the search style and 666 evaluation of the content, suggesting that providing this information to participants had 667 a meaningful effect on their validity evaluations. One way to reconcile these apparently 668 antithetical conclusions is by considering the relative capability of participants to assess 669 the plausibility of information: source knowledge can be a viable heuristic when 670 information is harder to evaluate. Indeed, we suspect that in our experiment 671 information about the source was often easier to assess than the plausibility of the 672 content itself. In addition, compared to previous experiments, participants could open 673 the original article of the post to confirm that it had actually been produced by the 674 source and not fabricated, a factor that probably increased reliance on the source.

675
These considerations and our findings are not sufficient to ascertain whether and under 676 what circumstances reliance on the source is beneficial or detrimental; however, we 677 argue that source information is important in many situations [70,71].

678
Our study does not come without limitations. Possibly the most critical issue is the 679 limited number of stimuli that were used across experiments (15), which did not allow 680 us to properly control for many features that could impact the evaluation of the posts. 681 Even though we cannot exclude confounding variables and biases in the selection of 682 stimuli, we tried as much as possible to follow a standardised procedure with pre-defined 683 criteria in order to exclude stimuli that could be considered problematic. Moreover, 684 even though most of the literature and the present study have focused on standardised 685 stimuli reporting content from news sources, we recognize that scientific (dis)information 686 comes in several formats that also depend on the topic, the audience, and the strategy 687 of the creator. We decided to exclude other types of formats (e.g. videos or screenshots) 688 to try to minimise the differences in experience between users, we think however that 689 future research should explore more in depth the impact of varying media on the impact 690 of disinformation spread and on possible counteracting interventions. Lastly, the study 691 explored the effectiveness of interventions when using a computer, as the very concept 692 of lateral reading is based on browsing horizontally through internet tabs on a computer. 693 Although nothing precludes the use of such techniques on other devices such as a mobile 694 phone or tablet, the user interface is often not optimised to search for different contents 695 at the same time, making their use more cumbersome. This is particularly problematic 696 considering that social media are predominantly accessed through mobile devices. A 697 promising direction in the fight to disinformation will be to study the influence of the 698 device and UI in the ability of users to access high-quality information. Further studies 699 should also investigate how much easiness of accessing information from within a specific 700 app could prompt users to fact-check what they see. For example, many apps allow to 701 check information on the internet via an internal browser without leaving the app itself. 702

703
This study set out to assess the relative effectiveness of monetary incentives and 704 fact-checking tips in recognising the scientific validity of social media content. We found 705 strong evidence that incentivising participants increases accuracy evaluations; we also 706 found evidence that fact-checking tips increase accuracy evaluation when the source of 707 the information is unknown. These results suggest a promising role of attention and