The ring of truth: Irrelevant insights make worldviews seem true

Our basic beliefs about reality can be impossible to prove and yet we can feel a strong intuitive conviction for them, as exemplied by insights that imbue an idea with immediate certainty. Here we presented participants with worldviews such as “people’s core qualities are xed”, and simultaneously elicited an aha moment. In the rst experiment (N = 3000), which included a direct replication, participants rated worldview beliefs as truer when they solved anagrams and experienced aha moments. A second experiment (N = 1,005) showed that the worldview statement and the aha moment must be perceived simultaneously for the insight misattribution effect to occur. These results demonstrate that articially induced aha moments can make worldviews seem truer, possibly because humans rely on feelings of insight to appraise an idea’s veracity. Feelings of insight are therefore not epiphenomenal and should be investigated for their effects on decisions, beliefs, and delusions.


Introduction
The philosopher Rob Sips (2019) said that aha moments played an essential role in the deterioration of his psychosis. He describes experiencing an "accelerating stream" (p2) of aha moments that revealed the world from so many different perspectives that his worldview simply could not withstand the assault: "This process, in my experiences, was wrecking what I considered to be my "personal worldview"... [the insights] undermined or "derealized" how I looked at things before." (p3). The idea that aha moments-a sudden feeling of pleasure and certainty that accompanies a new idea-can arise from a change in perspective goes back almost a century (Duncker, 1945;Maier, 1930) and is the basis of how psychologists elicit insight in the laboratory (Ohlsson, 1984). Like the switching perspectives in the duckrabbit illusion, aha moments mark a novel discovery in which pre-re ective assumptions change and information is seen in a new light (Schooler & Melcher, 1995). The experience of Rob Sips (2019) exempli es the dramatic impact of this process on higher-order worldviews. For Sips, the uncontrolled cascade of aha moments marked the deconstruction of his beliefs, leading to an unstable grasp on reality-like a multidimensional duck-rabbit illusion that will not stop shifting.
Sips' (2019) account is dramatic, but it is possible that aha moments are simply epiphenomenal, marking but not causing changes in beliefs, much like the steam-whistle of an engine. Under this view, the aha moment is a feeling that correlates with the discovery of a new perspective or solution but has no impact on decision-making or the selection of new ideas (Klein & Jarosz, 2011). Alternatively, as Sip's experience suggests, the aha moment may be causally potent. For example, aha moments might provide feedback to the conscious agent about whether an idea is likely to be a good one. Under this view, the aha experience convinces the agent that the new perspective is true. Because the processes that precede aha moments can be pre-re ective or implicit (Bowden, 1997;Grant & Spivey, 2003;Hattori et al., 2013;Laukkonen & Tangen, 2017;Laukkonen et al., 2021;Maier, 1931;Salvi et al., 2015;Salvi & Bowden, 2020;Schunn & Dunbar, 1996;Sio & Ormerod, 2009), aha moments might provide information that is not otherwise accessible. Thus, the aha moment could plausibly provide information about whether an idea can be trusted in the absence of access to the processes that produced it, much like hunger or fear can signal something important about the state of one's inner or outer world (Damasio, 1996;Schwarz, 2012).
Consistent with this possibility, preliminary evidence suggests that aha experiences provide important information. For instance, aha moments correspond to more accurate solutions to problems (Danek & Wiley, 2017;Hedne et al., 2016;Laukkonen et al., 2021;Salvi et al., 2016;Threadgold et al., 2018;Webb et al., 2016). The correspondence between the magnitude of the aha feeling and accuracy has also been captured using a measure of grip strength ; the harder the participants (unintentionally) squeezed the device during the spontaneous aha moment, the more likely it was to be correct. Although these ndings do not demonstrate that aha moments are causally agentive, they suggest that aha moments carry valuable information that could be useful for decision-making. There is also more direct evidence that aha moments can affect decisions. For example, aha moments that occur when solving anagrams lead to a kind of false memory, wherein participants report having already seen the word even if they had not (Dougal & Schooler, 2007). Using a similar paradigm, another study showed that irrelevant aha moments make mundane facts more believable (Laukkonen et al., 2020).
In the experiments that follow we test, replicate, and extend the following hypothesis: That the seemingly trivial aha moment elicited by solving anagrams can increase the perceived veracity of important beliefs that serve as the basis of people's worldviews. Because feelings of insight carry useful information about the quality of the associated ideas, when an aha moment is experienced it should lead to a truer appraisal of the worldview that accompanies it. Metaphorically, the aha moment "lights up" more content than it should, making the temporally coincident but unrelated belief seem true. According to this possibility, Rob Sips' slide into psychosis may not have been driven by accelerating insights about the state of the world. Rather the aha moments themselves may have contributed to the unraveling of his belief system due to the new perspectives that they happened to reinforce. According to this possibility, the aha experience may act as a heuristic for appraising the quality of ideas appearing in mind, which we have previously termed the Eureka Heuristic .

Experiment 1: Demonstration And Replication
This experiment was approved by the University of California, Santa Barbara, Human Subjects Committee, in accordance with the Declaration of Helsinki.

Design & materials
The experiment had two within subject variables: 2 (Problem: solved or unsolved) × 2 (Aha Experience: present or absent), and one between-subjects factor (Anagrams: present or absent). The dependent measure was truth judgments on a 12-point scale ranging from 1 (de nitely false) to 12 (de nitely true).
We created 15 worldview claims, none of which were objectively demonstrable as true or false. Each claim was constructed such that the last word of it was key to its meaning (see procedure for an example). The worldview claims were derived conceptually from 'The Psychology of Worldviews' (Koltko-Rivera, 2004). We also created 15 anagrams derived from the last word of each claim (see Appendix A for stimuli). Keywords were re-organised into anagrams using a random scramble function, then iteratively pilot tested and adjusted manually until they were neither too di cult nor too easy (approximating 50% solution rates).

Participants and procedure
This study had two samples of 1,500 participants recruited by Critical Mix to match the demographics of the U.S. We used the rst sample of 1,500 to test our hypotheses, and then the second 1,500 to assess the replicability of the ndings. We determined that 1,500 participants would provide su cient power to detect a Cohen's d effect size of .4, based on Laukkonen et al. (2020). All participants were randomly assigned to either the Anagram or No Anagram (control) condition via written instructions. The procedure for the Anagram condition is illustrated in Fig. 1 below. In the No Anagram condition, participants were simply presented with the complete propositions (e.g., "It is useless to pursue justice") and judged how true they were.

Results
In our analyses, we rely primarily on two statistical approaches. For between-subjects comparisons across conditions, we use Welch's t-tests. For within-subjects analyses, we use multilevel regression models, which account for the hierarchical nature of the data with random intercepts and random slopes for participants and trial numbers (Westfall et al., 2014). For the multilevel regression models with one binary xed effect, we present the Cohen's d of the xed effect, which we calculate according to Westfall et al. (2014). These models assess the in uence of the predictors aha moment (Y/N) and correct anagram solution (Y/N) with truth scores as the DV, as well as the predictor aha moment with correct solutions as the DV. We report statistics in accordance with similar previous work (e.g., Ding et al., 2021). Between-subjects analyses were conducted using the t-test function within the stats package in R (R Core Team, 2013). All within-subjects analyses were conducted using the lmer function within the lme4 (Bates et al., 2007) package in R, and effect sizes were calculated using the r.squaredGLMM function within the MuMin package (Bartoń, 2019). Our data and analysis scripts are available on the OSF (demonstration data: https://osf.io/2duq5, replication data: https://osf.io/rzg3c, complete analyses: https://osf.io/wycmg).
Demonstration: First 1,500 Participants After excluding 247 participants from analysis for failing to solve any anagrams (N = 82), solving all anagrams (N = 4), experiencing no aha moments (N = 178), or experiencing aha moments on all trials (N = 23), 1,250 participants were included in the analyses. 443 participants were in the Anagram condition and 807 were in the No Anagram condition (note: we address the differential dropout rate in Experiment 2). On average, participants correctly solved the anagrams 37% of the time (SD = 21%), and the anagrams elicited aha moments 36% of the time (SD = 22%). Using a multilevel regression model with aha moments as a xed effect and participants as a random effect, we found that anagrams that elicited aha moments were more likely to be solved correctly (23% solved, SD = 18%) than anagrams that did not elicit aha moments (14% solved, SD = 18%), b = .41, t = 16.10, p < .001, d = .72.
Truth judgments: Anagrams versus no anagrams (between): As predicted, a between-subjects Welch's ttest revealed that participants' average truth scores in the Anagram condition were higher (M = 6.62; SD = 2.09) than participants' average truth scores in the No-Anagram condition (M = 5.75; SD = 2.05), t(781) = 7.54, p < .001, d = .46. Overall, the presence of the anagram-including both solved and unsolved anagrams-increased truth judgments regarding the worldview claims.
Truth judgments: Solutions and aha moments (within): We used a multilevel regression model to test our prediction that claims associated with solved anagrams would be rated as more likely to be true than claims associated with unsolved anagrams, including solving as a xed effect and participants as a random effect. As predicted, solved anagrams resulted in higher truth ratings (M = 7.06, SD = 2.70) than unsolved anagrams (M = 6.45, SD = 2.15), b = .60, t = 4.97, p < .001, d = .16. We also used a multilevel regression model to test our prediction that claims would be rated as more likely to be true if they were accompanied by aha moments while solving the anagram, including aha moments as a xed effect and participants as a random effect. As predicted, participants provided higher truth ratings on trials where they reported experiencing an aha moment (M = 7.21, SD = 2.73) than on trials where they did not experience an aha moment (M = 6.35, SD = 2.26), b = .89, t = 6.85, p < .001, d = .23. Finally, we examined aha moments among the subset of correctly solved anagrams. Correctly solved anagrams accompanied by aha moments had higher truth ratings (M = 7.28, SD = 2.92) than correctly solved anagrams without aha moments (M = 6.29, SD = 2.90), b = .89, t = 5.13, p < .001, d = .23.
Replication: Second 1,500 participants After excluding 261 participants from analysis for failing to solve any anagrams (N = 123), solving all anagrams (N = 1), experiencing no aha moments (N = 155), or experiencing aha moments on all trials (N = 47), 1239 participants were included in analyses. 434 participants were in the Anagram condition and 805 were in the No Anagram condition. On average, participants correctly solved the anagrams 36% of the time (SD = 21%), and the anagrams elicited aha moments 37% of the time (SD = 23%). Using a multilevel regression model with aha moments as a xed effect and participants as a random effect, we found that anagrams that elicited aha moments were more likely to be correctly solved (M = 21%, SD = 17%) than anagrams that did not elicit aha moments (M = 15%, SD = 17%), b = .35, t = 15.25, p < .001, d = .62.
Truth judgments: Anagrams versus no anagrams: As predicted and consistent with the rst sample, participants' average truth scores in the Anagram condition were higher (M = 6.81; SD = 2.09) than participants' average truth scores in the No Anagram condition (M = 6.22; SD = 1.92), t(822)= 4.84, p < .001, d = .29. As in the rst sample, the presence of the anagram-including both solved and unsolved trials-increased truth judgments.
Truth judgments: Solving and aha moments: We used a multilevel regression model to test our prediction that claims associated with solved anagrams would be rated as more likely to be true than claims associated with unsolved anagram, including solving as a xed effect and participants as a random effect. As predicted, solved anagrams resulted in higher truth ratings (M = 7.09, SD = 2.70) than unsolved anagrams (M = 6.69, SD = 2.24), b = .49, t = 4.51, p < .001, d = .13. We also used a multilevel regression model to test our prediction that claims would be rated as more likely to be true if they were accompanied by aha moments while solving the anagram, including aha moments as a xed effect and participants as a random effect. As predicted, participants provided higher truth ratings on trials where they reported experiencing an aha moment (M = 7.34, SD = 2.71) than on trials where they did not experience an aha moment (M = 6.52, SD = 2.23), b = .91, t = 7.58, p < .001, d = .25. Finally, we conducted the aha moment analysis among the subset of correctly solved anagrams. Correctly solved anagrams accompanied by aha moments had higher truth ratings (M = 7.29, SD = 2.96) than correctly solved anagrams without aha moments (M = 6.45, SD = 2.74), b = .82, t = 4.53, p < .001, d = .23.

Con rmation + Replication
In Fig. 2, we illustrate the combined results of the con rmation (N = 1,250) and the replication (N = 1,239). For each of the illustrated comparisons: Anagram (present vs. absent), Solving (yes vs. no), and Aha (present vs. absent), and aha for solved anagrams (present vs. absent).

Experiment 2: Aha Misattribution With Delay
This experiment was approved by the University of California, Santa Barbara, Human Subjects Committee, in accordance with the Declaration of Helsinki.

Design & materials
In this study we test the assumption that the aha experience needs to be temporally coincident with the claim, by introducing a "delay condition" with a 10-second interval between solving anagrams and being shown claims. Additionally, we address two limitations of the rst experiment. First, to reduce differential dropout across conditions, we standardize the di culty and completion time of each condition. Second, we introduce two new conditions that shift the order of trial components to rule out potential order effects with regard to anagram solving.
The dependent measure was again truth judgments on a 12-point scale ranging from 1 (de nitely false) to 12 (de nitely true) regarding the 15 claims from Experiment 1 (see Appendix A). To standardize the di culty, we provided hints when participants attempted to solve the anagrams. Raw data and materials can be found on the OSF: https://osf.io/vym8r.

Participants and procedure
For this study we had a sample of 1,564 participants recruited by Critical Mix to match the demographics of the U.S. Using the pwr function in R, we determined that 99 participants in each of the two key groups of interest would provide su cient power (.8) to detect a Cohen's d effect size of .4 for the main analysis (Champely, 2020). Thus, factoring potential dropouts, 1,564 participants randomly assigned to one of four conditions (illustrated in Table 1) should provide more than su cient power. Participants were provided written instructions and we included hints alongside anagrams to achieve comparable solving rates for all the conditions. A ten-second delay was provided at the end of each trial.

Results
We generally used the same analytic methods as outlined in experiment 1, except that instead of using Welch's t-tests to examine the between-subjects effects, we used independent samples ANOVAs with truth ratings as the DV and condition as the factor. Standardizing the di culty and completion time resulted in similar Ns in the four conditions: Anagram Normal = 149, Anagram Delay = 157, Anagram After Truth = 132, and Anagram After Everything = 121. Based on our preregistered exclusion criteria, 329 participants were excluded for failing to solve any anagrams (N = 158), solving all anagrams (N = 23), experiencing no aha moments (N = 174), or experiencing aha moments on all trials (N = 38). As a result, 676 participants were included in analyses (see Table 1 for a breakdown of the various conditions).
The mean solution rate for anagrams presented alongside hints and an un nished worldview claim was 53% (SD = 24%). When only the hint was provided, a similar solving rate was found, 57% (SD = 24%), suggesting that the hints helped to balance solving rates. We also successfully equated aha moments across the two key conditions: Anagram Normal condition (M = 44%, SD = 26%), Anagram Delay condition (M = 42%, SD = 26%). Using a multilevel regression model with aha moments as a xed effect and participants as a random effect, we found that anagrams that elicited aha moments (in the Anagram After Delay and Anagram Normal conditions) were more likely to be correctly solved (M = 33%, SD = 24%) than those that did not (M = 22%, SD = 24%), b = .38, p < .001, t = 13.14, d = .65.
Truth judgments: Comparison across conditions (between): Our key prediction was that truth judgments would be lower in the Anagram Delay condition than in the Anagram Normal condition. An independent samples ANOVA revealed a main effect of condition, F(3, 672) = 12.52, p < .001, η p ² = .053. Follow-up Tukey comparisons support our key prediction that a delay would remove the aha misattribution effect: the Anagram Normal condition elicited higher truth ratings (M = 6.94; SD = 2.16) than the Anagram Delay condition (M = 5.98; SD = 1.66), t(672) = 4.60, p < .001, d = .50. Moreover, the Anagram Normal condition elicited higher truth ratings than the Anagram After Truth condition (M = 5.97; SD = 1.61), t(672) = 5.04, p < .001, d = .51, and the Anagram After Everything condition (M = 5.90; SD = 1.72), t(672) = 5.39, p < .001, d = .53. There were no other signi cant effects, indicating that the order of anagram solving was redundant and that the anagram simply needed to occur at the same moment as the worldview was presented.
Truth judgments: Solving and aha moments (within): We used a multilevel regression model to test our prediction that participants in the Anagram Normal condition would rate claims associated with solved anagrams as more likely to be true than claims associated with unsolved anagram, including solving as a xed effect and participants as a random effect. We also predicted that participants in the Anagram Normal condition would rate claims as more likely to be true if they experienced an aha moment while solving the anagram. We further predicted that both of these effects would be absent or weaker in the Anagram Delay condition.
Finally, we conducted the aha moments analysis among the subset of correctly solved anagrams. Inconsistent with predictions, the interaction between condition and experiencing aha moments was not signi cant, b = .33, t = 1.00, p = 0.32. Nonetheless, simple effects analyses revealed that correctly solved anagrams accompanied by aha moments had higher truth ratings than those not accompanied by aha

Discussion
The present study tested whether incidental aha experiences could in uence the perceived veracity of different worldviews. In the rst experiment, we found that participants rated worldview statements as truer when they had just attempted to solve anagrams corresponding to those statements. We also found that successfully solving the anagram led to higher truth ratings than failing to solve them. And nally, for the subset of correctly solved anagrams, those that elicited aha moments had the highest truth ratings of all. We then directly replicated this effect. In a second experiment, we manipulated the timing of anagram solving and therefore also aha moments. Here we found that aha moments increase perceived truth only when they occur at the same time as when the worldview is presented. In short, it seems that well-timed arti cially induced aha experiences can impact people's assessments of central premises about the world, giving them a ring of truth that they would not otherwise enjoy.
The fact that the feeling of insight has an impact on one's judgments is not itself surprising. There is a long list of domains where feelings in uence decisions, including jury decision-making (Semmler & Brewer, 2002), risk judgments (Fischhoff et al., 1978), truth and memory judgments (Dougal & Schooler, 2007;Reber & Schwarz, 1999;Schwarz et al., 2007), and gambling and probability judgments (Loewenstein et al., 2001). But why would the aha experience in uence truth ratings about something as seemingly unrelated and fundamental as worldview beliefs? Aha experiences are characterized by an immediate sense of con dence and pleasure in the content of an idea or solution (Danek & Wiley, 2017;Webb et al., 2018). This feeling of certainty is warranted-aha moments tend to correspond to accurate solutions (Danek & Wiley, 2017;Hedne et al., 2016;Salvi et al., 2016;Threadgold et al., 2018;Webb et al., 2016). Because aha experiences tend to be a marker of good ideas, it makes sense that humans have learned to draw on this feeling as a source of information about our beliefs (Laukkonen et al., 2020;Laukkonen et al., 2021). Our ndings thus favor the hypothesis that aha moments are not simply epiphenomenal-like the steam whistle of an enginebut have causal in uence on guiding decisions about the veracity of new ideas, like the coal that fuels the engine (Dougal & Schooler, 2007;Laukkonen et al., 2020).
It is worthwhile brie y distinguishing our aha misattribution account from uency or ease of processing effects (Reber & Schwarz, 1999;Topolinski & Reber, 2010). Under the uency account, when an anagram is solved-regardless of aha moments-there ought to be an increase in uency. However, within solved anagrams, the presence or absence of aha moments led to higher truth ratings, thus going above and beyond uency alone. The uency account would also presumably predict higher truth values when there are no anagrams to solve, since the presence of anagrams ought to lead to a more dis uent experience overall. Yet, we found the opposite in Experiment 1.
A fruitful path for future work is to investigate the effects of aha on decision-making, particularly in applied contexts. Aha moments can be incorrigible (Hedne et al., 2016), di cult to forget (Danek et al., 2013), and can promote inspiration and drive towards action (Danek & Wiley, 2017). It is also well known that humans often fail to introspect about the true causes of their feelings or actions (Brasil-Neto et al., 1992;Carruthers, 2009;Johansson et al., 2005;Nisbett & Wilson, 1977;Wegner, 2004;Wegner & Wheatley, 1999). Thus, the persuasive insights that humans experience in complex domains such as politics, religion, and relationships, may all be in uenced by the recursive force of aha moments. Aha moments can mark a valuable new discovery, but if this process breaks down or is misinformed, then they may also perpetuate and entrench false beliefs. A potentially disastrous example of this mechanism in action can be seen in the QAnon phenomenon. Here an unknown individual(s) set up vague clues for the public to identify patterns across the media, political and presidential proceedings, and other current events, in order to con rm a grand conspiracy in which the former president of the United States was acting behind the scenes to stop a pedophilic cannibalistic cabal. The level of support for the movement is hard to measure, but appears remarkably high given the bizarre nature of the claims (Shanahan, 2021). QAnon provides a potential real-life example of our ndings in its demonstration that the way the mind constructs 'insights' is fallible , and yet these insights can incite real and sometimes dangerous behaviour.
Moving forward, we encourage a research program on the underlying mechanisms and predictors of false insights-circumstances and states of mind in which this usually adaptive heuristic may break down. A paradigm has recently been developed to experimentally elicit false insights , which could yield valuable data for understanding the development of delusions in clinical populations. Consider, for example, the case of John Nash, the Nobel Laureate and mathematician. When he was asked why he believed he was being recruited by aliens to save the world. He said, "...the ideas I had about supernatural beings came to me the same way that my mathematical ideas did. So I took them seriously" (Nasar, 2001). Along with the example of Rob Sips (2019) described earlier, such anecdotes may point to the failure of an otherwise adaptive Eureka heuristic . Under ordinary conditions, the feeling that accompanies a new idea reveals that inaccessible processes have yielded a valuable conclusion. This heuristic view of the aha moment also makes sense evolutionarily, as we often must decide quickly whether a new idea is good or bad; a thorough analysis is not always possible when an audience member asks a challenging question or a hungry lion is at one's heels.

Figure 1
Going from left to right, in each Anagram trial participants were presented with an incomplete claim, for example: "Free will is a powerful ____". Below it was an anagram that completed the claim (e.g., "oinliusl"). When the answer to the anagram was submitted in the text entry box, or the visible 15-second timer ran out, participants were advanced to the next page. On that page, participants saw the completed claim: "Free will is a powerful illusion" and were asked to make a truth judgment about the claim.
Participants then reported whether they experienced an aha moment while solving the anagram.

Figure 2
Combined data of the con rmation and the replication, N = 2,489. The plots (including means and standard deviations) illustrate truth ratings for the different conditions and key comparisons, with pvalues and effect sizes. Each large black dot represents the mean and black lines represent +/-1 standard deviation. Circles represent individual participants with a random horizontal jitter to aid visualization.