Systematic review raises doubts about the effectiveness of framing in 1 climate change communication 2

11 Ambitious climate policy requires acceptance by millions of people whose daily lives would be affected 12 in costly ways. How to get the mass public on board to prevent a political backlash against costly climate 13 policies? Many scholars regard ‘framing’ as an effective communication strategy for changing climate 14 beliefs, attitudes, and behaviors. In contrast, skeptics argue that people hold relatively stable opinions 15 and doubt that framing can alter public opinion on salient issues like climate change. We contribute to 16 this debate by conducting a first systematic review of 121 experimental studies on climate and 17 environmental policy framing, published in 46 peer-reviewed journals. We find that the vast majority of 18 these experiments report significant framing effects. However, the robustness of these results cannot 19 easily be checked because only few studies make their data publicly available. A survey of framing 20 researchers suggests that when scholars successfully publish non-significant effects, these were typically 21 bundled together with other, significant effects. Re-analysis of studies focusing on framing differences 22 by partisanship (a key driver of climate change attitudes) also shows that these effects are often not 23 robust when accounting for omitted interaction bias. To improve confidence in climate communication 24 research, we propose some best-practice standards, including preregistration of study designs, 25 publication of replication materials, and use of advanced post-design solutions.


11
Ambitious climate policy requires acceptance by millions of people whose daily lives would be affected 12 in costly ways. How to get the mass public on board to prevent a political backlash against costly climate 13 policies? Many scholars regard 'framing' as an effective communication strategy for changing climate 14 beliefs, attitudes, and behaviors. In contrast, skeptics argue that people hold relatively stable opinions 15 and doubt that framing can alter public opinion on salient issues like climate change. We contribute to 16 this debate by conducting a first systematic review of 121 experimental studies on climate and 17 environmental policy framing, published in 46 peer-reviewed journals. We find that the vast majority of 18 these experiments report significant framing effects. However, the robustness of these results cannot 19 easily be checked because only few studies make their data publicly available. A survey of framing 20 researchers suggests that when scholars successfully publish non-significant effects, these were typically 21 bundled together with other, significant effects. Re-analysis of studies focusing on framing differences 22 by partisanship (a key driver of climate change attitudes) also shows that these effects are often not 23 robust when accounting for omitted interaction bias. To improve confidence in climate communication 24 research, we propose some best-practice standards, including preregistration of study designs, 25 publication of replication materials, and use of advanced post-design solutions. 26 27 Emphasis framing occurs when actors use messages to alter people's preferences by changing the 28 presentation of an issue or an event 1,2 . In climate policy, politicians or other stakeholders may emphasize 29 specific subsets of preexisting argumentssuch as economic or health-benefits of climate change 30 mitigation 3-5in an attempt to influence public opinion in favor of (or against) climate action. Is framing 31 an effective climate communication technique to alter public opinion about climate change? Many 32 studies on climate communication suggest that strategic emphasis framing can effectively influence 33 public opinion because it safeguards individuals' identities by appealing to their existing values and prior 34 beliefs [6][7][8][9][10][11] . Framing theory holds that the effectiveness of framing in altering people's attitudes varies 35 according to whether the related information is stored in individuals' memories (i.e., is available), is 36 retrievable (i.e., accessible), and is evaluated as appropriate (i.e., is applicable) in a given situation 2,12 . 37 The framing literature also builds on a bounded rationality model 13 and often assumes that citizens have 38 limited capacity to process information systematically 2,14-16 . From this perspective, individuals use 39 frames as simple heuristics to minimize cognitive effort when forming policy attitudes 12,17,18 . 40 Most framing studies on climate communication look at heterogeneous framing effectsa variation of 41 framing effects across population subgroups. According to directional-motivated reasoning models 1 , 42 framing political messages around prior beliefs and values can reduce cognitive dissonance 19,20 and 43 increase framing effects. For example, empirical studies have shown that individuals perceive frames 44 tailored to their ideological core beliefs as less threatening. Accordingly, many studies (especially in 45 polarized political contexts such as the United States) assume that frames aligned with citizens' 46 ideologies and party identification are more effective at altering climate policy attitudes 1,21-23 . 47 Empirical evidence for the effect of framing is primarily generated through experiments embedded in 48 survey-, field-, or lab studies. Typically, study participants are randomly confronted with messages 49 emphasizing subsets of arguments or aspects related to an issue. The aim is to assess how these different 50 framing treatments alter respondents' climate beliefs, attitudes, and behaviors, particularly across 51 population subgroups. For example, Bernauer and McGrath 5 as well as Bain et al. 4,24 randomly assigned 52 individuals to different messages that either emphasize the risks of failing to combat climate change 53 (control frame) or highlight different co-benefits of climate mitigation, such as economic, community 54 building, and health benefits (treatment frames), to study if framing climate mitigation policy around 55 co-benefits instead of risks increases public support. While many researchers (see e.g., Bain et al. 4,24 ) 56 presume framing to be an effective communication technique for altering mass public opinion and 57 behavior concerning climate change 1,4,7,24-26 , some scholars (see e.g., Bernauer and McGrath 5 ) have 58 expressed doubts 5,16,27-32 . Skeptics argue that on salient and contested issues, such as climate change, 59 people are likely to hold relatively stable, consciously formed preferences and cannot be easily 60 manipulated through simple framing 5,16,[27][28][29]32 . Some also suspect a bias against reporting non-61 significant effects in the current framing literature 17,32 . They criticize the use of established experimental 62 designs and statistical methods that involve risks of producing weak and noisy effects with low external 63 validity, especially when studying heterogeneous framing effects across population subgroups 17,33-35 . 64 We contribute to this debate by offering a first systematic review of existing framing experiments on 65 climate and environmental issues. Given the prominence of discussions about the robustness of partisan 66 framing effects across ideological subgroups, we also re-analyze data from a set of published studies 67 using (compared to most published work to date) more advanced statistical methods. Finally, we provide 68 guidance for pre-and post-design solutions that could help improve climate communication research. 69 70 Our review is based on the PRISMA identification standard 36 (see further details in the Method section). 71 We identified 121 studies published in 46 peer-reviewed journals between 2007 and 2020, all of which 72

Review of framing experiments on climate and environmental issues
use an experimental design to study the effects of different types of framing treatments on individuals' 73 climate and environmental beliefs, attitudes, and behaviors (see Methods and SI for the complete list of 74 studies). While most studies we consider relate specifically to climate change, many studies also include 75 treatment groups and dependent variables related to other environmental issues, such as air pollution. 76 We decided to include all these studies to increase the scope of our findings. According to the 77 experimental stimuli used in these 121 studies, we classified them into six climate and environmental 78 framing research categories (see SI- Table 1). 79

84
Our primary goal is to review existing framing experiments on climate and environmental issues and 85 assess the robustness of reported results on the effectiveness of framing as a strategy for shaping public 86 opinion. Figure 1 provides an overview of our review's results (see Methods and Supplementary  87 Information, SI- Table 2 for further details). Approximately 92 percent (n=111) of the framing studies 88 we reviewed report significant main framing effects. Only 7 percent (n=9) report non-significant main 89 effects, and 1 percent (n=1) does not report any main effects. Around 20 percent (n=24) of all studies 90 do not report and discuss any heterogeneous treatment effects (e.g., interactions between participants' 91 characteristics, such as party ideology, and framing treatments sample, often with larger (n>1000) sample that aim to be representative of a country's population. 110 Bundling non-significant and significant effects to achieve publication 111 One concern that arises in view of such a large proportion of studies finding statistically significant 112 framing effects is that there may be a file-drawer problem, where only significant effects are published 113 17 . To assess how the authors of these published framing experiments experienced the publishing process 114 and dealt with non-significant framing effects they encountered, we implemented an online survey (see 115 SI-Section V). We contacted all 173 authors of the 121 publications via email and received a total of 63 116 responses (a response rate of 36 percent). We find that around 80 percent (n=50) of the respondents have 117 also identified non-significant effects in their framing experiments. Around 60 percent (n=38) of these 118 authors tried to publish their results, including non-significant effects in peer-reviewed journals. Only 119 63 percent (n=24) of these authors were able to publish studies with non-significant effects successfully. 120 However, according to these authors, in most cases, publishing their findings was only possible when 121 non-significant results were bundled together with other, significant effects. Therefore, the observed gap 122 between the small number of published non-significant framing effects (see Figure 1 above) and the 123 substantially larger number of identified non-significant framing effects reported by the surveyed 124 authors strongly suggests a publication bias towards significant treatment results. 125

File-drawer problem and lack of publicly available data 126
Previous research has also highlighted a potential 'file-drawer problem', i.e., the under-reporting of non-127 significant results 17 . Assessing this problem's existence and magnitude would require public access to 128 the data and a re-analysis of the original study results. However, only 23 percent (n=28) of the 121 129 articles we reviewed made their data publicly available. In addition, out of those 93 reviewed articles 130 whose data was not published, we obtained data for 29 studies by contacting authors via email (i.e., 131 overall, we could not get access to the data of more than 53 percent (n=64) of all reviewed studies). The 132 large number of experiments that report significant framing effects without publishing data thus raises 133 significant barriers for researchers attempting to assess the robustness of published results. For example, 134 extra and often unsuccessful efforts to obtain access to data increase the costs to systematically re-135 analyze existing studies, assess the robustness of their results, and estimate the size of the potential file-136 drawer problem. 137   studies with publically available data. We compare effects estimated using both classical OLS and more 202 advanced LASSOplus 30 . LASSOplus allows for simultaneous estimation of sub-group effects for all 203 included pretreatment covariates (e.g., age, education, income, and gender) and for regularization of 204 insignificant effects to avoid overfitting (see further details in the Methods section). 205 While all of the original studies report significant partisan subgroup effects when using OLS (see x-axis 206

Re-analyzing framing effects to check for omitted interaction bias
of Figure 2), we find that for the vast majority of re-analyzed studies (9 out of 10) partisan sub-group 207 effects are not statistically distinguishable from zero when using LASSOplus (see y-axis of Figure 2). 208 The exception is Schuldt, Konrath and Schwarz 42 , where reframing has a significant effect amongst 209 Republicans, even when using the LASSOplus method. 210 In addition to assessing the robustness of published subgroup effects by partisanship, we also explored 211 other potential subgroup effects (e.g., by age, education, income, gender) that were not the focus of the 212 original studies. However, also in this explorative analysis of heterogeneous framing effects, we do not 213 find support for robust variation in framing effects across different subgroups (see SI- Tables 3-12). 214 Overall, our re-analysis of heterogeneous framing effects with more advanced statistical methods shows 215 that the differences in framing effects detected by the original analyses are not robust and should not be 216 considered causal due to omitted interaction bias. 217 Pre-and post-design stage solutions for future research on framing 218 Researchers can use a number of potential solutions to increase the validity and robustness of their 219 experimental framing results. These solutions can be applied both when designing framing experiments 220 and when analyzing the experimental data. In the following, we discuss some pre-design and post-design 221 stage solutions that could improve confidence in climate communication research focused on framing. 222

Pre-design stage solutions 223
First, while different types of frames have been subject to empirical evaluation, our review shows that 224 most of these experiments were embedded in surveys at one point in time and in one specific country, 225 mostly the United States.

Post-design stage solutions 260
Climate communication researchers should also assess their framing effects' robustness after 261 implementing their experiments using more advanced statistical methods. First, as shown in our re-262 analysis of partisan subgroup effects, many published framing effects are significant, but may not be 263 robust and run the risk of so-called type-S and type-M errors 51 . A type-S error refers to the probability 264 that an estimate's sign is in the wrong direction, i.e., finding a positive effect even though the true effect 265 is negative. A type-M error attempts to quantify the magnitude of an overestimated effect, i.e., how 266 much larger it is than its true value. Using the obtained treatment effect estimates, and associated 267 standard errors, researchers can conduct post-design power calculations and calculate the degree to 268 which their inferences are at risk from type-S and -M errors. 269 Second, our re-analysis raises doubts about the substantive meaning of the size of published framing 270 effects. Researchers can move beyond null hypothesis testing to test whether the treatment effect 271 estimated is substantively meaningful 52-54 . Equivalence tests are a prominent approach for doing so. 272 Originating in biostatistics, but increasingly adopted in the social sciences, "two one-sided tests" 273 (TOSTs) allow researchers to formally test whether the estimated treatment effect is statistically 274 significantly different from a non-meaningful effect specified by the researcher. For example, for a 275 researcher defining a meaningful change in support for an environmental policy as 1%, a treatment effect 276 of 0.5 with a 90% confidence interval of (0.25,0.75) would constitute a statistically significant "non-277 meaningful" effect. While placing a greater burden on the researcher, by explicitly specifying what is a 278 meaningful effect and conducting additional analyses, this approach would increase (skeptical) readers' 279 confidence that the framing effects identified are substantial and worthy of further research. 280 Third and finally, as demonstrated in our re-analysis of partisan framing effects, researchers can use 281 more advanced statistical methods 34,37,38 , typically based on machine learning algorithms, to check their 282 results' sensitivity to model misspecification and potential omitted interaction bias. This approach would 283 increase the robustness and credibility of the obtained findings. As an example, we illustrate in the 284 supplementary materials (SI-Section II) how to employ these post-design solutions when re-analyzing 285 data for a study on the effect of co-benefit framing on environmental policy support 5 . This illustration 286 underscores that many published framing experiments are at risk of overestimating effects (Type-M 287 error) that are, substantively-speaking, of negligible size and not robust when using more advanced 288 statistical methods. 289

290
The findings reported in this paper raise doubts about the effectiveness of framing in climate 291 communication. They point to a potential risk of over-reporting significant results 17 . Likely bias against 292 publication of non-significant findings is unfortunate, given the manifold ways that researchers and 293 practitioners could learn from such results in terms of when and why framing does or does not work. 294 However, our results do not suggest that framing per se is ineffective at influencing the public's beliefs, 295 attitudes, and behaviors. Instead, they suggest that framing effectsin the form they are currently 296 studied in climate communication researchare often of smaller magnitude and less robust than 297 assumed. To identify more robust and meaningful framing effects we need to reconsider the empirical 298 approaches and statistical methods used in climate communication research focused on framing. 299 Exploring effective climate communication strategies requires that practitioners and researchers 300 collaborate in more field-embedded and realistic transdisciplinary projects. Future research needs to 301 embrace the full spectrum of available methods and engage in a more cautious but often more effortful 302 empirical approaches. In doing so, researchers should follow best-practice standards. The most 303 important of these are preregistration of study designs, publication of replication materials, and 304 advanced post-design solutions to prevent over-reporting of weak effects. Future climate 305 communication research should critically reflect on the limits of framing and employ the outlined best-306 practice standards in order to provide useful policy recommendations about how to promote ambitious 307 policies to combat climate change. 308

A systematic review of framing studies 310
In line with the "Preferred Reporting Items for Systematic Reviews and Meta-Analyses" (PRISMA) 36 , 311 we systematically reviewed framing studies in the field of environmental politics, economics and 312 psychology according to the following three steps. 313 First, we conducted a scoping analysis of environment-related framing experiments published in a peer-314 reviewed scientific journal in Google Scholar, Web of Science, and personal databases using the 315 following search string: (("emphasis fram*" OR "issue fram*" OR "policy fram*" OR "refram*" OR 316 "fram* experiment" OR "information treatment" OR "communication" OR "message" OR "priming" 317 OR "persuasive information" OR "argument" ) AND (("survey" AND "experiment") OR ("field" AND 318 "experiment") OR ("lab*" AND "experiment")) AND ("climate change" OR "environment")) 319 In addition, we used forward and backward snowball technique to identify relevant framing experiments 320 using citations and the reference lists of the reviewed articles. We limited the scope to studies that were 321 published before or in 2020. We only identified relevant studies published between 2007 and 2020. 322 Second, during our scoping analysis, and in line with the PRISMA standard 36 , we identified 121 peer-323 reviewed articles in 46 social science journals that we classified as framing experimental studies in the 324 field of environmental politics, economics and psychology (see PRISMA scheme below). The PRISMA 325 standard aims to report systematic reviews transparently and comprises an evidence-based minimum set 326 of reporting items. We only included studies that randomly varied emphasis framing treatments and 327 assessed their effects on individuals' environmental beliefs, attitudes, or behaviours. Therefore, we 328 included studies that varied the information's connotation, but excluded so-called equivalence framing 329 experiments. In contrast to emphasis framing, equivalence framing uses different, but logically 330 equivalent phrases to label and describe an issue. An example for an equivalence frame would be to 331 state that a person has a 20% risk of dying or an 80% chance of surviving. The rationale for focusing 332 our review on emphasis frames is that equivalence frames are a less prominent strategy in climate 333 communication (research) and policymaking 30 . Policymakers typically vary the emphasis on a specific 334 subset of relevant arguments in a policy debate, rather than using logically equivalent phrases to alter 335 public opinion. Moreover, studies that did not use a survey-, lab-or field-experimental design or did not We trained three research assistants as coders. In addition, three of the authors also coded articles and 379 double-checked the coding results. In the case of coding-related uncertainty, we asked coders to make 380 comments. The authors then independently looked at these comments and came to an individual 381 decision. Subsequently, the authors discussed these pending cases to make a final decision. 382 We also qualitatively analyzed the sample articles and inductively created six framing-type groups, as 383 presented in Figure 1 of the paper. Namely, these are "Issue/Solution Frames", "Value/Norm/Attribution 384 Frames", "Re-Labelling Frames", "Psychological Distance Frames", "Consensus/Uncertainty Frames", 385 and "Source Cue Frames". The definition of each category and relevant examples are listed in SI- Table  386 1 in the appendix. To clarify, the objective of making this typology was to identify the central focus of 387 the treatment conditions in each framing experiment. For those studies that contained two types of 388 manipulations, we coded 0.5 for each category. 389

Issue/Solution
The treatment provides different issue interpretations of a given problem or suggests different solutions to a given problem (e.g., experimental manipulation of information suggesting the solution to climate change will have to be based on technological innovation or lifestyle changes)

Value/Norm/Attribution
The treatment emphasizes different values or social norms embodied in actions to address a given problem (e.g., experimental manipulation of information indicating economic benefits vs. moral obligations of reducing carbon emissions)

Re-Labeling
The treatment uses different wordings to describe the same problem (e.g., experimental manipulation of information referring to 'climate change' vs 'global warming')

Psychological Distance
The treatment indicates different time horizons to consider a given problem (e.g., experimental manipulation of information suggesting climate change is a future vs. current problem or a global vs. local problem)

Consensus/Uncertainty
The treatment highlights the varying degree of consensus or uncertainty on a given problem (e.g., experimental manipulation of information suggesting scientific consensus vs. lack thereof on climate change)

Source Cue
The treatment highlights different sources of the same message on a given problem (e.g., experimental manipulation of information on the severity of climate change suggested by scientists vs. politicians, or different political parties)  Figure 1 shows the distribution of the 121 studies across these six categories. The first category of 643 studies investigates the impact of issue and solution frames on study participants' climate and 644 environmental beliefs, attitudes, and behaviors (see SI- Table 1 and SI- Figure 1, 'Issue/Solution'). This 645 is the largest category and comprises 42 percent (n=50) of all reviewed studies. Issue and solution frames 646 are often emphasize environmental risks and co-benefits of environmental protection or climate 647 mitigation. For example, some studies 4,24 in this category highlight that emphasizing co-benefits of 648 climate mitigation (such as technological innovation, green jobs, community building, or health 649 improvements) could foster public support for ambitious mitigation policies. 650 The second category of studies focuses on potential effects of morally loaded frames that emphasize 651 personal values and social norms, and attribute responsibility for environmental problems and solutions 652 (see SI- Table 1 and SI- Figure 1, 'Value/Norm/Attribution'). We decided to group values, norms and 653 attribution into one group because all these framing types have an explicit moral and normative 654 dimension. This category accounts for 20 percent (n=24.5) of the studies we reviewed. For instance, 655 research in this category finds that moral and normative frames can be more effective at motivating 656 environmentally friendly behavior than economic appeals that focus on individual self-interest. 26,56 657 The third category of framing experiments accounts for 13 percent (n=15.5) of the studies we reviewed 658 (see SI- Table 1 and SI- Figure 1, 'Psychological Distance'). Such research seeks to examine the impact 659 of manipulating the perceived psychological distance to environmental impacts. For example, some 660 studies 57,58 vary the spatial, social, and temporal distance of climate change impacts to assess whether 661 people support ambitious mitigation more when they perceive climate change as a proximate problem. 662 The fourth-largest category of framing studies accounts for 12 percent (n=15) (see SI- Table 1 and SI-663 Figure 1, 'Re-Labeling'). This research seeks to re-label specific terms or use visual cues to influence 664 public opinion. For example, some studies 59,60 find that, in the United States, Republicans are more 665 concerned about 'climate change' than about 'global warming'. 666 The fifth category of frames accounts for 8 percent (n=10) and concentrates on consensus and 667 uncertainty (see SI- Table 1  The sixth category of frames relates to source cue effects (see SI- Table 1  To illustrate putting our recommendations into practice, we re-analyze a prominent study on the effect 680 of co-benefit framing on environmental policy support 5 . Bernauer and McGrath (ID11) conduct a 681 comprehensive study that evaluates the average and heterogeneous effects of various frames upon three 682 outcomes. Their ultimate conclusion is that reframing is unlikely to significantly boost public support 683 for climate policy, as the vast majority (135) of these effects are insignificant. 684 While the overwhelming majority of effects are insignificant, there is a small number where this is not 685 the case. For example, a frame that emphasizes how environmental policy can lead to a "good society", 686 is found to cause a statistically significant increase in environmental policy support for Independents 687 (effect = 0.32, std. error = 0.17, p < 0.06). 688 We thus use this effect as an example for assessing the robustness of framing effects generally, by 689 following the recommendations we have outlined previously. We do so because we consider this 690 example as a hard case for testing the robustness of framing effects. In essence, if we cannot find support 691 for the few positive framing effects reported in the critical assessment of framing effects reported by 692 Bernauer and McGrath 5 , this raises doubts about the effectiveness of framing in climate communication 693 more general. 694 First, we assess the potential for type-M and -S errors based upon this estimated treatment effect. 695 Precisely, for any given treatment effect estimate the smaller the standard error, the lower the expected 696 type-M error. Additionally, small treatment effects with large standard errors have a high probability of 697 type-S error. In this example, we find that the probability of a type-S error is incredibly small, 698 approximately 0.0004, meaning it is extremely unlikely the true effect is in the opposite direction, i.e. 699 negative. The type-M error is estimated to be approximately 1.35, meaning that the magnitude of the 700 effect we can uncover is on average 35% larger than the true effect. 701 Second, we assess whether the estimated treatment effect is substantively meaningful by using 702 equivalence tests. As we do not have strong prior beliefs about a non-meaningful change in the 703 standardized policy support variable, we base our equivalence regions on standard rules of thumb for 704 interpreting standardized effect sizes, also known as Cohen's d. We construct TOSTs for the commonly 705 used definitions of small (±0.2), medium (±0.5), and large (±0.8 effects). 706 707 SI- Figure 2: Equivalence tests for small (0.2), medium (0.5), and large (0.8) standardized effect sizes. 708 709 SI- Figure 2 displays the results of these equivalence tests. The tests suggest that the estimated treatment 710 effect is consistent with being interpreted as a small or medium-sized effect. The null hypothesis of 711 equivalence for these effect sizes cannot be rejected. However, the treatment effect is interpreted as a 712 large effect can be rejected, as the confidence interval for the estimated effect is located entirely within 713 the defined bounds. 714 Third, we assess whether the estimated effect is prone to omitted interaction bias by using LASSOPlus 34 . 715 To do so we include all the potentially relevant subgroups previously analyzed by Bernauer and 716 McGrath. Doing so leads to this estimated effect being set to zero, indicating that it is ultimately not a 717 robust treatment effect. No non-zero treatment effects, whether they be average or sub-group based, are 718 found due to this estimation. This analysis ultimately supports Bernauer and McGrath's contention that 719 simple reframing is unlikely to affect climate policy support. 720 In summary, following our recommendations for assessing framing effects provides an important set of 721 steps for assessing the robustness and significance of framing effects. Estimating the type-M and -S 722 error rates suggest that while the probability of finding the incorrect sign is low, the effect is likely 723 overstated in its magnitude. Equivalence tests suggest that the effect is consistent with the common 724 interpretation of a medium-sized standardized effect, but it is not equivalent to a large effect size. 725 However, this effect is ultimately not robust when accounting for potential omitted interaction bias using 726 a regularized estimator, calling in to question its likely replicability and generalizability outside of the 727 survey. 728 729 730 731 Thank you very much for answering the short questionnaire! We appreciate your support a lot. Please click once more to leave the survey.