Bridging the Gap: An Empirical Approach to Studying the Social Sustainability of Future Energy Systems

Background : Given the multitude of scenarios on the future of our energy systems, multi-criteria assessments are increasingly called for to analyze and anticipate long-term effects of possible pathways with regards to their environmental, economic and social sustainability. While economic and ecologic indicators are covered through energy systems modelling and life cycle sustainability assessments, approaches to the social sustainability of future energy systems remain methodologically under-developed. Previous studies have either focused only on the social acceptance of single energy technologies or used expert-based environmental and economic indicators with social implications. Approach and results : We argue that in order to gather empirical insights on the social sustainability of future energy systems and to integrate it in multi-cri teria assessments, citizens’ preferences and values need to be more systematically analyzed while informing their decisions more transparently with full life cycle data. Given the lack of theoretical underpinnings of sustainability and of empirical insight s into citizens’ perceptions of sustainability with respect to future energy systems, we further argue that an explorative research design is needed. Therefore, next to six focus groups, we conducted a discrete choice experiment. The method is currently be coming more popular to analyze individuals’ preference structures for energy technologies or investments. As we show in our paper, it can be fruitfully applied to study the values and trade-offs of citizens with regards to sustainability issues. Our combined empirical methods provide two main insights with strong implications for the future development and assessment of energy pathways: While environmental and climate-related effects significantly influenced citizens’ preferences for or against certain ener gy pathways, total systems and production costs were of far less importance to citizens than the public discourse suggests. Conclusions : Our findings are contrary to the focus of many scenario studies that seek to optimize pathways according to total systems costs. The role of fairness and distributional justice in transition processes featured as a dominant theme in all focus groups. This adds central dimensions for future multi-criteria assessments that, so far, have been neglected by current energy systems models.

Keywords: social sustainability, transition pathways, energy systems, multi-criteria assessments, discrete choice experiment, focus groups, mixed-method design, distributional justice, costs of energy transitions, sustainability trade-offs Background Pathways towards more sustainable energy systems are based on energy scenarios from systems modelling approaches. In recent years, the awareness of the scenarios' enormous diversity in underlying assumptions, methods and scope has resulted in calls for more systematic assessments of scenarios with regard to possible ecological, economic and social effects [1][2][3][4]. Such multidimensional assessments can serve different purposes: They support the comparability of often very complex scenarios and explicate (un)desired effects of transition processes that are not included in the scenario studies. This enables decision-makers to better anticipate sustainability trade-offs within and across different pathways and, thus, to navigate policies and instruments through the complex triangle of secure, affordable and ecologically-friendly energy systems [5,6]. In energy systems modelling, the development and assessment of scenarios is guided by ecological footprints, systems costs and technological feasibility [7,8]. Life Cycle Sustainability Assessments (LCSA) have also been applied to assess ecologic impacts and life cycle costs for parts of energy systems and individual technologies using diverse sets of indicators backed up by large datasets [9,10]. While integrations of LCSA with energy systems modelling are emerging (see [11] for a review), the social sustainability dimension remains under-developed. From our literature review, we observe that previous multi-criteria assessment studies either limit the concept to focus only on the social acceptance of single energy technologies [12][13][14] or use existing environmental and economic indicators with social implications (e.g. job creation, land usage, security of supply, CO2-emissions) to cover the social dimension [15,16].
In the latter cases, indicators are defined and assessed exclusively by experts, whereby the experts also evaluate the relative importance of the social indicators compared to ecologic and economic aspects as part of multi-criteria assessments.
Our paper seeks to expand and bridge the gap between normative, acceptance-based approaches to social sustainability and indicator-based expert assessments. In order to empirically study the social sustainability of future energy systems and to integrate findings into multidimensional assessments of transition pathways, we present an approach that addresses two key challenges in the literature: i) What does social sustainability even mean in the context of future energy systems, and ii) what kind of empirical data on social sustainability aspects can be used and made fruitful for multi-criteria decision analysis.
Following a normative-functional concept of sustainability [17], we argue that empirical studies need to reflect the perspectives and preferences of societal actors (citizens, stakeholders, decision-makers) for energy systems and technologies. The concept of sustainability is inherently normative in nature, and yet, needs to guide today's decisions about future energy systems [5]. This makes it necessary to take into account what different actors consider as justified and legitimate interventions in nature and society [17]. At the same time, these normative decisions need to be complemented and mediated by systems knowledge that allows for an analysis of the systemic relationship between sustainability objectives and the implications different decisions in the transition process (e.g. for or against certain energy technologies) can have for the functionality of the eco-social system [17]. We realized these challenges as part of the interdisciplinary research project InNOSys 1 which sought to assess and optimize pathways for the future German energy system by integrating economic, ecologic and social sustainability indicators into a multi-criteria decision analysis. As part of the project, existing scenarios for the German energy system in 2050 with GHG emissions reductions of 80-90% and of more than 90% were selected from an extensive literature review and were subsequently remodeled to harmonize underlying assumptions 2 . Sustainability indicators were derived for each scenario using data from the Framework for the Assessment of Environmental Impacts of Transformation Scenarios (FRITS) [2] and PANTA RHEI [18]. These indicators pertained to economic and ecological dimensions 1 For further information, see https://www.innosys-projekt.de/de (last accessed 11/10/2021) 2 The harmonization was done using the models MESAP and flexABLE (for more details, see 11) and included, among other, assumptions on energy demand, energy prices, technology prices, population and GDP. but also entailed significant social implications, e.g. job creation, impact on human health, CO2 emissions or systems costs. To explore the social sustainability dimension and to make it fruitful for multi-criteria decision analysis, we conducted a mixed-method approach: We conducted a discrete choice experiment (DCE) in which citizens had to choose among different future energy systems (sorted in two groups of electricity and electricity and heat production) in Germany. The choices were underpinned using the sustainability indicators with the life-cycle assessment data as derived for the research project. This ensured that citizens were able to express their preferences for certain future energy strategies but at the same time were made aware of the possible ecological, economic and social consequences of their choices. Thereby, we acknowledged findings from previous research that citizens often lack the necessary knowledge to discriminately assess different energy technologies [12] and follow recent calls for the usage of complete life-cycle information of technologies for lay people [19]. The DCE allowed us to analyze the relative importance of the different sustainability indicators for citizens' decisions. Such preference structures with regards to the different weights of sustainability indicators are core inputs for multi-criteria assessments, and were thus, used by the interdisciplinary research team (see [20] for an elaboration of the results and methodology). Furthermore, we adopted an explorative perspective by conducting six focus groups. Previous studies acknowledge that the difficulty of defining social sustainability in part stems from the lack of theoretical underpinnings of the concept more generally [21]. While we do not pretend to be able to solve this issue, we seek to find out what aspects and dimensions citizens associate with the social sustainability of entire energy scenarios, and thereby go beyond currently used indicators. As we will show in this paper, our insights also reveal the discrepancy between what (mostly quantitative, indicator-based) assessment models can capture, and what citizens seem to value most when assessing future energy systems. This paper documents how the combination of DCE and focus groups can be made fruitful for studying the social sustainability of future energy systems and discusses the kind of findings it reveals (as inputs for multi-criteria assessments). The following section briefly reviews previous approaches to studying social sustainability of energy systems and technologies and discusses how the methods DCE and focus groups have previously been applied in this context. It is followed by our methodology and data collection processes (section 3) and key results (section 4). We close with a discussion of the relevance of the findings for decision-makers in transitions and energy systems modelers (section 5).

Sustainability of energy systems: concepts and methodologies
The goal of sustainable energy transitions can be defined as establishing a secure, affordable energy system that spares non-renewable energy sources as well as environmental resources and that meets the needs of both present and future generations [21,22]. These objectives reflect the widely accepted understanding of sustainability as a 'three-pillar concept' that equally weights ecological, economic and social considerations of energy policy decisions. Sustainability has been notably shaped by the Brundtland Report that called for sustainable development processes that satisfy "the needs of the present without compromising the ability of further generations to meet their own needs'' [23,24]. In the aftermath of the report, sustainability has been operationalized using multiple frameworks that differ in the dimensions they incorporate and in how these dimensions relate to each other. Renn et al. [17] note that while concepts with a prioritization of ecological preservation historically constitute the oldest, three-pillar frameworks remain most popular, even though there have been propositions to expand the concept to include 'culture' and 'institutional stability' as additional pillars. Other concepts, e.g. the Sustainable Development Goals by the UN [25] or the Integrative Concept of Sustainable Development [26] move away from the pillar-logic and emphasize elements like "securing human existence", "maintaining society's productive potential" and "preserving society's options for development and action" [21].
What these conceptualizations have in common are two challenges for empirical research: First, how to measure sustainability, i.e. which indicators to use, and second, how to deal with interdependencies and trade-offs, i.e. how to weigh indicators for sustainability assessments. This also reflected in the previous studies criticism of the concept: Its normative character paired with a lack of theoretical underpinnings would mean that individuals (politicians, scientists, citizens) tend to disagree about desirable actions and transition pathways [27] and clear guidelines for how to derive sustainability indicators are missing [22]. Already in 2003, Parris & Kates [28] argued that more than 500 approaches existed for measuring sustainability.
The challenges in measuring the social sustainability of future energy systems in part stem from the fact that it is impossible to agree on a universally accepted definition of the concept of social sustainability. Previous studies to understanding the social sustainability of energy systems can be divided into three different kinds of contributions: A first group of empirical studies focus on the social acceptance of energy technologies as one component of social sustainability [12][13][14]. A national social sustainability survey ('Soziales Nachhaltigkeitsbarometer der Energiewende'), for instance, is published annually and presents empirical data on German citizens' acceptance of political measures, individual energy technologies or the governance of the transition process [29]. A second group of studies use measurable, environmental and economic indicators that have social implications (e.g. job creation, land usage, security of supply, CO2-emissions) and reframe these as 'social sustainability' next to further economic and environmental indicators [15,16]. In Evans et al. [16], for instance, experts determine a set of 'social' indicators and rank energy technologies accordingly using an ordinal scale (e.g. the social impact of wind turbines is measured as 'bird strike', 'noise' and 'visual' using a magnitude range between minor and major). As a third group, studies have used empirical data from social LCA databases to assess the social sustainability of energy technologies [30]. While for the latter group, data on the social sustainability performances is criticized for its methodological shortcomings and the resulting lack of reliability and validity [30][31][32]. With regard to the first two groups of contributions, we observe a clear dichotomy: Within the first group of contributions, social sustainability is understood as a purely normative function, i.e. what are acceptable technologies or policy measures for the interrogated individuals. While most of the studies also collect data on individuals' overall attitudes, emotions and perceptions of the technologies [12], the studies do not investigate the relative importance of different sustainability dimensions (environmental, economic, health-related aspects) for the acceptance. In contrast, the second group of studies uses a wider range of indicators to operationalize social sustainability. However, the indicators are defined and assessed exclusively by experts, and in many cases, they also assess the relative importance of the social sustainability indicators compared to ecologic and economic ones as part of multi-criteria assessments.
Despite these shortcomings, sustainability remains widely present as a guiding principle in political communication, public discourse and scientific assessments. As Köhler et al. [27] call for in their

Discrete choice experiments in energy research
Discrete choice experiment (DCE) has emerged as a popular quantitative research method to study individuals' preferences for a variety of sustainabilityand energy-related technologies, services and policy measures, including contributions in this journal [33][34][35]. The basic idea of the method is that instead of revealing preferences through empirical observations, researchers have individuals' explicitly state their preferences regarding a set of predefined options. Through an experimental setup, researchers predefine options with varying attributes and levels so as to analyze what relative value the different attributes [36]. This way, the method takes into account that in complex decision situations, preferences are not one-dimensional but involve trade-offs between underlying, or 'latent' preferences [37]. The methods have, thus, proven to be more reliable than direct questions regarding the importance of specific attributes for individual choices [37] and demonstrate a high external validity [38]. DCE and conjoint analysis have often been used synonymously in social science applications 3 . While we acknowledge the different traditions and theoretical foundations of both 3 Both methods have their origins in psychology and have been typically used in marketing to detect consumer preferences. In line with Green and Srinivasan [39], we acknowledge conjoint analysis as an umbrella term that includes a broad spectrum of approaches that "(…) estimate(s) the structure of a consumer's preferences (e.g. part worth's, importance weights, ideal points) given his/her overall evaluations of a set of alternatives that are pre-specified in terms of levels of different attributes." While some authors suggest that discrete choice experiments present a subcategory of conjoint analysis [36], Louviere et al. [37], however, argue that subsuming DCEs under the header of CA does not do justice to the theoretical underpinnings of DCE: Whereas CA is founded on mathematical principles of 'conjoint measurements', DCE is based on random utility theory that offers (better) insights into the choice behavior of individuals. It assumes that the latent utilities can be summarized by two components, a systematic (explainable) component and a random (unexplainable) component. Systematic components comprise attributes explaining differences in choice alternatives and covariates explaining differences in individuals' choices. methods, the following review includes applications of CA and DCE in the context of energy and sustainability. As the purpose is to situate our study among the existing approaches to a better understanding of citizens' preferences for energy technologies and sustainability measures, we emphasize the common features of CA and DCE: the setup of a hypothetical decision situation in which individuals have to choose among a discrete number of options with different attribute levels, the use of attributes as explanatory variables using regression models [40,41] as well as similar procedures on data collection. Table 1 presents a non-exhaustive overview of the different perspectives researchers in the areas of energy and sustainability studies have adopted in using stated preference elicitation methods.

Specifications of the studies
Variations across the literature Individuals being interrogated...
• Citizens as consumers/ private households [42,43] • Citizens as potentially affected by an energy technology [44,45] • Citizens as private investors [46][47][48][49] • Stakeholders involved in technology installation and operation [50,51] …concerning the objects of investigation… Single energy technologies: • Economic and ecologic effects of wind energy [52], including onshore [45] and offshore [53], siting decisions of geothermal power plants [54], nuclear waste [55], photovoltaics, hydro schemes, biomass, waste combustion, natural gas [44] Policies, programs, products: • Climate change mitigation policies for residential energy use [42] • Private investments in technologies [49], e.g. wind [47], solar thermal [56] • Electricity products [57], load control management/ domestic appliance curtailment contracts [58] • Smart meters [59], electricity saving products [60], energy pricing programs for demand side management [61] …testing the value of the attributes… Environmental aspects: • Impacts on landscape, wildlife, air pollution [44], and landscape, habitat and fauna in combination with costs of technologies [45] • Marine species abundance and diversity with artificial reefs, wind farm ownership, aesthetic impacts [53] Economic/ social aspects: • Employment in local community, price for electricity [44] • Town location, distance from respondents' home, monetary savings, tax revenue of community [52] • Willingness to pay for energy efficiency versus CO2 reduction measures [42] Random components comprise all unidentified factors that impact choices." Due to the involvement of a random component, with DCE, researchers can only determine the probability that an individual will prefer one alternative over the other.
• Return, risk, duration and field of private financial investments [48,49] Level of information/ personal involvement • Environmental labelling, disclosure of information about life-cycle [43,62] • Transparent information on energy sources for electricity products [57], feedback provision on energy saving-programs [60,63] • Level of engagement in the technology/control over the technological features [59,61] • Procedural fairness and distributive justice in policy decisions [55] • Personal convenience: type of curtailment contracts, frequency of curtailment, opt-out, advance notice, compensation [58] ...accounting for differences across individuals due to… • Age [43,47], gender [59] • Income [42,44,48] • Urban vs. rural communities [44] • Environmental attitudes & behavior [47,58] • Trust in electricity suppliers [58] ...using methodological variations • Comparison of contingent rating and choice experiment [45], CA and self-explicated method [64] • CA to improve communication between LCA analysts and stakeholders [51] • Combination of CA with field experiment [61] [45], three environmental-related attributes are highly significant, with the impact of wind farms on flora and fauna being the most important factor for citizens. The cost attribute thereby showed comparatively weak effects. This can also be observed in a study by Bergmann et al. [44] that uses a similar experimental setup for different energy technologies. Effects on landscape and wildlife are valued the highest by the respondents. The study by Klain et al. [53] looks at offshore wind farms and demonstrates that impacts on reef habitatamong the other indicators capital costs, ownership structure and visual impacts -are the most salient factor in citizens' decisions. Like many other studies, e.g. [52], they assessed citizens' willingness to pay (WTP) for measures that improve environmental conservation and found particularly high rates for measures on biodiversity. Another set of attributes that resonate with our research design is the role of transparent information for citizens' assessments.
Assefa and Frostell [12] found that citizens lack knowledge to discriminately assess different energy technologies while other studies [19] have argued for more transparent and complete life-cycle information of technologies for lay people. The study by Krütli et al. [55] investigates the role of perceived fairness in policy decisions on siting nuclear waste. Testing the attributes procedural justice, distributive justice and outcome valence, the authors find that procedural justice, i.e. the involvement of citizens and the transparency of the decision-making process is highly valued by respondents.
In sum, while the combination of tested attributes in various studies reflect environmental, economic and social dimensions, none of the reviewed studies have explicitly utilized DCE or CA for understanding social sustainability aspects nor for multi-criteria assessments in which the empirical results are utilized as weighted inputs for further, quantitative modelling. Similarly, focus groups that empirically complement, validate and contextualize DCE/ CA findings have not been found in the literature. Louviere et al. [37] merely suggest that qualitative research, including focus groups, could be used ex ante to identify possible attributes.

Focus groups in energy research
Focus groups have a long history as a method for gaining data in a variety of sciences, starting from psychology, economics and market analysis; it is one of the most known explorative and qualitative methods in sociology and its assigned research fields [65,66]. Focus groups are a form of group discussion to systematically access arguments on, perceptions about and attitudes towards a given topic. Moderation constitutes a core requirement to provide guidance during the discussion, to ensure that discussions do not get off topic or that single participants dominate and restrain others from voicing their opinion. This way, the group discussion follows the ideal of Habermas [67] to include as much diverse opinions as participants offer and are willing to share. These characteristics do not only make focus groups more time efficient than interviews; it also reveals the inherent social component of the method: Individuals' statements are always carried out in the presence and awareness of a group. This dynamic alters the entire communication process as statements are checked, agreed or countered immediately by the other participants and, thus, it can foster the depth of insight into a field as previously untapped sources of (civil) knowledge are activated [68].
Challenges related to the mitigation of climate change and the resulting need to transform energy production and consumption patterns provoke fundamental changes in individuals' lifestyles and their autonomy. In this context, focus groups have provided great potential for tapping citizens' local knowledge as explorative data sources [69][70][71]. In reviewing the role of focus groups within energy transition research projects, Gailing and Naumann [72] state that focus groups do not only reproduce and surface already existing perspectives of individuals'. The methodological setup also creates 'spaces' in which the group discussion creates and produces social realities. According to the authors, this can help change social disparities in that participants gain the power to not only respond but to steer the discourse itself. Following this constructivist perspective on focus groups, these effects potentially transcend the protected spaces of the focus groups in that participants have the possibility to participate in a more direct way and with greater power in decisions on energy transition processes [73].

Methods
In our research, 124 citizens participated in the discrete choice experiment while 65 of them also participated in one of the six focus groups (Figure 1). Participants for both methods were recruited through a market research company. The recruitment was guided using the following selection criteria: i) participants needed to live in either Stuttgart or Osnabrück 4 , ii) an equal distribution of gender, iii) an even number of participants that represent three age groups (students/ working/ retired), and iv) a good mixture of educational background and professions.  4 North Germany is heavily affected by the expansion of wind power generation and regularly produces more renewable energy than it needs while the industrialized centers in the south still lack renewable capacities. The emerging need for powerlines creates diverging patterns of acceptance in the German regions [74]. The cities Osnabrück (in Lower Saxony, northern GER) and Stuttgart (in Baden-Württemberg, southern GER) represent these dynamics.

Discrete choice experiment
As a basis for the discrete choices, we decided upon the technological systems, i.e. the different options participants needed to choose between. Model-based energy scenarios are complex 'if…then' statements derived from multiple assumptions and parameters [75] and are difficult for non-modelers to interpret [76]. Therefore, it was not a suitable option to present participants with such complex  [20]. Data for all indicators included effects regarding the technologies' full life-cycle (installation, operation, disposal) 6 . Traditionally, discrete choice experiments fully randomize the levels of attributes across choices. In our case, however, LCA datasets prescribed exactly how each scenario scored with regard to the different sustainability indicators (see [38] for previous applications of non-randomized, predefined vignettes). 5 The decision was based on a meta-analysis of existing energy scenarios [11] and the assumed future relevance of different technologies for the overall energy mix as described in these scenarios. For instance, we refrained from integrating coal or nuclear power plants into the design since all major scenario studies assume their phase out by 2050 (the most common reference point of the scenario studies). 6 The first four indicators were derived from the life cycle impact assessment method (ILCD midpoint 2.0), which is integrated in the life cycle inventory database ecoinvent v3.3 77 while the employment indicators were derived from research studies on job creation [78] and electricity production costs (based on [79-83]) and formed expert-based assessments (security of supply).
Participants were presented with two scenarios at a time and asked to choose the most preferred (paired comparison design). As a basis, the 12 scenarios were paired forming 30 different choices for 'electricity' and 'electricity and heat', while for each scenario all eight sustainability indicators were shown. Given this large number of choices and their complexity, a full factorial design (each participant receives all choices) was not adopted as we sought to minimize fatigue effects; therefore, we confronted each participant with 12 choices 7 . Given the complexity of the scenarios we paid special attention to the presentation and visualization of the data. The scenarios entailed verbal descriptions along with small icons of the involved technologies to facilitate quick recognition during the choice situations. Every sustainability indicator was displayed as a colored percentage: The percentages state how much the scenario's performance deviated from average performance of all considered scenarios.
For the example shown in Figure 2, security of supply for the scenario wind + photovoltaics is 19% below the average performance whereas for gas + geothermal the security of supply is 40% above average. The colors indicate the scenarios' contributions to sustainability (greenmore sustainability, redless sustainability) 8 . A pre-test showed that a display without any visualization and color results in a lack of understandability for participants. 7 A review of conjoint analyses on environmental issues [36] shows considerable differences among studies regarding the number situations per participant, ranging from 12 to 120. 8 While we acknowledge the normative nature of the concept of sustainability, the 'sustainability triangle' nevertheless is publicly discussed, widely accepted and points to a clear direction of what is more and what is less sustainable. For the present case, higher sustainability means higher job effects, lower energy costs for households, higher security of supply, and lower effects on human health, emissions, land and resource usage. From our perspective, these methodological choices are legitimate since the research objective was not to identify what constitutes sustainability per se, but which indicators contribute most to citizens' preferences for future energy systems.

Figure 2: Exemplary choice as appeared in the DCE
To aid understandability, descriptions and a glossary of the energy technologies used in the scenarios and the sustainability indicators was handed out to participants prior to the experiment. Participants had the opportunity to view the information again via a 'mouse over feature' during the experiment.
To explain the data visualization and to reduce the risk of misinterpretations, an example was shown to participants at the beginning. Participants completed the experiment online through the survey platform Qualtrics. They were sent a personalized link so that they were able to complete the survey in their own time within a period of four weeks.
Responses were analyzed by determining participants' preferences as the part-worth for each level of the attribute [36]. The classic approach to analyzing discrete-choice datapublished by McFadden [84] is fitting a conditional logit regression model (also referred to as multi-nominal logit model (MNL) [85]) on the data. This approach has two drawbacks: (1) the underlying assumption of independenceof-irrelevant-alternatives (IIA) which is at least questionable in most settings, (2) the inability of accounting for unobserved heterogeneity [85]. Several advancements to the MNL are able to overcome this: Latent-Class MNL, Random Parameter/Mixed MNL and Hierarchical Bayesian MLN [85].
We applied a Random Parameter/Mixed MNL because it allowed to model the nested structure of our data (each participant made several pairwise choices) better than a Latent-Class MLN, and because we were not focused on obtaining individual-level-coefficients (an advantage of Hierarchical Bayes MLN).
In contrast to MNL, Mixed MNL estimates a mean effect and a standard deviation of that effect over the sample [85]. Following [41], therefore which is the conditional logit formula (McFadden 1974)." Scenario attributes were not presented by absolute values, but by percentages depicting their performance relative to all scenarios in the sample. Since all attributes thus had the same range, their coefficients can be interpreted as attribute weights in the choice of a scenario [40]. The analysis was implemented in the statistical program stata using mixlogit [41].

Focus groups
To explore citizens' perspectives on the sustainability of future energy systems beyond a set of expertbased indicators, six moderated group discussions were conducted. Our sample (n= 65) consisted of three evenly sized subsamples of seniors, working citizens and students. Both genders were over all equally participating though in the subsamples the ratio oscillates +/-5% around the aspired 50%. The focus groups were conducted in age-separated subgroups; to account for local differences in Germany, three focus groups were conducted in Osnabrück (Lower Saxony) and three in Stuttgart (Baden-Württemberg) (see Figure 1). Prior to the focus groups, participants were sent an information package containing descriptions of presently discussed energy technologies for the German energy system (the technologies that were also used in the DCE) as well as data on their ecological and economic life-cycle effects. This information served as a starting point for the group discussion. During the focus groups, participants were guided by the moderation to discuss beyond this information in that the moderators provokes discussions around the perceived advantages and disadvantages of the technologies as well as their potential consequences for participants' environment and lifestyle. The auto records of each focus group were transcribed and analyzed using the software MaxQDA. Given the explorative nature of our research, we followed an inductive coding process. From our data, we derived codes to represent single aspects, e.g. technologies, thoughts or entire arguments that were mentioned by participants. During this process of developing a very fine-grained coding scheme, we synthesized thematic clusters and structure our data. For example, within discussions about the dangers and personal consequences of climate change, participants focused on very diverse dimensions that centered around topics such as lost chances for future generations, the need to save ecosystems or the causality between human action and climate change. The aspects were hence aggregated into the codes 'generational fairness', 'ecosystem preservation' and 'deny of human climate change'. The interpretation and systematization of qualitative data also depends on the researcher. Thus, the coding process was repeated independently by three trained social scientists to create a complete set of codes. Subsequently, the researchers united their different code systems in iterative steps into a coherent one. The objective was to generate an intercoder-reliability to minimize individual differences in the interpretations. As a last step, the codes were revisited and tailored to our research question in terms of what participants associate the social sustainability of future energy systems.

Results
Discrete choice experiment 9

The model chosen for the analysis of the DCE is significant on a very high level and has a Mc-Fadden
Pseudo-R2 of 0.49, meaning that almost half of the variance in choices can be explained by the sustainability indicators included in the regression. Most of the indicators selected for the DCE showed a significant effect (p > 95 %) on scenario choice, with the exception of the indicator regarding temporary employment effects. Since this indicator was also highly correlated with permanent employment effects, and the latter had a higher significance as well as a higher effect on scenario choice, the temporary employment effects were excluded from the model. Of the remaining indicators, four showed significant preference heterogeneity and were, thus, modelled as random effects: climatic effects, health effects, land usage, and resource depletion. Among these, the importance of health effects is influenced the most by unobserved heterogeneity: the standard deviation is even higher than the estimated mean of the coefficient. The importance of health effects seems to vary more than the importance of all the other indicators between citizens; a plausible reason for this variation could be that their impact on an individual's personal health also varies very strongly depending on their personal health status. The importance of climatic effects is least influenced by unobserved heterogeneity: in relation to its mean value, it has the smallest standard deviation of these regressors.
Since all the regressors have been normalized before the estimation, coefficients can be compared directly and represent the importance of a regressor for the choice between different scenarios, or preference weight. Overall, the climatic effects of the scenarios presented in the DCE had the strongest effect on whether a scenario was preferred over another. Climatic effects are seen as almost twice as important as the second most important regressorresource depletion. When comparing the importance costs and economic effects of energy transition pathways have in media, politics and science, it proves to be an interesting and somewhat surprising finding of our study that both economic-related regressors in the DCE (total system costs and employment effects) were regarded as the most non-relevant indicators by participants (Table 1).
Since group differences cannot be modelled directly in mixed-logit models [86], we looked for them by analyzing interaction terms. Along with the DCE, information about age, gender, occupation and knowledge about the energy system were collected as part of the experimental study. For each of these variables, interaction terms with any of the regressors have been tested, but none of them could have significantly enhanced our model.

Focus groups
During the group discussions, participants were asked what they understood to be a social sustainable for this type of argument. Opposed to this, our analysis also showed a cluster around 'lifestyle change': Here, participants focused on the perceived injustice in resource and energy use and pointed to discrepancies between national and international conditions and between present and future generations. Although concerns about a failing security of supply of renewable energy technologies also prevailed in this cluster, the argumentation structures led to a different conclusion as opposed to the first cluster: the necessity to reconsider present lifestyles and to reduce personal impacts on the environment through own conscious sacrifices in consumption patterns. Such an inner negotiation process that seeks to balance one's own habits and a more sustainable lifestyle constitutes the core of this type of argument: Participants accepted that major changes will be necessary and that these changes will affect all individualsincluding themselves. Along with surfacing this inner conflict, participants also proposed different societal guidelines and new approaches towards the use of energy and natural resources.

Note:
The picture shows the identified thematic statements and their counts (in brackets) found in all focus groups. The statements are connected to clusters in which several arguments interacted, meaning that arguments related to different clusters. Square brackets denote the frequencies of exchanges of arguments connected to the discussion clusters in all six focus groups. The figure presents all arguments that occurred at least once (thin / 1pt lines) connected with each other during the focus groups. Thicker lines (up to 7pt) represent more (up to 11) connected speech acts.

Figure 3: Argument network and discussion clusters
While these two clusters seem to be opposing in nature, it is important to note that they present different types of arguments that we observed not only across but also within participants: Some participants used both kinds of arguments (lifestyle changes and lifestyle preservation) which underlines the inner conflicts and trade-offs citizens are exposed to when dealing with the concept of sustainability [87]. During the focus groups, aspects of distributional justice were intensively debated and emerged as one of the topics that showed the most variety in answers and had major potential for dispute. Our analysis also showed that distributional justice was also interconnected with most of the other arguments The participants did not limit their considerations to national conditions but also covered global facets of energy justice.

"At the end of the day, it's the whole of humanity that's affected, not individuals who lose their jobs or have health problems, but it's about saving us all." [translation of original 11 ]
The arguments demonstrate that participants were aware of the interconnectedness of international commodity flows. For example, among several of the focus groups, lithium mining processes in South America were criticized by participants as a particularly unjust side-effect of Germany's investment in battery technologies. Such mining processes were not only discussed with respect to ecological damages but also in terms of the potential social exploitations and the moral (ir)responsibility of importing countries. Participants also reported a perceived distributive injustice between private companies and civil society. Participants were disapproving of benefits being accumulated by or granted towards private investors while damages or losses being transferred to the public domain or leading to price increases for end users.
Overall, the argumentations clustered around the need for lifestyle changes appeared to be more interconnected: Here, arguments strongly built on each other (green). The argumentations that tended to lifestyle preservation were presented mostly without connections to each other (orange). Although the topics of employment and energy affordability were also addressed in all focus groups, they had a different status in the groups in Osnabrück compared to in Stuttgart: Citizens from Osnabrück placed more emphasis on employment whereas in Stuttgart citizens focused on energy affordability.
However, both strands of arguments were clearly subordinate to the debates on distributive and intergenerational justice and appeared to be given as "But what about…"responses.
Finally, aspects of intergenerational justice emerged as part of the discussion in the focus groups.

Discussion
Our empirical research provides insights into what citizens value most when assessing future energy systemswith respect to available, scientific sustainability indicators and beyond.
When assessing the results of our discrete choice experiment, it should be noted that we could not realize a true random sampling of participants that meets the requirements for representativeness. It cannot be ruled out that participants in our sample cared differently about the future energy system than the overall German population. As a result, these findings have to be interpreted with caution 12 .
Nevertheless, to our knowledge, this has been the first DCE on citizens' preferences with regards to national energy scenarios, as opposed to single energy technologies, and thus, provides valuable insights: All but one of the sustainability indicators that we derived from current sustainability assessment models (see section 3.1.) showed a significant contribution to participants' choices; only temporary employment effects did not contribute significantly to the model. One of the somewhat counter-intuitive results of preference weights is the relative unimportance of production and total system costs which ranked last but one regarding preference weights. Even if the exact values of preference weights may differ in a representative sample, this result raises the question if costs that emerge from the transition towards renewables have the same importance for citizens as implied by their omnipresence in science, media and politics. In this regard, two previous DCEs on preferences regarding local energy projects present comparable settings. Like in our study, Alvarez-Farizo and Hanley [45] found that environmental attributes were highly significant while the importance of costs of local wind farms ranked significantly lower in citizens' perceptions. On the contrary, in a study of 2006 [44], costs had the highest level of significance andin contradiction to our resultsemployment effects did not significantly influence vignette choice. One reason for this discrepancy might be the declining prices for renewables in the last decades; another reason could be that the environmental and social implications of climate change have been getting ever clearer in the meantime 13 . One central implication of our findings is that it underlines increasing concerns of social scientists [88] whether economic incentives have the potential to increase the acceptance and social sustainability of technologies and transition processes in the long-term. In contrast, it suggests that particularly environmental externalities (including those beyond citizens' immediate environment) will need to be taken up by policy measures.
Another central issue highlighted by the results of the DCE and the focus groups is that citizens balance many different sustainability aspects when choosing their preferred scenarios on the future of energy systems. Diverse contradictions and dilemmas inherent in the concept of sustainability have been taken up from a scholarly perspective and with respect to its policy implications, for instance the weighting of the three sustainability pillars [87] or the possibility of 'green growth' [89]. From the perspective of citizens, however, their inner conflicts and the process of considering diverse trade-offs between sustainability objectives when assessing future energy systems has not been acknowledged or discussed. So far, inner conflicts of individuals with respect to sustainability have mostly been tackled through the lens of 'rebound effects' [90]. The relevance of these aspects is further amplified sector, being a parent, membership in a conservation group, and the amount of last electric bill. However, they found a significant difference regarding preference weights between urban and rural populations. Since our sample consisted mainly of urban citizens, we could not test for these differences. Since we included more vignette attributes than the two studies, it comes as no surprise that our pseudo-R2 is higher.
A central finding of the focus groups has been the dominance of aspects on distributional justice citizens mentioned in connection to social sustainability. To the participants of the focus groups, it presented more or less a given that the transition of the energy system will be costly; they worried not about having to pay for the transition but expressed their concerns about whether everyone will have to pay a fair share. Especially the balance between private and public sector as well as private sector and citizens were perceived as problematic. To many, burdens connected with the energy systems change were perceived to be externalized towards civil society. Energy justice within processes of energy transitions have received growing scholarly attention in recent years. From a conceptual perspective, previous studies have raised aspects such as possible negative effects for fossil-intensive employment or an outsourcing of emissions from one focal country to another [91]. Particularly the latter point has been underscored by our empirical research as a key theme in the eyes of German citizens. Here, our findings are in line with insights from other industrialized countries that argue for the importance of distributional and procedural justice [55,92]. Following newer frameworks that account for distributional and procedural inequalities [93], our research suggests that policies need to move beyond tackling inequalities in transition processes in a selective manner (e.g. by focusing only on effects of single technologies or job effects resulting from the decline of an industry). As our research shows, citizens are concerned about the systemic effects and multiple externalities of transition processes which need to be reflected and tackled on a system's perspective in the development and implementation of transition pathways.

Conclusions and outlook
Our research has been conducted with the purpose to close the gap of social sustainability insights on energy transition pathways and to inform sustainability assessments and multi-criteria decision analyses (MCDA). Our focus groups allowed for an explorative perspective on what citizens value in terms of the social sustainability of energy systems and thereby moved beyond (often entrenched) conceptual and indicator-based understandings of the concept in the literature. The discrete choice experimentas a stated preference methodhas proven to be viable option for quantifying and parametrizing citizens' preferences regarding environmental, social and economic implications of future energy systems.
Overall, the combined empirical methods provided two main insights with strong implications for future energy research: (i) While environmental and climate-related effects of future energy systems significantly influenced citizens' preferences for or against certain energy scenarios, total systems and production costs were of far less importance to citizens than the public discourse suggests. (ii) The role of fairness and distributional justice and, thus, the sharing of burdens among members of society in transition processes featured as a dominant theme in all six focus groups.
When coordinating the results of the DCE and the focus groups with energy system models and macroeconomic models as part of an integrative assessment of energy scenarios (the overall objective of the research project), it became clear that current energy system models cannot quantify aspects of distributive justice of the calculated scenarios. The only indicator related to distributive justice that the applied models could calculate was the economic regional disparity [20]. However, participants in our focus groups were not only discussing regional inequalities; for them, intergenerational distributive justice, burden sharing between state, firms, and citizens, as well as between different income and lifestyle groups was far more important. In order to tackle the energy transition on the policy level, facing these observed cleavages within society must be addressed to negotiate durable solutions. This field should not be overlooked by future research on citizens' preferences and social sustainability indicators. Future energy scenarios, therefore, face the challenge to adapt to this demand and include statements about the burdens a respective scenario is placing on different actors and societal groups. This claim is far-reaching since many of the models are by design unable to provide this information and would need substantial modifications.