Narrative Visualizations Best Practices and Evaluation: A Systematic Mapping Study

doi:10.21203/rs.3.rs-1735564/v1

Download PDF

Research Article

Narrative Visualizations Best Practices and Evaluation: A Systematic Mapping Study

https://doi.org/10.21203/rs.3.rs-1735564/v1

This work is licensed under a CC BY 4.0 License

You are reading this latest preprint version

In recent years, there has been a growing interest in integrating data visualizations into narrative stories to effectively convey information and knowledge. By leveraging the best practices established in the literature, narrative visualizations can reduce the cognitive workload associated with chart comprehension. However, since it is critical and challenging to assess the results, several methodologies have been proposed in this regard. In this article, we present a systematic mapping study of ninety-five data storytelling and information visualization studies. Our goal is to collect and summarize current definitions of “data storytelling” reported in the literature, the best practices for designing narrative visualizations, and evaluation criteria and methods to assess them. As main contributions, we derive a working definition of data storytelling, distinguishing among the concepts involved, and provide an overview of design guidelines to assist practitioners and researchers in creating narrative visualizations. In addition, we characterize the main evaluation criteria and methods. Our findings highlight the need for more out-of-the-box, ready-to-use evaluation tools that allow a rapid and iterative assessment of narrative visualizations.

information visualization

data storytelling

narrative visualizations

evaluation

systematic mapping study

Data visualization has become essential to understanding large datasets and communicating findings. As a subfield of visualization research [1], Information Visualization (IV) focuses on the visual representations of abstract data [2][3] to enhance understanding and support and amplify cognition [4]. In particular, storytelling has been used as an effective way of conveying information and knowledge. Stories aid memory and recall by embedding information into characters, settings, relationships, and events [5], and a narrative is what gives shape to a story [6]; hence, the importance of integrating data visualization into narrative stories. In structured contexts, researchers can use these stories to support discussion, decision making, and process analysis [7], among others.

Narrative visualizations that leverage best practices established in the literature can reduce the cognitive workload associated with chart comprehension [8], [9] and prompt positive decision-making [10]. However, the development and dissemination of guidelines for their design have been scarce. As demand for narrative visualizations increases, so does the need for standards to support their creation. By understanding the impact of specific visual encodings on performance, we can assist end-users in making informed, effective decisions.

In recent years, evaluation has emerged as a central and challenging issue in the visualization field [11], [12]. There is a diverse set of qualitative and quantitative methods for evaluating different aspects of data-driven stories [11], some of which include controlled experiments, usability tests and case studies [13]. Nonetheless, these methods only focus on a visualization’s ability to communicate facts. Recent studies by Dimara et al. suggest that people can make irrational decisions even when they understood the data properly [14], and good performance with analytical tasks does not guarantee the same in decision making [15]. Thus, researchers aim to move beyond this evaluation approach to assess the utility of a visualization. As Matzen et al. [16] indicate, it would be valuable to have evaluation tools that can be deployed rapidly and iteratively during the design process to assess visualizations prior to conducting a user study.

Motivated by this scenario, we present a systematic mapping study (SMS) of ninety-five data storytelling and information visualization studies. Our goal is to collect and summarize current definitions of “data storytelling” reported in the literature, best practices for designing narrative visualizations, and evaluation criteria and methods to assess them.

The rest of this paper is organized into five sections as follows. Section 2 summarizes the background and related works. Section 3 describes the methodology for conducting this SMS. Section 4 reports the results and findings. Section 5 discusses our research questions and presents the threats to validity for our study. Finally, Section 6 concludes and outlines future work.

This section presents an overview of prior research relevant to our work, including existing SLRs, and contextualizes the aims of this study.

2.1 Data Storytelling

Over the last two decades, there has been a growing interest in visual storytelling and data visualization to communicate findings. Some studies have reviewed the use of decision support systems in fields such as agriculture [17], where visualization plays a key role in assisting end-users in interpreting the data, or environmental sciences [18], where it is necessary to interact with actors outside of the scientific community and therefore promoting an effective communication is challenging to enable users to have an actionable understanding.

In the field of Information Visualization, various authors have attempted to define storytelling. Segel and Heer [19] introduced the concept of “narrative visualizations” and highlighted that data stories differ from traditional storytelling – not every visualization can be thought of as a visual data story [20]. In [21], Hullman and Diakopoulos build upon this definition and describe the techniques involved in the communication of data stories.

Riche et al. [22] use the term “data-driven story” for stories that are either based on or contain data, visualized to support one or more intended messages, usually including annotations (labels, pointers, text) or narration [20]. The main idea that all these definitions share is the use of visualizations along with narratives to enhance communication and promote insight.

Several researchers have discussed emerging opportunities and challenges in storytelling in the field of visualization. Kosara and Mackinlay [23] give an overview of the topic, highlighting the importance of storytelling for data analysis and presentation, as visualizations are increasingly used for decision making. In [20], the authors take a narrower view of what data storytelling involves, in order to facilitate discussion around it, and present a comprehensive view of the storytelling process. Ojo and Heravi [24] examined 44 cases of award-winning data stories to identify storytelling practices and characterize the tools and techniques to create them.

More recently, El Outa [25] proposed a conceptual model of data narrative for exploratory analysis to support the entire process of building a data story. They also provide structured definitions of the key concepts in data narrative. Finally, in [26], Matei and Hunter review existing definitions of “story” and “storytelling” and present examples of what constitutes a “good” story. They outline a methodology for constructing effective, data-based stories.

2.2 Best Practices for the Design of Narrative Visualizations

Many experts in chart design provide guiding principles to create clear and efficient visualizations with the lowest possible cognitive load [10], [27]–[29]. Based on their studies on graphical perception, Cleveland and McGill [30], [31] suggest a ranking of graphical devices for displaying quantitative data. In [32], Kosslyn describes eight psychological principles for the design of charts that respect the cognitive, perceptual and memory processes of the human brain.

Tufte, a well-known advocate of minimalistic design, argues that visualization is about simplifying complex information and defines “graphical excellence” as displays that communicate with clarity, precision, and efficiency [28]. He proposes the data-ink ratio and suggests that all ink not used to present data should be removed.

Several empirical studies have evaluated the effectiveness of different guidelines, such as: limiting the use of gridlines [33], [34], using color strategically [35]–[39], adding labels and annotations[40], [41], avoiding visual clutter [42], [43], or allowing the viewer to cluster information and recognize patterns [44], [45]. Some authors have investigated the effect of chart embellishments [47] – [50] and emphasis techniques [50], [51], while others focused on a specific type of chart: bars[53] – [57], scatterplots [58] – [61] and treemaps [61], [62].

Another research reports on the emerging challenges and opportunities regarding visual analytics. In [63], the authors reviewed the entries for the Visual Analytics Science and Technology (VAST) Challenge as well as guidelines developed by other researchers and synthesized the results into an initial set for assessing environments and analytics reports. From a product design perspective, Adagha et al. [64] point out that part of the challenge is that current visual analytics tools do not have standardized design criteria and processes, thus raising the question of what the attributes of effective tools are and how to measure their impact.

Despite such efforts, to the best of our knowledge, no study has systematically gathered the best practices involved in the design of narrative visualizations.

2.3 Evaluation

Evaluation is attracting considerable interest in the IV field and some studies characterize the approaches through literature reviews. Isenberg et al. [65] reviewed 581 papers and provided a quantitative and objective report of the types of evaluation practices encountered in the visualization community. They found an increasing trend in the evaluation of user experience and user performance. However, though it has improved over the years, the general level of rigor of evaluations is still low.

Lam et al [66] provide an overview of the different types of evaluation scenarios, categorized into those for understanding data analysis processes and those which evaluate visualizations themselves. They base their categorization on questions and goals, rather than existing methods, encouraging the community to consider the context before choosing an evaluation method.

In [67] the authors summarize types and components of visualizations used in the health industry and methodologies for developing and evaluating patient-facing visualizations. They suggest a need for greater attention to developing these kinds of visualizations, since the evaluation methods had little to no standardization across the articles, thus making it difficult to identify and compare best practices. They identified three key opportunities, namely: more robust data collection and reporting, consistent evaluation approaches, and interpretation enhancement.

Tory and Möller [68] discuss the use of formal laboratory user studies against alternative evaluation techniques in HCI, such as focus groups, field studies and expert reviews, and particularly heuristics evaluations. They argue that such reviews are valuable ways to assess visualizations.

Borgo et al. address specific methodologies for evaluating visualizations. In [69], they reviewed the use of crowdsourcing experiments for empirical evaluation of visualizations reported in 82 papers. They summarized the design, methods, tasks, tools, and measures used in these kinds of studies and found that many papers failed to properly report relevant aspects of the experiments. They present a taxonomy of practices along with a checklist to support researchers in reporting tasks.

As for evaluation criteria, Bertini, Tatu and Keim [70] presented a systematic analysis of quality metrics to support the exploration of high-dimensional data sets and defined a quality metrics pipeline. Shah and Hoeffner [71] review graph comprehension and provide guidelines to design charts that enhance interpretation and discuss unresolved questions related to data presentation and graphical literacy.

Saket et al. [72] review visualization evaluation in terms of user experience and characterize the goals and metrics relevant to storytelling and narrative purposes such as memorability, engagement, and enjoyment. More recently, in [73] the authors argue that data stories must address different challenges depending on the context. They provide a non-exhaustive set of criteria and evaluation methods by which data-driven stories can be assessed.

Based on the review outlined above, the goal of this work is to define “data storytelling” in the context of Information Visualization and collect and analyze guidelines for the creation of narrative visualizations. This effort can help develop a better understanding of the design process and serve as a starting point to derive improvement recommendations.

We also provide an overview of the evaluation criteria to get a clearer view of what is considered an “effective” visualization for narrative purposes, as well as the reported evaluation strategies. Our work is based on Lam et al. visualization scenario [66], as we are interested in characterizing evaluation from a “final product” perspective. We do not consider conventional methods such as user studies and experiments but rather focus on finding methodologies that are consistent with the best practices and criteria found.

In this section, we describe the steps of the SMS process, following the guidelines by Kitchenham and Charters [74] and Petersen [75]. They include the definition of a review protocol to ensure rigor and reproducibility. We determine research questions, data sources and search strategy, inclusion and exclusion criteria, quality assessment, data extraction, and selected studies. Additionally, we followed guidelines for conducting automated searches [76] and effective data extraction [77].

An online version^[1] of this SMS is available for an interactive experience with charts, while supplementary files can be accessed at [78].

3.1 Research questions

As stated in previous sections, the goal of this SMS is to identify, analyze and summarize existing definitions of “data storytelling,” best practices for the design of narrative visualizations and evaluation criteria and methods. To this end, we formulate the following research questions:

RQ1: What are the existing definitions of “data storytelling”? This question seeks to establish what exactly data storytelling is and whether it has a formal and accepted definition for both academics and practitioners.

RQ2: What are the data storytelling best practices reported in the literature and how are they implemented? The goal of this question is to summarize the guidelines reported in the literature to create effective narrative visualizations and to find practical demonstrations of how they are applied.

RQ3: What are the criteria to evaluate narrative visualizations? The various stakeholders involved in the data storytelling process might have different goals according to their perspectives [79]. This research question aims at identifying the criteria by which those goals are met. These criteria can then be used to evaluate narrative visualizations.

RQ4: What are the current strategies to evaluate narrative visualizations? This question identifies the methods by which the criteria defined in the previous question can be assessed and their characterization: what types of charts they apply to, the metrics they use, and the tools to support them.

3.2 Visualizations

This study covers four main types of visualizations, namely: line charts, bar charts, scatter plots, and pie charts, in addition to choropleth maps, area charts, bubble charts, and treemaps. We focus on these visualizations as they are the most frequently occurring types, according to [80] and [81]. Throughout the study, we use the term “chart” instead of “graph” to avoid confusion with the field of graph drawing.

3.3 Search strategy

The search and selection process of the primary studies was performed in three steps to control the number and relevance of the results, namely: automated search, study selection, and snowballing search.

Database search: We conducted a series of database searches on three indexing systems related to the Software Engineering field: ACM Digital Library, IEEE Xplore, and Scopus. We chose these sources as they are considered standard libraries [82]. The search string was divided into two parts: one containing a keyword that describes our main subject and its synonyms, and another one focusing on the different topics of the research questions.

The search was limited to title, abstract and keywords. The specific implementation of the search string for each database is presented in Table 1, while Table 2 presents each term together with its keywords.

Snowballing search: We complemented the database search with forward and backward snowballing. The goal of this step was to expand the set of relevant papers by focusing on papers citing or being cited by previously included studies [83]. Articles collected during this stage were also added to the main list and selected according to the inclusion/exclusion criteria.

3.4 Inclusion and Exclusion Criteria

We defined the following inclusion (I) and exclusion (E) criteria based on the guidelines proposed by [84] to select appropriate studies and filter out unrelated ones. We were interested in primary studies published in any year up until 2021 presenting some contribution on data storytelling, visualization best practices, and evaluation.

I1: The title, abstract and keywords explicitly state that the paper is related to data storytelling and data visualization.
I2: The study is a full paper with empirical evidence.
I3: The paper is peer-reviewed (journal article, conference paper)
I4: The full text of the paper is available.
E1: The paper is not written in English.
E2: The paper’s full text is not accessible.
E3: The paper is a gray publication without peer review.
E4: The paper is explicitly a short paper.
E5: The paper does not cover any of the visualizations described in Section 3.2.

3.5 Study selection

We found a total of 11.818 articles ranging from 1984 to 2021 by applying the search strategy defined in Section 3.3. The search was conducted using title, abstract and indexed keywords (see Table 3).

The procedure for studies selection consisted of five phases, as presented in Fig. 1. In Phase 1, we obtained the studies from electronic databases, downloaded the results, and organized them into a single spreadsheet. Out of the 11.818 search results, 8476 were unique. In Phase 2, we checked the title of each primary study using the I/E criteria. Articles unrelated to our subject were discarded, and 5321 titles were kept for consideration in the next phase. In Phase 3, we checked the abstract of each primary study and excluded 3845 papers. If there was insufficient data, the paper was left for the next phase. In Phase 4, the remaining papers were analyzed using the full text. In this step, we obtained 87 relevant papers and excluded 1389. The selected articles were reviewed for quality assessment in Phase 5, before making a final decision to include them in this SMS. Another 13 papers were included via snowballing, resulting in 95 studies overall.

An identity code was assigned to every individual study. The list of papers with their identity code is available in Table 11 (see Appendix).

3.6 Quality assessment

In addition to the inclusion/exclusion criteria, it is critical to assess the quality of the primary studies [74]. The Cochrane Reviewers’ Handbook [85] suggests that quality relates to the extent to which a study minimizes bias and maximizes internal and external validity. The quality assessment (QA) of the selected studies was achieved by a scoring technique to evaluate their credibility, completeness, and relevance. All papers were assessed against a set of eleven quality criteria. The assessment instrument is presented in Table 3. Questions Q1, Q2, Q4-Q11 were adopted from the literature [74], [86] , while question Q3 is a proposal of the authors.

Each quality assessment question is judged against three possible answers: “Yes” (score = 1), “Partially” (score = 0.5) or “No” (score = 0). The quality score for a particular study is computed by taking the sum of the scores of the answers

3.7 Data extraction

To extract data from the selected primary studies, we used the template shown in Table 5. Collected data includes general information (e.g., title, authors, year of publication, and source) and information related to the research questions. Before the actual data extraction, we performed an extraction pilot with a random set of ten papers to calibrate the instrument, assess the extraction strategies, and avoid possible misunderstandings.

We decided to extract the information exactly as the authors mentioned it. For each paper, we considered abstract, introduction, methodology, results, and conclusion. In some cases, a comprehensive reading of the paper was necessary. Any conflicts were discussed and resolved internally by the authors to reduce bias and ease reproducibility. To measure the level of agreement between researchers we used the Cohen Kappa statistic [87].

To support this task, we used Atlas.ti [88], which is a software for conducting qualitative research that allows highlighting, annotating, and coding data segments of interest. For RQ2, we used an open and axial coding strategy based on grounded theory [89]. First, we read each guideline and assigned it a best practice (BP) id, such that a new BP was created for guidelines that did not resemble previous ones. Then, we used axial coding to compare the best practices to each other and identify categories or themes, by relying on general knowledge and categorizations proposed by other authors. Similarly, for RQ3, we organized the criteria into two levels (main and sub-criteria), grouping similar criteria into a single one.

^[1] https://bit.ly/narrative-visualizations-sms

In this section we describe the quality assessment results and provide an overview of the main characteristics of the ninety-five studies included in this SMS. Then, we present the results corresponding to each research question.

4.1 Quality assessment results

The quality assessment helped us increase the reliability and achieve a coherent synthesis of results [90]. We present the results of the assessment in Table 12 (see Appendix) according to the questions described in Table 4. The results indicate that the overall quality of the studies is high since the quality mean was 90%.

4.2 Overview of selected primary studies

The selected primary studies were published between 1984 and 2021. Fig. 2 presents the number of studies by year of publication. Overall, we found at least one study each year since 2005. An increasing number of publications is observed since 2010, with the majority of them conducted between 2013 and 2021. The highest number of studies was in 2018. This evidences a trend in the topic of information visualization and data storytelling.

4.3 Classification method

The selected studies were classified according to the six categories defined by Wieringa et al. [91]: i) validation research, ii) evaluation research, iii) solution proposal, iv) philosophical papers, v) opinion papers, and vi) experience papers. Table 6 shows the classification by type of research.

Validation research (81%) and solution proposal (16%) were the most adopted research type within the selected papers. Few philosophical papers appeared over the years and none of the selected studies belonged to the evaluation, opinion, and experience categories.

4.4 RQ1: Is there a valid and accepted definition of “data-driven storytelling”?

Over the last years, there has been a growing effort in the visualization community to better understand and characterize data storytelling. The broader use of the term suggests that complementing interactive visualizations with narratives can make the data exploration a more engaging and memorable experience [19], [21], [23], [92], [93].

In 2010, Segel and Heer [S02] introduced the term Narrative Visualizations, to describe “an emerging class of visualizations that attempts to combine narratives with interactive graphics”. These authors draw a distinction between data stories and traditional storytelling: stories are often presented in a linear, controlled progression of events, while data visualization can be organized according to different structures and can also be interactive, promoting new questions and explorations. The authors analyzed 58 examples of narrative visualizations and identified the techniques for telling stories with data, such as common genres (magazine style, annotated chart, partitioned poster, flow chart, comic strip, slide show, and video) and design patterns (the martini glass structure, interactive slideshows, and drill-down stories). According to the authors, narrative visualizations put data at the forefront of storytelling, resulting in visualizations with a storytelling component intended to convey a specific message.

In [S03], Hullman and Diakopoulos further discuss this concept by exploring the rhetorical devices used in narrative visualizations. They suggest that “narrative visualizations typically rely on a combination of persuasive, rhetorical techniques to convey an intended story to users as well as exploratory, dialectic strategies aimed at providing the user with control over the insights gained from interaction”. In their study, the authors examined 51 examples of narrative visualizations to understand how the design and rhetorical choices affect user’s interpretation of a story and highlight the nature of narrative visualizations as “multimedia artifacts that can’t be reduced to visualization alone.”

In 2013, Hullman et al. [S47] define the storytelling focus on visualization research as “the study of narrative visualizations, development of automated data storytelling tools and proliferation of narrative visualizations in news media attest”. They argue that conveying a narrative with visualizations requires context definition, information selection and choosing an order in which to present visualizations; and provide a focused analysis of 42 data stories to understand the forms that structure and sequence take. In line with [S02], they state that in a narrative visualization, “the events of interest are patterns in data sets represented in visualizations”, and the author of the story must decide how to organize the representations into a compelling and understandable sequence.

Dimara et al. suggest in [S60] that the narrative component in visualizations can provide benefits in terms of enhanced motivation, attention, and engagement, and this could translate into improved task comprehension and higher-quality responses.

In [S08], the authors analyze existing open data platforms and suggest the need to leverage storytelling as means to support the discovery of relevant data and the generation of insights, thus improving transparency and trust. They consider data storytelling as “the process of translating data analysis into simple, logical stories that can be understood by non-technical audiences.”

More recently, Liem et al. [S80] empirically tested storytelling techniques to make visualizations compelling, engaging and persuasive. They explored whether certain practices influence people’s attitudes towards anthropomorphized data graphics (the practice of visualizing data about people in a way that helps the audience relate [94]). Although the results were inconclusive, this work shows the potential of storytelling in the field of visualization research.

4.5 RQ2: What are the data storytelling best practices reported in the literature and how are they implemented?

We found a total of 38 best practices that are summarized in Table 7. Across the ninety-five studies, some authors proposed categorizations regarding different aspects of visualizations, such as [21], [95]–[97]. Based on these categorizations and following the strategy described in Section 3.7, we classified the best practices into five categories, as follows:

Cognitive: Cognition is defined as the way our brain processes information. Visual cognition involves deriving meaning from external representations of visual information [98]. This category encompasses the guidelines and recommendations that aid in reducing the cognitive load associated with chart comprehension.

Data: This set of best practices involves aspects related to the data sources behind the visual representation and it is concerned with the information itself, such as transparency, reliability, and accuracy.

Perceptual: Cleveland and McGill define “graphical perception” as the visual decoding of the quantitative and qualitative information encoded on graphs [30]. It is directly related to the user’s ability to interpret information. Perception-based guidelines allow us to display data in a way that the important patterns stand out [99]. This category includes practices related to the human visual system, such as the mapping of visual variables, or the strategic use of color, among others.

Presentation: This category incorporates choices about how the data is mapped to the visual domain, and thus, it is also related to perception. It considers the aesthetics and style of the representation, the layout of the elements, as well as the clarity and effectiveness of the design.

Usability: It is described as the degree to which a product can be used (ISO 9241). This category is concerned with the functionality and understandability of charts, including features such as accessibility, consistency, and interactivity, which allow the users to perform tasks effectively.

Note that a given practice might be part of more than one group, since some aspects such as cognition and perception are intrinsically linked. The “Context” column indicates the scope of the practice in terms of the charts it applies to. The “Reference” column specifies the primary study in which the guideline was found (see Supplementary Materials for visual references).

Fig. 3 presents the best practices grouped according to the categories defined above. It provides a clear visual of the guidelines that fall under more than one category.

Regarding the implementation of best practices, for the vast majority we found several ways to apply them. For BP4, BP26 and BP38, however, we did not find any practical demonstrations. Overall, there were 384 implementations, varying from two to more than fifty per best practice, as shown in Fig. 4.

For instance, Table 8 shows implementations for the “Data” best practices. The “Context” column indicates the scope of the implementation (e.g.: “bar charts”, “line charts” or “general” if it applies to all types of charts), while the “Reference” column specifies the primary study where it was found. In some cases, one implementation could serve as a practical application of more than one practice. For the complete set of guidelines and implementations, see the Supplementary Materials.

4.6 RQ3: What are the criteria to evaluate narrative visualizations?

We identified 47 evaluation criteria and organized them in two levels: 5 main criteria and 42 sub-criteria. Table 9 presents the definition of the main criteria along with their sub-criteria.

Sub-criteria are described in Table 10. In both cases, the “ID” column identifies each criterion (Cand SC were used for criteria and sub-criteria, respectively); the “Definition” column explains the concept as extracted from the primary studies and the “Reference” column specifies the primary study in which the criterion was found (see Supplementary Materials for visual references).

We extracted the criteria exactly as the authors mentioned in the studies, and then grouped those that were similar or expressed related ideas into a single one, such as “appearance” being similar to “aesthetic”, “discovery” being similar to “insight”, or “attention” being similar to “focused attention”, among others. Fig. 5 displays the number of sub-criteria per category, considering that certain criteria overlap across the categories.

4.7 RQ4: What are the current strategies to evaluate the quality of narrative visualizations?

Several methodological approaches have been developed for evaluating visualizations. The most common are empirical studies in which participants perform benchmark tasks and researchers collect measures like completion time, error rate and accuracy. However, as pointed out in [106] and [107], these methods only focus on data “facts” and fail to consider other aspects such as decision-making support, or insight generation. The motivation of this RQ was then to move beyond formal laboratory studies and focus on evaluation models allowing for a more holistic assessment.

The work by Zhu et al. [S28] presents a complexity analysis method that can be used during the design stage prior to conducting user studies. The authors measure the complexity of a visualization in terms of visual integration, number of separable dimensions for each visual unit, the complexity of interpreting the visual attributes, and the efficiency of visual search. These factors indicate the amount of cognitive load for a given design.

Padda et. al. [S31] propose to evaluate a visualization based on its comprehension support, i.e., if the users are able to grasp the underlying intentions of the design. The authors derive a set of criteria and characterize them into three aspects, namely: perception, cognition, and visual interface. In [S67], the authors present a methodology to compare two specific techniques (scatter plot and line chart) and determine which one is best suited for a given problem based on a “consistency score”. They measure consistency between a LOESS fit to the data and the visual representation by the two visualization types. The visualization with the larger consistency score is then suggested to be the right choice.

Another approach for evaluating and guiding the design of visualizations is the use of heuristics, which is a common usability inspection method [108]. Researchers have tested and developed various heuristics that capture issues specifically related to visualization and decision making. Forsell and Johansson [S36] empirically compared six sets of heuristics and synthesized them into a new set of ten heuristics that offer a wide explanatory coverage of common issues in information visualization. Similarly, Dowding and Merrill [S70] developed a heuristic checklist that combines Nielsen’s usability principles along with visualization-specific heuristics, (including those of Forsell and Johansson), to evaluate data visualization dashboards.

The authors in [S71] present an instrument that integrates a set of heuristics from the literature to evaluate visualizations, which also serves as a best practice checklist for their design. Wall et. al. [S73] develop a methodology based on the value framework proposed by Stasko [109] to evaluate visualizations along four components: time, insights, essence, and confidence. The authors derived a set of value-driven heuristics to estimate and quantify the potential value of a visualization, i.e., its ability to provide a proper understanding of the data.

In [S23], Lan et al. conducted crowdsourcing studies to identify affective responses to infographics and derived practical design heuristics. The authors divided them into two categories: Usability and Expressiveness. The Usability heuristic comprises aspects like accessibility, readability, messaging, and credibility. The Expressiveness category includes embodiment, narrative, and uniqueness.

Bai [S14] introduces the concept of “purposeful visualizations” to highlight the importance of context and suggests a taxonomy to support the design and evaluation of such visualizations. The model covers seven assessment areas: visual representation, information presentation, psychology of the observer, information quality, visual impact, overall design style, and overall performance. Each area further breaks down into several sub-categories.

The authors in [S58] propose a method consisting of a usability questionnaire and a fuzzy logic application to evaluate participants' responses. The evaluation criteria are communication, which comprises satisfaction, flexibility, and learning; information, which includes accessibility, accuracy and searchability; and technology, which consists of learnability, efficiency, and error-proneness. Bai et al. [S86] suggest measuring the Visual Intelligence Density (VID) of a visualization to quantify the amount of useful information for decision support it provides and to evaluate its capabilities in terms of supporting decision making. The evaluation criteria used to measure VID is based on Tufte’s work, in addition to their own proposed criteria: comparisons; causality, mechanism, structure, and explanation; multivariate analysis; integration of evidence; documentation; content counts most of all; 1D, 2D, 3D; variety of views; user interaction; animation. Each criterion is quantified in a scale from 0 (very weak capabilities) to 10 (extremely strong capabilities).

To measure visualization literacy (i.e., the ability and skill to read and interpret visually represented data and extract information), Boy et al. [S18] developed a method based on Item Response Theory for line charts, bar charts and scatter plots. The tasks considered to model test items were finding maximum, minimum, variations, intersections, calculating averages, and making comparisons. Each test item involves a stimulus, a task, and a question. Along the same line, Lee et. al. [S59] developed the Visualization Literacy Assessment Test (VLAT), aimed at non-expert users. It consists of twelve data visualization types, namely: line chart, bar chart, stacked bar chart, 100% stacked bar chart, pie chart, histogram, scatter plot, area chart, stacked area chart, bubble chart, choropleth map, and treemap. The final test contains 53 multiple-choice items that cover eight major tasks related to visualization. Each item is composed of a task, a content validity ratio (CVR), an item difficulty index (P), and an item discrimination index (D).

Hung and Parsons [S61] present VisEngage, a self-assessment questionnaire that comprises eleven characteristics of user engagement: aesthetics, captivation, challenge, control, discovery, exploration, creativity, attention, interest, novelty, and autotelism. The questionnaire consists of 22 questions rated on a seven-point Likert scale, and it is intended to be used immediately after interacting with a visualization.

This section discusses the implications derived from our work. We reflect on the main findings of the research questions and address the threats to the validity of the results.

5.1 RQ1: Is there a valid and accepted definition of “data-driven storytelling”?

Even though the concepts of “data visualization” and “storytelling” are tightly associated, when it comes to creating effective visuals, they are not interchangeable, and visualizing information does not necessarily mean creating a narrative visualization. To better understand what data storytelling is, we articulate the differences between information visualization and storytelling below, and then derive a working definition of data storytelling.

Information Visualization

The term data visualization is used to describe the visualization of large data sets. Data visualizations were primarily meant for use in the scientific field and aimed at an efficient reading of data in analytical tasks [98]. Information visualization is a subfield of data visualization [1] that leverages the cognitive capacity of human visual perception, evolved for fast pattern detection and recognition, to communicate underlying relationships and trends in large datasets. The major goal of Information visualization is to amplify cognition [99], and help people perform tasks effectively [3].

The term “chart” or “visualization” describes a graphical representation of data. Charts, particularly within the context of presentation or persuasion, are designed to aid in the memorability of the presented data [100] and reduce cognitive load [99], [101], [102].

Storytelling and Narrative

Storytelling is a central aspect of human communication and cognition: for a long time, people have used stories to convey information, values, and experiences. In the research field, we found fragmented views and inconsistent definitions: some authors draw a distinction between “storytelling” and “narrative,” while others used the terms interchangeably.

Storytelling is defined as the social and cultural activity of sharing stories [103]. A fundamental aspect of storytelling are emotions and the cognitive responses the story evokes in its audience [104]. Gabriel [105] defined stories as “emotionally and symbolically charged narratives” oral or written that “usually have a plot, characters, aim to entertain, persuade or win over.”

According to the Oxford English Dictionary, narrative is described as “an account of a series of events given in order and with the establishing of connections between them.” It combines the narrative contents (story) and the narrative form (discourse) [104]. In simple terms, narration is the telling of a sequence of events to convey a story to an audience. A well-told story conveys great quantities of information in relatively few words in a format that is easily assimilated [106].

Data Storytelling

Kosara and McKinlay [23] define a data story as an ordered sequence of steps consisting of visualizations that can include text and images but are essentially based on data.

Riche et al. [22] refer to data-driven stories as stories that start from a narrative that is either based on or contains data, often portrayed by data visualizations, to clarify, inform and provide context to visually salient differences. In [20], the authors describe it as a sequence of “story pieces” (facts backed up by data), visualized to support one or more intended messages. The visualization can include annotations (labels, pointers, text) or narration to highlight and emphasize the message and to avoid ambiguity. These story pieces are presented in a meaningful order to support the author’s main goal (educate, persuade, convince, for example).

Based on these results, we derive a definition of data storytelling, by refining and incorporating previous ones, as: “The creation of narrative visualizations to convey an intended message, which can include images, text, and annotations to emphasize the message, avoid ambiguity, and facilitate decision making.”

As [107] points out, data storytelling sits at the intersection between data, traditional narrative, and visualization (Fig. 6). This is also consistent with the work of Edmond and Bednarz [120], as they propose “NarVis” (narrative visualizations) that are situated between data visualization and narrative. According to [108], the essence of a narrative visualization is a good storytelling. A story worth telling challenges the reader and is a means of discovery. It drives the audience to ask more questions and pushes them from simply believing to knowing with a degree of confidence [26].

5.2 RQ2: What are the data storytelling best practices reported in the literature and how are they implemented?

The aim of this question was to collect and summarize existing guidelines for creating narrative visualizations, and when possible, to also explain how they are implemented. In general, we observed that there is not a lack of guidelines, but rather they are scattered across the literature and some of them only apply to a certain type of chart.

The most frequently mentioned best practice was BP17, “choose the visualization technique that better supports the expected tasks.” It is also one of the practices with the largest number of implementations (over thirty), as it directly influences the amount of time it takes for a user (or decision maker) to solve a problem, and therefore, its complexity [122]. As discussed in several primary studies, different charts (or design choices within a single chart) perform better than others depending on the task, and designers must consider how they want the display to support a specific task, at potential cost for others [123]. For instance, spotting outliers in a scatterplot would be difficult at low marker opacity but estimating data density could benefit from it [59].

We believe this guideline is intrinsically related to BP5, (“select the appropriate visualization considering the types of data to represent and the advantages and disadvantages of each technique”). This is so because one cannot choose the appropriate visualization without also considering the target tasks, both critical to aid the user in understanding the underlying data and improve decision making. BP5, however, also highlights the importance of tailoring visualizations to their audiences, considering aspects like chart familiarity and learning curves.

The second most referenced practice was BP8 “map information and data dimensions to the most salient visual features”, followed by BP15 “use text, labels and annotations for effective information consumption and decision making”. BP8 reflects on whether the visualization uses comprehensible data encodings, as suggested by [31]. Features include color, size, orientation, and shape, which allow the user to perform the required tasks effectively. BP15 focuses on enhancing the interpretability of the information depicted in the charts while also developing the narrative aspect. Titles and text are key to increase memorability. As pointed out in [41], a good title can make the difference between a visualization that is recalled correctly from one that is not. Labels, in turn, help orient the user, and annotations can be used to highlight interesting patterns.

We found that many practices resemble user interface design guidelines, such as BP3, BP22, BP33, BP36 or BP37. Other practices mainly focus on storytelling issues, such as BP25 and BP30. Moreover, we did not find any practical demonstrations for BP4, and BP26 and BP38 as they are straightforward guidelines and generally do not require illustration.

Overall, these findings indicate that each best practice might be associated with one or more evaluation criteria, as each one serves a purpose (e.g., improving usability, increasing memorability, or enhancing comprehension, among others). Nonetheless, further analysis is necessary to validate these associations.

5.3 RQ3: What are the criteria to evaluate narrative visualizations?

The goal of this question was to investigate the factors involved in quality visualizations, particularly, for narrative and storytelling purposes. This has been a topic of growing interest in the research community over the years. In general, a visualization is considered effective if it helps people extract accurate information [111] without further complexity [44].

Although several studies propose different sets of criteria, we found that there are no unified standards for what constitutes an “effective” visualization. Instead, every author focuses on evaluating a given aspect of a visualization and the traits it encompasses. We collected the most mentioned, well-known criteria and grouped the less common features as sub-criteria

Among the primary criteria, we found Memorability, Comprehension, and Engagement. Several studies emphasize the importance of memorability in visualizations, however, not every author agrees on it [72], arguing that the fact that the audience remembers a visualization does not necessarily mean it is effective. Moreover, there are certain challenges to measure it [48]. We believe memorability is a fundamental aspect to remember information prior to making decisions; particularly when the user has limited time to interact with the visualization (e.g., company meetings, crisis settings [124]).

Comprehension was pointed out by three studies as the primary goal of any visualization. We found the most related items (16) for this criterion. Even though it is highly related to literacy, we did not include it as sub-criteria since it is inherent to the user [125], rather than a visualization trait. Designers can tailor visualizations to support the audience’s various levels of literacy, thus making information as comprehensible as possible.

Engagement was pointed out by three studies as a complex construct involving several factors such as aesthetics, user control, or exploration, for instance. Despite it lacks a clear definition, its main concern is the user’s immersion in a visualization [126]. A visualization being viewed for a certain amount of time and receiving interactions, will be considered as more successful than others. As suggested by Mahyar et al., it is even more important when the target audience are not domain experts [127]. While there have been several efforts towards this direction, there is no unified approach to measure engagement yet.

Among the most frequently mentioned sub-criteria, we can mention the “aesthetics” or style of visualizations. This makes sense, given that it is tied to every major criterion: if a user finds a visualization aesthetically pleasing, the more willing he/she is to use it [128] (perceived usability), spend time on it (engagement) and remember the information (memorability).

Our findings differ slightly from what [73] proposes as criteria to evaluate data-driven stories, as we did not find “dissemination” or “impact” per se in the primary studies. It can be argued, however, that these terms are highly related to engagement, and more studies are necessary to have a deeper understanding of it. Regarding impact, some of the studies mentioned in previous sections have tested the effect of incorporating storytelling in regular visualizations to measure the audience’s reaction [129], or the decision-making capabilities [14], [15].

Overall, as mentioned in [73], all these criteria are subjective constructs, and thus, they depend on the context of application and cannot be measured directly. We argue that the goal of a visualization should be considered when evaluating the criteria.

5.4 RQ4: What are the current strategies to evaluate the quality of narrative visualizations?

The motivation behind this RQ was to find evaluation methodologies being consistent with the best practices and criteria found in previous questions, outside of the traditional laboratory studies.

A variety of approaches have been proposed along that way. Some approaches derive from the Human-Computer Interaction (HCI) field, such as heuristics evaluation (S36, S70, S71, S23). Other models address a specific criterion, such as comprehension (S31) or engagement (S61, S73), while others involve the use of algorithms (S58). There was only one method whose goal was to compare two different techniques and select the most appropriate one (S67).

Among the heuristic evaluation, there were a few authors that suggest the standard technique be supplemented with new details, as it is the case of the value-driven heuristics (S73) to assess the potential utility of a visualization, or [S23] that focuses on the “affective” impact of visualizations. Heuristic evaluation, however, has certain limitations, as it depends on evaluators’ background and domain knowledge and cannot always be applied due to their generality.

We found that many approaches do not explicitly mention the targeted visualization techniques, nor the goal of the visualizations they assess (decision-making support, or persuasion, for instance). Although we found several best practices in RQ2, we observed the evaluation strategies do not consider all of them, rather they focus only on a certain guideline or set of guidelines. For instance, the complexity score method in [S28] evaluates the aspect ratio of a chart, so that the user can perform tasks more efficiently, while S31 takes into account perceptual, cognitive and presentation aspects to assess comprehension.

We believe these methods might be classified into those that assess the visualization itself (S14, S23, S28, S31, S67, S36, S70, S71, S58, S86) and those that evaluate aspects concerning the user, such as literacy (S18, S59) or engagement (S61, S73).

In general, and in line with past research findings [130], the major obstacle to developers and designers of visualizations is the lack of out-of-the-box, ready to use evaluation tools. These methods, however, can serve as a starting point to other evaluation models.

5.5 Implications for research

The results of this SMS yield some opportunities for future research. First, more empirical studies are needed to test the efficacy of certain best practices. For instance, researchers can take a subset of the best practices, and observe the effects of including or excluding them in a decision-making context. We acknowledge that no single visualization can incorporate all thirty-eight best practices, thus a deeper understanding of the design space tradeoffs is needed to identify which of these best practices are necessary to reach a given goal.

Moreover, many of the criteria found in this study are subjective constructs and can be further examined and characterized in terms of their specific features: their formal definition in the context of visualization, or how to measure them appropriately, among other things. The relationship between best practices and evaluation criteria can also be observed.

Evaluation is perhaps the most challenging aspect since it involves the visualization itself and the user’s capabilities to interact with it and extract useful information. One limitation of past research is that they not always stated the goal of the visualizations they assessed, (a critical aspect to interpret results accordingly), or they only focused on a certain type of chart. As we mentioned the previous section, the most prevalent evaluation methodology are laboratory experiments and user studies that assess how well a visualization communicates facts. We hope the results of this study will assist researchers to go beyond this paradigm, to develop more contextualized, specific strategies.

5.6 Implications for practice

This SMS presented 38 visualization design best practices along with several recommendations on how to implement them. Designers and developers can follow these practices during the planning and design of visualizations or use them to compare to the practices they are currently adopting and identify improvement opportunities.

Additionally, by understanding the criteria for effective visualizations, engineers can determine their goals more clearly (i.e.: to make visualizations more memorable, more comprehensible, or more engaging, for instance) and make informed design choices towards that direction.

5.7 Threats to validity

This section discusses the limitations that may impact this study regarding construct, internal, external, and conclusion validity.

Construct Validity: Construct validity is determined by our ability to capture what we intended. During the search, primary studies could have been missed. We mitigated this threat by searching on different libraries that cover the majority of the high-quality publications in SE and complementing the search with forward and backward snowballing sampling [37]. In addition, we performed an updated search re-executing our search query to capture new papers published during the course of this research.

Internal Validity: These threats reflect possible wrong conclusions when causal relations are examined [131]. Researcher bias constitutes a threat to the internal validity. To reduce this threat, we performed the selection process iteratively. For the data extraction phase, we conducted a pilot extraction to validate the data extraction form. We had one researcher extracting the data and another reviewing the extraction. Any conflicts during this phase were discussed and resolved by the authors. To measure the level of agreement between researchers, we used the Cohen Kappa statistic [87].

External Validity: External validity refers to what extent it is possible to generalize the findings. To ensure the widest coverage possible, we included papers published from 1984 to 2021. The excluded papers may affect the generalizability of our results. However, we argue that they do not have a significant impact on our review, as the ones included share similar ideas and recommendations.

Conclusion Validity: Conclusion validity measures the reproducibility of the study. This threat was mitigated by following the protocol proposed by [74], widely used in SE research, to determine research questions, data sources and search strategy, inclusion and exclusion criteria, quality assessment, data extraction, and study selection.

This paper presented a systematic mapping study of 95 studies about information visualization and data storytelling of the past 30+ years. We were interested in collecting existing definitions of “data storytelling” reported in the literature, guidelines for the design of narrative visualizations, as well as evaluation criteria and the methods to assess them.

Our findings revealed that there is no clear, agreed-upon definition of what “data storytelling” encompasses, though several efforts have been being made in that direction. We thus contribute by deriving a working definition of data storytelling, which distinguishes among the involved concepts.

Furthermore, the results of this SMS provide a useful overview of design guidelines for narrative visualizations, which can serve as a starting point to assist practitioners and researchers to create effective communications. Overall, we found 38 best practices regarding data, cognitive, perceptual, presentation and usability aspects, and over 300 recommendations on how to implement them.

Regarding evaluation criteria, we found five major aspects for effective visualizations, namely: comprehension, engagement, information, memorability, and usability, each one comprising their own sub-criteria. As for evaluation methods, although there are useful approaches to assess visualizations, we observed that some problems still remain since not all of the approaches are comprehensive enough in terms of the criteria and best practices they consider, and thus do might not support the real needs of designers and developers.

Based on the results of this SMS, we plan to develop an evaluation model for narrative visualizations that captures the guidelines and criteria discussed, as comprehensively as possible, so as to enable an iterative assessment of visualizations during the design phase.

Competing interest. The authors have no competing interests to declare.

Acknowledgements. This work is supported by National Council on Scientific and Technical Research (CONICET) under a PhD Fellowship (RESOL-2021-154-APN-DIR#CONICET) and the National University of the NorthEast (SCyT - UNNE) under grant 21F005. It is part of the research conducted under the Computer Science Doctorate Program at UNNE, UNaM and UTN.

Authors contribution statement. Andrea Lezcano Airaldi: Conceptualization, Methodology, Data curation, Investigation, Data analysis, Visualization & Writing – original draft. Emanuel Irrazábal: Conceptualization, Methodology, Data analysis, Visualization, Supervision & Writing – review & editing. Jorge Andrés Diaz Pace: Supervision & Writing – review & editing.

F. H. Post, G. M. Nielson, and G. P. Bonneau, “Data Visualization: The State of the Art,” Jan. 2003.
D. Keim, G. Andrienko, J. D. Fekete, C. Görg, J. Kohlhammer, and G. Melançon, “Visual analytics: Definition, process, and challenges,” Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 4950 LNCS, pp. 154–175, 2008, doi: 10.1007/978-3-540-70956-5_7.
T. Munzner and E. (Graphic artist) Maguire, Visualization analysis & design. A K Peters/CRC Press, 2014. Accessed: Mar. 10, 2022. [Online]. Available: https://www.cs.ubc.ca/~tmm/talks/minicourse14/vad17stat545-4x4.pdf%0Ahttps://books.google.com.co/books?hl=en&lr=&id=dznSBQAAQBAJ&oi=fnd&pg=PP1&dq=Visualization+Analysis+and+Design&ots=HfNtFwMbMq&sig=5Voltt5TZziDkT5sVf-JoQDjP3w%0Ahttps://www.crcpress.com/V
S. K. Card, J. D. Mackinlay, and Ben. Shneiderman, Readings in information visualization : using vision to think. Morgan Kaufmann Publishers, 1999.
N. Henry Riche, C. Hurter, N. Diakopoulos, and S. Carpendale, Data-Driven Storytelling. 2018. doi: 10.1201/9781315281575.
B. Bach et al., “Narrative Design Patterns for Data-Driven Storytelling,” in Data-Driven Storytelling, A K Peters/CRC Press, 2018, pp. 107–133. doi: 10.1201/9781315281575-5.
W. Willett, J. Heer, J. M. Hellerstein, and M. Agrawala, “CommentSpace: Structured support for collaborative visual analysis,” in Conference on Human Factors in Computing Systems - Proceedings, 2011, pp. 3131–3140. doi: 10.1145/1978942.1979407.
D. Dowding, J. A. Merrill, N. Onorato, Y. Barrón, R. J. Rosati, and D. Russell, “The impact of home care nurses’ numeracy and graph literacy on comprehension of visual display information: Implications for dashboard design,” Journal of the American Medical Informatics Association, vol. 25, no. 2, pp. 175–182, Feb. 2018, doi: 10.1093/jamia/ocx042.
M. Gilger, “Addressing information display weaknesses for situational awareness,” 2006. doi: 10.1109/MILCOM.2006.302129.
C. Nussbaumer Knaflic, Storytelling with Data: A Data Visualization Guide for Business Professionals. Wiley, 2015. Accessed: Mar. 17, 2021. [Online]. Available: https://www.amazon.com/-/es/Cole-Nussbaumer-Knaflic/dp/1119002257/ref=sr_1_1?crid=4WE4N3JTKA0L&dchild=1&keywords=storytelling+with+data&qid=1616074837&s=books&sprefix=storytelling%2Caps%2C315&sr=1-1
S. Carpendale, “Evaluating information visualizations,” Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 4950 LNCS, pp. 19–45, 2008, doi: 10.1007/978-3-540-70956-5_2.
H. Lam, E. Bertini, P. Isenberg, C. Plaisant, S. Carpendale Empirical, and H. Lam Enrico Bertini Petra Isenberg Catherine Plaisant Sheelagh Carpendale, “Empirical Studies in Information Visualization: Seven Scenarios,” Institute of Electrical and Electronics Engineers, vol. 18, no. 9, pp. 1520–1536, 2012, doi: 10.1109/TVCG.2011.279ï.
C. Plaisant, “The challenge of information visualization evaluation,” Proceedings of the Workshop on Advanced Visual Interfaces AVI, pp. 109–116, 2004, doi: 10.1145/989863.989880.
E. Dimara, A. Bezerianos, and P. Dragicevic, “The Attraction Effect in Information Visualization,” IEEE Transactions on Visualization and Computer Graphics, vol. 23, no. 1, pp. 471–480, Jan. 2017, doi: 10.1109/TVCG.2016.2598594.
E. Dimara, A. Bezerianos, and P. Dragicevic, “Narratives in crowdsourced evaluation of visualizations: A double-edged sword?,” Conference on Human Factors in Computing Systems - Proceedings, vol. 2017-May, pp. 5475–5484, May 2017, doi: 10.1145/3025453.3025870.
L. E. Matzen, M. J. Haass, K. M. Divis, Z. Wang, and A. T. Wilson, “Data Visualization Saliency Model: A Tool for Evaluating Abstract Data Visualizations,” IEEE Transactions on Visualization and Computer Graphics, vol. 24, no. 1, pp. 563–573, Jan. 2018, doi: 10.1109/TVCG.2017.2743939.
F. Gutiérrez, N. N. Htun, F. Schlenz, A. Kasimati, and K. Verbert, “A review of visualisations in agricultural decision support systems: An HCI perspective,” Computers and Electronics in Agriculture, vol. 163, no. May, p. 104844, 2019, doi: 10.1016/j.compag.2019.05.053.
S. Grainger, F. Mao, and W. Buytaert, “Environmental data visualisation for non-scientific contexts: Literature review and design framework,” Environmental Modelling and Software, vol. 85, pp. 299–318, 2016, doi: 10.1016/j.envsoft.2016.09.004.
E. Segel and J. Heer, “Narrative visualization: Telling stories with data,” IEEE Transactions on Visualization and Computer Graphics, vol. 16, no. 6, pp. 1139–1148, 2010, doi: 10.1109/TVCG.2010.179.
T.-M. Rhyne, B. Lee, N. H. Riche, P. Isenberg, and S. Carpendale, “More Than Telling a Story: Transforming Data into Visually Shared Stories,” 2015. Accessed: Mar. 09, 2021. [Online]. Available: www.gapminder.org/
J. Hullman and N. Diakopoulos, “Visualization rhetoric: Framing effects in narrative visualization,” IEEE Transactions on Visualization and Computer Graphics, vol. 17, no. 12, pp. 2231–2240, 2011, doi: 10.1109/TVCG.2011.255.
N. H. Riche, C. Hurter, N. Diakopoulos, and S. Carpendale, Data-driven storytelling. A K Peters/CRC Press, 2018.
R. Kosara and J. MacKinlay, “Storytelling: The next step for visualization,” Computer (Long Beach Calif), vol. 46, no. 5, pp. 44–50, 2013, doi: 10.1109/MC.2013.36.
A. Ojo and B. Heravi, “Patterns in Award Winning Data Storytelling,” https://doi.org/10.1080/21670811.2017.1403291, vol. 6, no. 6, pp. 693–718, Jul. 2017, doi: 10.1080/21670811.2017.1403291.
F. el Outa, M. Francia, P. Marcel, V. Peralta, and P. Vassiliadis, “Towards a Conceptual Model for Data Narratives,” Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 12400 LNCS, pp. 261–270, 2020, doi: 10.1007/978-3-030-62522-1_19.
S. A. Matei and L. Hunter, “Data storytelling is not storytelling with data: A framework for storytelling in science communication and data journalism,” https://doi.org/10.1080/01972243.2021.1951415, vol. 37, no. 5, pp. 312–322, 2021, doi: 10.1080/01972243.2021.1951415.
J. Bertin, “Semiology of graphics,” p. 415, 1983, Accessed: Mar. 07, 2022. [Online]. Available: https://books.google.com/books/about/Semiology_of_Graphics.html?hl=es&id=luZQAAAAMAAJ
E. R. Tufte, The Visual Display of Quantitative Information, 2nd ed. Graphics Press, 2001. Accessed: Mar. 17, 2021. [Online]. Available: https://www.edwardtufte.com/tufte/books_vdqi
S. Evergreen, Effective Data Visualization: The Right Chart for the Right Data, 2nd ed. SAGE Publications, Inc., 2019. doi: 10.3138/cjpe.69480.
W. S. Cleveland and R. McGill, “Graphical perception and graphical methods for analyzing scientific data,” Science (1979), vol. 229, no. 4716, pp. 828–833, 1985, doi: 10.1126/SCIENCE.229.4716.828.
W. S. Cleveland and R. McGill, “Graphical Perception: Theory, Experimentation, and Application to the Development of Graphical Methods,” J Am Stat Assoc, vol. 79, no. 387, p. 531, Sep. 1984, doi: 10.2307/2288400.
S. M. Kosslyn, Graph Design for the Eye and Mind. Oxford University Press, 2006. doi: 10.1093/ACPROF:OSO/9780195311846.001.0001.
L. Bartram and M. C. Stone, “Whisper, don’t scream: Grids and transparency,” IEEE Transactions on Visualization and Computer Graphics, vol. 17, no. 10, pp. 1444–1458, 2011, doi: 10.1109/TVCG.2010.237.
L. Bartram, B. Cheung, and M. Stone, “The effect of colour and transparency on the perception of overlaid grids,” IEEE Transactions on Visualization and Computer Graphics, vol. 17, no. 12, pp. 1942–1948, 2011, doi: 10.1109/TVCG.2011.242.
P. Bera, “How colors in business dashboards affect users’ decision making,” Commun ACM, vol. 59, no. 4, pp. 50–57, Mar. 2016, doi: 10.1145/2818993.
D. A. Szafir, “Modeling Color Difference for Visualization Design,” IEEE Transactions on Visualization and Computer Graphics, vol. 24, no. 1, pp. 392–401, Jan. 2018, doi: 10.1109/TVCG.2017.2744359.
K. B. Schloss, C. C. Gramazio, A. T. Silverman, M. L. Parker, and A. S. Wang, “Mapping Color to Meaning in Colormap Data Visualizations,” IEEE Transactions on Visualization and Computer Graphics, vol. 25, no. 1, pp. 810–819, Jan. 2019, doi: 10.1109/TVCG.2018.2865147.
A. Dasgupta, J. Poco, B. Rogowitz, K. Han, E. Bertini, and C. T. Silva, “The Effect of Color Scales on Climate Scientists’ Objective and Subjective Performance in Spatial Data Analysis Tasks,” IEEE Trans Vis Comput Graph, vol. 26, no. 3, pp. 1577–1591, Mar. 2020, doi: 10.1109/TVCG.2018.2876539.
K. Reda and D. A. Szafir, “Rainbows revisited: Modeling effective colormap design for graphical inference,” IEEE Transactions on Visualization and Computer Graphics, vol. 27, no. 2, pp. 1032–1042, Feb. 2021, doi: 10.1109/TVCG.2020.3030439.
Y. Meng, H. Zhang, M. Liu, and S. Liu, “Clutter-aware label layout,” IEEE Pacific Visualization Symposium, vol. 2015-July, pp. 207–214, Jul. 2015, doi: 10.1109/PACIFICVIS.2015.7156379.
M. A. Borkin et al., “Beyond Memorability: Visualization Recognition and Recall,” IEEE Transactions on Visualization and Computer Graphics, vol. 22, no. 1, pp. 519–528, Jan. 2016, doi: 10.1109/TVCG.2015.2467732.
D. J. Gillan and E. H. Richman, “Minimalism and the syntax of graphs,” Human Factors, vol. 36, no. 4, pp. 619–644, 1994, doi: 10.1177/001872089403600405.
K. Ajani, E. Lee, C. Xiong, C. Nussbaumer Knaflic, W. Kemper, and S. Franconeri, “Declutter and Focus: Empirically Evaluating Design Guidelines for Effective Data Communication,” IEEE Transactions on Visualization and Computer Graphics, 2021, doi: 10.1109/TVCG.2021.3068337.
S. Haroz and D. Whitney, “How capacity limits of attention influence information visualization effectiveness,” IEEE Transactions on Visualization and Computer Graphics, vol. 18, no. 12, pp. 2402–2410, 2012, doi: 10.1109/TVCG.2012.233.
C. C. Gramazio, K. B. Schloss, and D. H. Laidlaw, “The relation between visualization size, grouping, and user performance,” IEEE Transactions on Visualization and Computer Graphics, vol. 20, no. 12, pp. 1953–1962, Dec. 2014, doi: 10.1109/TVCG.2014.2346983.
R. Borgo et al., “An empirical study on using visual embellishments in visualization,” IEEE Transactions on Visualization and Computer Graphics, vol. 18, no. 12, pp. 2759–2768, 2012, doi: 10.1109/TVCG.2012.197.
S. Bateman, R. L. Mandryk, C. Gutwin, A. Genest, D. McDine, and C. Brooks, “Useful junk? The effects of visual embellishment on comprehension and memorability of charts,” Conference on Human Factors in Computing Systems - Proceedings, vol. 4, pp. 2573–2582, 2010, doi: 10.1145/1753326.1753716.
M. A. Borkin et al., “What makes a visualization memorable,” IEEE Transactions on Visualization and Computer Graphics, vol. 19, no. 12, pp. 2306–2315, 2013, doi: 10.1109/TVCG.2013.234.
D. Skau, L. Harrison, and R. Kosara, “An Evaluation of the Impact of Visual Embellishments in Bar Charts,” Computer Graphics Forum, vol. 34, no. 3, pp. 221–230, Jun. 2015, doi: 10.1111/CGF.12634.
A. Mairena, M. Dechant, C. Gutwin, and A. Cockburn, “A Baseline Study of Emphasis Effects in Information Visualization • Graphics Interface,” in Proceedings of Graphics Interface 2020, May 2020, pp. 327–339. Accessed: Mar. 14, 2022. [Online]. Available: https://graphicsinterface.org/proceedings/gi2020/gi2020-33/
A. Mairena, C. Gutwin, and A. Cockburn, “Which emphasis technique to use? Perception of emphasis techniques with varying distractors, backgrounds, and visualization types,” Inf Vis, vol. 21, no. 2, pp. 95–129, Apr. 2022, doi: 10.1177/14738716211045354.
L. Perkhofer, C. Eisl, H. Losbichler, and A. Greil, “Improving Information Perception of Graphical Displays - an Experimental Study on the Display of Column Graphs ,” Jun. 2014. Accessed: Mar. 14, 2022. [Online]. Available: https://www.researchgate.net/publication/295513740_Improving_Information_Perception_of_Graphical_Displays_-_an_Experimental_Study_on_the_Display_of_Column_Graphs
A. Srinivasan, M. Brehmer, B. Lee, and S. M. Drucker, “What’s the difference?: Evaluating variants of multi-series bar charts for visual comparison tasks,” in Conference on Human Factors in Computing Systems - Proceedings, Apr. 2018, vol. 2018-April. doi: 10.1145/3173574.3173878.
J. Diaz, O. Meruvia-Pastor, and P. P. Vazquez, “Improving perception accuracy in bar charts with internal contrast and framing enhancements,” in 22nd International Conference Information Visualisation (IV), Dec. 2018, pp. 159–168. doi: 10.1109/IV.2018.00037.
J. Talbot, V. Setlur, and A. Anand, “Four experiments on the perception of bar charts,” IEEE Transactions on Visualization and Computer Graphics, vol. 20, no. 12, pp. 2152–2160, Dec. 2014, doi: 10.1109/TVCG.2014.2346320.
C. Xiong, V. Setlur, B. Bach, E. Koh, K. Lin, and S. Franconeri, “Visual Arrangements of Bar Charts Influence Comparisons in Viewer Takeaways,” IEEE Transactions on Visualization and Computer Graphics, vol. 28, no. 1, pp. 955–965, Aug. 2021, doi: 10.48550/arxiv.2108.06370.
E. Bertini and G. Santucci, “Improving 2D scatterplots effectiveness through sampling, displacement, and user perception,” Proceedings of the International Conference on Information Visualisation, vol. 2005, pp. 826–834, 2005, doi: 10.1109/IV.2005.62.
M. Fink, J. H. Haunert, J. Spoerhase, and A. Wolff, “Selecting the aspect ratio of a scatter plot based on its delaunay triangulation,” IEEE Transactions on Visualization and Computer Graphics, vol. 19, no. 12, pp. 2326–2335, 2013, doi: 10.1109/TVCG.2013.187.
L. Micallef, G. Palmas, A. Oulasvirta, and T. Weinkauf, “Towards Perceptual Optimization of the Visual Design of Scatterplots,” IEEE Transactions on Visualization and Computer Graphics, vol. 23, no. 6, pp. 1588–1599, Jun. 2017, doi: 10.1109/TVCG.2017.2674978.
J. Li, J. J. van Wijk, and J. B. Martens, “Evaluation of symbol contrast in scatterplots,” IEEE Pacific Visualization Symposium, PacificVis 2009 - Proceedings, pp. 97–104, 2009, doi: 10.1109/PACIFICVIS.2009.4906843.
N. Kong, J. Heer, and M. Agrawala, “Perceptual guidelines for creating rectangular treemaps,” IEEE Transactions on Visualization and Computer Graphics, vol. 16, no. 6, pp. 990–998, 2010, doi: 10.1109/TVCG.2010.186.
D. Turo and B. Johnson, “Improving the visualization of hierarchies with treemaps: Design issues and experimentation,” Proceedings of the 3rd Conference on Visualization, VIS 1992, pp. 124–131, Oct. 1992, doi: 10.1109/VISUAL.1992.235217.
J. Scholtz, “Developing guidelines for assessing visual analytics environments,” Information Visualization, vol. 10, no. 3, pp. 212–231, 2011, doi: 10.1177/1473871611407399.
O. Adagha, R. M. Levy, and S. Carpendale, “Towards a product design assessment of visual analytics in decision support applications: a systematic review,” Journal of Intelligent Manufacturing, vol. 28, no. 7, pp. 1623–1633, 2017, doi: 10.1007/s10845-015-1118-5.
T. Isenberg, P. Isenberg, J. Chen, M. Sedlmair, and T. Moller, “A systematic review on the practice of evaluating visualization,” IEEE Transactions on Visualization and Computer Graphics, vol. 19, no. 12, pp. 2818–2827, 2013, doi: 10.1109/TVCG.2013.126.
H. Lam, E. Bertini, P. Isenberg, C. Plaisant, and S. Carpendale, “Empirical studies in information visualization: Seven scenarios,” IEEE Transactions on Visualization and Computer Graphics, vol. 18, no. 9, pp. 1520–1536, 2012, doi: 10.1109/TVCG.2011.279.
M. R. Turchioe et al., “A Systematic Review of Patient-Facing Visualizations of Personal Health Data,” Applied Clinical Informatics, vol. 10, no. 4, pp. 751–770, 2019, doi: 10.1055/s-0039-1697592.
M. Tory and T. Möller, “Evaluating visualizations: Do expert reviews work?,” IEEE Computer Graphics and Applications, vol. 25, no. 5, pp. 8–11, 2005, doi: 10.1109/MCG.2005.102.
R. Borgo, L. Micallef, B. Bach, F. McGee, and B. Lee, “Information Visualization Evaluation Using Crowdsourcing,” Computer Graphics Forum, vol. 37, no. 3, pp. 573–595, 2018, doi: 10.1111/cgf.13444.
E. Bertini, A. Tatu, and D. Keim, “Quality metrics in high-dimensional data visualization: An overview and systematization,” IEEE Transactions on Visualization and Computer Graphics, vol. 17, no. 12. pp. 2203–2212, 2011. doi: 10.1109/TVCG.2011.229.
P. Shah and J. Hoeffner, “Review of graph comprehension research: Implications for instruction,” Educational Psychology Review, vol. 14, no. 1, pp. 47–69, 2002, doi: 10.1023/A:1013180410169.
B. Saket, A. Endert, and J. Stasko, “Beyond usability and performance: A review of user experience-focused evaluations in Visualization,” ACM International Conference Proceeding Series, vol. 24-October-2016, pp. 133–142, Oct. 2016, doi: 10.1145/2993901.2993903.
F. Amini, M. Brehmer, G. Bolduan, C. Elmer, and B. Wiederkehr, “Evaluating Data-Driven Stories and Storytelling Tools *,” in Data-Driven Storytelling, A K Peters/CRC Press, 2018, pp. 249–286. doi: 10.1201/9781315281575-11.
B. Kitchenham and S. M. Charters, “Guidelines for performing systematic literature reviews in software engineering,” 2007.
K. Petersen, S. Vakkalanka, and L. Kuzniarz, “Guidelines for conducting systematic mapping studies in software engineering: An update,” Information and Software Technology, vol. 64, pp. 1–18, Aug. 2015, doi: 10.1016/J.INFSOF.2015.03.007.
P. Singh, M. Galster, and K. Singh, “How do secondary studies in software engineering report automated searches? a preliminary analysis,” in ACM International Conference Proceeding Series, 2018, vol. Part F1377. doi: 10.1145/3210459.3210474.
V. Garousi and M. Felderer, “Experience-based guidelines for effective and efficient data extraction in systematic reviews in software engineering,” in ACM International Conference Proceeding Series, Jun. 2017, vol. Part F1286, pp. 170–179. doi: 10.1145/3084226.3084238.
A. Lezcano Airaldi, J. A. Diaz-Pace, and E. Irrazábal, “Narrative Visualizations Best Practices and Evaluation: A Systematic Mapping Study (Supplementary Material)”, Accessed: Jun. 05, 2022. [Online]. Available: https://drive.google.com/file/d/1HZEfcIZzyAxidJltWYvSkH1ujdOgyHxx/view?usp=sharing
F. Amini, M. Brehmer, G. Bolduan, C. Elmer, and B. Wiederkehr, “Evaluating Data-Driven Stories and Storytelling Tools *,” in Data-Driven Storytelling, 2018, pp. 249–286. doi: 10.1201/9781315281575-11.
L. Battle, P. Duan, Z. Miranda, D. Mukusheva, R. Chang, and M. Stonebraker, “Beagle: Automated Extraction and Interpretation of Visualizations from the Web,” Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, 2018, doi: 10.1145/3173574.
S. Lee, S. H. Kim, and B. C. Kwon, “VLAT: Development of a Visualization Literacy Assessment Test,” IEEE Transactions on Visualization and Computer Graphics, vol. 23, no. 1, pp. 551–560, Jan. 2017, doi: 10.1109/TVCG.2016.2598920.
M. Kuhrmann, D. M. Fernández, and M. Daneva, “On the pragmatic design of literature studies in software engineering: an experience-based guideline,” Empirical Software Engineering, vol. 22, no. 6, pp. 2852–2891, Dec. 2017, doi: 10.1007/s10664-016-9492-y.
C. Wohlin, “Guidelines for snowballing in systematic literature studies and a replication in software engineering,” in ACM International Conference Proceeding Series, 2014, pp. 1–10. doi: 10.1145/2601248.2601268.
B.A Kitchenham, “Guidelines for performing systematic literature reviews in software engineering,” 2007.
S. Higgins, Julian PT and Green, “Cochrane Handbook for Systematic Reviews of Interventions,” Handbook, vol. 2. p. 649, 2011.
D. Dermeval et al., “Applications of ontologies in requirements engineering: a systematic review of the literature,” Requirements Engineering, vol. 21, no. 4, pp. 405–437, Nov. 2016, doi: 10.1007/S00766-015-0222-6.
J. Pérez, J. Díaz, J. Garcia-Martin, and B. Tabuenca, “Systematic literature reviews in software engineering—enhancement of the study selection process using Cohen’s Kappa statistic,” Journal of Systems and Software, vol. 168, p. 110657, Oct. 2020, doi: 10.1016/j.jss.2020.110657.
“ATLAS.ti: The Qualitative Data Analysis & Research Software.” https://atlasti.com/ (accessed Mar. 08, 2022).
J. Corbin and A. Strauss, Basics of Qualitative Research (3rd ed.): Techniques and Procedures for Developing Grounded Theory. SAGE Publications, Inc., 2008. doi: 10.4135/9781452230153.
D. Dermeval et al., “Applications of ontologies in requirements engineering,” Requirements Engineering, vol. 21, no. 4, pp. 405–437, Nov. 2016, doi: 10.1007/S00766-015-0222-6.
R. Wieringa, N. Maiden, N. Mead, and C. Rolland, “Requirements engineering paper classification and evaluation criteria: A proposal and a discussion,” Requir Eng, vol. 11, no. 1, pp. 102–107, Mar. 2006, doi: 10.1007/S00766-005-0021-6.
J. Hullman, S. Drucker, N. Henry Riche, B. Lee, D. Fisher, and E. Adar, “A deeper understanding of sequence in narrative visualization,” IEEE Transactions on Visualization and Computer Graphics, vol. 19, no. 12, pp. 2406–2415, 2013, doi: 10.1109/TVCG.2013.119.
A. Satyanarayan and J. Heer, “Authoring narrative visualizations with Ellipsis,” Computer Graphics Forum, vol. 33, no. 3, pp. 361–370, 2014, doi: 10.1111/CGF.12392.
J. Boy, A. V. Pandey, J. Emerson, M. Satterthwaite, O. Nov, and E. Bertini, “Showing people behind data: Does anthropomorphizing visualizations elicit more empathy for human rights data?,” Conference on Human Factors in Computing Systems - Proceedings, vol. 2017-May, pp. 5462–5474, May 2017, doi: 10.1145/3025453.3025512.
X. Bai, D. White, and D. Sundaram, “Purposeful visualization,” Proceedings of the Annual Hawaii International Conference on System Sciences, 2011, doi: 10.1109/HICSS.2011.353.
X. Lan, Y. Shi, Y. Zhang, and N. Cao, “Smile or Scowl? Looking at Infographic Design through the Affective Lens,” IEEE Transactions on Visualization and Computer Graphics, vol. 27, no. 6, pp. 2796–2807, Jun. 2021, doi: 10.1109/TVCG.2021.3074582.
H. Padda, S. Mudur, A. Seffah, and Y. Joshi, “Comprehension of visualization systems - Towards quantitative assessment,” Proceedings of the 1st International Conference on Advances in Computer-Human Interaction, ACHI 2008, pp. 83–88, 2008, doi: 10.1109/ACHI.2008.19.
B. Tversky, “Visuospatial Reasoning,” in Cambridge Handbook of Thinking and ReasoningChapter: Visuospatial Reasoning, Cambridge University Press, 2005, pp. 209–240. Accessed: Apr. 07, 2022. [Online]. Available: https://www.researchgate.net/publication/232082377_Visuospatial_Reasoning
C. Ware, Information visualization : perception for design, 4th ed. Morgan Kaufmann, 2020.
Y. H. Hung and P. Parsons, “Assessing user engagement in information visualization,” in Conference on Human Factors in Computing Systems - Proceedings, May 2017, vol. Part F127655, pp. 1708–1717. doi: 10.1145/3027063.3053113.
S. Amri, H. Ltifi, and M. ben Ayed, “Towards an intelligent evaluation method of medical data visualizations,” International Conference on Intelligent Systems Design and Applications, ISDA, vol. 2016-June, pp. 673–678, Jun. 2016, doi: 10.1109/ISDA.2015.7489198.
D. A. Norman, The Design of Everyday Things . Doubleday Business, 1990.
M. W. Matlin, Cognition. Harcourt Brace Publishers, 1994.
H. L. O’Brien and E. G. Toms, “The development and evaluation of a survey to measure user engagement,” Journal of the American Society for Information Science and Technology, vol. 61, no. 1, pp. 50–69, Jan. 2010, doi: 10.1002/ASI.21229.
M. H. Huang, “Designing website attributes to induce experiential encounters,” Computers in Human Behavior, vol. 19, no. 4, pp. 425–442, Jul. 2003, doi: 10.1016/S0747-5632(02)00080-8.
E. Wall et al., “A heuristic approach to value-driven evaluation of visualizations,” IEEE Transactions on Visualization and Computer Graphics, vol. 25, no. 1, pp. 491–500, Jan. 2019, doi: 10.1109/TVCG.2018.2865146.
E. Dimara, A. Bezerianos, and P. Dragicevic, “Conceptual and Methodological Issues in Evaluating Multidimensional Visualizations for Decision Support,” IEEE Transactions on Visualization and Computer Graphics, vol. 24, no. 1, pp. 749–759, Jan. 2018, doi: 10.1109/TVCG.2017.2745138.
J. Nielsen, Usability Inspection Methods. New York: John Wiley & Sons, 1994. Accessed: Mar. 28, 2022. [Online]. Available: https://www.nngroup.com/books/usability-inspection-methods/
J. Stasko, “Value-driven evaluation of visualizations,” ACM International Conference Proceeding Series, vol. 10-November-2015, pp. 46–53, Nov. 2014, doi: 10.1145/2669557.2669579.
P. Isenberg, A. Bezerianos, P. Dragicevic, J.-D. Fekete, and J.-D. A. Fekete, “A Study on Dual-Scale Data Charts,” IEEE Transactions on Visualization and Computer Graphics, vol. 17, no. 12, pp. 2469–2487, Dec. 2011, doi: 10.1109/TVCG.2011.238.
S. K. Card, J. D. Mackinlay, and Ben. Shneiderman, Readings in information visualization : using vision to think. Morgan Kaufmann Publishers, 1999.
J. D. Kelly, “The Data-Ink Ratio and Accuracy of Newspaper Graphs:,” http://dx.doi.org/10.1177/107769908906600315, vol. 66, no. 3, pp. 632–639, Aug. 2016, doi: 10.1177/107769908906600315.
A. Savikhin, R. Maciejewski, and D. S. Ebert, “Applied visual analytics for economic decision-making,” VAST’08 - IEEE Symposium on Visual Analytics Science and Technology, Proceedings, pp. 107–114, 2008, doi: 10.1109/VAST.2008.4677363.
D. Dowding, J. A. Merrill, N. Onorato, Y. Barrón, R. J. Rosati, and D. Russell, “The impact of home care nurses’ numeracy and graph literacy on comprehension of visual display information: Implications for dashboard design,” Journal of the American Medical Informatics Association, vol. 25, no. 2, pp. 175–182, Feb. 2018, doi: 10.1093/jamia/ocx042.
“Storytelling - Wikipedia.” https://en.wikipedia.org/wiki/Storytelling (accessed Mar. 08, 2022).
A. Lugmayr, E. Sutinen, J. Suhonen, C. I. Sedano, H. Hlavacs, and C. S. Montero, “Serious storytelling – a first definition and review,” Multimedia Tools and Applications, vol. 76, no. 14, pp. 15707–15733, Jul. 2017, doi: 10.1007/S11042-016-3865-5.
Y. Gabriel, Storytelling in Organizations: Facts, Fictions, and Fantasies. Oxford University Press, 2000. doi: 10.1093/ACPROF:OSO/9780198290957.001.0001.
N. Gershon and W. Page, “What storytelling can do for information visualization,” Commun ACM, vol. 44, no. 8, pp. 31–37, Aug. 2001, doi: 10.1145/381641.381653.
B. Dykes, Effective data storytelling : how to drive change with data, narrative, and visuals, 1st ed. Wiley, 2019.
C. Edmond and T. Bednarz, “Three trajectories for narrative visualisation,” Visual Informatics, vol. 5, no. 2, pp. 26–40, Jun. 2021, doi: 10.1016/J.VISINF.2021.04.001.
J. Stikeleather, “The Three Elements of Successful Data Visualizations,” Apr. 13, 2013. https://hbr.org/2013/04/the-three-elements-of-successf (accessed Mar. 08, 2022).
J. M. Teets, D. P. Tegarden, and R. S. Russell, “Using cognitive fit theory to evaluate the effectiveness of information visualizations: An example using quality assurance data,” IEEE Transactions on Visualization and Computer Graphics, vol. 16, no. 5, pp. 841–853, 2010, doi: 10.1109/TVCG.2010.21.
D. Albers, M. Correll, and M. Gleicher, “Task-Driven Evaluation of Aggregation in Time Series Visualization,” Proc SIGCHI Conf Hum Factor Comput Syst, vol. 2014, pp. 551–560, 2014, doi: 10.1145/2556288.2557200.
A. Lezcano Airaldi, J. A. Diaz-Pace, and E. Irrazábal, “Data-driven Storytelling to Support Decision Making in Crisis Settings: A Case Study,” JUCS - Journal of Universal Computer Science 27(10): 1046-1068, vol. 27, no. 10, pp. 1046–1068, 2021, doi: 10.3897/JUCS.66714.
S. Lee, S. H. Kim, and B. C. Kwon, “VLAT: Development of a Visualization Literacy Assessment Test,” IEEE Transactions on Visualization and Computer Graphics, vol. 23, no. 1, pp. 551–560, Jan. 2017, doi: 10.1109/TVCG.2016.2598920.
J. Boy, F. Detienne, and J. D. Fekete, “Storytelling in information visualizations: Does it engage users to explore data?,” Conference on Human Factors in Computing Systems - Proceedings, vol. 2015-April, pp. 1449–1458, Apr. 2015, doi: 10.1145/2702123.2702452.
N. Mahyar, S.-H. Kim, and B. C. Kwon, “Towards a Taxonomy for Evaluating User Engagement in Information Visualization,” Oct. 2015. Accessed: Apr. 07, 2022. [Online]. Available: https://kops.uni-konstanz.de/handle/123456789/45094?locale-attribute=en
S. Chi’ and K. Kashirnum, “Apparent Usability vs. Inherent Usability Experimental analysis on the determinants of the apparent usability Masaaki Kurosu,” Conference companion on Human factors in computing systems - CHI ’95, doi: 10.1145/223355.
J. Liem, C. Perm, and J. Wood, “Structure and Empathy in Visual Data Storytelling: Evaluating their Influence on Attitude,” Computer Graphics Forum, vol. 39, no. 3, pp. 277–289, Jun. 2020, doi: 10.1111/CGF.13980.
T. Isenberg, P. Isenberg, J. Chen, M. Sedlmair, and T. Moller, “A systematic review on the practice of evaluating visualization,” IEEE Transactions on Visualization and Computer Graphics, vol. 19, no. 12, pp. 2818–2827, 2013, doi: 10.1109/TVCG.2013.126.
C. Wohlin, P. Runeson, M. Höst, M. C. Ohlsson, B. Regnell, and A. Wesslén, Experimentation in Software Engineering, vol. 9783642290442. Springer-Verlag Berlin Heidelberg, 2012. doi: 10.1007/978-3-642-29044-2.
L. Nowell, R. Schulman, and D. Hix, “Graphical encoding for information visualization: An empirical study,” Proceedings - IEEE Symposium on Information Visualization, INFO VIS, vol. 2002-January, pp. 43–50, 2002, doi: 10.1109/INFVIS.2002.1173146.
J. Boy, L. Eveillard, F. Detienne, and J. D. Fekete, “Suggested Interactivity: Seeking Perceived Affordances for Information Visualization,” IEEE Transactions on Visualization and Computer Graphics, vol. 22, no. 1, pp. 639–648, Jan. 2016, doi: 10.1109/TVCG.2015.2467201.
A. Quispel, A. Maes, and J. Schilperoord, “Graph and chart aesthetics for experts and laymen in design: The role of familiarity and perceived ease of use:,” http://dx.doi.org/10.1177/1473871615606478, vol. 15, no. 3, pp. 238–252, Sep. 2015, doi: 10.1177/1473871615606478.
Y. Tanahashi, N. Leaf, and K. L. Ma, “A Study On Designing Effective Introductory Materials for Information Visualization,” Computer Graphics Forum, vol. 35, no. 7, pp. 117–126, Oct. 2016, doi: 10.1111/CGF.13009.
N. Ó. Brolcháin et al., “Extending open data platforms with storytelling features,” ACM International Conference Proceeding Series, vol. Part F128275, pp. 48–53, Jun. 2017, doi: 10.1145/3085228.3085283.
O. Juarez, C. Hendrickson, and J. H. Garrett, “Evaluating visualizations based on the performed task,” Proceedings of the International Conference on Information Visualisation, vol. 2000-July, pp. 135–142, 2000, doi: 10.1109/IV.2000.859748.
H. Padda, A. Seffah, and S. Mudur, “Investigating the comprehension support for effective visualization tools - A case study,” Proceedings of the 2nd International Conferences on Advances in Computer-Human Interactions, ACHI 2009, pp. 283–288, 2009, doi: 10.1109/ACHI.2009.37.
J. Boy, R. A. Rensink, E. Bertini, and J. D. Fekete, “A principled way of assessing visualization literacy,” IEEE Transactions on Visualization and Computer Graphics, vol. 20, no. 12, pp. 1963–1972, Dec. 2014, doi: 10.1109/TVCG.2014.2346984.
S. Nusrat, M. J. Alam, and S. Kobourov, “Evaluating Cartogram Effectiveness,” IEEE Transactions on Visualization and Computer Graphics, vol. 24, no. 2, pp. 1105–1118, Feb. 2018, doi: 10.1109/TVCG.2016.2642109.
A. Kale, M. Kay, and J. Hullman, “Visual reasoning strategies for effect size judgments and decisions,” IEEE Transactions on Visualization and Computer Graphics, vol. 27, no. 2, pp. 272–282, Feb. 2021, doi: 10.1109/TVCG.2020.3030335.
N. Henry and J. D. Fekete, “Evaluating visual table data understanding,” Proceedings of BELIV’06: BEyond time and errors - novel EvaLuation methods for Information Visualization. A workshop of the AVI 2006 International Working Conference, 2006, doi: 10.1145/1168149.1168154.
H. Lam, R. A. Rensink, and T. Munzner, “Effects of 2D geometric transformations on visual memory,” Proceedings - APGV 2006: Symposium on Applied Perception in Graphics and Visualization, pp. 119–126, 2006, doi: 10.1145/1140491.1140515.
Y. Zhu, X. Suo, and G. S. Owen, “Complexity analysis for information visualization design and evaluation,” Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 4841 LNCS, no. PART 1, pp. 576–585, 2007, doi: 10.1007/978-3-540-76858-6_56.
N. Cawthon and A. vande Moere, “The effect of aesthetic on the usability of data visualization,” Proceedings of the International Conference on Information Visualisation, pp. 637–645, 2007, doi: 10.1109/IV.2007.147.
R. Van, D. Berg, J. B. T. M. Roerdink, and F. Cornelissen, “Perceptual Dependencies in Information Visualization Assessed by Complex Visual Search,” ACM Transactions on Applied Perception, vol. 4, no. 4, pp. 1–21, 2008, doi: 10.1145/1278760.1278763.
S. Garlandini and S. I. Fabrikant, “Evaluating the effectiveness and efficiency of visual variables for geographic information visualization,” Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 5756 LNCS, pp. 195–211, 2009, doi: 10.1007/978-3-642-03832-7_12.
C. Forsell and J. Johansson, “An heuristic set for evaluation in Information Visualization,” Proceedings of the Workshop on Advanced Visual Interfaces AVI, pp. 199–206, 2010, doi: 10.1145/1842993.1843029.
W. Javed, B. McDonnel, and N. Elmqvist, “Graphical perception of multiple time series,” IEEE Transactions on Visualization and Computer Graphics, vol. 16, no. 6, pp. 927–934, 2010, doi: 10.1109/TVCG.2010.162.
J. Goldberg and J. Helfman, “Eye tracking for visualization evaluation: Reading values on linear versus radial graphs,” Information Visualization, vol. 10, no. 3, pp. 182–195, Jul. 2011, doi: 10.1177/1473871611406623.
M. Correll, D. Albers, S. Franconeri, and M. Gleicher, “Comparing averages in time series data,” Conference on Human Factors in Computing Systems - Proceedings, pp. 1095–1104, 2012, doi: 10.1145/2207676.2208556.
D. Toker, C. Conati, G. Carenini, and M. Haraty, “Towards adaptive information visualization: On the influence of user characteristics,” Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 7379 LNCS, pp. 274–285, 2012, doi: 10.1007/978-3-642-31454-4_23.
A. vande Moere, M. Tomitsch, C. Wimmer, B. Christoph, and T. Grechenig, “Evaluating the effect of style in information visualization,” IEEE Transactions on Visualization and Computer Graphics, vol. 18, no. 12, pp. 2739–2748, 2012, doi: 10.1109/TVCG.2012.221.
D. Toker, C. Conati, B. Steichen, and G. Carenini, “Individual user characteristics and information visualization: Connecting the dots through eye tracking,” Conference on Human Factors in Computing Systems - Proceedings, pp. 295–304, 2013, doi: 10.1145/2470654.2470696.
J. Fuchs, F. Fischer, F. Mansmann, E. Bertini, and P. Isenberg, “Evaluation of alternative glyph designs for time series data in a small multiple setting,” Conference on Human Factors in Computing Systems - Proceedings, pp. 3237–3246, 2013, doi: 10.1145/2470654.2466443.
R. Euman and J. Abdelnour-Nocera, “Data Visualisation, User Experience and Context: A Case Study from Fantasy Sport,” in Proceedings, Part III, of the 15th International Conference on Human-Computer Interaction. Users and Contexts of Use - Volume 8006, Jul. 2013, pp. 146–155. Accessed: Mar. 14, 2022. [Online]. Available: https://dl.acm.org/doi/10.5555/2959924.2959942
N. Ferreira, D. Fisher, and A. C. König, “Sample-Oriented Task-Driven Visualizations: Allowing Users to Make Better, More Confident Decisions,” Apr. 2014. Accessed: Mar. 14, 2022. [Online]. Available: https://www.microsoft.com/en-us/research/publication/sample-oriented-task-driven-visualizations-allowing-users-to-make-better-more-confident-decisions/
A. Quispel and A. Maes, “Would you prefer pie or cupcakes? Preferences for data visualization designs of professionals and laypeople in graphic design,” Journal of Visual Languages and Computing, vol. 25, no. 2, pp. 107–116, Apr. 2014, doi: 10.1016/J.JVLC.2013.11.007.
T. Mercun, “Evaluation of information visualization techniques: Analysing user experience with reaction cards,” in BELIV ’14 Fifth Workshop on Beyond Time and Errors: Novel Evaluation Methods for Visualization, Nov. 2014, vol. 10-November-2015, pp. 103–109. doi: 10.1145/2669557.2669565.
M. Correll and M. Gleicher, “Error bars considered harmful: Exploring alternate encodings for mean and error,” IEEE Transactions on Visualization and Computer Graphics, vol. 20, no. 12, pp. 2142–2151, Dec. 2014, doi: 10.1109/TVCG.2014.2346298.
S. McKenna, N. Henry Riche, B. Lee, J. Boy, and M. Meyer, “Visual Narrative Flow: Exploring Factors Shaping Data Visualization Story Reading Experiences,” Computer Graphics Forum, vol. 36, no. 3, pp. 377–387, Jun. 2017, doi: 10.1111/CGF.13195.
T. Lei, N. Ni, Q. Zhu, and S. Zhang, “Aesthetic experimental study on information visualization design under the background of big data,” Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 10919 LNCS, pp. 218–226, 2018, doi: 10.1007/978-3-319-91803-7_16.
C. Perin, T. Wun, R. Pusch, and S. Carpendale, “Assessing the Graphical Perception of Time and Speed on 2D+Time Trajectories,” IEEE Transactions on Visualization and Computer Graphics, vol. 24, no. 1, pp. 698–708, Jan. 2018, doi: 10.1109/TVCG.2017.2743918.
Y. Wang, F. Han, L. Zhu, O. Deussen, and B. Chen, “Line Graph or Scatter Plot? Automatic Selection of Methods for Visualizing Trends in Time Series,” IEEE Transactions on Visualization and Computer Graphics, vol. 24, no. 2, pp. 1141–1154, Feb. 2018, doi: 10.1109/TVCG.2017.2653106.
B. Saket, A. Srinivasan, E. D. Ragan, and A. Endert, “Evaluating Interactive Graphical Encodings for Data Visualization,” IEEE Transactions on Visualization and Computer Graphics, vol. 24, no. 3, pp. 1316–1330, Mar. 2018, doi: 10.1109/TVCG.2017.2680452.
D. Dowding and J. A. Merrill, “The Development of Heuristics for Evaluation of Dashboard Visualizations,” Appl Clin Inform, vol. 9, no. 3, pp. 511–518, Jul. 2018, doi: 10.1055/S-0038-1666842.
R. Barcellos, J. Viterbo, F. Bernardini, and D. Trevisan, “An instrument for evaluating the quality of data visualizations,” Information Visualisation - Biomedical Visualization, Visualisation on Built and Rural Environments and Geometric Modelling and Imaging, IV 2018, pp. 169–174, Dec. 2018, doi: 10.1109/IV.2018.00038.
B. Ondov, N. Jardine, N. Elmqvist, and S. Franconeri, “Face to face: Evaluating visual comparison,” IEEE Transactions on Visualization and Computer Graphics, vol. 25, no. 1, pp. 861–871, Jan. 2019, doi: 10.1109/TVCG.2018.2864884.
H. K. Kong, Z. Liu, W. Zhu, and K. Karahalios, “Understanding visual cues in visualizations accompanied by audio narrations,” Conference on Human Factors in Computing Systems - Proceedings, May 2019, doi: 10.1145/3290605.3300280.
A. M. B. Rodrigues, G. D. J. Barbosa, H. Lopes, and S. D. J. Barbosa, “Comparing the effectiveness of visualizations of different data distributions,” Proceedings - 32nd Conference on Graphics, Patterns and Images, SIBGRAPI 2019, pp. 84–91, Oct. 2019, doi: 10.1109/SIBGRAPI.2019.00020.
K. McGurgan, E. Fedoroksaya, T. M. Sutton, and A. M. Herbert, “Graph design: The data-ink ratio and expert users,” VISIGRAPP 2021 - Proceedings of the 16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, vol. 3, pp. 188–194, 2021, doi: 10.5220/0010263801880194.
C. Conati and H. Maclaren, “Exploring the role of individual differences in information visualization,” Proceedings of the Workshop on Advanced Visual Interfaces AVI, pp. 199–206, 2008, doi: 10.1145/1385569.1385602.
X. Bai, D. White, and D. Sundaram, “Visual intelligence density: Definition, measurement, and implementation,” ACM International Conference Proceeding Series, pp. 93–100, 2009, doi: 10.1145/1577782.1577799.
B. Saket, A. Endert, and C. Demiralp, “Task-Based Effectiveness of Basic Visualizations,” IEEE Transactions on Visualization and Computer Graphics, vol. 25, no. 7, pp. 2505–2512, Jul. 2018, doi: 10.1109/TVCG.2018.2829750.
M. Siegrist, “The use or misuse of three-dimensional graphs to represent lower-dimensional data,” http://dx.doi.org/10.1080/014492996120300, vol. 15, no. 2, pp. 96–100, Jan. 2010, doi: 10.1080/014492996120300.
B. Dy, I. Nazim, A. Poorthuis, and S. C. Joyce, “Improving Visualisation Design for Effective Multi-Objective Decision Making,” IEEE Transactions on Visualization and Computer Graphics, 2021, doi: 10.1109/TVCG.2021.3065126.
E. Dimara, H. Zhang, M. Tory, and S. Franconeri, “The Unmet Data Visualization Needs of Decision Makers within Organizations,” IEEE Transactions on Visualization and Computer Graphics, 2021, doi: 10.1109/TVCG.2021.3074023.
I. Yovanovic, J. Goñi, and C. Miranda, “Remote usability assessment of topic visualization interfaces with public participation data: A case study,” eJournal of eDemocracy and Open Government, vol. 13, no. 1, pp. 101–126, 2021, doi: 10.29379/JEDEM.V13I1.640.

Table 1 Syntax of the search string for each digital library

Search engine	Search string
Scopus	TITLE-ABS-KEY (( "data storytelling" OR "data-driven storytelling" OR "data visualization" OR "data story" OR "data-driven story" OR "information visualization" OR "data-driven visualization" ) AND ( practice OR guideline OR guide OR principle OR evaluation OR assessment OR metric OR measurement OR criteria OR goal OR characteristic )
IEEE	("All Metadata":"data* story" OR "All Metadata": "data visualization" OR "All Metadata": "information visualization") AND ("All Metadata":practice OR "All Metadata":guide OR "All Metadata":principle OR "All Metadata":heuristic OR "All Metadata":evaluation OR "All Metadata": assessment OR "All Metadata": metric OR "All Metadata":measurement OR "All Metadata": criteria OR "All Metadata": goal OR "All Metadata":characteristic) NOT ("All Metadata":optical)
ACM	Abstract:("data storytelling" OR "data-driven storytelling" OR "data visualization" OR "data story" OR "data-driven story" OR "information visualization") AND Title:(practice OR guideline OR guide OR principle OR heuristic OR evaluation OR assessment OR metric OR measurement OR criteria OR goal OR characteristic)

Table 2 Main terms and synonyms used to create the search string

Main Term	Keywords
Data storytelling	"data storytelling” OR “data-driven storytelling" OR “data story” OR "data-driven story" OR "data visualization" OR "information visualization”
Best practice	“best practice” OR practice OR guideline OR principle
Criteria	criteria OR goals
Evaluation	evaluation OR assessment

Table 3 Automated search details

Database	Search results
ACM Digital Library	1.635
IEEE Xplore	243
Scopus	9.940
Total	11.818

Table 4 Quality assessment checklist

ID	Question
Q1	Is there a clear statement of the goals of the research?
Q2	Is there sufficient discussion of related work?
Q3	Are the visualizations under study clearly described?
Q4	Is the purpose of the analysis clear?
Q5	Is the investigation process adequately documented?
Q6	Are the statistical methods described?
Q7	Are the study participants or observational units adequately described? For example, SE experience, type (student, practitioner, consultant), nationality, task experience and other relevant variables.
Q8	Are all study questions answered?
Q9	Is there a discussion about the results of the study?
Q10	Are the limitations of this study explicitly discussed?
Q11	Are the lessons learned interesting and relevant for practitioners?

Table 5 Data extraction form

Focus	Item	Description
General Information	Identifier	Reference number given to the article
	Bibliography	Author, year, title
	Source	Journal/Conference
	Aim	Goal of the study
	Type of study	Empirical strategy
RQ1	Data storytelling definition	Definition of data storytelling or related terms
RQ2	Best practice	Recommended practice or guideline
RQ2	Application	Ways to implement the guidelines and best practices
RQ3	Criteria	Evaluation criteria for data visualizations
RQ3	Assessment	Techniques to assess if the evaluation criteria were met
RQ4	Evaluation method	Strategy to evaluate data visualizations
	Type of chart	The visualization technique covered by the evaluation method
	Metrics	Values measured by the evaluation method
	Tools	Software applications, models and algorithms used to support evaluation

Table 6 Research type categorization [91]

Type	Description	# of papers
Validation research	Techniques investigated are novel and have not yet been implemented in practice, e.g.: experiments, work done in the lab.	77
Solution proposal	A solution for a problem is proposed. The solution can be either novel or a signiﬁcant extension of an existing methodology. The potential beneﬁts and the applicability of the solution is shown by a small example or a good line of argumentation.	15
Philosophical papers	These papers sketch a new way of looking at existing things by structuring the ﬁeld in form of a taxonomy or conceptual framework.	3
Evaluation research	A technique is implemented in practice and an evaluation of it is conducted. Implementation of the technique is shown in practice and its consequences are demonstrated.	0
Experience papers	Experience papers explain what and how something has been done in practice. It must be the personal experience of the author.	0
Opinion papers	These papers express the opinion of somebody whether a certain technique is good or bad, or how things should have been done. They do not rely on related work and research methodology.	0

Table 7 Best practices found in the literature

ID	Category	Best practice	Context	Reference
BP1	Cognitive	Simplify complex ideas	General	S03, S08, S16, S91
BP2	Cognitive	Provide contextual information	General	S23
BP3	Cognitive	Use metonymy to map visual signs to implicit meanings.	General	S03, S23
BP4	Cognitive	Limit the number of series or categories displayed	General; Line chart	S37, S42
BP5	Cognitive Data	Select the appropriate visualization considering the types of data to represent and the advantages and disadvantages of each technique.	General	S12, S15, S21, S46, S65, S84, S91, S94
BP6	Cognitive Data	Present uncertainty information alongside point estimates to enable informed judgments and decisions.	General	S24, S54
BP7	Cognitive Data	Use a common baseline to make comparison between series easier	Line chart Bar chart	S37, S50
BP8	Cognitive Data Perceptual	Map information and data dimensions to the most salient visual features	General	S03, S12, S23, S30, S32, S33, S40, S42, S51, S84
BP9	Cognitive Data Perceptual	Avoid obscuring information.	General	S02, S03
BP10	Cognitive Perceptual	Grids and labels should be usefully visible; effective in providing information without being obtrusive.	General	S13, S27, S39, S50, S72
BP11	Cognitive	Provide redundancy to improve comprehension and memorability of the information.	General	S16, S57, S69
BP12	Cognitive Perceptual Presentation	Focus the important data points of a visualization to draw the user's attention to the relevant areas.	General	S02, S03, S30, S47, S76, S78, S81, S90, S95
BP13	Cognitive Perceptual Presentation	Declutter visualizations by removing unnecessary elements such as grid lines, marks, legends, and colors.	General	S10, S13, S19, S39, S81
BP14	Cognitive Perceptual Presentation	Maximize the data-ink ratio by limiting the amount of "ink" that is not used to present data.	General	S13, S34, S82, S83
BP15	Cognitive Presentation	Use text, labels and annotations for effective information consumption and decision making.	General	S02, S19, S23, S25, S49, S50, S57, S60, S92
BP16	Cognitive Presentation Usability	Avoid adding embellishments, non-essential imagery, and chart junk to not distract readers from data.	General	S34, S43, S56
BP17	Cognitive Usability	Choose the visualization technique that better supports the expected tasks.	General	S01, S20, S21, S24, S35, S38, S40, S42, S45, S51, S67, S69, S74, S75, S77, S87, S93, S94
BP18	Cognitive Usability	Guide the user in the consumption of the story.	General	S02, S06, S47, S62
BP19	Cognitive Usability	Tailor designs to the needs of their audience, taking into account individual user characteristics.	General	S41, S42, S52, S85
BP20	Data	Provide credits of the provenance of the data and details of the design to ensure transparency and credibility	General	S03, S08, S23, S80, S91
BP21	Data	Avoid omitting information	General	S03, S24
BP22	Perceptual Cognitive	Use visual imagery and embellishments to help to fix a chart in a viewer’s memory	General	S23, S34, S43, S56, S57, S81
BP23	Perceptual	Use color effectively, preserving important differences in the data	General	S22, S30, S64, S74, S79, S90
BP24	Perceptual	Group similar marks and visual features to facilitate tasks	General	S42, S55
BP25	Perceptual	Incorporate tangible or situated feelings to evoke senses and create experiences	General	S23
BP26	Perceptual Presentation	Avoid using 3D effects	General	S03, S29, S34, S50, S83, S88
BP27	Perceptual Presentation	Manage design parameters with care	General	S12, S17, S39, S67, S72
BP28	Perceptual Presentation Usability	Use comfortable font sizes to improve legibility	General	S23
BP29	Usability	Make information accessible to impaired users	General	S23, S78, S95
BP30	Presentation	Communicate a narrative in a clear way	General	S02, S08, S23, S62
BP31	Presentation	Layout the elements of the charts and the whole story so that it is easier to understand the data.	General	S25, S26, S62, S75
BP32	Presentation	Define the sequence of events and possible story paths	General	S47, S62
BP33	Presentation	Maintain consistency throughout the story	General	S02, S47
BP34	Presentation	Define what role the visualization plays in the story	General	S62
BP35	Usability	Include interaction techniques to allow the user to explore the data.	General	S02, S03, S04, S08, S29, S62, S68
BP36	Usability	Provide feedback to show to readers that their input affects the story and to improve the performance of specific encodings.	General	S62, S68
BP37	Usability	Allow the user control over the story components.	General	S02, S62
BP38	Usability	Avoid lengthy tutorials and explicit description of every design manipulation	General	S03, S06

Table 8 An example of best practices implementations

ID	Description	Implementation	Ref.	Context
BP5	Select the appropriate visualization considering the types of data to represent and the advantages and disadvantages of each technique.	Use bar charts at low density, treemaps at high density.	S12	Treemaps Bar charts
		Use treemaps when comparing non-leaf nodes.	S12	Treemaps
		Use dual-scale with careful design or avoid using it altogether.	S15	Dual scale charts
		Dual-Scale data charts are important when regular charts reach the limits of their display resolution due to data with varying densities or degrees-of-interest.	S15	Dual scale charts
		When the geographic locations and adjacencies are important aspects, and the required map-reading is more detailed, contiguous cartograms might be more suitable.	S21	Maps
		Rectangular cartograms work well if adjacency relations are important and having a simple schematic representation is useful.	S21	Maps
		The choice of cartogram type should also take into account the type of map being shown.	S21	Maps
		Where the treemap’s reliance on color and size to represent information could lead to ambiguity and confusion, the slope graphs benefits of combining this with the angle and order of connecting lines is clear.	S46	Treemaps Slope graphs
		Use a tabular layout in decision support visualization systems.	S65	Tables
		Use scatterplots to provide overviews.	S65	Scatterplots
		A divided bar chart can always be replaced by a grouped bar chart.	S84	Bar charts
		Use dot charts and bar charts as replacements for divided bar charts and pie charts.	S84	General
		Use grouped dot charts and grouped bar charts as replacements for divided bar charts.	S84	General
		Use framed rectangle charts as replacements for statistical maps with shading.	S84	General
BP6	Present uncertainty information alongside point estimates to enable informed judgments and decisions.	Eliminate the bias by only showing the Pareto front (the set of formally incomparable or non-dominated alternatives), hiding all dominated options.	S07	General
		Avoid adding means, as it leads to small biases in magnitude estimation and decision-making from distributional comparisons.	S24	General
		Choose encodings that are visually symmetric and visually continuous (gradient plots and violin plots for example).	S54	General
		Avoid bar charts: They suffer from within-the-bar bias, (where values within the bar are seen as likelier than values outside the bar) and binary interpretation (values are within the margins of error, or they are not). This difficults viewers to confidently make detailed inferences and overestimate effect sizes in comparisons.	S54	General
		For charts with a small number of bars, using boxes instead of top gridlines improves the estimation of the encoded values.	S72	General
		Quantized gradients, used with a gridline on top, or a box framing, consistently produces the least bias.	S72	General
		Use transparency to show uncertainty by making uncertain objects less opaque, overlaying a transparent wash for highlighting, and more generally for reducing screen space limitations by overlaying objects or features.	S39	General
BP7	Use a common baseline to make comparison between series easier	Comparison between time series is made easier with a “common” baseline or one based on the “previous” time series displayed.	S37	General
BP7		Do not use non-zero or broken axes.	S50	Bar charts
BP8	Map information and data dimensions to the most salient visual features	Use luminance to encode secondary values in treemaps.	S12	Treemaps
		When comparing different charts use the same data mappings if possible.	S15	Dual scale charts
		Use common units per display unit when possible.	S15	Dual scale charts
		Make changes in data scale visually clear if they cannot be avoided.	S15	Dual scale charts
		When a break in data scale is necessary do not connect data values across the break.	S15	Dual scale charts
		Use color sequences that gradually increase in luminance for continuous variables.	S22	General
		Use spectral schemes for categorical data.	S22	General
		Color and size are features that can be used independently to represent information.	S30	General
		Orientation is less suitable for representing information that consists of a large range of values because it does not show a clear relationship between contrast and salience.	S30	General
		The visual variable size is the most efficient to detect change under flicker conditions.	S33	General
		Color encodings afford reliable determination of averages.	S40	General
		To improve value comparison, use a linear layout or switch to color encoding for value.	S45	Line charts
		For value encoding, position/length encodings should be preferred to a color encoding.	S45	Line charts
		Triangular shapes may be better than rectangular shapes for color encoding.	S45	Line charts
		Color encodings for higher data densities should be used with caution.	S45	Line charts
		Circular layouts rather than linear ones should be preferred for detecting temporal locations.	S45	Line charts
		Mapping variables (which refer To the selection of what particular properties of the data to display) can help the viewer by explicitly encoding the quantity of interest, but only if the relevant information is known.	S51	General
		Computational variables (which Describe the methods used to compress the signal) can align displayed information with the viewer’s task if the task is known.	S51	General
		Visualization summarization can compensate for the impossibility of showing all data in a visualization at once.	S55	General
		Use large enough mark sizes.	S55	General
		To best encode speed only: encode speed with value. If value cannot be used (e.g., color is already used extensively), convey speed by mapping time to segment length.	S66	General
		If a visualization contains straight paths only and the important information to convey is speed, then any speed encoding can be used. If it contains curved paths, encoding speed with size is discouraged.	S66	General
		To best encode time only: encode time with segment length in the form of time ticks. Encoding time with value is discouraged.	S66	General
		To best encode both time and speed simultaneously: use segment length whenever possible. Encoding speed with value on top of length can improve perceiving time but may slightly interfere with perceiving speed.	S66	General
		If the number of available variables is limited, use segment length alone, as this encoding conveys both time and speed.	S66	General
		If using segment length is not possible or not desirable within the context of a visualization, encode speed with value and time with size.	S66	General
		Encode speed with brightness/color value and time with segment length or encode both time and speed using segment length only.	S66	General
		If using segment length is not desirable, the next best choice is to encode speed with color value and time with size	S66	General
		Position judgments are more accurate than length judgments and angle judgments.	S84	General
		A pie chart can always be replaced by a bar chart, thus replacing angle judgments by position judgment.	S84	General
BP9	Avoid obscuring information and confusing.	Make changes between chart types explicit to avoid confusing the viewer.	S02	General
		Do not introduce “noise” into the representation.	S03	General
		Do not add a gratuitous third dimension.	S03	General
		Do not use non-essential sizing transformations that violate discriminability limits.	S03	General
		Do not make elements too small for judgment.	S03	General
		Do not oversize to the point of overwhelming the presentation.	S03	General
		Do not obscure a value’s true position on an axis.	S03	General
		Map information to the most salient visual judgment types.	S03	General
		Do not imply false cause-and-effect relationships.	S03	General
		Do not use complex design tactics like the double-axis, which experts have noted are difficult to decode even when properly used.	S03	General
		Visual noise is a visual metaphor technique that can also serve to obscure.	S03	General
BP20	Provide credits of the provenance of the data and details of the design to ensure transparency and credibility	Cite and/or link data sources.	S03	General
		Add additional references.	S03	General
		Explain methodological choices.	S03	General
		State relevant facts.	S03	General
		Annotate exceptions and corrections.	S03	General
		Represent uncertainty.	S03	General
		Show error bars.	S03	General
		Describe inferential limits (i.e., Confidence intervals).	S03	General
		Annotate forecast data explicitly labelling the point in a graph where data are extrapolated.	S03	General
		Expressions of doubt regarding potential conclusions.	S03	General
		Provide identification of a visualization’s designer through author bios or personal anecdotes.	S03	General
		Show professionally listed references.	S23	General
		Use data from sources that are valid and clearly collated.	S23	General
		Present information in an impartial way.	S23	General
BP21	Avoid omitting information	Cite data sources.	S03	General
		Define variables unambiguously.	S03	General
		Do not oversimplify complex phenomena by excluding complicating information from the visual representation.	S03	General
		Avoid thresholding values.	S03	General
		Avoid omitting exceptional cases.	S03	General
		Omission can also be transferred via filtering capabilities like search bars that allow a user to select a subset of data.	S03	General
		Avoid knowledge assumptions of the end-user.	S03	General
		Avoid omitting uncertainty information.	S24	General

Table 9 Evaluation criteria and sub criteria

ID	Description	Definition	Reference	Sub criteria
C1	Comprehension	A visualizations’ ability to ease understanding of the underlying information and generate insight[97].	S11, S26, S31	SC01; SC02; SC03; SC04; SC05; SC06; SC07; SC08; SC09; SC10; SC11; SC12; SC13; SC14; SC15; SC16.
C2	Engagement	A combination of factors that allow to keep the user’s attention and interest to view or interact with the visualization [100][73].	S08, S53, S61, S62	SC01; SC17; SC18; SC19; SC20; SC21; SC22; SC23; SC24; SC25; SC26; SC27; SC28.
C3	Information	Relates to the quality and credibility of underlying data [101].	S58	SC29; SC30; SC31; SC32; SC33; SC34.
C4	Memorability	It deals with the retention of information and relevant aspects of the visualization. A memorable visualization “sticks” in the viewers mind [48].	S16, S34, S48, S56, S57	SC01; SC35; SC03; SC25; SC36.
C5	Usability	The degree to which the visualization helps users accomplish goals with minor effort; it comprises the functionality and understandability of charts [96].	S23, S58	SC01; SC17; SC35; SC37; SC38; SC39; SC40; SC41; SC42.

Table 10 Definition of sub-criteria for evaluation

ID	Description	Definition	Reference
SC01	Aesthetics	Visual appeal (attractiveness) of a chart.	S05, S29, S52, S53, S61, S62, S63, S11
SC02	Complexity	The amount of detail in the visual representation.	S10, S16, S28
SC03	Uniqueness	The quality that makes a chart visually distinct from others.	S23, S57, S11
SC04	Identity	The degree of differentiation between data series in a shared space or data points in a visualization.	S37
SC05	Affordance	Aspects of visualizations that suggest how it should be used; a visual clue to the functions and usage of an object [102].	S11
SC06	Clarity	Degree of coherence of a chart.	S11
SC07	Cognitive load	The mental effort of processing information [10].	S50
SC08	Confusion	The degree of difficulty the users experience when processing information.	S19
SC09	Dynamism	Motion or animation methods.	S11
SC10	Legibility	Readability of the design “at a higher scale”.	S11
SC11	Mapping	If the properties of the visual representation most closely match the information being represented [102].	S11
SC12	Perspective	If the visualization allows multiple views (perspectives) of the same information.	S11
SC13	Reachability	Navigation, exploration mechanism of a visualization system.	S11, S61
SC14	Saliency	Indicates how a target stands out from the background.	S28
SC15	Simplicity	The quality of designs that display only relevant and necessary information.	S11
SC16	Space management	Describes whether space is “shared” or “split” between time series.	S37
SC17	Ease of use	Perceived usability of the design.	S05, S53, S62
SC18	Focused attention	The concentration of mental activity [103], [104]	S61, S62
SC19	Novelty	Features of the interface that users find unexpected, surprising, new, and unfamiliar [104], [105].	S61, S63
SC20	Autotelism	The chart is an end in and on itself.	S61
SC21	Challenge	The amount of effort experienced by the participant in performing a task[104].	S61
SC22	Control	How “in charge” users feel over their experience with the technology [104].	S61
SC23	Creativity	The degree to which a visualization promotes the generation of new thoughts or ideas.	S61
SC24	Endurability	The assessment of users’ perception of success with a task, and their willingness to use an application in future or recommend it to others [104].	S62
SC25	Expressiveness	Denotes the design traits that make infographics vivid, eloquent, and story like.	S23
SC26	Felt involvement	The feeling of users of being drawn into and involved in a task and the overall assessment of the experience as “fun” [104].	S62
SC27	Interest	The “feeling that accompanies or causes special attention to an object or class of objects” [104].	S61
SC28	Visual flow	The congruence between the way a reader navigates the story, the visual components of the story, and the type of visual feedback the reader receives; along with the nature of the data and facts that the author wants to communicate.	S62
SC29	Insight	Nontrivial discovery about the data. Addresses the need for users to understand information.	S08, S61, S73
SC30	Concreteness	Indicates the degree of pictorial resemblance of a visual representation.	S16
SC31	Confidence	The degree to which a visualization helps a user trust in his/her understanding of the data set.	S73
SC32	Essence	How a visualization communicates the essence of the data set with respect to overview and context.	S73
SC33	Usefulness	Value, relevance of the visualization.	S53
SC34	Visual Intelligence Density	The amount of useful information that a decision maker obtains by interacting with a visualization.	S14
SC35	Familiarity	The frequency with which some visualization and visualization features are encountered.	S05, S06, S16
SC36	Significance	Indicates the relationship between what is depicted in the visual representation and the function it refers.	S16
SC37	Aspect ratio	The ratio between the width and height of a diagram.	S17
SC38	Communication	Relates to the quality of human-computer interaction. It comprises aspects of user’s satisfaction, flexibility, and learning.	S58
SC39	Efficiency	Perceived productivity (e.g.: time and effort saved)	S53
SC40	Intersections	Intersections caused by a guiding line.	S19
SC41	Time	Captures how a visualization facilitates faster, more efficient understanding of data with respect to both searching and browsing.	S73
SC42	Visual clutter	Unnecessary elements and embellishments that diminish a visualizations’ ability to communicate.	S37, S19

No competing interests reported.

Download PDF

Editorial decision: Major revision
12 Jun, 2023
Reviews received at journal
12 Jun, 2023
Reviewers agreed at journal
29 Mar, 2023
Reviews received at journal
05 Dec, 2022
Reviewers agreed at journal
29 Nov, 2022
Reviewers invited by journal
29 Sep, 2022
Editor assigned by journal
12 Jun, 2022
Submission checks completed at journal
08 Jun, 2022
First submitted to journal
07 Jun, 2022

You are reading this latest preprint version

Narrative Visualizations Best Practices and Evaluation: A Systematic Mapping Study

Status:

Version 1

Abstract

Figures

1 Introduction

2 Background And Related Works

2.1 Data Storytelling

2.2 Best Practices for the Design of Narrative Visualizations

2.3 Evaluation

3 Methodology

3.1 Research questions

3.2 Visualizations

3.3 Search strategy

3.4 Inclusion and Exclusion Criteria

3.5 Study selection

3.6 Quality assessment

3.7 Data extraction

4 Results

4.1 Quality assessment results

4.2 Overview of selected primary studies

4.3 Classification method

4.4 RQ1: Is there a valid and accepted definition of “data-driven storytelling”?

4.5 RQ2: What are the data storytelling best practices reported in the literature and how are they implemented?

4.6 RQ3: What are the criteria to evaluate narrative visualizations?

4.7 RQ4: What are the current strategies to evaluate the quality of narrative visualizations?

5 Discussion

5.1 RQ1: Is there a valid and accepted definition of “data-driven storytelling”?

5.2 RQ2: What are the data storytelling best practices reported in the literature and how are they implemented?

5.3 RQ3: What are the criteria to evaluate narrative visualizations?

5.4 RQ4: What are the current strategies to evaluate the quality of narrative visualizations?

5.5 Implications for research

5.6 Implications for practice

5.7 Threats to validity

6 Conclusions And Future Work

Declarations

References

Tables

Additional Declarations

Supplementary Files

Status:

Version 1