Applying declarative analysis to industrial automotive software product line models

Program analysis of automotive software has several unique challenges, including that the code base is ultra large, comprising over a hundred million lines of code running on a single vehicle; the code is structured as a software product line (SPL) for managing a family of related software products from a common set of artifacts; and the analysis results (despite being numerous and despite being variable) need to be presented to the engineer in a way that is manageable. In previous work, we reported on lifting declarative analyses to apply to a software product line, rather than to an individual product variant. This paper reports on milestone results from applying lifted declarative analyses (behaviour alteration, recursion analysis, simplifiable global variable analysis, and two of their variants) to automotive software product lines from General Motors and assessing the scalability of the analyses and the effectiveness of reporting to engineers conditional analysis results (i.e., results conditioned on SPL program variants). We also reflect on some of the lessons learned throughout this project.

software. Although a system-wide analysis is not yet possible, we have been working to address some of the unique challenges that arise in analysis of automotive software, such as -automotive software is multi-threaded and distributed across multiple CPUs; -the code and execution environments in a single vehicle are heterogeneous with respect to programming languages, language variants, and operating systems; -the code base is ultra large, comprising over a hundred million lines of code running on a single vehicle; -the software is structured as a software product line (SPL) for managing a family of related software products that are differentiated by their features; and -the analysis results (despite being numerous and variable) need to be presented to the engineer in a way that is manageable.
Of these challenges, General Motors has been particularly interested about the last three: that the tools scale to large software systems (at least a million lines of code (LOC), for now), that the analyses accommodate an SPL's program variants, and that the presentation of analysis results not tax the cognitive load of the engineers.
In previous work (Muscedere et al. 2019), we addressed the challenges of distributed software components and heterogeneity by extracting models of the components and their program elements (e.g., functions, variables, function calls, assignments) and linking these together into a model of the software system. To accommodate variability, we extract a model of the software product line for analysis (Shahin et al. 2021b).
A software product line (SPL) supports a family of related products, usually developed together as a common set of mandatory and optional features (Clements and Northrop 2001). A feature is the unit of variation, and products (also called configurations or variants) are derived from the SPL by selecting among and integrating features from the SPL's feature set. Our work focuses on annotative SPLs (and SPL models), in which program elements and functionality that pertain to specific features or feature combinations are annotated with expressions that represent those features.
Analyzing separately each product of a non-trivial SPL is infeasible because the number of potential products grows exponentially with the size of the SPL's feature set, due to the combinatorial nature of SPL features (Liebig et al. 2013). Instead, several researchers have lifted different analyses to be variability-aware (Thüm et al. 2014), such that the analysis applies to a product line (as opposed to an individual product) and leverages the commonalities among an SPL's products. Multiple types of analyses have been lifted, including parsing (Gazzillo and Grimm 2012;Kästner et al. 2011), type checking (Kästner et al. 2012), static analyses (Bodden et al. 2013;Midtgaard et al. 2015), model checking (Classen et al. 2010), resulting in significantly faster analyses of the SPL compared to the product-based analyses of all the SPL's products. In previous work, we lifted an entire class of analyses by lifting the Datalog (Ceri et al. 1989b) engine to be variability aware (Shahin et al. 2019;Shahin and Chechik 2020b). As a result, declarative analyses (Bravenboer and Smaragdakis 2009;Benton and Fischer 2007;Dawson et al. 1996;Grech and Smaragdakis 2017) that can be expressed as a set of Datalog rules can be used as-is as input to the variability-aware Datalog engine to analyze an SPL model (Shahin et al. 2021a).
Another open problem in SPL-based analyses is how to present analysis results, given that they vary for different sets of products. Most research on SPL visualization focuses on documenting and viewing variability and configuration choices (Kang et al. 1990;Czarnecki and Pietroszek 2006). Additional works use configuration views to facilitate the inspection of consequences of configuration decisions (Botterweck et al. 2008); or visualizations of variability-analysis results to support variability restructuring and management (Loesch and Ploedereder 2007). In contrast, the goals of our visualization work are to help the engineer understand and explore the results of an SPL analysis from the perspective of different product sets.
Our work focuses on user-defined declarative program analyses (expressed as Datalog rules) that are mostly variants of data-flow and control-flow analyses. In this paper, we leverage our variability-aware Datalog engine to lift five such analyses and apply them to seven automotive controller product lines provided by General Motors. The paper makes the following contributions: (1) We outline the design of a pipeline for variability-aware analysis of product lines implemented in C/C++. (2) We present the results of applying a set of program analyses using our pipeline to a set of automotive software product lines from General Motors. Our evaluation compares the performance of analyzing the whole product line against analyzing a single configuration that includes all features. (3) We describe our interactive visualizer to support the exploration of SPL-analysis results and present the results of a small user study that assesses General Motors engineers' feedback on the visualizer. (4) We discuss the lessons learned throughout the project.
Early results of this work were published in the Practice and Innovation Track of MOD-ELS 2021 (Shahin et al. 2021b). This paper extends the earlier work by investigating more program analyses; only one analysis (behaviour alteration) was covered in Shahin et al. (2021b). This paper reports on applying the extended set of analyses to seven SPL controllers from General Motors. The seventh SPL controller is a new subject system not described in Shahin et al. (2021b) that was provided by General Motors to stress test our pipeline: it is significantly larger, has many more features and feature combinations, and includes a middleware component that leads to an order of magnitude more results than analyses of the other controllers. This paper also extends and details our interactive visualization and filtering of SPL analysis results and reports on a small user study that provides engineers' feedback on the effectiveness of the visualization in reporting and exploring results that vary by product sets.
The rest of the paper is organized as follows. Section 2 provides a background on SPLs, Datalog, and lifted declarative analyses. In Section 3, we present our five declarative analyses of interest. In Section 4, we present our interactive visualizer to support the exploration of SPL analysis results. In Sections 5 and 6, we present our industrial examples and the results of applying our lifted analyses to them, respectively; and in Section 7 we present feedback on our interactive visualizer from a small user study involving General Motors engineers. We discuss lessons learned in Section 8, present related work in Section 9, and conclude in Section 10.

Background
In this section, we briefly define the concepts we build upon in the rest of the paper. In particular, this includes backgrounds on software product lines, declarative analyses of relational models, and variability-aware analyses that can be applied to a entire product line.

Software Product Lines
A software product line (SPL) is a family of related software products, developed together from a common set of artifacts (Clements and Northrop 2001). The unit of variability in an SPL is a feature, where each feature can be either present or absent in each of the SPL's products. Because of the combinatorial nature of SPL features, the number of products grows exponentially with the number of features. However, there are typically constraints among features that preclude all possible feature combinations from generating valid products.
In an annotative SPL (Thüm et al. 2014), feature-specific lines of source code are encapsulated in conditional statements that are guarded (annotated) with feature expressions. The SPL's features are represented as compile-time Boolean constants (called feature variables), and feature expressions are Boolean expressions over the feature variables. For example, the SPL in Fig. 1 has two features, FA and FB; their values true or false indicate whether their corresponding features are present or absent, respectively, in a product. Consider the code in Component C1: lines 10-17 are specific to products in which the feature FA is present; line 13 is specific only to products in which features FA and FB are both present; and line 15 is specific only to products in which feature FA is present and feature FB is absent.
A specific value assignment to all of the SPL's feature variables is called a (feature) configuration and denotes a single product in the SPL. The configuration whose feature variables are all true is often referred to as the 150% representation (Beuche et al. 2016) because this configuration generally does not represent a valid product due to constraints among feature selections.
A set of configurations is succinctly expressed as a presence condition (PC), which is a propositional formula over the feature variables. 1 If f is a feature variable and if pc 1 and pc 2 are PCs that represent arbitrary sets of configurations in an SPL, then the following are also PCs: f ≡ all configurations in which feature f is present !pc 1 ≡ all configurations in the SPL not belonging to pc 1 pc 1 ∧ pc 2 ≡ the intersection of configurations in both pc 1 and pc 2 pc 1 ∨ pc 2 ≡ the union of configurations in either pc 1 or pc 2 Thus given the SPL depicted in Fig. 1, the PC {FA} represents two configurations: (1) where FA is present and FB is absent, and (2) where both FA and FB are present. Similarly, the PC {!FA} also represents two configurations.
The primary motivation behind developing a family of products together as an SPL instead of developing each product independently is to maximize reuse of common software artifacts across products, leveraging the potentially high degree of commonality among them. Different techniques of developing SPLs have been proposed and used in practice (Gacek and Anastasopoules 2001;Apel and Kaestner 2009;Schaefer et al. 2010).
A typical software development process includes the use of tools to perform a variety of software analyses for bug-finding, metric generation, and performance assessment. In most cases, such tools can be applied only to one software product at a time rather than to an entire SPL. The naïve approach of generating each and every product and applying an analysis tool to it individually is usually infeasible because of the exponential growth in the number of products as the number of features increases.

Datalog-Based Analysis
We express declarative analyses in Datalog. Datalog (Ceri et al. 1989a) is a logic programming language that supports rule-based inference over relational data. Program analyses written in Datalog are applied to relational facts extracted from programs. A Datalog program is a set of rules, with a set of premises (the body of the rule), and a conclusion (the head of the rule). For example, line 1 of Fig. 2 is a rule with a single clause in the body (the varWrite clause), and the transVarWrite is the head (conclusion) of the rule. Figure 2 shows a Datalog program (simplified for presentation purposes) for detecting symptoms of behaviour alteration, in which a variable assignment in one component affects the behaviour of another component. Lines 1-3 compute the transitive closure of the varWrite relationship, thereby finding all data-flows in which one variable is used in the assignment expression of another variable (including parameter assignments). Lines 5-10 define behaviour alteration as a data-flow that starts with a variable assignment (write) in function f0, which impacts the values of other variables (via transVarWrite), and ends with a variable whose value influences the invocation of some function f1 (varInfFunc). As we are interested only in behaviour alterations that cross component boundaries, we exclude intra-component results (lines 8-10).
Running this analysis on facts extracted from the code in Fig. 1 reports that the function updateX in component C1 may influence whether the function foo in C2 is called: (i) write relationship from function updateX to variable GlobVar (line 15 in C1); (ii) varWrite relationship from the variable GlobVar to itself (line 15 in C1); (iii) varInfFunc relationship, indicating that variable GlobVar affects whether the function foo is invoked (lines 12-13 in C2).

Lifted Declarative Analyses
Several software analyses have been re-designed and implemented to enable efficient analysis of the whole SPL at once. Such analyses are called variability-aware analyses, and the process of transforming a single-product analysis into an variability-aware analysis is referred to as variability-aware lifting (Bodden et al. 2013;Salay et al. 2014;Shahin et al. 2019;Shahin and Chechik 2020a). A lifted analysis is expected to preserve the semantics of its single-product counterpart, while tracing each of the results of the analysis to the set of products to which it applies. We use the notation f ↑ to refer to a lifted version of a product-based analysis f .
Instead of re-implementing a given analysis to make it variability-aware, another approach is to lift the language in which the analysis has been implemented. This has the advantage of effectively lifting any and every product-based analysis that can be expressed in the lifted language. For example, Shahin et al. (2019) lifted Datalog analyses by extending the Datalog language with optional presence condition annotations at the fact level, and implementing a variability-aware fact inference algorithm in the Soufflé ↑ (lifted Soufflé) Datalog engine (Shahin and Chechik 2020b). The result is that analyses written in Datalog are naturally lifted when they are processed by Soufflé ↑ .
Consider a variability-aware version of the behaviour alteration analysis presented in the previous section and consider again the results of running this analysis on the example SPL in Fig. 1; if you recall, the analysis reports that the function updateX in component C1 may influence whether the function foo in C2 is called. The initial write relationship from function updateX to variable GlobVar (line 15 in C1) exists only in products that include feature FA and exclude feature FB (i.e., whose PC is FA∧!FB). Similarly, the intermediate varWrite relationship from the variable GlobVar to itself (line 15 in C1) exists only in products with the PC FA∧!FB. The varInfFunc relationship that ends the dataflow (lines 12-13 in C2) exists in all products. The full data-flow path result exists only in products that satisfy the conjunction of the PCs of all the edges in the path: that is, all products in the PC FA∧!FB. In general, a variability-aware analysis is expected to report its results annotated with the products (or PC) for which each result applies.

The Analysis Pipeline
We have implemented an end-to-end pipeline for extracting a product line model from source code, analyzing it, and interactively visualizing the results. The analysis pipeline integrates components used in previous projects (Shahin et al. 2019;Muscedere et al. 2019), together with some adapter components for converting data from one format to another. The overall pipeline design is shown in Fig. 3. The major tools of this pipeline are Rex↑, Soufflé ↑ , and the Interactive Visualization/Filtering tool.
An SPL model is extracted from C/C++ source files using a new variability-aware version of Rex (Muscedere et al. 2019), which extracts syntactic facts about the source files (e.g., variable declarations, variable assignments, function declarations, function calls) and annotates a fact with a presence condition (PC) if the fact relates to code that is present in a subset of products. Facts and their presence conditions are extracted as tuples and then converted to Datalog fact format using a simple script (ta2tsv adapter component).
Soufflé ↑ takes as input facts annotated with presence conditions and infers additional facts based on a set of input Datalog rules that express analyses of interest. Soufflé ↑ 's output is presented as an annotated graph, in which each presented result is annotated with a presence condition denoting the set of products for which the result applies. The engineer can then use the Interactive Visualizer to create filters that highlight (with colour) the analysis results that apply particular product sets of interest. The visualization component is explained in more detail in Section 4.

Variability-Aware Fact Extraction
Our analyses operate on extracted facts about C/C++ source code, rather than operating on the code itself, to improve the scalability of our analyses to large software systems.

Fig. 3
End-to-end fact extraction and analysis pipeline. Source code is provided to Rex↑, which produces a factbase of program facts that ta2tsv translates into the input expected by the SPL-analyzer Soufflé ↑ . An analysis, expressed as Datalog rules, is input to Soufflé ↑ along with these facts, and the analysis results are presented to the user via an interactive visualizer Specifically, a fact extractor Rex (Muscedere et al. 2019), based on the Clang++ opensource compiler, 2 parses C/C++ source-code files, generates abstract syntax trees (ASTs), and extracts facts of interest from the AST into an in-memory hierarchical graph. Sourcecode entities such as variable declarations and function declarations are the nodes of the graph; and relations such as variable assignments (in which one variable is used in the assignment expression for another variable), function calls, and containment (of variable declarations within functions, function declarations within files, components comprising files) are the edges of the graph. Additional information about the nodes and edges are recorded as associated attributes. Rex outputs the resulting graph as a collection of facts (called a fact model or factbase) about source-code entities, their relations, and their respective attributes represented as three-tuples (triples).
In order to support analysis of SPL models, we developed a variability-aware version of Rex ↑ that annotates entities and relationships with their presence conditions. A Rex ↑ user can specify, by type and naming convention, which program variables are to be considered feature variables and thus used in presence conditions (e.g., only constant global bool or enum type variables). Variability-aware Rex ↑ keeps track of all conditions over feature variables that hold while walking the AST and uses that information to annotate facts with their PCs as they are extracted. Figure 4 gives an overview of the Rex ↑ extraction process of the component C1 in Fig. 1a. On the left is the input, in this case C++ code; the middle of the figure depicts extracted information as an in-memory hierarchical graph; and on the right is the extracted fact model in tuple format. In this example, Rex ↑ creates fact nodes for the class A, function updateX, and variables x, FA, FB, and GlobVar. 3 Each contain edge corresponds to an entity declaration (e.g., class A contains the declaration of variable x). When one variable appears in an expression that is assigned to another variable (e.g., the use of GlobVar in an assignment to variable x in C1), a varWrite edge is created from the used variable to the assigned variable (e.g., varWrite GlobVar x). The creation of the other edges follows the same pattern. Attributes of entities and relationships are listed at the end of the fact model. The attribute PC records presence conditions: any entity or relationship that is annotated with a PC attribute represents a fact that is conditionally present in the model, depending on the value of the feature variables. Thus, variability-aware Rex ↑ extracts a 150% representation that includes facts for all the SPL's features, where conditional facts are annotated with their products' presence conditions. Because of the nature of static analysis, the resulting model is an over-approximation of the program's actual set of facts: it may contain some facts that are infeasible (e.g., a function call in a conditional branch that never executes).
The fact model is translated into the input format of the Soufflé ↑ reasoner, using a script ta2tsv that we wrote specifically for that purpose. For example, in a fact model, presence conditions are listed as attributes at the end of the file rather than being co-located with their associated facts. Our ta2tsv script associates each fact with its corresponding presencecondition attribute.

Analyses of Interest
In collaboration with General Motors, we identified three analyses of interest, behaviour alteration, recursion, and simplifiable-global variable, and applied them to the industrial case study. Each of these analyses was originally applicable to a model of a single product, not a product line. We devised lifted versions of these analyses by expressing each as a set of Datalog rules and "executing" them using the lifted Datalog engine Soufflé ↑ . This way we were able to leverage the flexibility of using Datalog as a query language for expressing analyses. We were also able to leverage all the optimizations in Soufflé ↑ to help ensure that our analyses scale to industrial-size SPLs.

Behaviour Alteration Analysis
In our work, the primary analyses of interest are those that detect possible component interactions, where a component behaves differently in isolation versus when it is combined with other components (Muscedere et al. 2019). Such analyses are of particular interest to General Motors because of the large number of components and component combinations in their products and product lines. An engineering team will know its components well, but will not necessarily know all of the ways in which its components can affect the behaviours of components developed by other teams. One of the most complex types of component interaction is behaviour alteration (Muscedere et al. 2019), a form of data-flow component interaction, in which a change to a variable value made in one component alters the behaviour of another component. The specific instance of behaviour alteration used in this paper is (1) an assignment is made in component C1 to a variable v; (2) whose modified value impacts other variables through variable assignments and impacts other components through parameter passing; until (3) in another component Cn a variable x, whose value has been impacted by the modified value of v, is used in the decision condition of some control structure (i.e., an if, for, while, or switch statement) that (4) guards a function call. Thus, the analysis looks for a data flow from a variable assignment in one component to a control structure in another component, where the control-structure's statement block includes a function call. Figure 1 gives a simple example where the write to GlobVar in line 15 of component C1 could affect whether or not the function bar calls the function foo in component C2.
To analyze the software controllers provided by General Motors, we developed a specialized version of the behaviour alteration analysis, henceforth called the GM variant. This analysis has an additional requirement that a particular middleware component, which handles inter-component communications, cannot be the start-or the end-point of a behaviour alteration path. This component is expected to communicate with multiple other components, and thus behaviour alteration paths that start or end with this component are uninteresting and would clutter the analysis results.

Recursion Analysis
A second analysis of interest to General Motors involves recursion. Automotive software typically uses little to no recursion in order to guarantee timing requirements. Our work initially focused on two types of recursion: (1) function recursion (2) component recursion The first analysis detects functions that directly or indirectly call themselves via a cycle of function calls with in single component; and the second analysis detects a cycle of function calls involving functions from at least two distinct components. As we discuss in Section 6, results reported by the recursion analysis led GM engineers to request us to perform a followup analysis to help them understand the instances and contexts of the reported occurrences of recursion. This followup analysis is discussed in Section 8.4.
The recursion analyses exemplify simple coding standards, like MISRA C 4 standards that are used in the automotive and other safety-critical industries. These kinds of code patterns can be expressed easily as Datalog queries.

Simplifiable Global Variable Analysis
The third analysis of interest detects a code pattern that was of particular interest to a General Motors engineer for potential code refactorings. Specifically, a simplifiable global variable is a global variable that is used only to pass data to a single function. If a global variable is simplifiable then it can likely be refactored as a parameter of the function that reads from it -which is a useful because global variables can introduce unnecessary couplings of components and potential logical errors in maintaining their state. Figure 5 illustrates a simplifiable global variable CtrlIdx, where X and Z are the only functions in the program that call function Y.

Interactive Visualization
Our pipeline includes an interactive visualizer that supports inspection of the analysis results by visually encoding which facts and analysis results belong to which software products. Because the results of a lifted analysis are inferred paths in the factbase, they can be portrayed as edges in a graphical model representing the analysis results. Although a graphical model can concisely represent the analysis results, the task of understanding how the results apply to specific SPL configurations requires the engineer to read and compare presence conditions on multiple edges. To facilitate this, we developed an interactive visualizer that enables the engineer to apply coloured filters to the results to help identify groups of paths occurring in related software products. Our visualizer is implemented on top of the Neo4j Browser, 5 the user interface provided by the open-source graph database Neo4j. 6 As a database engine, Neo4j enables the storing and querying of graphical data like the facts and results of our analyses. As such, we import our results into an instance of the Neo4j database to be queried and visualized. Figure 6 shows a pedagogical example of analysis results comprising functions (f1, f2), variables (v1, v2), their relationships, and their respective presence conditions over feature variables (FA, FB, FC, FD) in a visualization frame. Each visualization frame has a central interactive area in which the graphical results are displayed; the user can use native Neo4j Browser facilities to zoom in and out, and rearrange the layout of the graph. The visualization frame also includes an expandable sidebar that gives an overview of the data being represented and provides options to customize the appearance of elements, including node size, edge thickness, and the information presented on the labels. For each edge in the graph, the visualizer displays both of the type of the edge and the edge's presence condition.
The edge type appears in bold whereas the presence condition is located on the opposite side of the edge. Each presence condition labelling an edge indicates the software products for which that relationship applies.
We have enhanced the Neo4j Browser to support the exploration of our analysis results based on user-specified filters (please see Fig. 7). In the textbox on the top right, the engineer specifies a filter as a presence condition representing a set of SPL configurations, and the visualizer highlights the subset of results that satisfy the filter's presence condition. Our visualizer employs Logic Solver, 7 a Boolean satisfiability solver, to reason for each fact whether the fact's presence condition satisfies also each filter's presence condition. The visualizer automatically assigns a distinct color to the edges that satisfy the filter, thereby preserving the original results as well as highlighting the filtered results. Multiple filters can be applied to the same analysis results, producing a colour-coded graph visualization that highlights which analysis results pertain to specific configurations. For example, as shown in Fig. 7, the edge between nodes f1 and f2 reports an alteration behaviour that originates in function f2 and manifests in function f1; this alteration behaviour is present only in products that satisfy the presence condition FA∧FB∧FC∧FD. After applying filters, the engineer can inspect the legend at bottom right corner showing the filters applied and their colours (shown in Fig. 7-B).
Through the application of one or more filters, the engineer can explore and better understand the analysis results. A single filter can be used to determine and visualize which results apply to a particular set of software products of interest. Alternatively, the engineer can compare how two or more sets of software products differ in their analysis results by applying multiple filters, one for each product set. Moreover, the engineer can see the effects of adding or removing a single feature from a product set by applying filters that include or exclude the feature of interest and seeing which results are highlighted by the different filters. Figure 8(a) shows an example of creating a single filter to identify the analysis results that apply to a specific set of software products. Figure 8(b) shows the effects of applying a second filter to the same visualization to assess the impact of adding a feature to the first filter. The edges coloured yellow highlight the analysis results that satisfy only the first filter and the edges coloured blue and yellow highlight the analysis results that satisfy both filters. 8 Visualization of numerous analysis results is always a concern (Von Landesberger et al. 2011), but there are several facilities within the tool chain for searching and scoping the presented results. Firstly, the analysis itself (i.e., the Datalog query) can be used to limit the number of results returned and in some cases can prioritize results (e.g., prioritize shortest or longest path results, or prioritize results that involve the largest number of components). The analysis query can also be refined to focus only on specific components. Secondly, the engineer's web browser's text-search feature can be used to search the graph labels to localize results related to particular components, functions, or variables. Thirdly, native Neo4j Browser features can be used to include or exclude node or edge types from the visualization, to more easily focus on results that pertain to nodes and edges of interest. 9 Fig. 7 Three main sections of the analysis result visualizer: (A) the text box allows the user to create filters with feature expressions, (B) the legend box shows the colour and shape of the edges mapped to each filter, and (C) the overview sidebar allows the user to customize visual parameters like the colour, size, and shape of the nodes and edges Fig. 8 The visualization of a subset of the analysis results and the effects of (a) applying an initial filter (yellow edges) showing the analysis results that apply to a given program configuration and (b) applying a second filter (blue edges) that adds a new feature to the initial (yellow) filter Finally, our extension to Neo4j Browser can be used to highlight the analysis results that pertain to product sets of interest. In Section 7, we report on a small user study in which GM engineers evaluate our interactive visualizer for the tasks viewing, searching, highlighting, and comparing analysis results for different product sets.

Industrial SPL Examples
To assess scalability, we applied our analyses of interest to SPL models extracted from seven vehicle controller product lines provided by General Motors, which are abstractly named SPL-A, SPL-B,..., SPL-G to obfuscate sensitive industrial data. Metrics on the sizes of all seven product lines are shown in Table 1. For example, SPL-A has 5431 header (.h) files, with a total of 350,102 lines of code (LOC). It also has 5133 C language source files (.c), totalling 730,947 lines of code. Of particular interest is controller SPL-G, which provided a stress test of our tool chain. SPL-G is significantly larger than the other controllers, has many more features and feature combinations, and includes a middleware component that leads to an order of magnitude more results than analyses of the other controllers. The General Motors controllers encode the inclusion or exclusion of features using what are called configuration parameters. Configuration parameters are represented as global constants of enumerated types (enum) or Boolean type (bool). 10 The values of these parameters are defined at deployment time during vehicle manufacturing (Young et al. 2017).
Such an encoding of variability means that the source code includes all of the code relevant to all features. Thus, each controller code-base is a 150% representation of the controller's SPL, and an individual controller product is configured by setting the values of these configuration parameters.
For some of our analyses (specifically, behaviour alteration analysis and component recursion analysis), the most interesting results are the paths between functions that reside in different components. However, there is no identifiable notion of a component unit in C/C++ source code. Components in the General Motors controllers are made up of collections of source-code files, so we cannot use compilation units as the delimiters of components. Instead, General Motors shared with us a high-level decomposition of their code into components, and we incorporated this information into a controller's factbase as additional facts: we introduced a component entity fact for each distinct component and a contains relationship fact between each component entity and its constituent source files. The additional facts allowed us to adapt our analyses to avoid reporting intra-component results (please see the last line in Fig. 2).

Applying Analysis to the Industrial Examples
One of the primary goals of this project was to validate that the variability-aware Datalog analysis approach (Shahin et al. 2019) is scalable to real-world industrial SPLs. We informally define scalability as having a marginal performance overhead compared to analyzing the 150% representation of the SPL, which implicitly means having an exponential speedup compared to product-based analysis of each single product individually.
For each controller SPL, we used Rex↑ to automatically extract both a 150% representation (i.e., a model representing a single product with all features present) and an SPL model, with feature variability represented as presence-condition annotations on facts. We translated the extracted facts into Datalog facts. For each of the analyses, we applied Soufflé (version 1.3.1) to the 150% representation and applied Soufflé ↑ to the factbase annotated with presence conditions. To assess performance, we repeated each analysis experiment five times and removed the minimum and maximum execution times (to marginalize the effect of noise from the execution environment) and report the meanaverage execution times and standard deviations for both the 150% representation and the product line. Tables 2, 3   nearest hundreds). For example, when applying the behaviour alteration analysis (Table 2) to SPL-A, the number of input facts relevant to behaviour alteration is 157303, 698 of which are variational, corresponding to 0.44% of the relevant input facts.  The software of a vehicle has many variation points and thus configuration involves many configuration parameters (Young et al. 2017). In our SPL examples, the code has several hundred configuration parameters (Features in Tables 2-6) in each of the controllers. Because the number of possible products is exponential in the number of configuration parameters, the large number of configuration parameters makes analyzing individual product variants infeasible. As reported in Table 2, the behaviour alteration analysis of the 150% representation of SPL-A using Soufflé takes 3.93 seconds, and the number of inferred facts (150% Outputs) is 128780. Among those, 1191 facts are the end results of the analysis (150% Results). However, applying the same analysis to the variational facts of the same SPL-A product line using Soufflé ↑ takes 6.49 seconds (an overhead of only 64.82%). Soufflé ↑ infers 128759 new facts (Outputs), 505 of which are variational (VOutputs). Among the outputs, 1189 are end results of the analysis (Results). The number of distinct presence conditions calculated as part of the analysis is 198.
Considering all our analyses, variational analysis time overheads range from 6.74% (simplified global variable analysis applied to SPL-A) to 254.36% (recursion checking applied to SPL-G). Recall that the cost of product-based analysis, where each product of an SPL is analyzed separately, grows exponentially with the number of features (Liebig et al. 2013). Yet the execution-time overhead of our variability-aware analyses does not seem to correlate with the number of SPL features. The marginal overheads incurred can be considered very acceptable, at least in cases like our industry examples, where a system has hundreds of features but sparse variability in terms of the percentage of facts annotated with presence conditions. We also note that for a computation-intensive analysis like simplified global variable analysis, the overheads are significantly lower than those of the other, lighter-weight, analyses -in part because this analysis applied the SPLs' 150% representation models are so costly. This indicates that the extra costs of presence condition manipulation amortizes over the execution time of the analysis.
In addition to execution time, we also measured the number of facts inferred by the analyses, including all intermediate facts generated as part of the computation of final results. There are two reasons behind the discrepancy in the number of facts inferred when analyzing the 150% representation versus applying Soufflé ↑ to variational factbases: (1) variabilityaware analysis excludes facts that have unsatisfiable presence conditions, whereas in the analysis of a 150% representation, all inferred facts are deemed to be feasible; and (2) variational aggregator operators (e.g., sum, count) might generate multiple results when applied to a set of facts, whereas a non-variational aggregator always generates a single result.
We also measured the total number of unique presence conditions (Distinct PCs) computed during the inference process. 11 To our surprise, the number of unique presence conditions in each controller was usually smaller than (and in one case roughly equal to) the number of configuration parameters in the controller -which is far fewer than the number of possible combinations of features. Thus, although the controller's SPL technically supports an exponential number of configurations (2 N products given N features), the number of variants mentioned in the source code as presence conditions is much smaller. Taking a further look at the presence conditions, we found out that many features always appear together in a presence condition. This kind of feature correlation is not uncommon in SPLs (Apel and Beyer 2011).
In summary, with a performance overhead of only 6.74%-254.36% compared to the analysis of the single product with all features present (the 150% representation), our evaluation shows that variability-aware analysis scales to large-scale industrial software product lines with hundreds of features.

GM Engineers' Feedback on Graph Visualization
Graph visualization tools can include a number of features to help engineers cope with a large amount of data (see Section 4). In this section, we report on a semi-formal user survey with General Motors engineers to help evaluate the effectiveness of an extension we introduced to the Neo4J browser to use in graph visualization to inspect conditional (product-specific) analysis results. Specifically, we asked for General Motors engineers' feedback on three aspects of our work: (1) the capacity of associating variability-aware analyses with product sets, (2) the preferred way to represent results of variability-aware analyses, and (3) the utility of coloured filters to help engineers explore and focus on subsets of results.

Methodology
We began by delivering an online presentation that showed the output of variability-aware analyses (including presence condition annotations), the graphical and tabular representation of such results, and the expression of filters to highlight subsets of results that pertain to specific product sets. We created a video of this same presentation for those engineers who were unable to attend the presentation. We also provided access to a prototype of our interactive interface so that engineers could experiment with the interactive graph visualization and its coloured filters.
The survey comprised three sections, each asking a set of questions about the participant's preferences using a 5-point Likert scale (ranging from a strong preference to a strong dislike) followed by opportunities to provide free-form answers about the rationale behind their preferences.
-The first section focuses on the participant's interest in seeing analysis results annotated with software variants and contained four questions (including two optional open-ended questions) which assessed the degree to which the association of analysis results with product sets is useful in understanding the results. -The second section asks participants about their preference between graphical and tabular formats. A tabular format that lists results (e.g., program elements or paths that match some pattern of interest), each annotated with a presence-condition attribute, is more conventional. A graphical format presents the same information with graph edges annotated with their respective presence condition. This section of the survey comprises seventeen questions asking about the participants' general preference (e.g. "Overall, which format do you prefer?") and task-specific preferences (e.g., reading, searching, understanding). At the end of the section, the respondents were asked to rank how the presented task-specific scenarios influenced their general preferences. -The third section of the survey gauges the participants' interest in using coloured filters to highlight analysis results. The coloured filters apply to both tabular and graphical representations, highlighting subsets of results based on user-specified product sets of interest. Similarly to the previous section, the survey asks seventeen questions about the participants' general preference (e.g., "Overall, which visualization (coloured or uncoloured) would be faster and easier to read, understand, find, and report results and their associated configuration expressions?") and task-specific preferences (e.g., "to access the impact of analysis results from adding or removing a feature from a configuration expression"). At the end of the section, participants could also rank the presented task-specific scenarios with respect to their general preferences. -The survey concludes with two open-ended questions asking participants about the general impression of the presented features of the tool and their suggestions for improvement.
Six senior software engineers from General Motors responded to our survey. The complete set of materials used in this study (e.g., slides, video, survey questions, responses) are publicly available. 12

Answers: Associating Variability-Aware Analyses Results with Product Sets
All of the respondents reported a preference for associating analysis results with a set of products. Most respondents (5 out of 6) considered it useful to know which analysis results belong to a single product. One of the participants explained how the association between results and product sets could help with understanding the results:

"Configurable software operates in many different ways which is fundamentally hard to keep clear in doing software tasks, knowing when code is active/inactive is a key understanding element . . . [products set] knowledge allows you to understand the shared interactions, I cannot change something as unique to variants if shared across variants" (P1).
The observed clear preference for understanding sets of results justifies and guides our efforts for building visualizations of variability-aware analysis results.

Answers: Presentation of Variability-Aware Analysis Results
The respondents expressed clear but diverse preferences for graphical versus tabular results. One participant (P5) prefers the graphical representation for most of the task-specific scenarios presented in the survey. Three participants (P2, P3, P4) prefer the opposite, favouring the tabular format for the same set of tasks. The other two respondents (P1, P6) prefer different representations for different tasks and showed interest in having an interface that provides both formats. For example, P6 said they would like a graphical view of results to read data values and a tabular view to find particular results.
Two participants who favour the tabular format admitted to preferring the graphical format (or not having a preference) if the data could be further queried to focus on a subset of the results. Such comments reflect a concern (shared by most participants) that a graph representation would become difficult to read once it becomes large enough. For the participants who prefer the graphical format, the graph helps direct their focus when inspecting the results: "I really prefer them both together, the graph provides an easier pattern perspective of the whole but can be hard to consume details. I would see myself using the graph to find nodes then the table to dig in on detail" (P1).
"The graphical format allows me to quickly zoom in to the part of the dataflow I want to analyze further, without having to stop and mentally connect the individual call links" (P5).
The received answers provide sufficient evidence for the need to provide support for both tabular and graphical formats. Further investigation of how the two representations could cooperate to deliver the best experience to engineers is left for future work.

Answers: Utility of Coloured Highlighting of Filtered Analysis Results
Most of the respondents indicated that coloured filters are strongly helpful for the overall understanding of analysis results. They said that the use cases that most benefitted from the coloured filters were: (a) identifying results associated with a single product instance, (b) assessing how a change in software configuration impacts the analysis results, and (c) comparing facts associated with different product sets. P5 explained that the coloured filters could be especially helpful in understanding the impact of configuration decisions: " [...]where I really need to do detailed analysis is on product instances within that set (e.g., if I set one value within a product set a specific way, exactly which subset of the original filtered set of possible paths is impacted?). This seems like the most useful case for understanding behaviour, or whether my software design has any gaps in the conditions I have set up, where I expect a common set of dataflows to be highlighted for every one [product] of a product set, but for an individual product instance within that collection there is a difference." Another respondent suggested that presence-condition annotations should include the number of variants in the presence condition's product set, and we intend to implement this suggestion in our tool.
Participant (P2) asserted that coloured filters would not be helpful to them because they are colour-blind: in fact, colour choices could make it hard for them to read and distinguish between results. This particular respondent may have been influenced by the survey's use of specific colours (red, blue, yellow) when showing images of highlighted (filtered) results, whereas in practice, the choice of filter colours is controllable by the user. In any case, the participant's response shows a limitation of this feature for those engineers who experience full colour blindness.
To summarize, the use of coloured filters to highlight subsets of analysis results looks promising but should be confirmed by further study. Respondents' feedback classify coloured filters as a useful feature but also point out potential improvements and limitations that should be considered for an optimal experience.

Threats to Validity
The main threat to validity of our results is the small number of engineers who participated in our user study. We shared the survey with twelve engineers, and they were invited to share the materials with other engineers whom they thought would be interested and appropriate. We received six responses to our survey, all from senior engineers with 20 to 30 years of work experience as software engineers and 15 to 30 years of experience working with configurable software. We believe that the respondents' seniority and expertise lend significant weight to their answers, even if the number of respondents is not enough for study results to be statistically significant. We are currently working on our next user study that aims to reach a broader group of engineers. Our evaluation is decidedly semi-formal and qualitative rather than quantitative. Yet we were able to collect preliminary but consistent results that the visualization tooling we are building is effective.

Lessons Learned
In this section, we reflect on some of the lessons learned by conducting these empirical studies.

Variability Annotation
Different techniques have been used to annotate segments of source-code with feature expressions, effectively deciding which pieces of code belong to which features. For example, CIDE (Kästner et al. 2009a) is a colour-based tool that highlights segments of code with different colours, each of which represents a feature. The most commonly used annotation mechanism in industrial product lines is the C Pre-Processor (CPP) (Ernst et al. 2002;Liebig et al. 2010). The CPP provides a high degree of flexibility when annotating source code, allowing for lexical rather than syntactic annotation. This means that any sequence of lexemes (tokens), even if the sequence by itself is not syntactically valid, can be assigned a presence condition. As a result, the 150% representation of an SPL annotated with CPP directives is typically not syntactically well-formed, requiring variability-aware parsing (Gazzillo and Grimm 2012;Kästner et al. 2011).
The product lines from General Motors, however, use a different annotation mechanism. C-language constants (following a naming convention) are used within the source code to indicate features. Those constants are assigned values as a part of the product configuration process. Feature-specific code is thus enclosed within C-language conditional statements, relying on the compiler to evaluate the compile-time constants at compile time and to eliminate dead-code corresponding to features not included in the product being built.
This annotation technique has two direct consequences. First, while it is less flexible than CPP directives, it does not require variability-aware parsing because the entire product line is a syntactically well-formed C-program. Secondly, existing analysis tools can be applied to the entire product line, in the same way as regular parsers can be applied to it. The downside is that each result of a given analysis is not labeled with the distinct set of products to which it applies. This draws a clear distinction between analyzing the 150% representation of a product line, in the case where it is well-formed and readable by an analysis tool, and variability-aware analysis, where both inputs and outputs of the analysis need to be appropriately annotated.
An indirect consequence of the annotation technique used by General Motors is the possibility of filtering analysis results through user-provided feature expression of interest and presenting only facts with satisfying presence conditions. This capability has the potential to improve the readability of the data and support the experience of the user visually inspecting the analysis results.

Variability Encoding
Soufflé ↑ can only handle binary features; that is, a feature can be either present or absent. However, the SPLs we analyzed in this project also encode sets of mutually exclusive features using C-language enum data types. For example, if Feat0, Feat1, Feat2, and Feat3 is a set of four mutually exclusive features, it is a common C-language idiom to encapsulate them in an enumerated data type: Enumerated data types in C are integral types, allowing the use of mathematical operators (e.g., addition, bit-wise disjunction) and comparison operators on their values. We came across cases where presence conditions included comparison operators on values of enumerated data types, and we had to abstract those predicates into propositional symbols. For example, if x is a constant of type FeatSet, then the expression x < Feat2 is a logically valid presence condition, but is not acceptable in Soufflé ↑ . We apply a syntactic transformation for these kinds of expressions, turning the above expression into a Boolean variable x LT Feat2, where the LT sub-string stands for less-than. We use similar substitutions for other comparison operators. These transformations are applied by a post-processing script that is executed on the facts before they are added to the factbase. The transformations are limited to comparison operators (not arithemtic or bitwise operators) and the expressions to which a feature variable is compared is limited to enum constants.
The fact that the four features belonging to FeatSet are mutually exclusive can be expressed as a constraint on feature variables in the feature model of the product line. The fragment of the feature model representing this property for FeatSet is: If a feature from FeatSet is mandatory, we also need to add a disjunction over all four features to the feature model:

Scalability of Lifted Analysis
In theory, the complexity of software product line analysis is expected to grow with respect to the number of product line features (Liebig et al. 2013). Product variants compose features together, thus the number of product variants typically grows exponentially with the number of features. The idea behind lifting analyses to product lines is to leverage the commonality among different product variants as much as possible to keep the cost of product line analysis reasonable, as opposed to enumerating and analyzing each product variant by itself, which is intractable in most practical cases.
The product lines we analyzed in this study have hundreds of features each, which means that enumerating each product is not an option. The variability-aware overhead reported for Soufflé ↑ in earlier work (Shahin et al. 2019) is marginal, but that was reported for relatively small benchmarks of only tens of features each. Results presented in Shahin et al. (2021b) show that the performance overhead of full product line analysis using Soufflé ↑ is still marginal for industrial product lines, with hundreds of features. Evaluation results in Section 6 reinforce those findings with a more diverse set of analyses and one more (significantly bigger in size) benchmark from General Motors.
Looking further into the results, the performance overhead does not seem to correlate with the size of the code-base, the size of the extracted model (number of facts), or the number of features of the SPL. This can be explained by differences between the subject SPLs with respect to the code patterns directly relevant to the particular analysis applied. Also measuring the unique number of presence conditions generated throughout the analysis sheds some light on how some features are tightly coupled in industrial product lines, causing the effective complexity of the analysis to be lower than what might be perceived given the number of features.

Utility of Analysis Results in Practice
Automated analyses help engineers identify underlying facts about the program. For example, General Motors engineers were surprised by the results of the recursion analysis and were interested in tracking down the causes and occurrences of some of them. While recursive code is not necessarily prohibited in automotive software, detected instances are worthy of inspection as they can be resource intensive (with respect to memory and time). The followup analyses focused on recursion results detected in SPL-G, in part because we could collaborate with a General Motors engineer familiar with this controller. The followup analyses reported more detailed results including the software components involved, the function names, and code snippets. For SPL-G, these analyses detected three functions that directly call themselves (direct recursion) and four pairs of functions that mutually call each other (indirect recursion). Upon examining these results, the engineer was able to determine that: (1) all instances of direct recursion and two instances of mutual recursion belong to a component dedicated to testing the system code and thus the recursion was deemed not harmful to the well-functioning of the system; and (2) the two other instances of mutual recursion belong to the system code, but their constituent function calls cannot occur together because they are guarded by mutually exclusive conditions on program variables.
Ultimately, our recursive analyses did not identify any problematic instances of recursion (at least in SPL-G). However, the original analysis and its followups were still deemed useful: (1) Because our analyses are expressed in a query language, we were able to specialize both the followup analyses and level of detail reported in the analysis results; and (2) the engineer increased their confidence in the code.

Related Work
Variability-Aware Analysis Different kinds of source-code analyses have been reimplemented to be variability aware (Thüm et al. 2014). For example, the TypeChef project (Kästner et al. 2011; implements variability-aware parsing (Kästner et al. 2011) and type checkers (Kästner et al. 2012) for Java and C. The SuperC project (Gazzillo and Grimm 2012) is another C language variability-aware parser. With respect to model-based analyses, the Henshin graph-transformation engine (Arendt et al. 2010) was lifted to support product lines of graphs (Salay et al. 2014). These lifted analyses were written from scratch, without reusing any components from their respective product-based analyses. Our approach, on the other hand, lifts an entire class of product-based analyses written as Datalog rules, by lifting the inference engine (and inferring presence conditions together with facts).
SPL Lift (Bodden et al. 2013) extends IFDS (Reps et al. 1995) data-flow analyses to product lines. Model checkers based on Featured Transition Systems (Classen et al. 2013) check temporal properties of transition-system models where transitions can be labeled with presence conditions. Both of these SPL analyses use almost the same single-product analyses on a lifted data representation. At a high level, our approach is similar in the sense that the logic of the original analysis is preserved, and only data is augmented with presence conditions. Still, our approach is unique because we do not touch any of the Datalog rules comprising the analysis logic itself.
In this paper, we use a lifted query language to implement analyses instead of lifting existing analyses. In particular, we use a variability-aware Datalog engine (Shahin and Chechik 2020b) that implicitly lifts analyses written in Datalog (Shahin et al. 2019). This approach has also been recently extended to lift analyses written in more expressive, Turing-complete languages (Shahin and Chechik 2020a).

Variability-Aware Visualization
Our work on visualization differs from previous works in SPL visualization in both the type of information that is visualized and the user's ability to filter and highlight information.
The conventional use of colour in SPL visualization is to distinguish features or variabilities in source code or feature models. Tools like CIDE (Kästner et al. 2009b), FeatureMapper (Heidenreich et al. 2008), fmp2rsm (Czarnecki and Pietroszek 2006), and FeatureVISU (Apel and Beyer 2011) enable colouring of model entities or source-code fragments according to their association with a set of features that the user selects. Visualization tools that employ interactive techniques, such as detail-on-demand and highlighting, are proven to contribute to the engineer's comprehension of a product line and to their productivity in modifying the feature configurations (Asadi et al. 2016).
The visualization presented in Loesch and Ploedereder (2007) similarly supports analysis of the feature configurations of a software product line. The authors use Formal Concept Analysis to identify and remove obsolete variable features to optimize the task of configuring the product line. Their graph visualization explores the spatial distribution of the nodes (representing concepts) and the node sizes, to encode the difference between them and the number of feature variables associated with each node, respectively. The authors use black, white, and gray colours to indicate the number of features attached to each node and identify obsolete features. Our work differs in terms of the goal of the analysis and data presented in the visualization.
Work that is closer to ours are the visualizations provided by VISIT-FC (Botterweck et al. 2008) to support the understanding of possible consequences of the engineer's decisions. The tool provides an interactive view connecting three models (decision, feature, and component models) where the user can select features and visualize the decisions and components related to their selection and the relations between them. The traceability is visualized by explicit links connecting the models' components. The highlighting of those links is performed by colouring all non-relevant entities in gray, colouring only the information relevant to the engineer. Their analyses reflect consequences of decisions about a single configuration whereas our analyses visualize results for sets of products.
Recently, Strüber et al. (2020) used graph visualization to represent variability of class diagrams. The authors experimented with three methods to represent variability exploring colour coding and graphical layout. Our visualizer differs by focusing on program elements and relationships that are more diverse and detailed than class diagrams. Moreover, our visualization focuses on encoding the variability of the models as colour-coded edge groups, not necessarily changing the spatial distribution of nodes.
In summary, previous works use colour or tags to associate portions of an SPL's code or models with distinct features to highlight the SPL's variabilities, or they visualize the consequences of feature selections on the product configuration. Whereas our visualizer uses colour to highlight subsets of analysis results that belong to sets of products specified by the engineer: the engineer defines one or more product sets of interest and the visualizer highlights the corresponding analysis results by painting each result with all the colours representing all of the product sets to which the result belongs. Hence, our visualizer supports exploration, filtering, highlighting, and comparing of analysis results rather than simple presentation of the results.

Conclusion and Future Work
In this paper, we presented an industrial study of applying a declarative source-code analysis to relational models of annotative Software Product Lines (SPLs). We integrated sourcecode fact extraction and a variability-aware Datalog engine from two prior projects (Shahin et al. 2019;Muscedere et al. 2019), implementing an analysis pipeline. In addition to adapter components between pieces coming from different projects, we enhanced the fact extraction to be variability-aware and added a result-filtering and visualization module for the interactive inspection of results.
We applied the pipeline to five analyses (behaviour alteration, recursion analysis, simplifiable global variable analysis, and two of their variants) of models of seven automotive controller SPLs from General Motors, each with hundreds of product line features. Our results demonstrate the scalability of our variability-aware analysis approach to real-life industrial SPLs. Our interactive visualization module allows users to filter the analysis results for a subset of products, allowing for a finer-grained inspection of results per project or per project set (e.g., enabling comparison of analysis results for different feature selections and change-impact analysis).
With respect to limitations, (1) our analyses need to be declarative and expressible in Datalog. So far, this has not posed a limitation on the types of analyses we have tried to perform.
(2) Our variability-aware analysis incurs a performance overhead of 6.74%-254.36% to analyze the entire SPL, compared to the time to analyze the superset of all the SPL's features. However, we consider this overhead to be negligible, given that the analysis returns results for all of the SPL's products; the runtime of a brute-force approach that applies the same analysis to each product separately would grow exponentially with the number of the SPL features. (3) The use of colour to highlight analysis results may be a limitation for users who suffer from full colour blindness. (4) The visualization of analysis results suffers when the set of results is very large. A small semi-formal user study of GM engineers provides preliminary evidence that filtering and highlighting can help the engineer to focus their attention on subsets of results of interest, but these findings and the impact of other facilities provided by the visualization environment need to be confirmed with larger studies.
For future work, we plan to integrate our analysis pipeline more tightly to produce a single tool that takes SPL code as input and provides an interactive user interface for inspecting results. We are also in discussions with General Motors to apply the pipeline to other analyses and to more SPLs. In addition, since our pipeline is analysis-agnostic, we are also in the process of identifying other analyses that might be of value to General Motors and whether they can be implemented in Datalog.
Data Availability Data sharing not applicable to this article as no datasets were generated or analyzed during the current study.

Conflict of Interests
The authors have no relevant financial or non-financial interests to disclose.