Robust Networked Multiagent Optimization: Designing Agents to Repair Their Own Utility Functions

We study settings in which autonomous agents are designed to optimize a given system-level objective. In typical approaches to this problem, each agent is endowed with a decision-making rule that specifies the agent’s choice as a function of relevant information pertaining to the system’s state. The choices of other agents in the system comprise a key component of this information. This paper considers a scenario in which the designed decision-making rules are not implementable in the realized system due to discrepancies between the anticipated and realized information available to the agents. The focus of this paper is to develop methods by which the agents can preserve system-level performance guarantees in these unanticipated scenarios through local and independent redesigns of their own decision-making rules. First, we show a general impossibility result which states that in general settings, there are no local redesign methodologies that can offer any preservation of system-level performance guarantees, even when the affected agents satisfy an inconsequentiality criterion. However, we then show that when system-level objectives are submodular, local redesigns of utility functions do exist which allow nominal performance guarantees to degrade gracefully as information is denied to agents. That is, in these submodular settings, agents can adapt to informational inconsistencies independently without incurring much loss in terms of system-level performance.


Introduction
A multiagent system can be viewed as a collection of decision-making entities each making local decisions in response to locally available information. These decision-making entities could be self-interested, e.g., drivers utilizing a transportation networks [15], or could be computer controlled, e.g., unmanned aerial vehicles in a swarm [21]. Regardless of the physical makeup of these decision-making entitles, the central job of the system operator is to ensure that the emergent collective behavior is desirable relative to a performance metric of interest.
Game theory has received considerable attention as an overarching framework for the analysis and design of such multiagent systems [1,3,4,24,31]. Clearly, game theory is relevant to the design and control of multiagent systems when the system is comprised of self-interested decision-making entities [7,9,28]. Interestingly, game theory is also relevant for systems where the decision-making entities are computer controlled and a system operator is tasked with deriving local decision-making rules (or control strategies) for the agents in the systems [29]. The connection to game theory is fueled by the fact that these control strategies can often be viewed through the lens of distributed learning, where agents are responding in a predetermined way to some intrinsic utility functions.
The connection between game theory and the design of engineered multiagent systems has fueled a branch of research under the umbrella of mechanism design that focuses on the design of agent utility functions [20,30]. Here, a central objective of the system designer is to construct agent utility functions such that the resulting equilibria have desirable efficiency properties. It is often the case that the system designer can focus exclusively on the design of utility functions without explicitly considering the dynamics that will ultimately drive the collective behavior to such equilibria [8,17]. The reason for this decomposition stems from the fact that if the designed agent utility functions posses a desirable structure, e.g., form a weakly acyclic or potential game, then a system operator can appeal to off-the-shelf distributed learning algorithms that can be employed to drive the collective behavior to an equilibrium of interest, e.g., fictitious play or log-linear learning [1,6,11,18]. Accordingly, the design of distributed control algorithms for engineered multiagent systems can be recast as the problem of designing utility functions for the agents such that the induced games have desirable equilibrium properties. Interestingly, recent results show that this game theoretic approach to distributed control design is without loss for various problems of interest [22]. That is, such game theoretic algorithms can actually match the performance bounds of the best distributed algorithms.
One of the central research themes associated with the problem of utility design in multiagent systems focuses on the well-studied efficiency metric termed the price of anarchy [23]. The price of anarchy is a worst-case measure that seeks to bound the system-level cost associated with an equilibrium of interest, e.g., pure Nash equilibrium, when compared to the optimal system-level cost. With regard to multiagent system design, the price of anarchy serves as the competitive ratio of the distributed algorithm that results from merging a particular utility design with a given dynamic process that is guaranteed to drive the collec-tive behavior to an equilibrium. Accordingly, there have been significant recent advances in deriving utility functions that optimize the price of anarchy [25].
This paper studies implementing such game theoretic approaches for the online coordination of multiagent systems, focusing in particular on the robustness of such approaches to informational deficiencies. Implementing a given distributed learning algorithm for online coordination of multiagent systems typically requires an individual agent to have access to (i) the agent's local utility function and (ii) the behavior of the other agents in the system. While knowledge of the local utility function is not necessarily a problem, the same need not hold true regarding knowledge of the behavior of the other agents in the system [13,26]. These informational deficiencies can be due to numerous issues including agent failures, adversarial interventions, and communication or sensing issues. Accordingly, the focus of this paper is on understanding how to revise these online learning algorithms to accommodate unplanned informational deficiencies while preserving the original equilibrium performance guarantees.
We begin by describing our model to ensure that our contributions are clear.

Model
A multiagent optimization problem has agent set I = {1, . . . , n} where each agent i ∈ I has a finite action set (or choice set) A i . The quality of each joint action profile a = (a 1 , . . . , a n ) ∈ A = A 1 × · · · × A n is expressed by a system-level objective function of the form W : A → R ≥0 . Further, each agent i is assigned a utility function U i : A → R ≥0 which will guide their decision-making process. We will refer to these utility functions as the nominal utility functions and will denote a multiagent optimization problem of the above form by the tuple G = (I, A, {U i } i∈I , W ). We will denote a set of such multiagent optimization problems by G.
The field of utility design focuses on how to construct these agent utility functions U = (U 1 , . . . , U n ) to achieve desirable equilibrium properties. During the design process, it is typically assumed that the system designer has limited knowledge regarding the specific structure of the multiagent optimization problem (I, A, ·, W ), and a common mode of uncertainty pertains to the structure of the agents' action sets. More formally, each agent i ∈ I is associated with a set of possible action choicesĀ i , and the system-level objective is defined over these possible action sets, i.e., W :Ā 1 × · · · ×Ā n → R. However, the specific realization of these action sets, A i ⊆Ā i for all i ∈ I, is not available to the system designer. The goal of a system designer is to design a utility function for each agent i ∈ I of the form U i :Ā 1 × · · · ×Ā n → R that leads to desirable collective behavior regardless of the specific realization of the agents' action sets. Note that a particular choice of utility functions U = (U 1 , . . . , U n ) coupled with a system objective function W induces a family of multiagent optimization problems of the form where we denote an admissible joint action set as A ⊆Ā with the understanding that this means Fig. 1 for an illustration.
When a utility design induces a learnable game structure (e.g., potential games or weakly acyclic games) and is coupled with a suitable distributed learning rule that guarantees convergence to an equilibrium (e.g., fictitious play or best response dynamics), the result is a distributed algorithm that provides a competitive ratio equal to the price of anarchy, which is defined as x y x y Realization #1 Realization #2 All Possible Actions Fig. 1 This figure highlights the inherent challenges associated with the utility design process for a simplified problem. To that end, suppose that each agent i ∈ I = {1, 2} has three potential actions, which we denote bȳ A 1 =Ā 2 = {x, y, z}. Furthermore, the system-level objective (or welfare) associated with the nine possible joint actions is provided on the left figure. For simplicity, consider the common interest design where each agent is associated with a utility function equal to the global objective, e.g., U 1 (x, y) = U 2 (x, y) = W xy . Once this utility structure is specified, the action set of each agent is realized and the designed utility functions are implemented. In Realization #1, the realized action sets are A 1 = A 2 = {x, y} and there is a unique pure Nash equilibrium with the optimal total welfare of 3. However, in Realization #2, the realized action sets are now of the form A 1 = A 2 = {x, z} and there are two pure Nash equilibria, (x, x) and (z, z), with a welfare of 1 and 4, respectively. Hence, this particular utility design is unable to guarantee that any Nash equilibrium in any problem instance will have a welfare > 25% of the optimal welfare where PNE(G) ⊆ A denotes the set of pure Nash equilibria in the game G. There are several recent positive results that characterize the utility functions that optimize the price of anarchy for several interesting classes of systems [25], e.g., congestion games.
Implementing such an algorithm requires that each individual agent is able to evaluate its utility function, which may not be possible given potential informational deficiencies that may prevent agents from observing the action choices of other agents in the system (for instance, a faulty sensor or jamming by an adversary). We model the information available to the agents using an information graph N := {N 1 , . . . , N |I| } where we let N i ⊆ I, i ∈ N i , denote the set of agents whose actions can be observed by agent i. Throughout this paper, we write |N | := i∈I |N i | to denote the number of edges in N . We write N c = {N c 1 , . . . , N c |I| } to denote the complement graph of N ; that is, for each i ∈ I, we write N c i := I\N i to denote the set of agents whose actions cannot be observed by Agent i. To account for these informational deficiencies, it is imperative that the nominal utility functions U i :Ā → R ≥0 are replaced with modified lower-dimensional utility functions of the form U i :Ā N i → R ≥0 , whereĀ N i = i∈N iĀ i denotes the actions of agents visible to i. This modification will then allow the agents to follow the prescribed learning rule, albeit with these new modified utility functions that adhere to the realized information structure. See Fig. 2 for an illustration.
Information Graph This paper focuses on designing local mechanisms for constructing these new lowerdimensional utility functions U 1 , . . . , U n that maintain some notion of equilibrium quality relative to the nominal utility functions U . The process by which agents adapt their utility functions to account for information deficiencies is given by a local adaptation rule, defined as follows: That is, f i takes in a nominal utility function U i and a local information graph N i , and outputs a new lower-dimensional utility function of the form U i : When applying a local adaptation rule, agent i knows N i (which agents it can observe) and U i (its own full nominal utility function) but has no knowledge of the system objective W , or the utility functions or information sets N j of other agents. We let G f (N ) denote the locally adapted version of the game G through the adaptation rule f and information graph N . We will sometimes omit highlighting the information graph, i.e., express G f (N ) as merely G f , when the dependence is clear.

Summary of Contributions
While there has been extensive work done in the area of utility design, c.f., [22], to the best of our knowledge this is the first work to address the design of adaptation rules to accommodate unplanned informational deficiencies. Accordingly, to disentangle these two design elements, we fix the utility design as merely the common interest design U i = W , as highlighted in Figs. 1 and 2, and concentrate purely on the design of local adaptation rules f = ( f 1 , . . . , f n ) that gracefully preserve efficiency guarantees of the resulting equilibria. While we do not explicitly consider situation where U i = W in this manuscript, many of the results immediately extend to this setting as well.
Our first set of results are largely negative, stating that in general it is impossible for any local adaptation rule f to preserve any level of optimality associated with the resulting pure Nash equilibria for any degree of informational losses. We state these results here less formally and refer readers to Sect. 2 for the formal statements of Proposition 2 and Theorem 1.

Contribution 1
Let G be a sufficiently rich family of multiagent optimization problems with the common interest utility design. Then for any information graph N that is missing one directed edge, 2 we have The first result above highlights the fragility of distributed approaches to informational deficiencies in multiagent optimization problems. Note that the metric introduced in (2) is optimistic, comparing the best equilibrium in the adapted game G f to the worst equilibrium in the nominal game G, and that this optimism bias does not help preserve efficiency guarantees. Furthermore, in Sect. 2 we also demonstrate that this negative result holds even when the affected agents i, j satisfy an inconsequentiality criterion in which the behavior of agent j has minimal impact on the utility of agent i. Finally, we mention that the formal statement of this result in Theorem 1 is actually stronger than the statement above, since there we show that there exist pathological multiagent optimization problems on which every local adaptation rule fails to preserve equilibrium performance guarantees.
While the above result paints a negative picture regarding the availability of adaptation rules to account for informational deficiencies, our next set of results is more positive and states that certain structures of multiagent optimization problems do provide agents with opportunities to preserve equililbrium quality despite informational deficiencies. These positive results pertain to the class of multiagent optimization problems with submodular system-level objective functions, i.e., objective functions that exhibit a property of diminishing returns. In this case, we establish a lower bound on equilibrium quality degradation that is parameterized by |N c |, the number of edges missing from the information graph; this bound is depicted as well in Fig. 3.

Contribution 2 Let G be a sufficiently rich family of submodular multiagent optimization problems with the common interest utility design. Then for any information graph N , we have
Here, we write NE(G) to denote the set of all Nash equilibria (not merely pure). Note that the metric introduced in (3) is far more pessimistic than its counterpart in (2), comparing the worst equilibrium in the adapted game G f to the best equilibrium in the nominal game G. Nonetheless, this result indicates that important classes of multiagent optimization problems in G have a degree of robustness to informational deficiencies, since it means that even the worst adapted equilibria cannot be arbitrarily worse than the nominal equilibria, and this worst-case guarantee degrades gracefully as edges are removed from the information graph N . To provide some intuition, the pathological games which enable Contribution 1 have highly fragile equilibria due to a lack of alignment between the actions of different agents (i.e., individual local agent best responses can be made to point "away" from system-optimal action profiles). However, submodularity enforces a degree of agent action alignment and this appears to be part of why positive performance guarantees are possible in this case. Figure 3 depicts the possible bounds provided by (3) as a function of the number of directed edges missing from N ; note that one consequence of Contribution 2 is that removing a single edge from N cannot degrade worst-case equilibrium guarantees relative to the nominal case. Also, note that the edge-based bound given in (3) is a consequence of a somewhat tighter bound which we show in Theorem 2.
Finally, note that this paper considers two sources of uncertainty: first, the agents' inability to observe other agents' actions; second, the agents' uncertainty over the realized action sets. We mention here that both types of uncertainty are required to obtain the negative results in this paper. Specifically, if agents cannot reliably observe each others' actions but do know the realized action sets with certainty, then each agent could simply compute the system optimal action profile and play its corresponding action. However, if agents can always observe each others' actions, then none of the pathologies in this paper are possible, even if agents have uncertainty over realized action sets.

Contribution 1: General Multiagent Optimization Problems are Fragile
Our first set of contributions focuses on designing local adaptation rules for general multiagent optimization problems and asks if any adaptation rule can preserve system-level performance guarantees in the presence of informational inconsistencies. To pose the problem rigorously, we consider local adaptation rules that take the following form for some payoff evaluator f , for agent i and joint action a N i = {a j } j∈N i : That is, each agent is given a function f (·) which aggregates the payoffs which could possibly be obtained and are consistent with the information available to the agent into one single proxy payoff. In the interest of generality, we allow any evaluator f which satisfies the following definition:

Definition 2 For any possible utility functions
If Agent i uses acceptable evaluator f to compute proxy payoffs for the case when Agent j's action is unobservable, we say that Agent i applies f to Agent j. Proposition 1 gives a partial list of evaluators which are acceptable by Definition 2; its proof is included in Appendix. Note that some evaluators have an appealing intuitive explanation. For instance, f min assigns proxy utility functions to agent i which implicitly assume that the unobserved agents are selecting actions adversarially, acting to minimize the payoffs of agent i. On the other hand, the f max evaluator assigns proxy utility functions to agent i which implicitly assume that unobserved agents are selecting actions to maximize the payoffs of agent i. Furthermore, in the case of common interest games, the f max evaluator implicitly models unobservable agents as simply playing best response actions.

Unrestricted Games are Fragile
First, we show that if no restriction is placed on which type of game is under consideration, a single missing edge in the information graph N can easily and catastrophically degrade performance. The following proposition assumes that Agent 1 loses information about the action choice of Agent 2, and thus must compute proxy payoffs for the case when Agent 2's action is unobservable.
Proposition 2 Let G denote the set of all multiagent optimization problems, and let N be such that Agent 2 cannot observe the action choice of Agent 1. For any acceptable evaluator f that Agent 2 applies to Agent 1, it holds that Here, by showing that the "optimistic" metric obtains a value of 0, we see that in general games, no evaluator can prevent pure Nash equilibria from having arbitrarily low quality when even one edge is missing from N . The proof of Proposition 2 is given in Appendix.

Even Inconsequential Agents Matter
In light of Proposition 2, we now ask whether adding additional structure to the class of optimization problems can confer some resilience. Accordingly, we now study systems in which individual unobservable agents are only weakly important to the system objective. We introduce the following notion of weak interrelation: We say Agent j is "inconsequential" if it can can never cause a large change in the system objective value by changing actions. We make this notion precise in Definition 3: Now, let G denote the class of games such that for each G ∈ G , we have that Agent 1 is no more than -inconsequential. Inconsequentiality gives that for each game G ∈ G , we are assured that even if some other agent(s) cannot observe Agent 1's action, a unilateral deviation by Agent 1 can have only a small impact on the system objective. One might hope that this property could help avoid the severe pathologies of Proposition 2.
Unfortunately, Theorem 1 demonstrates that whenever > 0, no acceptable evaluator can prevent the loss of a single information edge from causing significant harm to the emergent behavior in the problem, even if that edge is associated with observing an inconsequential agent.
Theorem 1 For any > 0, let G be the set of multiagent optimization problems as defined above, and let information graph N satisfy |N c | = 1 and be such that 1 / ∈ N 2 (that is, Agent 2 cannot observe the action choices of Agent 1). 3 Then it holds that We provide the proof of Theorem 1 in Appendix. This theorem indicates a fundamental fragility in these systems, even when the class of systems has carefully been selected to perform well. As we detail in the proof of Theorem 1, the statement in (7) is obtained by demonstrating a family of games which all have unique Nash equilibria, and whose adapted games are weakly acyclic with unique Nash equilibria. That is, it is possible to reach the Nash equilibrium in each adapted game from any action profile via a finite sequence of unilateral payoff-improving moves.
Finally, note the order of precedence of the inf and sup operators in (7): This formulation explicitly states that even if f is selected with knowledge of the specific multiagent problem, it cannot preserve any efficiency guarantees.

Contribution 2: Submodular Multiagent Optimization is Resilient
In this section, we consider the well-studied case of submodular multiagent optimization problems, defined as follows: Let S be a finite set of elements and let W be a set-based function of the form W : 2 S → R. Function W is called submodular if for any sets Q ⊆ R ⊆ S and any element s ∈ S it holds that i.e., the marginal benefit of adding the element s to the set Q is higher than the marginal benefit of adding the element s to the superset R. In all our results pertaining to submodular functions, we also assume the functions to be nondecreasing, that is, for sets Q ⊆ R ⊆ S, W (Q) ≤ W (R). Furthermore, without loss of generality, we assume submodular functions to be normalized, or W (∅) = 0. When cast in this paper's multiagent framework, agents' action sets are constrained to be of the form A i ⊆ 2 S ; i.e., each agent's pure actions are subsets of some ground set S. We also assume that ∅ ∈ A i for all agents i; that is, each agent can in essence choose not to participate. Submodular maximization is a well-studied topic in a variety of fields due to its application in many common engineering problems, e.g., information gathering [16], influence in social networks [14], object detection [2], and document summarization [19], among others. Recent work has also focused on identifying centralized algorithms to solve submodular maximization problems in resilient ways [27] or subject to informational constraints [12]. It is well known that many multiagent optimization problems with submodular objective functions have nontrivial worst-case equilibrium quality guarantees. In particular, when U i = W the price of anarchy of a submodular multiagent optimization problem is known to be at least 1/2 [28].
Our first question is this: Do there exist any local adaptation rules for submodular problems which provide efficiency guarantees that are close to the nominal price of anarchy of 1/2? In answer, Theorem 2 indicates that in this setting, there exist payoff evaluators which give efficiency guarantees that gracefully degrade from the nominal 1/2 as the information graph becomes increasingly disconnected. Furthermore, Theorem 2 provides a lower bound on the efficiency degradation which depends on the structure of the information graph.
Theorem 2 applies to all Nash equilibria; accordingly, we write a mixed strategy for player i as a i ∈ (A i ), where (A i ) denotes the standard probability simplex over A i . For simplicity, we write (A) := i∈I (A j ) and (A −i ) := j =i (A j ). If players are selecting mixed strategies, their utility functions are evaluated in expectation over the joint probability distribution of play in the standard way, so that U : (A) → R; similarly, for any a ∈ (A), we define the extended system objective W (a) as the expected value of W under the distribution a.
Agent i's best response set for an strategy profile We write the set of all Nash equilibria of multiagent optimization problem G as NE(G).
We say that a graph N is acyclic if it contains no cycles; i.e., if every directed path in N has finite length. We write MF(N ) to denote the smallest set of agents whose removal (along with associated edges) renders N acyclic. In graph-theoretic terms, MF(N ) is the minimum feedback vertex set of graph N .
We define the disconnection factor δ(N ) as the cardinality of the minimum feedback vertex set of the complement graph: δ(N ) := |MF(N c )|. Note that δ(N ) expresses the "disconnectedness" of the information graph N . For instance, if N is a complete graph (all agents can observe all agents), δ(N ) = 0; if N is empty (no agent can observe any other agent), then δ(N ) = n − 1. Furthermore, δ(N ) is monotone in the following sense: If N ⊆ N are information graphs over I, then δ(N ) ≥ δ(N ); adding an edge to a graph can only decrease its disconnection factor. We are now prepared to state our result: Theorem 2 Let G SM be the family of multiagent optimization problems with nondecreasing, normalized, submodular system-level objective functions. For any information graph N , Furthermore, a local adaptation rule f which achieves this bound is the minimum payoff evaluator f min , item 3 in Proposition 1.
Note that in a common interest game, the denominator in (9) resolves to the systemoptimal objective value, and the numerator captures the worst-case degradation possible in the game's equilibria due to information losses. Thus, (9) represents a bound on the degree to which information losses can harm the emergent behavior of the multiagent system.
As we proceed to prove Theorem 2, we first highlight the structure of the payoff evaluator which gives (9). First, the lower bound in (9) is obtained using the minimum evaluator (see Proposition 1, item 3). While the structure of the optimal evaluator remains an open question, Theorem 2 demonstrates that choosing the evaluator correctly ensures that the performance guarantee associated with locally adapted utility functions degrades gracefully as a function of the disconnection factor of the graph. The minimum evaluator has an appealing intuitive interpretation when the objective W is submodular and nondecreasing. That is, the minimum evaluator is equivalent to assuming that unobservable agents are selecting the ∅ action. For information graph with N i , the minimum evaluator generates a proxy utility function equal to Thus, Theorem 2 provides that the simple policy of "if I cannot see you, I will assume that you are not present" yields nontrivial equilibrium performance guarantees. We now proceed with the proof.

Proofs for Theorem 2
The proof of Theorem 2 proceeds by developing a sequence of inequalities which bound W (a opt ) using various properties of submodular problems. To facilitate these arguments, throughout this proof we replace the agents' utility functions with marginal-cost utility functions as follows. For agent i, let the marginal-cost utility function be given bȳ Note that an action profile a ne is a Nash equilibrium for marginal-cost utility functionsŪ if and only if it is a Nash equilibrium for common-interest utility functions considered in this paper; this is because for any a i , a i ∈ A i , it holds that This equivalence holds for the adapted utility functions as well when considering the minimum evaluator. That is, Agent i's adapted marginal-cost utility function is simply given byŪ which (just as in (10)) is equivalent to assuming that unobservable agents are selecting ∅. Thus, throughout the proofs for Theorem 2, without loss of generality we define all Nash equilibria with reference to the marginal-cost utility functionsŪ defined in (11) and their adapted versionsŪ defined in (12). We begin with the following lemma. Lemma 1 Let G ∈ G SM be a submodular multiagent optimization problem. Suppose that the associated information graph N is such that for some subset of agents N ⊆ I, the subgraph of N associated with agents in N contains a complete DAG. 4 Then if agents apply the minimum evaluator (equivalently, they evaluate their nominal utility functions with unobservable agents selecting ∅), the following holds for any mixed strategy a ∈ (A): Proof Without loss of generality, assume that the agents are indexed in increasing order of out-degree; or |N i | < |N i+1 |. For each agent i ∈ N , let N * i := {1, . . . , i − 1} (and define The first inequality follows simply from the submodularity and monotonicity of W and (12), yielding a telescoping sum; the second inequality follows from the monotonicity and normalization of W .

Proof of Theorem 2:
Let G ∈ G SM be a submodular maximization problem with nondecreasing and normalized objective function W . Let a opt ∈ (A) be an optimal solution to W , and let a ne ∈ (A) be a Nash equilibrium associated with proxy utility functions computed using the minimum evaluator f min yielding utility functions (12). Let M := MF(N c ) be a minimum feedback vertex set of N c (so that δ(N ) = |M|). Then The first inequality is borrowed from [28, Proof of Theorem 3]; it is a consequence of utility functions computed by (12) and the definition of Nash equilibrium. The second inequality follows from the fact that by submodularity, monotonicity, the form of proxy utility functions from (10), andŪ i (a) ≤ W (a i ) ≤ W (a). The last inequality follows from Lemma 1; note that the information (sub)graph of I\M must contain a complete DAG, since its associated complement graph N c is acyclic. The desired result is obtained by observing that by definition, δ(N ) = |M|.

The Role of Graph Structure in Equilibrium Quality
While the lower bound in Theorem 2 holds for any information graph N , we note that the relationship between a graph N and its disconnection factor δ(N ) may not be immediately obvious. Indeed, identifying the minimum feedback vertex set of an arbitrary undirected graph is NP-Hard [10]. However, it is quite simple to parse for certain types of graph. For instance, a simple consequence of Theorem 2 is that if the complement graph N c is acyclic, the nominal efficiency guarantee of 1/2 is preserved: Corollary 1 Let G SM denote the set of multiagent optimization problems with monotone, normalized, submodular objective functions, and let N a be such that its associated complement graph N c a is acyclic. Then the optimal evaluator preserves nominal efficiency guarantees: Proof Note that if N c is acyclic, δ(N ) = 0 since no vertices are required to be removed to render N c acyclic. Thus, (16) is immediate from (9) and from the known upper bound of 1/2 for this class of problems.
We may also leverage straightforward bounds on the cardinality of minimum feedback vertex sets of directed graphs to reinterpret (9) directly in terms of |N c | (i.e., the number of edges "missing" from N ).

Corollary 2
Let G SM denote the set of multiagent optimization problems with monotone, normalized, submodular objective functions. For any information graph N , Proof First, we must show for any graph N that 2δ(N ) ≤ |N c |. Note that at least two unique edges must be incident on each vertex in MF(N c ); otherwise, MF(N c ) would not be minimal. Thus, N c must contain at least twice as many edges as vertices in MF(N c ), or Therefore (9), the integrality of δ(N ), and (18) combine to give (17).
For a plot depicting the bound in (17), see Fig. 3.

Empirical Comparison of Adaptation Rules
One example of a multiagent optimization problem that has received significant attention in the literature is known as the weighted set coverage problem. Here, there exists a collection of resources R where each resource r ∈ R is associated with a value v r ≥ 0. The goal of the weighted set coverage problem is to allocate agents over resources to maximize the total value of the covered resources. To that end, we have a collection of agents I where each agent is associated with a given action set A i ⊆Ā i = 2 R that defines the set of permissible coverings. It is important to emphasize that agent i does not choose the action set A i ; rather, the action set is chosen exogenously and the agent is tasked with choosing a particular covering choice a i ∈ A i . The goal of the weighted set covering problem is to choose an admissible collective allocation a = (a 1 , . . . , a n ) ∈ A that optimizes a systemlevel objective function of the form W (a) = r ∈∪ i∈I A i v r . Several different utility design methodologies have been considered for the maximum weighted set covering problem. The most natural design choice is that of common interest, where each agent i ∈ I is assigned a utility function of the form U i (a) = W (a) for any a ∈Ā. This design choice ensures that the resulting game is an exact potential game and further that the price of anarchy is 0.5, meaning that such a choice ensures that all resulting pure Nash equilibria have performance within 50% of optimal irrespective of the specific weighted set covering problem [28]. Note that this design choice does not ensure that all Nash equilibrium are optimal, as suboptimal Nash equilibrium can also exist as well.
It is well known that the weighted set cover objective is submodular, nondecreasing, and normalized; thus, weighted set cover problems satisfy the requirements of Theorem 2. Accordingly, we use this class of problems to perform an empirical comparison between the minimum evaluator (which yields the bound in Theorem 2) and the maximum evaluator (see Proposition 1, item 2). At first glance, in the weighted set cover problem, the maximum evaluator seems to offer some hope for providing good performance under informational Fig. 4 Top: layout of game for illustrative example in text. Two agents (circles) each have access to two resources (boxes); resource values are indicated in the respective box. Bottom left: system objective function with respect to agent selections; note that the objective is maximized when one agent selects b and the other agent selects a or c. Bottom center/right: If neither agent can observe the actions of the other, this is the payoff matrix induced by the maximum and minimum evaluators, respectively. Nash equilibria are shaded gray. Note that for the maximum evaluator, all action profiles (even the highly suboptimal (a, c)) are Nash equilibria; however, the minimum evaluator has only the nearly optimal action profile (b, b) as a Nash equilibrium inconsistencies, since it biases agents toward resources that cannot be covered by other (unobserved) agents; thus, one might expect it to lead agents to avoid redundancies.
For concreteness, we first provide a simple example of a weighted set cover problem to illustrate the concept in Fig. 4. This example has two agents I = {1, 2} and three resources R = {a, b, c} with values v a = v c = 0.1 and v b = 1. Agent 1 can cover either resource a or b, and Agent 2 can cover either resource b or c; i.e., A 1 = {a, b} and A 2 = {b, c}. In this example, an optimal action profile has an objective value of 1.1, with one agent covering resource b and the other agent covering a different resource. If both agents can observe each others' actions, any optimal action profile is also a Nash equilibrium.
However, consider the scenario in which neither agent can observe the other's action; i.e., N 1 = {1} and N 2 = {2}. In this case, the complement graph of N has a single directed cycle, so δ(N ) = 1 and by Theorem 2, the minimum payoff evaluator ensures that all Nash equilibria have objective value at least 1.1/3. Figure 4 depicts the payoff matrices corresponding to the maximum and minimum evaluators for this example problem; note that in this case the maximum evaluator (bottom center) causes every action profile to be a Nash equilibrium (even the highly suboptimal (a, c) outcome), whereas the minimum evaluator (bottom right) induces only one moderately good equilibrium (b, b). On this problem, note that the maximum evaluator's worst Nash equilibrium is considerably worse than the lower bound ensured by the minimum evaluator.
The results of our empirical study (depicted in Fig. 5) corroborate this finding: the minimum evaluator significantly outperforms the maximum evaluator on random problems as well. To conduct this study, we generated 1200 random weighted set cover problems in the following way: each game was generated with 5 agents, 8 total resources, 3 randomly selected resources available to each agent, and resource values selected uniformly at random from [0, 1]. For each randomly generated game G, we computed the best nominal Nash equilibrium a ne ∈ arg max a∈PNE(G) W (a). We then subjected the game to a sequence of edge removals; at each stage, we removed one uniformly selected edge from N , up to a total of 18 removed edges. For each information graph in the resulting sequence of graphs, we applied each of the considered evaluators and then computed the worst Nash equilibrium associated with that evaluator. Let G k f denote the game with k edges removed, subjected to the evaluator f ; then we may write the computed worst-case Nash equilibrium as a k f ∈ arg min a∈PNE(G k f ) W (a) for each evaluator f ∈ { f min , f max }. Finally, for game G with k edges removed, we record Results of empirical investigation into the effect of local adaptation rule on equilibrium quality. To generate one (gray) trace on each plot a random weighted set cover game was generated with 5 agents, 8 total resources, 3 randomly selected resources available to each agent, and resource values selected uniformly at random from [0, 1]. A single (gray) trace corresponds to a single random game under a sequence of removed edges; it records the objective value of the worst Nash equilibrium divided by the optimal objective value. The left plot was generated using the minimum evaluator which is used to prove the lower bound in Contribution 2; and the blue trace in each plot reports the average performance of worst-case equilibria for the minimum evaluator. The left plot was generated using the maximum evaluator (see Proposition 1, item 2), and the red trace in each plot reports the average performance of worst-case equilibria for the maximum evaluator. Note that for highly disconnected graphs, the minimum evaluator enjoys a substantial advantage over the maximum evaluator Referring to Fig. 5, note that the minimum evaluator significantly outperforms the maximum evaluator, and this advantage is the most pronounced on highly-disconnected graphs (large values of |N c |). First, minimum average performance (blue trace) exceeds maximum average performance (red trace) for all values of |N c |. Indeed, minimum average performance is slightly higher for mostly connected graphs than it is for fully connected (complete) graphs. Second, note that for highly disconnected graphs, nearly all equilibria associated with minimum (gray traces on the left plot) outperform maximum's average; conversely, nearly all equilibria associated with maximum (gray traces on the right plot) underperform minimum's average. Finally, consider highly disconnected information graphs in the plots in Fig. 5, with approximately |N c | > 14. Even here, a setting in which almost no agent can observe other agents, the minimum evaluator often induces equilibria which are as good as the best equilibria achievable when N is fully connected (i.e., their degradation reported on the plot is 1). However, in the same connectivity regime, the maximum evaluator rarely or never induces equilibria of such high quality; practically all of its equilibria are worse than the best nominal equilibria, and some even obtain an objective value of 0.
While this empirical analysis is limited to a single class of simple submodular problems with a relatively low number of agents, it indicates that the problem of selecting a payoff evaluator is nontrivial, and that a misguided selection may lead to unnecessarily poor equilibrium performance.

Conclusions
This paper represents an initial study on the applicability of local methods for agents to adapt to informational inconsistencies in multiagent optimization, and suggests several open areas for future study. For example, it is unclear precisely what types of optimization problem might be subject to the fragility reported in Theorem 1, and future research could focus on identifying particular problem structures that render a problem susceptible to these issues. Another open question pertains to the optimal local adaptation rule for submodular problems. Theorem 2 shows a lower bound which is associated with the minimum evaluator and our empirical evidence suggests that the maximum evaluator is worse, but it is an open question whether any evaluators exist which can provably outperform the minimum evaluator in any sense.
Data Availability Statement Data sharing not applicable to this article as no datasets were generated or analyzed during the current study.

Appendix: Proofs for Section 2
First we present the proof for Proposition 1 characterizing several acceptable evaluators.

Proof of Proposition 1
To see that each satisfies Definition 2, let S ∈ R k and S ∈ R k satisfy the assumption of Definition 2. Arrange S and S in ascending order and denote the i-th element of S and S as s i and s i , respectively, so that min s∈S = s 1 and max s ∈ S = s k . Thus, Since f sum satisfies Definition 2, it must be true that f mean does as well. To see that f max and f min satisfy Definition 2, simply note that f max (S) = s k > s k = f max (S ) and f min (S) = s 1 > s 1 = f min (S ).
Next, we present the proof that general unrestricted multiagent optimization problems are fragile.

Proof of Proposition 2.
Consider the multiagent optimization problem depicted in Fig. 6, where > 0 is a small positive constant. This problem has 3 agents with two possible actions each:Ā 1 =Ā 2 =Ā 3 = {1, 2}. Now, suppose that the realized problem has A 1 = {1}; that is, Agent 1 does not have access to action 2 and thus selects the fixed action a 1 = 1 (i.e., selects the left-hand payoff matrix in Fig. 6). Furthermore, suppose that the information graph is such that Agent 2 cannot observe the action choice of Agent 1 (formally:  . 6 Payoff matrix for pathological game used to prove Proposition 2. As described in the proof,Ā 1 = {1, 2}, but that in the realized problem A 1 = {1}. That is, Agent 1 lacks access to action 2 (that is, the action which selects the right-hand payoff matrix), so Agent 1 selects a fixed action a 1 = 1. Agent 2 cannot observe the action choice of Agent 1, so Agent 2 applies an acceptable evaluator to Agent 1. In the adapted game, whenever Agent 3 selects action a 3 = k, Agent 2's unique best response (according to the adapted payoffs) is a 2 = 2 as depicted by the underlines in the left-hand payoff matrix Thus, the realized game is restricted simply to the left-hand payoff matrix in Fig. 6, but Agent 2 lacks the ability to observe this. Since Agent 2 cannot observe Agent 1, Agent 2 applies an acceptable evaluator f to Agent 1. Agent 2's adapted utility function U 2 is Agent 3 1 2 By the definition and established properties of acceptable evaluators, it must hold that . That is, Agent 2's adapted payoffs are such that a 2 = 2 becomes a strictly dominant strategy, leaving a unique dominantstrategy Nash equilibrium of a 2 = 2 and a 3 = 2 with an objective value of only 2 . The result is proved by taking the limit as → 0.
Finally, we present the proof of Theorem 1, showing that this fragility persists even in problems with a single -inconsequential agent which cannot be seen by a single other agent.
Proof of Theorem 1. We will construct a multiagent optimization problem G ∈ G in which Agent 1 is -inconsequential. In this problem, Agent 2 is unable to observe the action choices of Agent 1. However, despite Agent 1's inconsequentiality, regardless of which acceptable evaluator is applied by Agent 2, the problem's Nash equilibria are rendered arbitrarily poor. Furthermore, this problem has a unique Nash equilibrium in the adapted case (in which Agent 2 cannot observe Agent 1) which can be reached from any joint action via a finite sequence of payoff-improving unilateral deviations (i.e., the adapted game is weakly acyclic under better replies; see [20]).
Consider the multiagent optimization problem depicted in Figs. 7 and 8, defined for small positive constant δ > 0. This problem has 3 agents;Ā 1 = {1, 2}, andĀ 2 =Ā 2 = {1, 2, . . . , K } where K = 1/δ − 3 . First, note that Agent 1 is 3δ-inconsequential. Since max a∈Ā W (a) = 1, verifying 3δ-inconsequentiality reduces to verifying that a unilateral deviation by Agent 1 changes the objective value by no more than 3δ for any action profile; this is easily verified via Fig. 7 by comparing like cells in the left (a 1 = 1) and right (a 1 = 2) matrices and via Fig. 8 by comparing like cells in the upper (a 1 = 1) and lower (a 1 = 2) matrices. Now, consider a system realization in which A 1 = {1}; that is, Agent 1 only has access to a single action (i.e., the action which yields the left matrix in Fig. 7 and the upper matrix in Fig. 8). This effectively reduces the game to a two-player game between Agent 2 and Agent 3 with the common interest utility function depicted on the left in Fig. 7 and the upper matrix in Fig. 8). Throughout, we will write an action profile in the form (a 2 , a 3 ). As can readily be verified via the payoff matrices, this game has two pure Nash equilibria: (1, 2) and (2, 1) which each have an objective value of 1 − 2δ. Note also that no other action profile is a pure Nash equilibrium: in each action profile with W (a) = 0, one or both agents can deviate to obtain a nonzero payoffs. In action profile (K , K ), Agent 2 can deviate to a 2 = K − 1 for a payoff improvement, and In every other action profile with nonzero payoffs, Agent 3 can deviate to a 3 = 1 (the left-most column) to obtain a payoff improvement. Thus, we have that min a ne ∈PNE(G) Now, consider an information graph in which Agent 1 and Agent 3 can observe all agents' action choices, but Agent 2 cannot observe Agent 1. That is, N 1 = N 3 = {1, 2, 3}, but That is, Agent 1 lacks access to action 2 (that is, the action which selects the right-hand payoff matrix), so Agent 1 selects a fixed action a 1 = 1. Agent 2 cannot observe the action choice of Agent 1, so Agent 2 applies an acceptable evaluator to Agent 1. In the adapted game, whenever Agent 3 selects action a 3 = k, Agent 2's unique best response is a 1 = k as depicted by the underlines in the left-hand payoff matrix W (·), a 1 = 1 Agent 3 1 · · · k k+ 1 Agent 2 k − 1 1 − (k − 1)δ · · · 1 − kδ 0 k 1 − kδ · · · 1 − (k + 2)δ 1 − (k + 1)δ k + 1 1 − (k + 1)δ · · · 0 1 − (k + 3)δ W (·), a 1 = 2 Agent 3 1 · · · k k+ 1 Agent 2 k − 1 1 − (k + 2)δ · · · 1 − (k + 3)δ 0 k 1 − (k + 3)δ · · · 1 − (k − 1)δ 1 − (k + 4)δ k + 1 1 − (k + 4)δ · · · 0 1 − kδ Fig. 8 Generic modular payoff matrix block for pathological game used to prove Theorem 1. Note that A 1 = {1, 2}, but that in the realized problem A 1 = {1}. That is, Agent 1 lacks access to action 2 (that is, the action which selects the lower payoff matrix), so Agent 1 selects a fixed action a 1 = 1. Agent 2 cannot observe the action choice of Agent 1, so Agent 2 applies an acceptable evaluator to Agent 1. In the adapted game, whenever Agent 3 selects action a 3 = k, this leads Agent 2 to best-respond with action a 1 = k as well as depicted by the underlines in the upper payoff matrix N 2 = {2, 3}. Since Agent 2 cannot observe the action choice of Agent 1, Agent 2 applies an acceptable evaluator f to Agent 1. Now, suppose Agent 3 selects action a 3 = k ∈ {1, 2, . . . , K }. Agent 2's unique best response (given the adapted utility function) is to select a 2 = k. To see this, first let Agent 3's action satisfy a 3 = k > 1. Considering Fig. 8, note that Agent 2's adapted payoffs are and Hence, by the properties of acceptable evaluators, it must hold that U 2 (k, k) > U 2 (k − 1, k) whenever δ > 0. Second, let Agent 3 select action a 3 = 1 (the left-most column in Fig. 7).
Here again it can be verified that Agent 2's best response is a 2 = 1, using a similar argument to above. W (·), a 1 = 1 Agent 3 1 · · · K − 1 K

Agent 2
K − 2 1 − (K + 1)δ · · · 1 − (K + 2)δ 0 K − 1 1 − (K + 2)δ · · · 1 − (K − 2)δ 1 − (K + 3)δ K 0 · · · 0 1 − (K − 1)δ Fig. 9 Portion of payoff matrix for pathological game used to prove Theorem 1; Agent 2 and 3 actions {K − 1, K } are depicted to show the pathological adapted Nash equilibrium at a 2 = K and a 3 = K . Note thatĀ 1 = {1, 2}, but that in the realized problem A 1 = {1}. That is, Agent 1 lacks access to action 2 (that is, the action which selects the lower payoff matrix), so Agent 1 selects a fixed action a 1 = 1. Agent 2 cannot observe the action choice of Agent 1, so Agent 2 applies an acceptable evaluator to Agent 1. In the adapted game, whenever Agent 3 selects action a 3 = k, this leads Agent 2 to best-respond with action a 1 = k as well as depicted by the underlines in the upper payoff matrix Finally, we show that the action profile (K , K ) is a Nash equilibrium of the adapted game. Since K = 1/δ − 3 , it holds that By previous arguments it is thus a best response for Agent 2, and any deviation by Agent 3 would yield a payoff of 0. Finally the objective value of (K , K ) is small (obtained from Fig. 9): Thus, we have that in the adapted multiagent optimization problem G f , Combining (19) and (22), we have that for any acceptable evaluator f ∈ F and any δ > 0, a multiagent problem G exists for which The proof of Theorem 1 is obtained in the limit by taking δ → 0.