Dynamic Advertising Games in Duopolies Under One-Step-Ahead Optimal Control

In discrete-time Vidale–Wolfe–Deal duopoly models and variants, two firms compete for market share, in a dynamic game setting, described by a pair of difference equations. This paper studies these dynamic games, using the natural concept of one-step-ahead optimal control, in which each firm optimizes its own performance index at the next step, and only has access to some information about its competitor. Two cases are studied: with and without stipulating target market shares for each firm, under sequential and parallel game playing procedures. It is shown that when target market shares are not specified, for the VWD model, limit cycles of large period can occur when each firm uses linear performance indices, while multiple equilibria may arise when quadratic performance indices are used. Three other proposed models result in games that lead to equilibria and do not have limit cycle behavior. When target market shares are specified, convergence to an equilibrium occurs for all the models proposed in this paper.

where S represents the rate of sales at time t, A(t) is the advertising expenditure (to be designed), r is referred to as the response constant, M is the market saturation level and λ is the exponential sales decay rate. The interpretation is that the increase in the rate of sales,Ṡ, is proportional to the intensity of the advertising effort (or control) A(t) reaching the fraction of potential customers (1 − S/M), less the number of defecting customers (λS). In what follows, the parameter λ will be referred to as the defection rate, rather than sales decay rate, since, in the more general case of market share models involving competition between firms, sales decay will be attributed to the defection of the customer from the firm. In order to rewrite model (1) in a normalized form, i.e., in terms of the fraction of customers S/M, which can also be denominated as the market share, denoted as x, a change of variables (x = S/M) is made, to yield:ẋ In discrete time, the Vidale-Wolfe market share dynamics can be written as follows: Despite the simplicity of this model, it has been the subject of intense research, notably by Sethi [18,19] who derived optimal advertising efforts (controls) for this class of models, under different assumptions on the cost functions. The bulk of the literature on the Vidale-Wolfe model does not, however, propose significant or structural modifications to the original Vidale-Wolfe model dating from 1957.
Deal [6] generalized the Vidale-Wolfe model to a duopoly: two firms competing for market share, through advertising. Advertising effort is determined by each firm optimizing its performance index. In other words, Deal proposed a differential game model, now known as the Vidale-Wolfe-Deal (VWD) model. A discrete-time version of the VWD model, in which the market shares are represented by the variables x 1 and x 2 , is as follows: where u i (k) ≥ 0, for k = 1, . . . , K f and λ i ≥ 0, i = 1, 2. The nonnegative b i s are referred to as the market share response parameters.
Interpreting firms as players competing for market share, the two-player decision making problem of allocating advertising expenditure or effort, is referred to as a game. The game is called dynamic if the order in which the decisions are made is important, and is noncooperative if each player involved pursues its own interests which are partly conflicting with the interests of the others [3]. Since a difference (resp. differential) equation in discrete time (resp. in continuous time) describes the evolution of the underlying decision process, the adjective differential is used to qualify the game. Strictly speaking, in the discrete-time case, one should refer to a difference game, but since this terminology is not common in the literature, the term dynamic game will be used in this paper, and some basic definitions pertaining to dynamic games will be recalled in Sect. 2. Furthermore, this paper will consider real-time iterative processes for the computation of noncooperative equilibria of market share dynamic games.
Two quotes from [2] are relevant here: (The Gauss-Seidel and Jacobi procedures should be thought of as) "online adjustment processes (in which) the players need to know only their own cost functions and the most recently computed... policies of the other players, and not the other players' cost functions." (from the abstract: we added the phrase in parentheses and normal font to make the quote comprehensible). In [2, p.18], we find the statement "the class of algorithms studied... is different in nature from a majority of algorithms found in the literature for zero-sum and nonzero-sum games. In most of that work... the objective has been to obtain efficient iterative techniques to solve (off-line) for the equilibrium solution from the necessary conditions obtained for such games. Here, however, our motivation has been guided by online distributed decision making (emphasis ours) where at each iteration the players react to the changes they detect in the policy choices of each other, in a way consistent... with the information available to them at the time of the decision." Both of these quotes are applicable, ipsis litteris, to the present paper.
In the literature, the Vidale-Wolfe-Deal model, introduced in [6], has mainly been studied in the continuous-time case, from a theoretical viewpoint, using the techniques of differential games. It is often difficult to obtain analytical or even computational results using the differential game approach, owing to the complexity of solving the nonlinear partial differential Hamilton-Jacobi-Bellman-Isaacs equations. The main objective in this paper is to show that the one-step-ahead optimal control is applicable, and easily computable, in the discrete-time context, provided that the competition is assumed to occur according to a given procedure: for example, with firms alternating in the computation of their optimal actions. This paper studies the Vidale-Wolfe-Deal model, as well as three more general models, in which advertising effort can affect both customers who are undecided as well as those who patronize the competing brand, using the one-step-ahead optimal control approach.
The sections that follow describe the main ingredients of a discrete-time dynamic game. First, the models used to describe market share dynamics are introduced and discussed briefly, together with the definition of possible performance indices that are relevant. A quick recapitulation of some basic definitions of dynamic games is made, as well as the introduction of two basic procedures of game playing that define alternating and simultaneous modes of play, following Başar [2]. The concluding section gives numerical examples of some sample cases. Since there are many different choices of models, performance indices, game playing procedures and parameters, it is clear that, combinatorially speaking, there is potentially a large number of cases to be investigated. However, the scope of this paper is not to make an exhaustive study, but rather to showcase some of the most illustrative or representative cases.

Contributions of this Paper
This paper makes the following contributions: • Introduces and studies discrete-time versions of four different models of two-player games which model market share dynamics under advertising effort in a duopoly. • Defines a dynamic game in which each player adopts a one-step-ahead optimal strategy using either a linear or a quadratic performance index, trading off market share and advertising effort. Indices are of two types: without and with target market shares for each player. Each player's moves follow either a Jacobi (synchronous parallel) or a Gauss-Seidel (sequential) procedure. • Defines a discrete-time version of the Nagurney-Zhang projection procedure and uses it to ensure invariance of the projected dynamical systems corresponding to the dynamic models that result from application of the one-step-ahead optimal advertising effort by each of the players. • Formulates the dynamic game models under one-step-ahead optimal control as nonlinear programs, enabling a numerical study of the proposed games under different parameter values.
• Carries out numerical studies that highlight the differences in the outcomes of the dynamic games, depending on the model, its parameters and the play procedure (Jacobi or Gauss-Seidel). This provides useful guidelines on the use of each model, depending on the observed market share behavior. • In the case of specified target market shares, dynamic games based on all four models show convergence to a globally stable equilibrium (and no limit cycles occur).

Definition of a Discrete-Time Dynamic Game
The general formulation of a dynamic game, based on [2], is given in this section. The discussion will be limited, for simplicity, to two-person deterministic games, but there is no difficulty in generalizing to N -person games. A two-player dynamic game, with the players denoted P 1 and P 2 , is an 8-tuple where the elements of the 8-tuple are described as follows: 1. Player states, denoted x 1 , x 2 , which describe the quantity of interest to each player, and belong to some set X 1 × X 2 ⊂ R 2 . In this paper, the states represent market shares, and X i is defined as [0, 1] ⊂ R for i = 1, 2. In fact, since the x i s represent fractions of the total market (normalized to unity), the states are further restricted to the triangle defined as Player actions or controls, denoted u 1 , u 2 , which belong to some sets U 1 , U 2 . In this paper, U 1 = U 2 = [0, 1] ∈ R. 3. A pair of difference equations, collectively referred to as the dynamics D, that determine the evolution of the states of the players, as a function of the player states and actions (controls). These difference equations are written as follows: 4. Performance indices, denoted J 1 , J 2 , where J i is to be optimized by P i . The indices J i may depend on x i , u i , i = 1, 2. The specific indices J i and dynamics D used in this paper are introduced in Sect. 2.1. It should be noted that the dynamics and the performance indices define the information that each player needs about the other in order to be able to calculate its action and, as a result, its next state. 5. A game playing procedure P is specified, determining the order in which each player optimizes its index, given the information it requires about its adversary. Details of the specific procedures are given in Sect. 2.5.
The subsections which follow detail the descriptions of the different elements of the tuple which define, specifically, a market share dynamic game, namely the dynamics, the performance indices and how they are used in the calculation of optimal controls or actions and the game playing procedure.

Models of Market Share Dynamics for Duopolies
This section introduces discrete-time versions of four models of market share dynamics, referred to by the initials of the authors who first proposed the models. In all four models, x i denotes the market share of firm i and u i the corresponding advertising effort, while λ i denotes the market share decay rate of firm i. Other parameters specific to each model are defined as necessary.

The Vidale-Wolfe-Deal (VWD) Model
Discrete-time Vidale-Wolfe-Deal dynamics for two firms, whose market shares are represented by the variables x 1 and x 2 , are as follows: where u i (k) ≥ 0, for k = 1, . . . , K f and λ i ≥ 0, i = 1, 2. The nonnegative b i s are referred to as the market share response parameters.

The Aliaga-Bhaya-Kaszkurewicz (ABK) Model
An alternative proposal for dynamics of market shares in a duopoly was made in [1] and a slightly generalized discrete-time version of ABK dynamics, adding parameters b i , k i , is as follows.
where u i (k) ≥ 0, for k = 1, . . . , K f and λ i ≥ 0, i = 1, 2. The nonnegative b i s are referred to as market share response parameters, while the nonnegative k i s are the predation parameters.

The Deal-Sethi-Thompson (DST) Model
Discrete-time Deal-Sethi-Thompson dynamics for two firms whose market shares are represented by the variables x 1 and x 2 are as follows: where u i (k) ≥ 0, for k = 1, . . . , K f and λ i ≥ 0, i = 1, 2. The nonnegative b i s are referred to as market share response parameters and the nonnegative e i s as the excess advertising effect parameters.

The Leitmann-Schmitendorf (LS) Model
Discrete-time Leitmann-Schmitendorf dynamics for two firms whose market shares are represented by the variables x 1 and x 2 are as follows: where u i (k) ≥ 0, for k = 1, . . . , K f and λ i ≥ 0, i = 1, 2. The parameters c i shape the response to advertising effort and are discussed further in Sect. 2.1.5, while k i s are predation parameters.

Brief Comparative Discussion of Duopoly Market Share Models
As pointed out by Deal [6], the time-varying aspects that must be treated in a market share model must include the market share response to advertising expenditures, as well as the possibility of diminishing returns to cumulative advertising expenditures. The four models considered above deal with these issues in different ways. There are several surveys [9,11,13,14] and even book-length expositions [10,15] that propose different classifications of these models to which the reader is referred for greater detail. The discussion below is limited to a brief description of the main features of each model. The VWD model (6)-(7) is the simplest model since it does not have a competitive term and the competitor's advertising effort has no direct effect on a firm's market share. However, the presence of the term b i (1 − x 1 (k) − x 2 (k))u i (k) implies that if the market share of the competitor increases, the size of the unconquered market (1 − x 1 (k) − x 2 (k)) decreases, which, in turn, diminishes the effectiveness of advertising effort for the firm, thus modeling the effect of competition and diminishing returns. This is sometimes referred to as a (market) saturation effect.
The ABK model (8)- (9), introduced in continuous time in [1] and inspired by population biology models, is a variant of the VWD model, which considers three populations: two composed of customers of firms 1 and 2 and a third composed of undecided customers. The new feature of the ABK model is to consider the effects of advertising on the undecided population The first change implies that the advertising effort of firm 1 applies to both the competitor population as well as the undecided one. The second change implies that the advertising effort of firm 2 has a direct (predatory) effect on the firm 1 population, proportional to the latter's size.
The salient feature of the DST model (10)- (11), which is another variant of the VWD model, is that it contains reaction terms of the form e 1 (u 1 − u 2 )(x 1 + x 2 ) and e 2 (u 2 − u 1 )(x 1 + x 2 ). To explain these terms, Deal et al. [5] assume that u 1 > u 2 , so that the term e 1 (u 1 − u 2 )x 2 represents the fraction of the x 2 population that switched from firm 2 to firm 1 because of the excess advertising of firm 1. Next, the term e 1 (u 1 − u 2 )x 1 is referred to as an approximation of e 1 (u 1 − u 2 )x 1 x 2 . The latter term is typical of Lotka-Volterra-type dynamics, representing the rate of encounters or interaction between individuals belonging to the x 1 and x 2 populations and represents the fraction of the x 2 population switching from firm 2 to firm 1 as a result of the excess advertising of firm 1. This term has, more recently, also been described as modeling word of mouth (WOM) interaction between populations. Summing the two terms leads to the term The LS model (12)- (13), introduced in continuous time in [16], takes a different approach from the previous three discussed above, modeling the diminishing returns of advertising effort explicitly, through the convex function u i (k) − c i u i (k) 2 . The predatory terms −k i x i u j are identical to those of the ABK model. Table 1 shows, for each of the four models discussed above, the information each player requires about its adversary at instant k, in order to compute its market share at instant k + 1, in accordance with the respective model dynamics.

Performance Indices for Dynamic Games
In the literature on dynamic games (see [3] and references therein), one finds performance indices of the type where the subscript H is used to denote that the optimization is performed over the entire The following remarks about (14) are pertinent.
1. It is crucial to observe that the choice of index (14) implies that the optimal controls u 1 (k), u 2 (k) must be determined simultaneously, for the whole horizon H , using some optimal control method. This means that we cannot ensure that the choice of Or, saying this another way, cannot be guaranteed to be causal in the sense that the determination of control and states at a given time does not depend on future states and controls. This, apart from being infeasible, also goes against the usual meaning of a competitive game in which each player chooses his present action (advertising effort), as the game unfolds (i.e., in real time), according to his present market share as well as the market share and present and past actions of his adversary. 2. In a general model, the x 1 and x 2 dynamics may be coupled, in the sense that f i at instant k may also depend on x j (k) and u j (k). This implies that, in a competitive game, if player 1 wishes to optimize an index of the type J H i in (14), his optimal sequence of actions will also end up specifying the optimal sequence of actions (and states) of his adversary, player 2, and vice versa, which implies that optimality of the sequences computed by the two adversaries will only hold if they are consistent with each other.
It turns out that these issues can be sidestepped in the particular case of linear dynamics and a quadratic performance index, because the optimal control can be written in feedback form [3,8,12], but for the general case of nonlinear dynamics and performance indices that are not quadratic, both observations above hold true. At an abstract mathematical level, the notion of causality in games in discussed in [21], while the notion of consistency is discussed in [3].
Given these observations, the problems of noncausality and inconsistency can be avoided as follows: 1. Using one-step-ahead optimization, in which, at each instant k, each player optimizes his index at the next instant k + 1, using only current and past information. Note that this is natural in a discrete-time setting, since choosing the action or control at instant k determines the state at instant k + 1. Adopting one-step-ahead optimization ensures causality.
2. Defining a game playing procedure, which defines the order in which each player calculates its optimal control. Following the procedure as the game unfolds guarantees consistency of the controls.
Performance indices of two types will be defined: The first type captures the desire of each player to maximize its market share while minimizing its advertising effort, while the second establishes target market shares for each player, so that the deviation of each player's actual market share from the target is sought to be minimized, while simultaneously keeping its advertising effort as small as possible. From these descriptions, it is clear that these indices apply to all models of competitive duopolies. A further subdivision of the performance indices without targets is made according to whether it is linear or quadratic. In the case that targets are specified, only quadratic performance indices are considered, for simplicity.
The linear one-step-ahead performance index for firm i (with market share x i at instant k), denoted J i (k), is defined as follows: where α i is a nonnegative weight and the index J i (k) is to be maximized at each instant k, where α i is a nonnegative weight and the index J q i (k) is to be maximized at each instant k, for each i.
Given target values x i,re f , i = 1, 2, a quadratic one-step-ahead performance index at instant k, denoted J r i (k), for firm i, with target market share x i,re f , is defined as follows.
where α i , β i are positive weights and the index J i (k) is to be minimized at each instant k, for each i.

Game Playing Procedure
In order to define game playing procedures precisely, some notation is introduced. Let L i be the optimal action function, also called the optimal response function, be the map L 1 : X 1 × X 2 × U 2 → U 1 defined by the following equality constrained optimization problem: The argument u 1 that maximizes J 1 in (18) is denoted u * 1 (k) and is the one-step-ahead optimal control for player 1 at instant k, i.e., . By using the dynamics to substitute x 1 (k + 1) by the right-hand side of (4), the nonlinearly constrained problem (18) can be written equivalently in terms of the following unconstrained nonlinear optimization problem: Note that, in the definition of L 1 , x 1 , x 2 , u 2 are assumed fixed and the optimization variable is u 1 . The map L 2 : X 1 × X 2 × U 1 → U 2 can be defined similarly.
Using this notation, two iterative procedures can be described. The first corresponds to alternate or sequential play by each of the players and, following [2], will be referred to as the Gauss-Seidel procedure, since it is reminiscent of the numerical iterative method with the same name. The second corresponds to simultaneous play by both players and is referred to as the Jacobi procedure, once again in analogy with the numerical iterative method.
Formally, the Gauss-Seidel procedure is written as: Emphasizing the role of the controls u i , the iteration (20)-(21) can be written equivalently as: where L G S := L 2 • L 1 is the composition of the optimal action functions and is referred to as the Gauss-Seidel optimal action map.. The Jacobi procedure, in which the players compute their optimal actions simultaneously (i.e., in parallel), rather than sequentially, is written formally as follows.
Once again, emphasizing the role of the controls u i , we can define the Jacobi optimal action map as:

Equilibrium of a Dynamic Game
In this paper, all games (X 1 , X 2 , U 1 , U 2 , D, J 1 , J 2 , P) will be played on a finite horizon k = 0, 1, 2, . . . , K f , meaning that at the end of the game, the players have defined two Following Başar [2], a solution of the game is referred to as a globally stable (or, simply, stable) equilibrium solution, with respect to an a priori fixed game playing procedure P if, irrespective of the initial player actions, the state trajectories converge, as K f → ∞, to some fixed point or limit cycle.
At each step of such a game, each player P i determines his policy by optimizing his individual cost function J i , using the most recently available policies and states of the other players as given. Clearly, any deviation from such a stable equilibrium solution, followed by an optimal adjustment process that respects the given game playing procedure P, would lead back to the same equilibrium.
An equilibrium solution (u * 1 , u * 2 ) of the game (X 1 , X 2 , U 1 , U 2 , D, J 1 , J 2 , P), with P as the Gauss-Seidel procedure, is stable if and only if it is the limit point of the sequence {u 1 (k), u 2 (k)} ∞ k=1 generated by (20)-(21), or, equivalently, by (22), for all initial actions u 2 (1) ∈ U 2 . Note that this means that an equilibrium can also be thought of as a fixed point of the composite map L G S . Observe also that, if L G S = L 1 • L 2 , this just means that the Gauss-Seidel procedure begins with firm 2 instead of firm 1, so that the initial action u 1 (1) must be specified, but the resulting equilibrium solution is the same as the one found using the inverted order L 2 • L 1 .
An equilibrium solution of the game using the Jacobi procedure is just a fixed point of the map L J . In this paper, we will only be concerned with the equilibrium solutions or limit cycles produced by the iterative procedures defined in the paper, noting that, as shown in [2], that an equilibrium solution is a Nash equilibrium. Finally, using the notation L μ , where the subscript μ can be either G S or J , it may be the case that L μ does not have a fixed point, but that L N μ does, for some positive integer N , where the superscript denotes N successive applications of the map L μ . This means that the dynamic game possesses a limit cycle of period N . Thus, a limit cycle may be thought of as a Nash equilibrium of L N μ , for some positive integer N , where the superscript denotes N successive applications of the map L μ .

Complete Description of a Dynamic Game with Projections
An important point that has not been raised until now is the question of invariance of the triangle defined as (the phase space of the market share dynamics), when each player optimizes his performance index independently, following some game procedure. In fact, in order to guarantee that the triangle T is maintained invariant during the dynamic game, it is clear that the previous descriptions of the Gauss-Seidel and Jacobi procedures must be augmented with a projection step. This is because the computed optimal state, resulting from the application of the computed one-step-ahead optimal controls to a current feasible state, may lie outside the triangle T . Whenever this happens, this state should be projected back onto T . An appropriate definition of a projection, denoted T , is given in Appendix.
Adding the projection step to the previous description of a dynamic game, we finally arrive at the complete descriptions of the procedures.

The Gauss-Seidel Procedure for a Dynamic Advertising Game
This iterative procedure or algorithm can be written as follows: u 2 (0), and horizon K f given.
Note the appearance of the superscript t in both procedures, to indicate that the calculation of the superscripted "next" states is temporary, because the point (x t 1 (k + 1), x t 2 (k + 1)) may not belong to the triangle T . This is only ensured in line 7 of both procedures by application of the projection T . Observe also that the one-step-ahead optimal control is computed in the steps in which the L i appear on the right-hand sides.

Dynamic Game Based on the Vidale-Wolfe-Deal Dynamics
This section writes out the Gauss-Seidel and Jacobi dynamic games based on the VWD model explicitly.

Gauss-Seidel Procedure for VWD Duopoly Game
The Gauss-Seidel procedure described in general terms in Sect. 2 can be written formally for the Vidale-Wolfe-Deal dynamics as follows: The notation x t i (k + 1) refers to a temporary or pre-projection value of the state x i (k + 1). The actual values x i (k + 1) are computed in the last step, which is the projection.

Jacobi Procedure for VWD Duopoly Game
The Jacobi procedure described in general terms in Sect. 2 can be written formally for the Vidale-Wolfe-Deal dynamics as follows: Similar procedures can be written for the other three dynamic game models.

Numerical Examples of VWD Duopoly Games for Linear Performance Indices
In order to describe the numerical examples, the term parameter gap is introduced, to refer to the difference between corresponding parameters in the dynamics of players 1 and 2. For instance, the term λ-gap refers to the difference between λ 1 and λ 2 and is said to be in favor of firm i if λ i < λ j , i = j. This is because, in the case of constant control (u i ), a smaller decay factor should lead to a larger equilibrium value of market share. The term α-gap refers to the difference between α 1 and α 2 and is said to be in favor of firm i if α i < α j , j = i, because a smaller positive weight implies that a larger control effort can be applied if necessary, leading, in principle, to a larger market share. Similar comments apply to parameters β i , b i , c i , k i for the different models. This terminology is useful in describing different combinations of parameter choices compactly and describing the qualitatively different types of behavior associated with these choices. Furthermore, in all phase plane plots, the line x 1 + x 2 = 1 with slope −1 connecting (0, 1) to (1, 0) is called the saturated market line (and appears as a blue line in the plots).
The line x 1 = x 2 is called the equal market share line and is shown as a dotted green line in the plots.
All the numerical experiments carried out in this paper were implemented in Julia and the algebraic modeling language JuMP (Julia for mathematical programming) [4,7].

Zero˛-, -and b i -Gaps
With zero gaps, it is to be expected that the VWD duopoly game will lead to a situation in which both firms end up having roughly equal market shares. The simulations below show that this is indeed the case, although, for the parameter values chosen, the end result is a limit cycle rather than a (Nash) equilibrium, for both the Gauss-Seidel (Fig. 1) and Jacobi (a) Phase plane plot, showing convergence to the limit cycle, from all initial conditions. (b) Period 3 limit cycle, symmetrically located around the green dotted equal market share line.

3.2˛-and -Gaps Both in Favor of Firm 1
With both gaps in favor of firm 1, it is to be expected that the VWD duopoly game will lead to a situation in which firm 1 ends up having a larger market share. The simulations below show that this is indeed the case, although, for the parameter values chosen in Figs. 5 and 6, the end result is a limit cycle rather than a (Nash) equilibrium, for both the Gauss-Seidel and Jacobi procedures. Corresponding time and advertising effort plots are shown in Figs. 7 and 8.
(a) Phase plane plot, showing convergence to the limit cycle, from all initial conditions. (b) Period 13 limit cycle, located below the green dotted equal market share line.

3.3˛-Gap in Favor of Firm 1 and -Gap in Favor of Firm 2
With the α-gap in favor of firm 1, and the λ-gap in favor of firm 2, the relative sizes of the gaps could be traded off, resulting in a situation in which both firms have roughly equal market shares. Such a choice of parameters, resulting in a slight advantage for firm 2, are shown in Figs. 9 and 10 (for the phase plane plot and limit cycles) and Figs. 11 and 12 (for the time and advertising effort plots).
(a) Phase plane plot, showing convergence to the limit cycle, from all initial conditions. (b) Period 14 limit cycle, located around the green dotted equal market share line, mostly above it.  (b) Phase plane plot for initial conditions below the green dotted equal market share line. Fig. 13 a Phase plane plot for initial conditions above E (the green dotted equal market share line), for the VWD duopoly game, with linear index J i , following the Jacobi procedure, and with α-gap in favor of firm 1 and λ-gap in favor of firm 2. b Phase plane plot for initial conditions below E. The equal market share line E is a separatrix, in the sense that trajectories with initial conditions above (resp. below) it, remain above (resp. below) it and never cross E For this particular choice of parameters, the Jacobi procedure can lead to one of two limit cycles, depending on the location of the initial condition. This can be seen clearly in the phase plane plots in Fig. 13. The plot on the left shows initial conditions originating above the equal market share line E (x 1 = x 2 ), and it can be seen that trajectories emanating from initial conditions above E appear to converge to a limit cycle above it, while the plot on the right shows that those emanating from initial conditions below E converge to a limit cycle below it. This is confirmed in the plots shown in Fig. 10.

Numerical Examples of VWD Duopoly Games for Quadratic Performance Indices
This section presents some results for VWD duopoly games using a quadratic index. Simulations are shown only for the case of zero αand λ-gaps in Figures 14, 15,16 and 17. The new feature which emerges is the existence of multiple equilibrium points, the locations of which are determined by the initial market shares. The new feature, with respect to the case of the linear index, is the appearance of multiple equilibrium points as well as limit cycles. The initial market shares determine the outcome of the game, which is either an equilibrium or a limit cycle. The equilibrium points lie on a straight line (shown as a dashed dark blue line in the figure), which is parallel to the saturated market line x 1 + x 2 = 1. The limit cycles are also to be found in a neighborhood of this line. Parameters b 1 and b 2 are both equal to 1

Summary of Observations from Numerical Experiments on the VWD Model with No Targets Stipulated
Using the terminology of λ-gap and α-gap, the behavior of the Vidale-Wolfe-Deal dynamics under one-step-ahead optimal control with linear and quadratic performance indices, and using the Gauss-Seidel and Jacobi procedures, can be summarized as follows:

Linear Performance Index
1. With zero gaps, the Gauss-Seidel procedure leads to a stable limit cycle, with the market shares of each firm oscillating between the same upper and lower bounds, around the same mean value, but out of phase by one period. With the Jacobi procedure, the oscillations are in phase, between the same upper and lower bounds and around the same mean value. 2. If both the αand λ-gaps are in favor of the same firm and are not too large, both the Gauss-Seidel and Jacobi dynamics lead to stable limit cycles, in which the favored firm's market share oscillates around a larger mean value than its competitor. The gap between the mean values of the market shares is larger for the Gauss-Seidel procedure than that for the Jacobi procedure. 3. If the αand λ-gaps are not aligned, i.e., one gap favors firm i and the other favors firm j, as long as the gaps are small and roughly of the same size, once again, both procedures lead to stable limit cycles, although there may exist more than one limit cycle for the Jacobi case. 4. Keeping one gap, say the λ-gap, fixed in favor of firm 2 and increasing the α-gap in favor of firm 1, there exists a threshold above which the market share of firm 2 converges to zero, and vice versa. This threshold is larger for the Jacobi procedure than for the Gauss-Seidel procedure.

Quadratic Performance Index
1. With zero αand λ-gaps, the use of the quadratic performance index J q i by both firms, for both the Gauss-Seidel and Jacobi procedures, there are no longer any global equilibrium points or limit cycles. The new phenomenon is that trajectories starting from different initial conditions may converge to different equilibria. However, all these equilibria lie on a straight line parallel to, and below, the saturated market line (x 1 + x 2 = 1). 2. With any mismatch between the α and λ parameters, the trajectories display the winner take all property, where the winner is defined by a relative advantage in either the α-gap or the λ-gap, or even both. 3. The advertising effort (control) is smoother than in the case of a linear performance index. As a result, the evolution of the market shares is also smoother, not presenting large oscillations.

Numerical Experiments on the ABK, DST and LS Models with No Targets Stipulated
This section shows the results of numerical experiments on the dynamic games based on the ABK, DST and LS models when no targets are specified. For brevity, only a few illustrative phase plane plots are shown. In all subsequent phase plane plots, the (thin) arrows, above the saturated market line x 1 + x 2 = 1 and parallel to it, indicate the direction of movement of the trajectories along the saturated market line, toward the equilibrium point, indicated by a black dot.

Summary of Observations from Numerical Experiments on the ABK, DST and LS Models with No Targets Stipulated
All three dynamic games (ABK, DST and LS) differ from the VWD game in presenting unique equilibrium points that are consistent with the advantages conferred by the parameter gaps,     Phase plane plot for the DST duopoly game, with no targets and linear index J i ,with zero α-λ-gaps and b i -gaps, but with e i -gaps favoring firm 1, following the Gauss-Seidel procedure, with a player 1 being the leader (i.e., making the first move), and player 2 being the follower. Note the first mover advantage, despite the e i -gap disadvantage. b Player 2 is leader, player 1 is follower. Note the first mover advantage, despite the e i -gap disadvantage follower, the leader still ends up with the full market share and the follower with zero market share (Fig. 21).

Dynamic Games with Target Market Shares Specified
This section shows the results of applying one-step-ahead optimal control in dynamic games for each of the four duopoly models. For brevity, we show only phase plane plots illustrating competition in the situation that target market shares of the firms sum to an unattainable value greater than one. Figures 22, 23, 24, 25 show that, when targets are specified, with the corresponding quadratic indices J r i , the dynamic game iterative procedure, for all four models, leads to an equilibrium point. For the ABK, DST and LS games, this equilibrium point is always located on the saturated market line x 1 + x 2 = 1 and its position is determined by the interaction of the dynamics with the modified (discrete time) Nagurney-Zhang projection, as well as the parameter values (Figs. 23, 24, 25). For the VWD game, the equilibrium point always lies below the saturated market line (i.e., x 1,eq + x 2,eq < 1) (Fig. 22).
Finally, when the target market shares sum to less than one, for all four models under one-step-ahead optimal control, convergence to the equilibrium (defined by the target market shares) occurs. These examples are not shown here for lack of space.

Conclusions
This paper implements one-step-ahead optimal control of nonlinear dynamic advertising games, viewed as Jacobi or Gauss-Seidel iterative procedures, following the pioneering work of Başar [2] for quadratic games. The steps involving computation of optimal controls are written as equivalent mathematical programs. Whenever necessary, a Nagurney-Zhang projection procedure, modified in this paper in order to apply to discrete-time dynamic games, is applied to ensure invariance of the market share phase space.
Broadly speaking, without specified targets, for the same configuration of parameters (α-, λand other parameter gaps), with linear performance indices, both procedures lead to similar results, although, for the VWD dynamic game, the Gauss-Seidel procedure tends to lead to unique stable limit cycles with high period, while the Jacobi procedure leads to different limit cycles, depending on the initial conditions, but generally with smaller periods than those obtained with the Gauss-Seidel procedure. The ABK, LS and DST (Jacobi) dynamic games show convergence to a unique equilibrium point, determined by the parameters, and do not display limit cycle behavior. The DST Gauss-Seidel game, interpreted as a leader-follower game, where the player who starts first is the leader, displays winner take all behavior, i.e., the leader is always the winner, ending up with market share equal to 1.
When target market shares are specified in the (quadratic) performance indices of each firm, the result is convergence to an equilibrium, for all the models considered. Furthermore, all four models show similar behavior under one-step-ahead optimal control with target market shares specified, despite the different information requirements shown in Table 1. These observations provide useful guidelines on the use of each model, depending on the observed market share behavior which it is desired to model and eventually control.
Author Contributions AB and EK wrote the main manuscript text and AB prepared all simulations and figures. Both authors reviewed the manuscript.

Funding
The authors have not disclosed any competing interests.

Conflicts of interest
The authors declare no competing interests.

Appendix: Projected Dynamical Systems in Discrete Time
Observe that, in any market share dynamics model, the market share fractions x i must be nonnegative and sum to at most one. Let T denote the closed convex set (triangle in the positive quadrant x 1 -x 2 ) defined as follows: In other words, for the market share models considered in this paper, the complete set of possible states is actually T , also referred to as the market share dynamics phase space, which must remain invariant under any market share dynamics. Let x denote (x 1 , x 2 ), u denote (u 1 , u 2 ) and f (x, u) denote (x ( k + 1), x 2 (k + 1)), i.e., the vector of right-hand sides of (4), (5). In order to ensure the invariance of the triangle T , it is necessary to introduce the notion of projected dynamical systems, which were defined for continuous-time systems in [17].
In the discrete-time case, a modification of the projection mechanism is needed and is defined in Fig. 26. The essential difference with respect to the continuous-time case is that there will not be, in general, an exact instant at which a trajectory, which is in the interior of the triangle T , crosses the border of T , creating the need to project onto the border of T . In the discrete-time case, the situation depicted in Fig. 26 (left panel) may arise, namely, that Fig. 26 Detailing the modification of the Nagurney-Zhang projection for the discrete-time market share duopoly dynamics model the point A = (x 1 (k), x 2 (k)) is in the interior of T and the next point determined by the dynamics B = (x + 1 (k), x + 2 (k)) := (x 1 (k + 1), x 2 (k + 1)) is outside T . In analogy with the continuous-time case [17], it is necessary to define the projected point in such a way that it lies on the frontier of T . In order to do this, referring once again to the left panel of Fig. 26, project the segment C B, where C is the intersection of AB with the line E 1 E 2 , orthogonally onto the line E 1 E 2 , since this is the line "crossed" by the trajectory when it leaves the triangle T . The right panel shows some other cases that might arise and the resulting projections. Let   T (x, f (x, u)) denote the projection with respect to T just defined, noting that T takes as arguments two points: the current point, x and the next point, f (x, u). The projected dynamical system, corresponding to the original system (4), (5), can be written as: Given the definition of the projection operator T , the discrete-time market share dynamics are defined more precisely as follows: x(k + 1) = T (x(k), f (x(k), u(k))) . (30)