Event-triggered-based cooperative game optimal tracking control for modular robot manipulator with constrained input

In this paper, a cooperative game optimal tracking control method based on event-triggered mechanism for constrained input modular robot manipulators (MRMs) system is introduced. According to the joint torque feedback technique, the dynamics model of the constrained input subsystem is established and the global state space equation is derived. The control inputs of n joints in the MRM system with constrained input are taken as n participants of cooperative game, and the tracking control problem of the manipulator system is transformed into the optimal control problem based on the cooperative game. Next, a fusion function containing position and velocity errors is defined to construct the performance index function. In order to improve the control performance and robustness of the manipulator system, part of the known model information is used to devise controller, the model uncertainty is dealt by the neural network (NN) observer, and the optimal compensation control strategy is used to deal with internal disturbance such as sensor measurement error and transmission ripple due to power fluctuations, electromagnetic effects, noise and vibration. Based on the adaptive dynamic programming algorithm and event-triggered mechanism, the optimal tracking control strategy is obtained by approximately solving the event-triggered Hamilton-Jacobi-Bellman equation with the critic NN. The Lyapunov theory proves that trajectory tracking error of MRM system with constrained input is uniformly ultimately bounded. Finally, the experimental results demonstrate the effectiveness of the proposed control method.


Introduction
Modular robot manipulators (MRMs) [2,3] are composed of modules with standard interfaces, which have high flexibility and wide adaptability, and have attracted great attention from researchers in the field of robotics. Compared with the traditional robot which has a single configuration and cannot be combined arbitrarily, the MRM can be added or reduced modules to accomplish different tasks, such as deep-sea exploration [4], disaster rescue [5] and extreme environment operations [6]. Constrained input [7] is a com-mon phenomenon in the actual engineering system due to the constraint of physical conditions. For example, the output torque and speed of the motor have maximum values, and the valve switch cannot be adjusted arbitrarily large. The occurrence of these situations is often accompanied by the degradation of system performance and even loss of stability. Since each module of MRM system contains a DC motor, the research on constrained input MRM system has profound significance. For a class of constrained input nonlinear systems, an antisaturation adaptive backstepping controller based on wavelet network is designed in [8]. In [9], a timevarying sliding mode control scheme for the control constrain problem of uncertain nonlinear systems is presented. A backstepping controller is designed by using hyperbolic tangent function to deal with saturation function in [10]. The constrained input problem of the controller is regarded as the constraint condition of the objective function to be optimized in [11]. However, the methods mentioned above ignore the comprehensive optimization of control performance and energy consumption. To solve this problem, optimal control is introduced into the controller design.
The optimal control theory includes two important branches of dynamic programming theory [12] and maximum principle [13]. On account of dynamic programming can cause curse of dimensionality [14], Werbos et al. [15] first proposed the idea of using neural network (NN) adaptive approximation to solve Bellman equation, which laid a foundation for the appearance of NN approximators in adaptive dynamic programming (ADP). In the 21st century, the ADP method has been widely recognized and developed in theory. Lewis et al. [16] proved the convergence of Hamilton-Jacobi-Bellman (HJB) equation by using the heuristic dynamic programming (HDP) algorithm based on value iteration. Jiang et al. [17] elaborated a value iteration algorithm for a continuous-time system with completely unknown dynamics and conducted a strict convergence analysis on the algorithm. Liu et al. [18] used the policy iteration ADP algorithm to solve the optimal control of the nonlinear system and analyzed the convergence. Cheng et al. [19] transformed the adaptive cruise control problem into the optimal tracking control problem of complex nonlinear system and then designed the adaptive cruise controller by using ADP and experience playback technology. Wei et al. [20] elaborated a distributed policy iteration algorithm, which only updated the control strategy of one controller at a time, effec-tively reducing the calculation burden of each iteration. Shi et al. [21] stated a data-based controller design method for the output feedback optimal control problem of continuous-time system with dynamic uncertainty through ADP technology, which improved the robustness of the system. With the continuous exploration of researchers, ADP is not only applied in theory to devise controllers of linear/nonlinear systems with input/output constraints [22,23], external disturbances [24], mismatched interconnections [25], but also devoted to practical systems such as manipulator system [26], marine vessels system [27], power system [28], network control system [29].
For a common MRM system, if the control input of each joint module is considered as a decision-maker, the MRM system can be supposed to an interactive system with multi-player dynamic decision-making. Differential game combining optimal control and game theory is an ideal tool to deal with multi-player decision problem, especially the ADP-oriented differential game has made numerous research achievements [30,31]. According to the cooperative or competitive relationship among participants, game problem can be mainly divided into zero-sum differential game [32], nonzero-sum differential game [33] and fully cooperative differential game [34]. The participants in the zerosum game are in complete competition relation. For the two-player zero-sum game, the idea is that both players expect to choose appropriate strategies to maximize and minimize the value function, respectively, which is similar to the design of H ∞ optimal controller [35]. Jiang et al. [36] used the ADP algorithm to overcome the difficulty of accurately finding the analytic solution of Hamilton-Jacobi-Isaacs (HJI) equation for the multi-player zero-sum differential game problem and adopted the form of single network to reduce the structural complexity. Dong et al. [37] elaborated a zerosum differential game based on ADP for the optimal tracking problem of MRMs system with uncertain disturbance and successfully solved the optimal control strategy by using the single critic NN. For nonzerosum differential game, there is both competition and cooperation among players, and each player expects to select strategy to minimize their performance index function. For the nonzero-sum game problem of multiplayer unknown nonlinear system, [38] designed an actor-critic structure to approximate optimal solution by using the data-based ADP algorithm. [39] transformed the optimal control of a nonlinear intercon-nected system into a nonzero-sum differential game problem, and the optimal solution of Hamilton-Jacobi (HJ) equation was obtained by using the proposed distributed ADP algorithm. Zhao et al. [40] proposed a reinforcement learning (RL) method to solve HJ equation for the nonlinear system optimal control problem of nonzero-sum game. Ma et al. [41] transformed the optimal tracking problem into a nonzero-sum game problem and introduced compensator-critic structure to solve the optimal solution of MRM system under uncertain environment. In the cooperative game, there is only a cooperative relationship among players, and each player expects to choose a strategy to minimize the value function for the team as a whole. Zhang et al. [42] exposited the data-driven ADP algorithm to study the cooperative game problem with constrained input. Li et al. [43] expounded a new adaptive Q-learning algorithm to solve cooperative linear quadratic dynamic game. Mu et al. [1] transformed the cooperative differential game problem into the optimal control problem by defining the global performance index function and successfully applied cooperative game to power system.
It is worth noting that most of the above literatures update the designed controller with time-triggered method of fixed sampling period. Due to the advantages of computational efficiency, event-triggered control methods have attracted extensive attention from academic circles in recent years. In consideration of MRM system's unexpected computational burden and the waste of communication resources caused by sampling periodicity in stable operation, event-triggered mechanism needs to be introduced into the design of MRM system controller. The robust control of nonlinear system was investigated based on event-triggered in [44] which not only has certain robustness but also saves the communication resources between the controller and the control plants. Xue et al. [45] converted the constrained tracking control problem into the optimal adjustment problem through integral reinforcement learning algorithm and then applied the event-triggered mechanism to reduce the data transmission pressure. Mu et al. [46] elaborated a modified event-triggered optimal control scheme by combining global dual heuristic dynamic programming (GDHP) algorithm with event-triggered mechanism to dispose constrained input problems of nonlinear system. Zhu et al. [47] proposed an event-triggered control method for constrained input continuous time non-linear system and successfully applied it to the overhead crane system. Based on ADP algorithm and eventtriggered mechanism, [48] and [49] considered zerosum and nonzero-sum differential game of continuoustime nonlinear system, respectively, and solve their optimal control problems. However, the optimal control scheme of cooperative game based on event-triggered mechanism is rarely studied.
Inspired by the above literatures, this paper proposes an event-triggered cooperative game optimal tracking control method for MRM system with constrained input. Different from previous optimal control methods [47,50], this paper focuses on cooperative game and fully considers the superiority of game theory in dealing with multi-player dynamic decision-making. Taking each joint module control input of the MRM system with constrained input as a decision-maker, it can be assumed as a multi-player interactive system. Cooperative game theory considers the information exchange between each joint of the manipulator system. The overall control performance and energy consumption of the MRM system are optimized by the cooperation of all joints. And the cooperative game has been successfully applied to real robot system. Then, the event-triggered control method can greatly reduce the updating times of the controller and effectively save the communication burden between the controller and the manipulator. The contributions of this paper are summarized as follows: 1. To the authors' knowledge, this is the first time that cooperative game is introduced into a real robot system such as MRM. Considering the information exchange among the joints of the MRM system with constrained input, the control input of n joint modules is regarded as n participants in the cooperative game; then, the tracking control problem of the manipulator system can be transformed into the optimal control problem of the cooperative game among n participants. 2. Considering the unexpected computational burden of MRM system and the communication resource waste caused by periodic sampling during stable operation, we introduce event-triggered mechanism based on cooperative game optimal tracking control method. By solving the event-triggered HJB equation, the cooperative game optimal tracking control strategy with aperiodic sampling is obtained, which effectively reduces the updating times of the manipulator system controller.
The rest of the article is arranged as follows: In Sect. 2, the dynamic model of the constrained input MRM system is established. The design of optimal tracking controller for cooperative game based on event-triggered mechanism is discussed in Sect. 3. The experimental demonstration and conclusion are in Sects. 4 and 5.

Problem statement and dynamic model
For an n-degree of freedom (n-DOF) MRM system, it is actually composed of n motors and connecting rods in series. Due to the limitation of physical conditions, the output torque of the motor has a maximum value. This saturation phenomenon will lead to the system control performance degradation and even loss of stability. Therefore, considering a constrained input MRM system with n-DOF and referring to the joint torque feedback (JTF) technique modeling method of the manipulator [51], the dynamic model of the constrained input MRM system can be represented by a constrained input subsystem with interconnected dynamic coupling (IDC). The dynamic model of the ith constrained input subsystem can be shown as: where the subscript i reflects the ith joint subsystem, q i , q i andq i are the ith joint position, velocity and acceleration, τ si indicates joint coupling torque which can be measured by torque sensors, τ i denotes the control torque which is less than or equal to a known positive constant β i , I mi refers to the moment of inertia of the rotor relative to the rotation axis, γ i represents the gear ratio, f i (q i ,q i ) denotes the joint friction torque, z i (q,q,q) means dynamic coupling torque among the subsystems and d i (q i ) is the internal disturbance torque. The properties of friction term, IDC term and internal disturbance torque term in (1) are analyzed as follows.
(1) Joint friction For joint friction torque f i (q i ,q i ), it can be considered as a function of joint position and velocity, which is defined as follows: where f bi is viscous friction coefficient, f si denotes static friction-related parameter, f τ i is a positive parameter related to Stribeck effect, f ci indicates Coulomb friction-related parameter, f pi (q i ,q i ) is the position dependency of friction term, sgn (q i ) stands for sign function which has the following form: According to the linearization scheme [52], the friction term can be modified as: wheref bi ,f ci ,f si , andf τ i are estimations of friction parameters f bi , f ci , f si , and f τ i , respectively, indicates the parameter uncertainty vector of friction term, Y (q i ) is defined in the following form: Property 1 It can be seen from (2) and (4) that f bi , f ci , f si and f τ i and their estimated values are bounded, hence each element in the parameter uncertainty vector of friction termF i is also bounded to satisfy F il ≤ η Fil , where η Fil is a given positive constant , the subscript l represents the lth element of vectorF i and l = 1, 2, 3, 4. The approximation error of friction In Remark 1 As f bi , f ci , f si and f τ i are all physical parameters of practical significance in the bound friction torque passively generated by motor rotation, the friction term parameters and their estimated values are bounded. Further, we can learn from (2) and (4) that the frictional parameter uncertainties f bi −f bi , f ci −f ci , f si −f si and f τ i −f τ i are also bounded.

Remark 2
The desired trajectories tracked in experiments and projects are given by users in practical applications. Therefore, it is feasible to assume that there is a usable positive constant for the uncertain parameter. The upper limit value of the uncertainty parameter can be easily determined according to the actual characteristics of the model uncertainty.
(2) Interconnected dynamic coupling For the IDC term, it can be considered as a complex nonlinear function as follows: where z mi is the unit vectors along the axis of rotation the ith rotor, z lk and z lc denote the unit vectors along the axis of rotation the kth and cth joint.
Redefine (6) to the following form: where We also know that if the ith and where η zi and η zi are known positive constants.
(3) Internal disturbance torque d i (q i ) stands for internal disturbance torque, which mainly due to power fluctuations, electromagnetic effects, noise and vibration caused by the torque sensor measurement deviation.
(4) State space formulation According to the dynamic model of the ith constrained input joint subsystem (1), we can define the state variable Then, the state space equation of the ith constrained input subsystem can be written as: means the known and measurable part of the subsystem dynamic model, refers to control torque and is less than or equal to a known positive constant β i .
Next, a cooperative game optimal tracking control method is proposed by transforming the tracking control problem of manipulator system into cooperative game problem. On the basis of the above control method, the event-triggered mechanism is applied to change the traditional periodic sampling method into non-periodic sampling, so as to reduce the computational burden while ensuring the control performance.

Problem transformation
Since the saturation function is not differentiable, we resort the hyperbolic tangent function tanh ( ) with well mathematical properties such as continuous differentiability and boundedness to construct the smooth saturation function. Instead of directly employing the hyperbolic tangent function itself, the saturation function is employed by multiplying its independent variable and amplitude by a certain value. The saturation function can satisfy the requirement that the controller's gain is bounded and the rate of change is adjustable. Referring to [47,53], the saturation function of this paper is defined as follows: where β > 0 is the amplitude coefficient factor of the function, and the critical value of the function is affected by the magnitude of β. For a constrained input MRM system with n-DOF, the trajectory tracking control problem of the manipulator system can be transformed into an optimal control problem based on cooperative game when the control inputs of n joint modules' manipulator system are regarded as n participants in the cooperative The state space formulation of the augmented subsystem with n constrained participants is as follows: : In order to achieve the optimal trajectory tracking control tasks, we define a fusion error function s (x) = x 2 − x 2d + α s (x 1 − x 1d ) including joint position and joint velocity errors. According to the augmented state space description (10), the continuously differentiable infinite horizon global performance index function is defined as: where x 1d , x 2d andẋ 2d refer to the desired position, velocity and acceleration of the joint, respectively, Q is a symmetric positive definite matrix, such that a function vector such that the value of h im is more than or equal to h i at any time. Obviously, the internal disturbance term h (x) and the internal disturbance upper bound function vector h m satisfy the relation is a positive definite nonquadratic function that handles constrained input. ϒ i ( ) is a set of admissible control strategies.
Definition 1 (Admissible control) [54]. For ∀i, i = 1, 2, · · · , n, a feedback control law u i is defined as admissible to performance index function (11) on the compact set , denoted by u i ∈ ϒ i ( ), if u i is continuous on with u i (0) = 0, u i can stabilize the system (10) on and guarantee that the performance index function J (s, u 1 , · · · , u n ) (11) is finite.

Remark 3
The participant of cooperative game problem is the control input of MRM system's each joint, and the optimization target is also the overall performance and energy consumption, which involves all information of the system. Hence, the performance index function based on cooperative game design includes the state of the whole system.

Remark 4
The upper bound of internal disturbance term h i is η him , whose existence ensures the stable operation of MRM system when it performs feasible tasks. If the internal disturbance does not possess a finite boundary, the trajectory tracking error of the manipulator system cannot be guaranteed to be uniformly ultimately bounded (UUB).

Remark 5
Function h im has an upper bound η him , and the system is stable and completely controllable. A fully controllable MRM system does not have uncontrollable or unstable state variables. Therefore, The positive definite nonquadratic function U i (u i ) is expressed as follows: where R i ∈ R 1×1 means a positive definite matrix. By integrating U i (u i ), we can solve the expression: Remark 6 For the purpose of satisfy the requirements of differentiable saturation function, bounded controller gain and adjustable rate of change, the saturation function (9) is introduced. For the sake of ensuring the differentiable performance index function and the comprehensive optimization of the control performance and energy consumption of the constrained input MRM system, we introduce the positive definite quadratic form of the error function and the positive definite nonquadratic form of the control torque [47,53] into the integrand function of the performance index.
On the basis of Definition 1 and performance index function (11), we can get: where The Hamiltonian function can be defined based on (14): For cooperative game problem for constrained input MRM system with n joints, we need to find a set of solutions u * 1 , · · · , u * i , · · · , u * n that satisfy the following Pareto equilibrium conditions: where i, w = 1, 2, · · · , n, but i = w. According to the optimal control theory, u * 1 , · · · , u * i , · · · , u * n satisfy the following stationary conditions: Based on (17), the optimal control law with constrained input is as follows: The minimization of the positive definite quadratic of the joint position and velocity fusion error function ensures the trajectory tracking performance of the manipulator system, and the optimal of the control torque ensures the minimum energy consumption of the system. At this point, changing the control strategy of any participant will not lead to increase overall cost of the manipulator system, which is called Pareto equilibrium. By substituting the constrained input optimal control law (18) into the Hamiltonian function (15), the time-triggered HJB equation is obtained as follows: where

NN observer design
In order to improve the robustness and control precision of the system, we can apply known model information to devise controller. Redefine N * i (s) = −N i1 (s) + N * i2 (s) , then the constrained input control law is as follows: is devised for the part of known model information, N i1 (s) is applicable to compensate the model uncertainty and N * i2 (s) is employed as the optimal compensation for the internal disturbance term d i (q i ) .
By substituting (20) into (19), the modified timetriggered HJB equation can be obtained: On the basis of (8), N i1 can be devised: Next, we utilize NN observer to estimate model uncertainties N i1 such as IDC term and unmeasured friction term. N i1 is reformulated in the following form: where W i ∈ R l i refers to the ideal weight vector, δ i (x) ∈ R l i means the radial basis function and l i is the number of hidden layer neurons. ε i (x) denotes the approximate residual of the NN which is bounded and satisfied ε i (x) ≤ ε im , where ε im indicates a positive constant.
Since the NN ideal weight is unknown in (23),ˆ i (x) can be obtained by replacing W i with the estimated valueŴ i : On the grounds of the state space equation of the subsystem with constrained input (8) and the estimated valueˆ i (24), the NN observer is devised as: T means the observed value of the system state x i , k io = diag (k io1 , k io2 ) is the diagonal matrix, in which each element is a positive constant. The NN observer error is defined as can be obtained as follows: whereW i = W i −Ŵ i signifies the weight approximation error and it updates through: The compensation for the uncertainty part N i1 of the model is as follows: Therefore, we can get the expression for N i1 (s): By combining (20) and (29), the time-triggered optimal tracking control law for an n-DOF constrained input MRM system is obtained as follows: . (30) Theorem 1 Considering the ith constrained input subsystem of MRM (1) with model uncertainty (7), the overall state space equation is given by (10), and (26) represents the estimation error of the devised NN observer (25). The global observation error of the NN observer is UUB under the given weight update rate (27).
Proof Take the following Lyapunov function: The derivative of (31) is as follows: Therefore, the global observation error e io of the NN observer is UUB with k io1 > 1 2 and k io2 > 1, if e io2 is outside the compact set i = e io2 : e io2 ≤ ε 2 im 2(k io2 −1) .

Implementation of the critic NN
The optimal control law (18) is solved in the form of periodic sampling and the HJB Eq. (19) is derived under the time-triggered. Fixed sampling frequency control method will increase the computing burden of MRM system and waste communication resources, which is not conducive to real-time control under the limited network bandwidth. Therefore, the event-triggered mechanism is introduced into the design of controller. The control input is updated only if the system violates the triggering condition and the control strategy is replaced by the value of the previous sampling instant when the triggering condition is met, which can effectively reduce the calculation burden and save Communication resources. Suppose t j +∞ j=0 is a monotonically increasing sequence composed of triggering instants, where t j satisfies 0 < t j < t j+1 and lim j→∞ t j = ∞, j ∈ {0, 1, · · · }. In event-triggered control, the system state is sampled at non-periodic sampling times t 0 , · · · , t j , t j+1 , · · · . To obtain the triggering condition, the difference between the sampled state s x j and the actual state s (x) is defined as a gap function: where s x j = s x j = s j , t ∈ t j , t j+1 s (x) , t = t j+1 . At the triggering instants, e j (t) = 0. Based on (20), the optimal control law under event-triggered is obtained as follows: The event-triggered Hamiltonian function is as follows: Substitute (34) into (35) to obtain the event-triggered HJB equation: Assumption 1 For any state s, s j , N * i exists in (19), N i1 exists in (29) and u * i exists in (20) are all Lipschitz continuous [55] and there exist positive constants that satisfy the following inequalities Remark 7 According to [56], it can be concluded that two adjacent internal sampling time intervals T j satisfy where L is Lipschitz constant. If L is unbounded, it causes an infinite increase in the number of times the eventtriggered. It can be seen from the experimental results that the number of events triggered is limited, and L must be a bounded positive constant. Therefore, it is reasonable and feasible to assume that the controller is Lipschitz continuous for e j (t) in Assumption 1 Since the HJB Eq. (19) is a partial differential equation, whose analytical solution is difficult to be obtained, the properties of universal approximation of NN are used to approximate the performance index function J (s) to solve the HJB equation. Under the event-triggered mechanism, we utilize NN to approximate the performance index function J s j under sampling state, and the optimal control strategy under event-triggered is solved approximately. The J s j can be reconstructed by NN in the following form: where W c ∈ R l refers to the ideal weight vector, l is the number of hidden layer neurons, δ c s j means the activation function and ε i denotes the approximate residual. The gradient of (38) is as follows: Based on the ADP framework of the 3.1 "Problem transformation" section, the optimal compensation policy under event-triggered N * i2 s j can be gained: Combining (34) and (39), the optimal control law of event-triggered based on critic NN is as follows: According to (35) and (40), the Hamiltonian function is transformed into: where e cH is the approximate residual of the NN which satisfies |e cH | ≤ e cHm and e cHm is a positive constant.
Since the critic NN ideal weight W c is unknown, the approximate weightŴ c is employed to replace W c , and J (s j ) can be obtained as follows: whereŴ c ∈ R l means the estimated weight vector. Similarly, the gradient vector of (43) can be obtained: Similar to (40), the approximate optimal compensation policy under event-triggeredN i2 s j is as follows: The approximate optimal control law of eventtriggered based on critic NN can be obtained: Combining (35) and (45), the approximate Hamiltonian function is as follows: For purpose of obtaining the approximate weight vectorŴ c , the objective function E c = 1 2 e 2 c is selected and the weight is gained by normalized steepest descent method [57,58].Ŵ c updates by: where φ = ∇δ c s j Tṡ j ,W c = W c −Ŵ c ,W c refers to the weight approximation error vector.

Theorem 2 Considering the dynamic model (10) of the constrained input MRM system, if the weight updating rate given by (48) is satisfied, the weight approximation error vectorW c of the critic NN is UUB.
Proof Take the Lyapunov function as: Take the derivative of (49) with respect to time, when α c > 1 4 andW c is outside the compact set is bounded, which effectively ensures that the weight parameters of the neural network can converge to a bounded range in the learning adjustment stage.

Remark 9
In the critic NN training process, since the optimal control law u * i s j usually cannot be obtained, we use the approximate optimal control lawû i s j instead [58], and it is also applied to the constrained input MRM system as the control torque given by (46). The structure diagram of event-triggered-based cooperative game optimal tracking control method proposed in this article is given in Fig. 1.

Stability proof
For the MRM system with constrained input, we devise an approximate optimal control law under eventtriggered (46) to accomplish the trajectory tracking task. How to ensure the stable operation of the system is discussed here, and we give the following assumption and theorem to make the closed-loop system stable.

Assumption 3
Let ∇δ cm , ∇ε cm andW cm all be positive constants [58] and satisfied: Theorem 3 Considering an n-DOF MRM system with constrained input whose dynamic model described as (1), the model uncertainty give in (4) and (7), the internal disturbance term existed in (8), and the overall state space equation derive by (10). The trajectory tracking error of the closed-loop system is UUB under the proposed event-triggered optimal control law (46), if triggering condition Proof The Lyapunov function is chosen as: where J * (s) refers to optimal performance index function that J * (s) > 0 for any s = 0. Therefore, J * (s) is a positive define function. In addition, we can see thaṫ J * s j = 0. By event-triggered mechanism, the stability proof of closed-loop system can be divided into the following two cases.
(b) The event is triggered, that is ∀t = t j+1 . The difference form of (52) is as follows: where We know from the proof of case a that V 1 (t) ≤ 0, therefore, The difference Eq. (61) can be reduced to: where κ (·) is a class-κ function [58] and e j t j+1 = s j t j+1 − s j t j .
Combining case (a) with case (b), we can conclude that the MRM system with constrained input is stable in real time.

Zeno behavior analysis
For an event-triggered MRM system with constrained input, the minimum triggering interval t min = min j∈N t j+1 − t j may be zero, which results in the system being driven an infinite number of times, leading to an infinite number of controller updates, these behaviors are called Zeno Behaviors [60]. This phenomenon is not in line with the original intention of applying event-triggered to save resources, and it is difficult to achieve in the actual system, hence it should be avoided. Next, we will prove that the minimum triggering time interval t min under the event-triggered control has a nonzero lower bound, which can effectively avoid Zeno behavior.

Assumption 4
Assuming the state space Eq. (10) of the MRM system with constrained input is Lipschitz continuous, then ẋ has an upper bound and therefore ṡ = ẋ 2 + υ ≤ ẋ 2 + υ ≤ ẋ 2 + υ m also has an upper bound, where υ m denotes a positive constant. The upper bound of ṡ can be defined as: where ϕ, k are positive constants, and ϕ is tiny.

Theorem 4
For an n-DOF MRM system with constrained input (1) whose state space described as (10), the minimum triggering interval t min has a positive lower bound if the triggering condition given by (51) is satisfied, Proof The derivative of gap function (33) is as follows: Combining (64) and (66), we have Depending on the property of the event-triggered mechanism, at the triggering instant e j t j = 0, the solution of (67) is as follows: Through (68), we can know that the triggering condition (51) of jth sampling is satisfied We can conclude that from (68) and (69), for any ∀t ∈ t j , t j+1 , j ∈ N, the minimum triggering interval t min has a nonzero lower bound and satisfies the inequality given by (65).
Proof is completed.

Experiments
In this part, the feasibility of the proposed method is verified by using a 2-DOF MRM experimental platform. In this paper, experiments with tracking tasks are established for 2-DOF constrained input MRM to verify the effectiveness of the proposed algorithm. The desired trajectory of the two joints is as follows:

Experimental setup
For the NN observer and the critic NN, we choose the radial basis function NN to approximate the model uncertainty and performance index function, respectively. The NN observer chooses 2-3-1 structure, that is 2 input neurons, 3 hidden layer neurons and 1 output neuron. The initial weight vector of the NN observer is defined asŴ i0 = Ŵ i1 ,Ŵ i2 ,Ŵ i3 T , and its initial value isŴ 10 Table 1. Figures 3,4,5,6,7,8,9,10,11 and 12 show the experimental comparison between the ADP-based eventtriggered optimal control method [26,50] and the proposed event-triggered-based cooperative game optimal tracking control method for MRMs system with constrained input. In the event-triggered control, we take the real-time state of MRM system with constrained input measured by the sensor under time-triggered as the system state under event-triggered.

Experimental results and analysis
(1) Trajectory tracking curves analysis Figures 3 and 4, respectively, demonstrate the trajectory tracking curves of two joints with the existing ADP-based event-triggered control method and the proposed event-triggered-based cooperative game optimal tracking control method. From the figures, one observes that the actual trajectory can keep up with the desired trajectory commendably under both control methods. In addition, from the local magnifica-tion view, we know that the existing control method can track desired trajectory at about 0.4 s, and the proposed event-triggered-based cooperative game optimal tracking control method ensures that the two joints can quickly catch up at 0.2 s. Therefore, the proposed control method has preferable dynamic tracking performance than the existing control method. Figures 5 and 6, respectively, exhibit the trajectory tracking error curves of two joints with the existing control method and the proposed control method. From these experimental results, one concludes that the manipulator system can quickly learn and track the desired trajectory at the beginning under the two control methods and ensure the trajectory tracking error within an ideal range. After the desired trajectory is quickly tracked by the actual trajectory, the trajectory tracking error of the proposed control method is within ± 4 × 10 −3 rad , which is less than the trajectory tracking error of the existing control method. The reason is the event-triggered-based cooperative game optimal tracking control method proposed in this paper not only compensates the model uncertainties such as friction term, IDC term and internal disturbance term, but also ensures the optimal overall performance by considering the cooperation among the various joints. Therefore, the steady-state tracking performance of the cooperative game optimal tracking control method based on event triggering mechanism is more excellent than the existing control method.
(2) Control torque curves analysis  troller and the controlled plant. In addition, Fig. 7 illustrates that the control torque is with obvious chattering phenomenon during the operation of the system, which will not only affect the control performance of the system, but also reduces the durability of the DC motor inside the manipulator. A smooth control torque curve is shown in Fig. 8. The vibration amplitude and frequency of the control torque devised under the event-triggered-based cooperative game optimal tracking control method are distinctly smaller, which enhances the control precision of the manipulator system and prolongates the operational lifespan of the motor.
(3) Event-triggered curves analysis  Figure 9 illustrates the overall trigger threshold and trigger condition curves. Due to the offset of the desired trajectory at the initial time, the value of the trigger condition and the trigger threshold are both large, but the trigger condition is still within the range of the trigger threshold, which indicates the event-triggered control method can ensure the normal operation of the system at the initial instant. In the local magnification, the trigger condition and the trigger threshold have an aperiodic intersection instant. The intersection of two points indicates that the system violates the trigger condition and the event is triggered. At this point, the system state is sampled and the control law is updated, which  remains unchanged until the next intersection of the two curves. Figure 10 describes the comparison curves of controller update times under the time-triggered and the event-triggered. As we can see from the figure, the update times of the event-triggered controller are about half of the time-triggered controller, which not only decrease computational cost of the system, but also effectively cut down the communication burden between the controller and the controlled model. In addition, the time-triggered controller update times curve is a straight line with constant slope, on account of the time-triggered controller employed periodic sampling communication. The event-triggered control method adopts non-periodic communication mode, as a result the slope of the update times curve of the event-triggered controller changes constantly. Where the slope is larger, the system triggering more times to ensure the control performance and vice versa. (4) Weight curves analysis Figure 11 is the weight curves of the NN observer used to compensate model uncertainty. Note that the NN observer proposed in this paper is designed under the time-triggered because it requires real-time position and velocity information of each joint, but the NN observer based compensation control law is under the event-triggered. From the experimental result curves in the figure, one concludes that the weights of the NN observer converge to a certain range, which verifies Theorem 1 in an experimental way and fully proves the effectiveness of the NN observer approximation model uncertainty.  Figure 12 exhibits the weight curves of the critic NN with the proposed optimal control method. Similar to the weight curves of the NN observer, the critic NN weight curves also converge to a bounded range and Theorem 2 can be verified experimentally. With the event-triggered mechanism, the weight of the critic NN will be updated only when the trigger condition is violated, and the other times remain unchanged. The weight curves and control torque curves have the same update frequency. Not only that, the proposed cooperative game optimal control scheme based on event-triggered mechanism adopts single critic NN to decrease the complexity of the system structure, and the estimation value of critic NN weight makes it feasible to solve the optimal control strategy proposed in this paper.
Experimental results illustrate that the event-triggeredbased cooperative game optimal tracking control method proposed in this paper is superior to the traditional control method considering the comprehensive influence factors such as control precision, energy consumption and communication burden of MRM system with constrains input. Both theoretical derivation and experimental results can be verified repeatedly.

Conclusion
In this paper, we solve the problem of optimal tracking control in cooperative game of MRM system with constrained input based on event-triggered mechanism. The dynamic model of constrained input subsystem is established by using JTF technology, and the entirety state space description is obtained by defining the global state variables. The cooperative game theory is introduced into the derivation of controller, and the tracking control problem of MRM system with constrained input is transformed into the optimal control problem based on cooperative game. Next, the global performance index function is constructed, and the optimal tracking controller based on cooperative game is proposed by using partial known model information and ADP framework. Then, the event-triggered mechanism is applied to the above controller, and the system state is sampled aperiodic according to the designed trigger conditions. The approximate optimal control law is gained by solving the event-triggered HJB equation through the critic NN. Lyapunov theory is used to prove the tracking error is UUB of closed-loop constrained input MRM system. From the experimental results, one concludes that the event-triggered-based cooperative game optimal tracking control method proposed in this paper can not only satisfy the control accuracy, but also reduce the update times of the controller and alleviate the calculation burden, which fully proves the validity of the proposed control method.