In this paper, the optimal control problem for finite-time missile-target interception systems is posed in a finite-horizon two-player zero-sum (ZS) differential game framework using a periodic event-triggered (PET) scheme. To solve the optimal control problem, a time-varying Hamilton-Jacobi-Issac (HJI) equation and a time-dependent cost function are constructed to deal with finite-horizon constraints, and an event-based periodic adaptive dynamic programming (ADP) algorithm is employed to find the Nash equilibrium solution for the designed HJI equation. Comparing with the traditional continuous event-triggered (ET) scheme, the proposed PET scheme only verifies the event-triggered conditions at periodic sampling instants, which reduces resource consumption in monitoring and excludes the Zeno behavior. A single critic neural network (CNN) is used to implement the proposed event-based optimal control algorithm, which reduces approximate errors bust also simplifies structures. Further, an additional error term is added in the designed weight updating law to such that the terminal constraint is also minimized over time. By resorting to Lyapunov function approach, some sufficient conditions are derived to achieve the uniformly ultimately bounded (UUB) of the ET closed-loop system and the estimation weight error of CNN. Finally, a missile-target interception system is introduced to illustrate the efficiency of the presented methods.
In the field of intercept guidance, modern guidance laws have received extensive attention due to extremely fast development of modern control theory and computer technology. There are numerous reported on guidance laws for missile-target interception system, such as the optimal guidance law (OGL) (He and Lee, 2018), the differential game guidance law (DGL) (Sun et al., 2016), the sliding mode guidance law (SMGL) (Guo et al., 2019; Wang, 2017) and the robust guidance law (NGL) (Li and Ji, 2016; Song and Song, 2019). The optimal control theory is firstly applied to the study of guidance law in Kishi and Bettwy (1965), after which plenty of theoretical results of OGL have also been provided in recent years. However, the OGL requires real-time estimation of remaining flight time and total motion information of the target, which inevitably degrades the guidance precision when existing a large remaining time estimation error. To overcome this drawback, differential games are put forward by Rufus (1954), which combines modern optimal control with game theory by using the dynamic programming method. At present, differential games have been widely employed in seeking the optimal strategy of both missiles and targets (Perelman et al., 2011; Shima and Golan, 2007). Note that these researches mentioned above failed to consider the optimal control problem for nonlinear systems, especially for nonlinear differential games.
In the terminal guidance interception problem of the maneuvering target, the target tries to maximize the miss distance while the interceptor seeks the optimal strategy to minimize the miss distance. Therefore, the problem of missile-target interception can be construct as a optimal control problem of nonlinear two-player zero-sum (ZS) differential games. However, it is almost impossible to seek a Nash equilibrium solution of the corresponding Hamilton-Jacobi-Isaac (HJI) equation in case of nonlinear differential games. In this circumstance, the ADP method is presented to approximate the optimal solution for the designed HJI equation by using neural networks (NNs). On account of ADP is valid in solving optimal control problem, plenty of theoretical results are obtained in the field of multi-error constraint control, fault-tolerant control, power system stability control and optimal battery energy control (Dai et al., 2019; Liu et al., 2018a; Tang et al., 2014; Wei et al., 2017). The authors adopt an actor-critic structure to approximate the optimal function and the optimal control strategy in Dierks (2012) and White et al. (1994). For uncertain nonlinear systems, an ADP-based method is employed to solve the Hamilton-Jacobi-Bellman (HJB) equation forward-in-time using a three-NN-structure. Based on ADP technique (Bhasin et al., 2013), the robust optimal control problem for affine nonlinear systems are considered in Ding et al. (2014), and the full state constraints systems are discussed in Gao et al. (2018) and Liu et al. (2018b, 2019). Further, massive of significant results for ZS differential games and NZS differential games have been reported in Vamvoudakis and Lewi, (2011), Zhang et al. (2017a), Sun and Liu (2018b) and Qu et al. (2018) via NNs. However, finite-horizon constraints in ADP design is not discussed in these researches mentioned above, even if finite-horizon convergence has been substantiated to have better robustness and anti-disturbance ability, which urges us to study it further.
The finite-horizon optimal control problem for nonlinear systems is more complicated than under circumstance of the infinite-horizon. The difficulties result from the time-varying HJI equation and the time-to-go dependent cost function, which are time-invariant under infinite-horizon. In addition, the finite-horizon cost function has to satisfy a terminal constraint condition, which is taken as zero for infinite-horizon. To address the above issues, time-varying activation functions and an additional error term are employed to approximate the coupled HJB equation while guaranteeing that the terminal constraint error is minimized in Zhao et al. (2015). In Wang et al. (2011, 2012), the optimal tracking problem for continuous systems and discrete systems are considered, where a iterative ADP algorithm be used to design the finite-horizon stabilization controller. However, the actor-critic structure employed in Zhao et al. (2015) and Wang et al. (2011, 2012) will generate a mass of approximate errors. To tackle this issue, the authors in Heydari and Balakrishnan (2013) eliminated the actor NN, and a single critic neuron network (CNN) is used to approximate the time-varying solution of the coupled HJI equation. For uncertain nonlinear systems, Huang (2016) developed a neuro-observer based finite-horizon optimal policy by using an observer-critic structure. Finite-horizon ZS and NZS differential games problem are considered in Cui et al. (2016) and Sun and Liu (2018a), the Nash equilibrium of coupled HJI equations are approximated by a single CNN. By now, most of existing literatures on optimal control problem are mainly taken for a time-driven scheme, with few on event-triggered schemes.
It is worth noting that communication resources and network bandwidth are limited in the actual missile guidance control system. However, most existing works on finite-horizon optimal control use a time-triggered mechanism. This mechanism will increase communication pressure, but also affect the effectiveness of intercepting target by missile. Therefore, the event-triggered ADP algorithm is proposed in Vamvoudakis (2014) to overcome the limitation of the time-triggered scheme. Subsequently, Zhong et al. (2015) and Sahoo et al. (2017) employed an adaptive ET control strategy to settle the optimal control issue of the continuous nonlinear systems. The optimal control problem is further studied in Zhu et al. (2017), when the continuous nonlinear system exist locally unknown dynamics and input constraints. In Wang et al. (2016) and Mu and Wang (2018), the problem of control for continuous-time nonlinear systems is converted to a optimal control problem for ZS differential game systems, in which the adaptive evaluation learning method is used to attain the event-based optimal control strategy and the time-based optimal perturbation strategy. But until now, the use of ET scheme in ZS differential game systems with finite-horizon constraints is not common, especially under the background of missile target interception. In addition, how to evade the Zeno behavior under the ET manner is also a serious problem to be solved urgently.
On the background of intercepting the maneuvering target, this paper first attempt to investigate the optimal control problem for missile-target interception systems with finite-horizon constraints using a periodic event-triggered adaptive dynamics programming (PETADP) method. At first, the finite-horizon convergence problem for missile-target interception systems is converted to a optimal control problem of two-player ZS differential game with a time-dependent cost function. To save communication costs and computing resources, a PETADP scheme is employed to address the optimal control problem for a class of nonlinear system efficiently. Then, the triggered condition is obtained to ensure that the nonlinear system with finite-horizon constraints is UUB under the event-based optimal controller. Finally, a single CNN is used to approximate the saddle point of the event-based HJI equation online. The main contributions are threefold:
Compared with some existing literatures (Sahoo et al., 2017; Vamvoudakis, 2014; Wang et al., 2016; Zhong et al., 2015; Zhu et al., 2017), this paper proposed PET scheme verified the event-triggered condition only at a fixed periodic, where the minimum inter-event time is greater than or equal to validity period. Obviously, this scheme combines the advantages of both traditional ET schemes and the periodic sampling, the benefits of saved communication resources and computing costs are preserved while the Zeno behavior is avoid.
This article first attempt to dealt with the issues of finite-horizon converge for missile-target interception systems via a PET scheme. Different from the infinite-horizon circumstance (Heydari and Balakrishnan, 2013; Wang et al., 2011, 2012), the finite-horizon HJI equation is time-varying, and the finite-horizon cost function should satisfy with a terminal constraint that is considered as zero under infinite-horizon scenario, which is more complicated. Then, an ET tuning law is proposed for CNN weight vectors, in which the condition of initial stabilising control is relaxed.
In terms of the Lyapunov function approach, some stability criteria with the PET method are derived for the closed-loop nonlinear system and the estimation error vector of CNN weight.
The remainder of this paper is organized as follows. In Section 2, a brief description of the optimal control problem for finite-horizon ZS differential games is proposed, where a periodic event-triggered ADP method is provided to improve communication resource utilization efficiency. A single CNN is designed in section 3 to solve the time-varying HJI equation approximately. Then, the event-triggered closed-loop nominal system and the weight error are guaranteed to be UUB by constructing a appropriate tuning law in Section 4. In Section 5, a missile-target interception system model is introduced to validate the effectiveness of this paper proposed PET optima control scheme. Finally, concluding remarks are given in Section 6, and some standard notations will be used to simplify the upcoming description (see Tables 1–2).
Some related notations.
Notations
Meaning
Notations
Meaning
the set of real numbers
the minimum sampling periodic based on PEC scheme
the set of natural numbers
h
the sampling periodic based on time-triggered scheme
the n-dimensional Euclidean space
the minimal eigenvalue of a matrix
the set of real nXm matrices
the maximal eigenvalue of a matrix
all positive definite matrices
T
the transpose operation
Problem descriptions
Consider the continuous-time nonlinear two-player ZS differential game as follows
where is the state vector, and are two input vectors for two players, respectively. , and are differentiable nonlinear dynamics with . It is clear that, is an equilibrium point of system (1). Now, Assumption 1 and Assumption 2 are provided for system (1).
Assumption 1: The nonlinear dynamics , and are locally Lipschitz continuous on a compact set , and the system (1) is controllable.
Assumption 2: and are bounded by constants and , that is,
Remark 1: As we all know, external disturbances and control inputs for many practical nonlinear systems are always energy bounded signals. Therefore, the constraint conditions in Assumptions 1–2 are reasonable, and they are widely used in existing literatures (Fu and Chai, 2017; Sun and Liu, 2018c).
Our aim is to propose a controller to minimize the time-dependent cost function, meanwhile, construct anther controller to maximize the cost function as follows
Defining the utility function , where is a positive definite function and are matrices with appropriate dimension. In terms of Cholesky decomposition, we can obtain , , where and are proper lower triangular matrices.
Remark 2: is introduced to deal with finite-horizon constraints, and is called a terminal cost, denotes the fixed final time. Different from the infinite-horizon scenario (Cavalieri et al., 2013; Jagat and Sinclair, 2017; Parras et al., 2016), the finite-horizon cost function is time-dependent.
Suppose that , the infinitesimal equivalent of (2) is defined by
where , .
Remark 3: Clearly is a time-dependent solution to the time-varying partial differential equation (3). Based on the given control pair and the backward inference method, can be obtained from fixed terminal time and terminal cost function . Note that the term is generated by the time-dependent cost function while does not appear in the infinite-horizon scenario.
Defining the Hamiltonian function including explicitly time for system (1) as
The optimal cost function of system (1) is derived
In accordance with the stationary conditions, the optimal control strategies can be calculated by
Substituting (6a) and (6b) into (4), the time-varying HJI equation can be rewritten by
Periodic event-triggered scheme
In this section, the traditional periodic sampling mechanism is considered firstly, that is, the controller is only update at periodic sampling instants. In general, the sampling period of the system is assumed as a constant , and the sample sequence is given below
Among them, some changes in sampled data are negligible or have no impact on the performance we are studying, so there is no need to transmit. In this sense, the traditional periodic sampling mechanism will generate too many redundant signals, thus causing a waste of computing resources. To overcome this obstacle, the event-based periodic sampling scheme is applied in this paper, where the periodically sampled signal can be transmitted only when the defined ET condition is satisfied. For convenience and without loss of generality, the initial time is supposed to zero, then the framework of the periodic event-triggered sampling is given
Remark 4: In contrast to references (Sahoo et al., 2017; Vamvoudakis, 2014; Wang et al., 2016; Zhong et al., 2015; Zhu et al., 2017), the ET condition only needs to be verified at a fixed periodic under this article proposed the PET scheme. In Figure 1, the minimum inter-event time is greater than or equal to the sampling period for the time-triggered scheme . Obviously, the computing costs of the detection process is further reduced and the Zeno behavior is avoid.
Illustration of periodic event-triggered sampling.
As show in Figure 1, we first consider the periodic sampling scheme, then the event-triggered scheme is introduced to enhance the utilization rate of limited communications resource. Under the PETS, only periodic sampling states that satisfy with the defined event-triggered conditions can be transmitted, and the event-triggered sampled states are denoted as
where denote event-triggered instants. Define the event-based threshold error between and as
where , is the unknown event-triggered condition. And denotes the coefficient of the threshold, which can affect the number of event-triggered.
As the PET scheme is introduced, the state feedback controller is update only at the event-triggered instants. Note that is a piecewise constant function with states as independent variables and continuous from the right everywhere. Under the PET scheme, the system (1) and state feedback control law (6) is reconstructed as
This together with the Hamiltonian function (4) and the time-triggered disturbance stategy (7), then, the event-based HJI equation is reconstructed as
In order to analyze conveniently, the following Assumption is introduced which is reasonable and easy to implement in the event-based control systems. The assumption is widely used in literatures (Wang et al., 2016; Vamvoudakis, 2014; Zhang et al., 2017a).
Assumption 3: The optimal controller is Lipschitz continuous with respect to , that is
where is a constant and .
Lemma 1: Consider time-triggered HJI equation (7) and event-triggered HJI equation (10), if Assumption 1 holds, then the following relationship is satisfied
Proof: Based on (7) and (10), one has
According to Assumption 3, the inequality (13) is obtained.
Theorem 1: For the nonlinear two-player ZS differential game (1) with the finite-horizon cost function (2), suppose Assumptions 1–3 all hold. The time-triggered disturbance law and the event-trigged control law are given by (6b) and (9), respectively, for all , then the event-triggered closed-loop system (8) is asymptotic stable, if the following event-triggered condition holds
where is the event-triggered threshold and is the designed parameter of the sample frequency.
Proof: The optimal value function can be regarded as a Lyapunov function, since it is a time-varying solution to the time-varying HJI equation. Taking the time derivation of yields
Combining (6a) and (6b), yields
Since , there is a matrix , such that . Then combining with (7), (15) and (14), the time derivation of becomes
Therefore, , as long as the inequality (13) is satisfied.
Finite-horizon neural networked implementation using the event-triggered adaptive critic controller
In this section, the single CNN is applied to approximate the time-varying cost function and the terminal constraint as follows
where is the ideal NN weight to be calculated, and represents the number of hidden-layer neurons. is the time-dependent activation function, and is the NN function reconstruction error. As in Sun and Liu (2018b) and Qu et al. (2018), the activation function is bounded.
It follows from (16a) that
where
On account of the ideal weight is unknown, the finite-horizon cost function and the terminal constraint are approximated by a critic NN
where denotes the approximate cost function, and the approximate cost function at the terminal time is denoted by . represents the activation function, where the estimated state can be calculated by the current state and the system dynamics. Further, the partial derivative of (18a) and (18b) with respect to and are obtained respectively
From (6a), (9), (17a) and (17b), the event-triggered control strategy and the time-based disturbance strategy can be described by
where , .
Combining (16a) with (16b) gives
Here, represents the reconstruction error of CNN approximation. Further, according to (19), it follows
Noticing (18a) and (18b), the approximate event-triggered HJI equation is described by
In order to settle the issue of finite-horizon constraints in the ZS differential game systems, the time-varying cost function and its terminal constraint should be considered. Therefor, the terminal constraint estimation error is introduced according to (18b)
where .
Lemma 2: (Yang et al., 2013) For system (1) with finite-horizon cost function (2), the control strategy (22a) and (22b). Assume is a Lyapunov function, which is continuously differentiable and radially unbounded, and satisfy with the following inequality
If there exist a positive definite function , , then the following inequality holds
where .
Combing with the control input (22a) and (22b), the derivative of Lyapunov function for system (1) can be calculated by
In terms of the gradient descent method, we develop the following update law for ,
In order to obtain to minimize the Hamiltonian function that satisfied with the terminal constraint during the learning process of the CNN, the total squared error is defined as
Generally, the optimal control strategy and the disturbance strategy are not available in practical engineering applications. Therefore, approximate control strategy (22a) and approximate disturbance strategy (22b) are applied to the learning process in the neural network. According to the normalized gradient descent algorithm (Liu et al., 2015; Sun and Liu, 2018b,c), the event-triggered weight vector is tuned as follows
where and . is defined as
Defining , and combining (21), (23) and (24), the time derivative of can be written as
Remark 5: In fact, the first term and the second term in (29) are employed to minimize the approximate event-triggered Hamiltonian function and the event-triggered terminal constraint estimation error, respectively.
From the definition (28), it is clear that the third term and the fourth term are introduced to such that the system states are bounded during the learning phase of the neural network. These two terms are zero when the nonlinear system is stable, and that will be activated when an unstable signal is generated.
Assumption 4: The control dynamic is Lipschitz continuous on a compact set , and satisfy with , where is a constant.
Assumption 5: Let , and be positive constants
(1) The partial derivative of activation function is Lipschitz continuous on a compact set , and satisfy with , where is a constant
(2) ,
(3) ,
(4) .
Theorem 2: For the nonlinear two-player ZS differential game (1) with the finite-horizon cost function (2), suppose Assumptions 1–5 all hold, if the time-triggered disturbance policy and the event-triggered control policy are designed as (22b) and (22a), respectively, the CNN tuning law is set as (29). Then, the event-triggered closed-feedback system and the estimation error of weight vector are UUB, if the following conditions hold
Proof: Choose the following Lyapunov function candidate
where denotes the optimal value function for the continuous nonlinear system (1), and represents the optimal value functions for the event-triggered nonlinear system (8).
Case : Events are not triggered, that is, . From (6a), (6b) and (7), the derivative of can be described by
Considering (6a) and (17a), the time-triggered control strategy is derived
From (22a) and (34), and considering the relationship yields
where
According to Assumptions 4 and 5, one has
Further, the time derivative of Lyapunov function becomes
In accordance with Lemma 2, the following inequalities hold
(1) , according to the definition of , the term is negative. For any , there exist a matrix such that . Calculating the time derivative of gives
Defining , the time derivative of can be further calculated as
where
For convenience, we denote
Thus, if the following conditions and are satisfied
(I) ,
(II) .
(2) , according to the definition of , . Hence, the time derivative of becomes
From (20a) and (20b), it follows that
Then, the following inequalities are derived
In term of Assumption 2, there exist a constant such that , and considering Lemma 2 yields
For convenience, we denote
Hence, the time derivative of is negative if the following conditions are satisfied
(i) ,
(ii) ,
(iii)
Case : Events are triggered, that is, . Based on (33), then the difference Lyapunov function is obtained
From Case , for . Notice that the system states and cost function are continuous, then the following inequalities are found
where , is a class-k function and . This means that Lyapunov function is also decreasing at triggering instants
As a consequence, from (30), (31) and the event-triggered condition (32), we can derive that the nonlinear two-player ZS game system (1) and the estimation error of weight vector are UUB.
Application in the missile-target interception system
To demonstrate the efficiency of the presented algorithm in Section 4, the missile-target interception system is introduced. The motion differential equations of the missile and the target are, respectively, given by
where is the position coordinates of the missile in inertial ordinate system, presents the missile lateral acceleration, and denotes the missile autopilots time constant.
where is the position coordinates of the target in inertial ordinate system, presents the target lateral acceleration, and denotes the target autopilots time constant.
This section consider the geometry of planar interception as provided in Figure 2, which is borrowed from (Sun and Liu, 2018c), where indicates position coordinates. Accordingly, the relative kinematics equation of the missile-target intercept system is obtained as follows, and some related notations will be used to simplify the upcoming description (see Table 2)
Engagement geometry.
Some related notations.
Notations
Meaning
the missile velocity
the target velocity
the missile control vector perpendicular to the velocity vector
the target control vector perpendicular to the velocity vector
is the missile flight-path angle (FPA)
the target FPA
is the line-of-sight (LOS) angle
is the LOS angle rate
the relative distance between the missile and the target
the range rate along the LOS
Taking the time derivative of the LOS angle rate , and bearing (37) in mind, we get
To accomplish the goal of intercepting maneuverable aircraft, the relative distance will be converged to zero by adjusting the missile control vector . Obviously, in differential equation (38), the drift dynamics will go to infinity when converges to zero. Further, equation (38) is the problem of finite-time missile-target interception by nature, and does not satisfy with the condition of local Lipschitz.
Therefore, the above dynamic equations need to be performed the following mathematical transformations, and the necessary definition is introduced.
Definition 1: (Bardhan and Ghose, 2015) Assume that the closest distance between the missile and the target at time t, if both the missile and the target are uncontrolled from that instant onwards, then the zero effort miss (ZEM) can be express by
Remark 6: As shown in (39), the missile will successfully capture the target in the terminal guidance phase, if the following conditions are satisfied
: the LOS angle rate goes to zero.
: the range rate along the LOS is negative.
Define the time-to-go, a new state variable and a new time variable are, respectively, , , and . Taking the derivative of with respect to , the following new dynamics equation is obtained
where
It is clear that dynamics equation (46) is controlled for all , . Therefore, the guidance law applies to the following domain
Remark 7: The problem of finite time missile-target interception in (38) is converted into an infinite time version in (40) by mathematical manipulation. In (40), tend to be infinity with decrease to zero that reveal that the guidance problem is converted into a problem of two-player ZS differential games.
Remark 8: It is clear that the nonlinear dynamics and are bounded in kinematics equation (40), which show that the Assumption 2 is easy to satisfy in actual engineering.
Missile-target interception simulation
For simplicity, let the velocity of the missile and the target be constants during the terminal guidance. Assume that the missile and the target velocity are and , respectively. Given the time constants of the missile and the target as and . Let the initial FPA of the missile and the target are and . Assume the initial position parameters are and , and the initial LOS angler is . To realize the designed algorithm of the event-triggered differential game, we choose the time-dependent term , where and . Let the , , and denotes the terminal constraint.
For the critic network, a time-dependent activation function with neurons is selected as , and the critic network weight is chosen as in which the initial value of is . The learning rate and the Lyapunov function defined in Lemma 2 are set as and . The sampling period of the time-triggering scheme is given as Finally, a small pumping signal is added to the control input in the first 10 seconds to ensure the persistent excitation effect.
The PETADP algorithm presented in this paper is used to the missile-target interception system, then the following Figures 3–11 is obtained. As show in Figures 3–4, the LOS angle rate is converge to the neighbourhood of zero, and the range rate along the LOS is always negative, that is, . Obviously, the maneuvering target is successfully captured by the missile in Figure 2 since both and satisfy the capture criteria in Remark 5. Figure 5 and Figure 6 reflect the lateral accelerations and . Figure 7(a) and (b) show the relative distance between the missile and the target under the time-triggered scheme and the PET scheme, respectively. It is clear that relative distance in Figure 7(b) while in Figure 7(a), which illustrated that the the periodic event-triggered ADP technique can implement more precise control for missiles. Figure 8 reflects the trajectory of engagement between the missile and the target, and the trajectories of is given in Figure 9. Figure 10 illustrates that the minimum triggering intervals under the PET scheme, therefore the Zeno behavior is eliminated. Obviously, this scheme combines the advantages of both traditional event-triggered schemes and the periodic sampling, the benefits of saved communication resources and computing costs are preserved while the Zeno behavior is avoid. In Figure 11, the periodic event-triggered ADP technique needs only 492 samples, while the traditional period sampling ADP technique and continuous event-triggered ADP technique require 750 samples and 551 samples, respectively. By calculating, the number of controller updates have been saved by under the the periodic event-triggered ADP technique. Thus, we conclude that the PET controller is highly efficient in decreasing the frequency of data transmission and saving compute costs.
The LOS angular rate.
The range rate along the LOS.
The lateral acceleration of target.
The lateral acceleration of missile.
The relative distance under (a) the time-triggered scheme; (b) the periodic event-triggered scheme.
Engagement trajectories.
The critic weights.
Triggering intervals under (a) the continuous event-triggered scheme; (b) the periodic event-triggered scheme.
Computation results of sampling numbers.
Remark 9: The above simulation results rendering that the signals of missile-target interception system are divergent at the terminal guidance time. In fact, the simulation curves are inevitable divergent at terminal time of the missile interception, and the major reason is that the system parameter value changes abruptly at the guidance terminal. In particular, the guidance law designed in this article is no longer applicable when the relative distance between the missile and the target tends to zero.
The issues of finite-time optimal control for event-triggered two-player ZS differential games have been investigated in this article. A periodic event-triggered ADP algorithm has been established to save communication costs while eliminate the undesired Zeno behavior, for which the control inputs are updated only when the designed ET condition was satisfied. To implement this algorithm, a single CNN and a time-varying activation function have been proposed to approximate the Nash equilibrium solution of the time-dependent HJI equation. Further, the terminal estimation error has been minimized to deal with the finite-horizon constraints, thereby a ET critic weight update law has been proposed. By virtue of the Lyapunov function method, some stability criteria have been obtained to guarantee the UUB for the event-triggered closed-feedback system and the weight error of the single CNN, respectively. Eventually, a missile-target interception system with finite time constraints is introduced to demonstrate the efficiency of the proposed methods. Recently, robust control of differential games receives extensive attention (Sun et al., 2016). In fact, many model errors and external disturbances widely exist in the modern missile-target interception system. Therefore, future efforts will focus on designing robust optimal controller for finite-horizon differential games.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by National Natural Science Foundation of China under Grant 61473147.
ORCID iDs
Dandan Duan
Chunsheng Liu
References
1.
BardhanRGhoseD (2015) Nonlinear diifferential games-based impact-angle-constrained guidance law. Journal of Guidance Control and Dynamics38(3): 1–19.
2.
BhasinSKamalapurkarRJohnsonM, et al. (2013) A novel actor-critic-identifier architecture for approximate optimal control of uncertain nonlinear systems. Automatica49(1): 82–92.
3.
CavalieriKASatakNHurtadoJE (2013) Incomplete information pursuit-evasion games with uncertain relative dynamics. In: AIAA Guidance, Navigation, and Control Conference, National Harbor, MD, USA, 13–17 January 2014, pp. 1–8. Reston, VA, USA: American Institute of Aeronautics and Astronautics.
4.
CuiXZhangHLuoYZuP (2016) Online finite-horizon optimal learning algorithm for nonzero-sum games with partially unknown dynamics and constrained inputs. Neurocomputing185: 37–44.
5.
DaiJLiuCSunJ (2019) Adaptive optimal fault-tolerant control scheme for a class of strict-feedback nonlinear systems. Transactions of the Institute of Measurement and Control41(4): 1079–1087.
6.
DierksT (2012) Online optimal control of affine nonlinear discrete-time systems with unknown internal dynamics by using time-based policy update. IEEE Transactions on Neural Networks and Learning Systems23(7): 1118–1129.
7.
DingWLiuDLiHMaH (2014) Neural-network-based robust optimal control design for a class of uncertain nonlinear systems via adaptive dynamic programming. Information Sciences an International Journal282: 167–179.
8.
FuYChaiTY (2017) Online solution of two-player zero-sum games for continuous-time nonlinear systems with completely unknown dynamics. IEEE Transactions on Neural Networks and Learning System27(12): 2577–2587.
9.
GaoTLiuYJLiuLLiD (2018) Adaptive neural network-based control for a class of nonlinear pure-feedback systems with time-varying full state constraints. IEEE/CAA Journal of Automatica Sinica5(5): 923–933.
10.
GuoJLiYZhouJ (2019) An observer-based continuous adaptive sliding mode guidance against chattering for homing missiles. Transactions of the Institute of Measurement and Control41(12): 3309–3320.
11.
HeSMLeeCH (2018) Optimal proportional-integral guidance with reduced sensitivity to target maneuvers. IEEE Transactions on Aerospace and Electronic Systems54(5): 2568–2579.
12.
HeydariABalakrishnanSN (2013) Finite-horizon control-constrained nonlinear optimal control using single network adaptive critics. IEEE Transactions on Neural Networks and Learning System24(1): 145–157.
13.
HuangY (2016) Neuro-observer based online finite-horizon optimal control for uncertain non-linear continuous-time systems. IET Control Theory and Applications11(3): 401–410.
14.
JagatASinclairAJ (2017) Nonlinear control for spacecraft pursuit-evasion game using the state-dependent riccati equation method. IEEE Transactions on Aerospace Electronic Systems53(6): 3032–3042.
15.
KishiFHBettwyTS (1965) Optimal and suboptimal design of proportional navigation systems. In: Recent Advances in Optimization Techniques. New York: John Wiley.
16.
LiGLJiHB (2016) A three-dimensional robust nonlinear terminal guidance law with ISS finite-time convergence. Internationa Journal of Control89(5): 938–949.
17.
LiuLLiuYJTongS (2018a) Fuzzy based multi-error constraint control for switched nonlinear systems and its applications. IEEE Transactions on Fuzzy Systems27(8): 1519–1531.
18.
LiuYJGaoYTongSCLiYM (2015) Fuzzy approximation-based adaptive backstepping optimal control for a class of nonlinear discrete-time systems with dead-zones. IEEE Transactions on Fuzzy Systems24(1): 16–28.
19.
LiuYJLuSTongS, et al. (2018b) Adaptive control-based barrier lyapunov functions for a class of stochastic nonlinear systems with full state constraints. Automatica87: 83–93.
20.
LiuYJTongSCWangWLiYM (2009) Observer-based direct adaptive fuzzy control of uncertain nonlinear systems and its applications. International Journal of Control, Automation and Systems7(4): 681–690.
21.
LiuYJZengQTongS, et al. (2019) Adaptive neural network control for active suspension systems with time-varying vertical displacement and speed constraints. IEEE Transactions on Industrial Electronics66(12): 9458–9466.
22.
MuCXWangK (2018) Approximate-optimal control algorithm for constrained zero-sum differential games through event-triggering mechanism. Nonlinear Dynamics95(4): 2639–2657.
23.
ParrasJDel ValJZazoSZazoJ (2016) A new approach for solving anti-jamming games in stochastic scenarios as pursuit-evasion games. In: 2016 IEEE Statistical Signal Processing Workshop (SSP), Palma, Spain, 26–29 Jun 2016, pp. 1–5. NEW YORK, NY, USA: IEEE.
24.
PerelmanAShimaTRusnakI (2011) Cooperative differential games strategies for active aircraft protection from a homing missile. Journal of Guidance Control and Dynamics34(3): 761–773.
25.
QuQXZhangHGRuiYYangL (2018) Neural network-based sliding mode control for nonlinear systems with actuator faults and unmatched disturbances. Neurocomputing275: 2009–2018.
26.
RufusI (1954) Differential games III: the basic principles of the solution process. Rand Corp Santa Monica Ca, Technical Report RM-1411-PR, Rand Corporation, 1954.
27.
SahooAXuHHeS Jagannathan (2015) Neural network-based event-triggered state feedback control of nonlinear continuous-time systems. IEEE Transactions on Neural Networks and Learning Systems27(3): 497–509.
28.
SahooAXuHJagannathanS (2017) Approximate optimal control of affine nonlinear continuous-time systems using event-sampled neurodynamic programming. IEEE Transactions on Neural Networks and Learning Systems28(3): 639–652.
29.
ShimaTGolanOM (2007) Linear quadratic differential games guidance law for dual controlled missiles. IEEE Transactions on Aerospace and Electronic Systems43(3): 834–842.
30.
SongJSongS (2019) Robust impact angle constraints guidance law with autopilot lag and acceleration saturation consideration. Transactions of the Institute of Measurement and Control41(1): 182–192.
31.
SunJLiuC (2018a) Finite-horizon differential games for missile-target interception system using adaptive dynamic programming with input constraints. International Journal of Systems Science49(2): 1–20.
32.
SunJLLiuCS (2018b) Decentralised zero-sum differential game for a class of large-scale interconnected systems via adaptive dynamic programming. International Journal of Control92(12): 2917–2927.
33.
SunJLLiuCS (2018c) Finite-horizon diifferential games for missile-target interception system using adaptive dynamic programming with input constraints. International Journal of Systems Science49(2): 264–283.
34.
SunJLLiuCSYeQ (2016) Robust differential game guidance laws design for uncertain intercepter-target engagement via adaptive dynamic programming. Internationa Journal of Control90(5): 990–1004.
35.
TangYHeHWenJLiuJ (2014) Power system stability control for a wind farm based on adaptive dynamic programming. IEEE Transactions on Smart Grid6(1): 166–177.
36.
VamvoudakisKG (2014) Event-triggered optimal adaptive control algorithm for continuous-time nonlinear systems. IEEE/CAA Journal of Automatica Sinica1(3): 282–293.
WangDLiuDRWeiQL (2011) Adaptive dynamic programming for finite-horizon optimal tracking control of a class of nonlinear systems. In: Proceedings of the 30th Chinese Control Conference, Yantai, China, 22–24 July 2011, pp. 2450–2455. Piscataway, NJ, USA: IEEE.
39.
WangDLiuDRWeiQL (2012) Finite-horizon neuro-optimal tracking control for a class of discrete-time nonlinear systems using adaptive dynamic programming approach. Neurocomputing78(1): 14–22.
40.
WangDMuCXHeHBLiuDR (2017) Event-driven adaptive robust control of nonlinear systems with uncertainties through ndp strategy. IEEE Transactions on Systems Man and Cybernetics Systems47(7): 1358–1370.
41.
WangDMuCXZhangQCLiuDR (2016) Event-based input-constrained nonlinear state feedback with adaptive critic and neural implementation. Neurocomputing241: 848–856.
42.
WangZ (2017) Adaptive smooth second-order sliding mode control method with application to missile guidance. Transactions of the Institute of Measurement and Control39(6): 848–860.
43.
WeiQLiuDLewisFL, et al. (2017) Mixed iterative adaptive dynamic programming for optimal battery energy control in smart residential microgrids. IEEE Transactions on Industrial Electronics64(5): 4110–4120.
44.
WhiteDASofgeDAReinholdVN (1994) Handbook of intelligent control: neural, fuzzy, and adaptive approaches. IEEE Transactions on Neural Networks7(5): 851–852.
45.
YangXLiuDRHuangYZ (2013) Neural-network-based online optimal control for uncertain nonlinear continuoustime systems with control constraints. IET Control Theory and Applications7(17): 2037–2047.
46.
ZhangQCZhaoDBZhuYH (2017a) Data-driven adaptive dynamic programming for continuous-time fully cooperative games with partially constrained inputs. Neurocomputing238: 377–386.
47.
ZhangQCZhaoDBZhuYH (2017b) Event-triggered control for continuous-time nonlinear system via concurrent learning. IEEE Transactions on Systems Man and Cybernetics Systems47(7): 1071–1081.
48.
ZhaoQXuHJagannathanS (2015) Neural network-based finite-horizon optimal control of uncertain affine nonlinear discrete-time systems. IEEE Transactions Neural Netw Learn System26(3): 486–499.
49.
ZhongXNNiZHeHB (2015) Event-triggered adaptive dynamic programming for continuous-time nonlinear system using measured input-output data. In: 2015 International Joint Conference on Neural Networks (IJCNN), Killarney, Ireland, 12–17 July 2015, pp. 1–8. NEW YORK, NY, USA: IEEE.
50.
ZhuYHZhaoDBHeHBJiJH (2017) Event-triggered optimal control for partially-unknown constrained-input systems via adaptive dynamic programming. IEEE Transactions on Industrial Electronics64(5): 4101–4109.