Abstract
With the rapid advancement of intelligent and clustered unmanned technologies, the autonomous decision-making and confrontation of multiple unmanned aerial vehicles (UAVs) has emerged as a prominent research focus among major military powers worldwide. Multi-UAV confrontation environments are characterized by high-dimensional action spaces, nonlinearity, and stringent real-time decision-making requirements, which pose significant challenges to existing decision-making algorithms. Therefore, this paper addresses the problem of the real-time maneuvering decision-making problem of multiple UAVs in the context of 2v2 close-range air combat. Firstly, a multi-UAV confrontation simulation environment based on the agent-environment cyclic (AEC) game model is developed to resolve issues of ambiguous reward allocation and dynamic variations in the number of intelligent agents. Secondly, a multi-agent soft actor-critic deep reinforcement learning method is proposed within a centralized training-distributed execution (CTDE) framework, supplemented by a strategy training and optimization approach incorporating curriculum learning. Furthermore, by integrating mainline and process rewards, collaborative rewards are introduced to strengthen tactical coordination among UAVs and enhance the effectiveness of adversarial strategies. Finally, three-dimensional simulation experiments validate the effectiveness and stability of the proposed method.
Keywords
Get full access to this article
View all access options for this article.
