Cooperative task allocation of heterogeneous UAVs oriented to mobile targets

Abstract

Addressing the challenges of inefficient dynamic programming and unstable performance in heterogeneous unmanned aerial vehicle (UAV) systems for tracking mobile targets in complex environments, this paper establishes a task allocation optimization model that comprehensively incorporates the impact of UAV capture range, flight distance, and mission time benefits on allocation results. Meanwhile, specific enhancements to the classical Kalman filter are introduced to mitigate allocation oscillations during target pursuit, thereby improving model robustness. To solve this model, we propose a Gray Wolf-Hippopotamus Optimization Algorithm (GWHO) for real-time allocation. In addition, for newly emerging targets, an event-triggered dominant-solution retention strategy is implemented to preserve high-quality solutions from historical allocations and accelerate algorithm convergence. Simulation analyses demonstrate that the improved algorithm achieves a 25.8% faster convergence on benchmark test functions while reducing allocation oscillations by over 60%. Moreover, the model consistently achieves optimal allocation results across diverse scales of UAV swarms and targets, validating its effectiveness and applicability.

Keywords

Moving target Kalman filter task allocation hippopotamus algorithm maintain dominance solution

Introduction

In recent years, unmanned aerial vehicles (UAVs) have demonstrated increasingly prominent roles in environmental monitoring, surveillance, and target capture applications due to their advantages of low operational costs, high maneuverability, and strong adaptability (Ding et al., 2023). Particularly in military operations, their high cost-effectiveness, enhanced controllability, and adaptable coordination capabilities position them as critical forces in future intelligent battlefields (Peng et al., 2021). However, given the growing complexity of operational environments, an individual UAV faces limitations in coverage range and system fault tolerance when engaging multiple targets, failing to satisfy diverse mission requirements (Wang et al., 2025). Consequently, improving overall mission effectiveness and efficiently coordinating heterogeneous multi-UAV systems have emerged as significant research focuses (Zhang et al., 2025; Zhu et al., 2021).

The task allocation problem for heterogeneous multi-UAV systems can be formalized as a complex combinatorial optimization problem subject to multiple constraints (Chakraa et al., 2025; Edison and Shima, 2011). Current mainstream allocation models primarily include Distributed Network Flow Optimization (DNFO) (Hochbaum et al., 2022), Mixed-Integer Linear Programming (MILP) (Nguyen et al., 2019), and Cooperative Multi-Task Allocation Protocol (CMTAP) (Sun et al., 2022). Building on these frameworks, solutions are typically derived through centralized (Duan et al., 2020; Skaltsis et al., 2021) or distributed algorithms (Du et al., 2025; Otte et al., 2020). Liu et al. (2024) incorporated an elite opposition-based learning strategy into the snake optimization algorithm. Moreover, by integrating adaptive threshold adjustment, they enhanced the local search performance and convergence speed of the algorithm. Wei et al. (2020) extended single-objective particle swarm optimization (PSO) to address multi-robot collaborative task allocation and proposed an enhanced multi-objective variant. Cheng et al. (2019) introduced allocation sequencing and state-transition modules into genetic algorithms to optimize cooperative strike missions against enemy positions. For local reallocation during area reconnaissance-strike operations, Chen et al. (2021) developed an information consistency algorithm incorporating UAV communication range and time-delay constraints. Separately, Chen et al.(2022) integrated an extended consensus bundle algorithm with dynamic task allocation to generate rapid replanning solutions for emergent targets.

However, existing solutions primarily address static targets, with limited consideration of how targets’ dynamic mobility affects allocation decisions. Miao et al. (2023) adopted a novel population initialization method to eliminate redundant solutions during the search process and further integrated the multimodal multi-objective differential evolution algorithm with simulated annealing. Consequently, this proposed approach provided decision-makers with more effective, equivalent optimal solutions for dynamic task allocation in multi-robot systems. Fei et al. (2024) proposed an air-ground cooperative task allocation method for mobile target search-and-strike missions in complex urban environments. Their digital pheromone-based search model enhances resource utilization efficiency. Guo et al. (2024) developed a hybrid framework that integrates real-time trajectory prediction with distributed game-theoretic decision-making, enabling effective allocation for fast-moving targets. Kang et al. (2022) introduced an optimized PSO algorithm incorporating Local Random Search (LRS) and Variable Neighborhood Search (VNS), in which adaptive learning factors in the update strategies resolve the allocation for simultaneous UAV attacks. Hao et al. (2021) established a dynamic allocation framework fusing distributed election and centralized allocation algorithms to balance solution optimality and efficiency. Based on the weapon-target allocation problem, Zhang et al. (2024) combined heuristic allocation with deep Q-network and graph neural network frameworks to improve interception success rates against aerial targets. To address unknown target trajectories, Yue et al. (2023) proposed a Federated Multi-agent Soft Actor-Critic (FMASAC) scheme that enables cooperative tracking in unknown environments while reducing policy update variance. Chen and Liu (2019) solved cooperative allocation for ground mobile targets by projecting intercept points and implementing online trajectory planning. Song et al. (2022) tackled prolonged computation and frequent target switching during iterative allocation through a capture-marker-based algorithm that achieves uniform saturation distribution.

In summary, existing studies have extensively investigated task allocation optimization for UAV-based mobile target capture. However, three critical challenges persist: (1) How to holistically incorporate heterogeneous UAVs’ performance metrics to rapidly fulfill allocation requirements across diverse missions, selecting optimal UAV combinations for target tracking and capture to maximize overall mission utility; (2) How to accelerate dynamic reallocation for emergent targets after initial assignment to ensure rapid response; (3) How to suppress allocation churning—oscillations caused by UAV matching parameters fluctuating near activation thresholds during large-scale real-time allocation—thereby ensuring model robustness. To address these challenges, this paper proposes a cooperative task allocation framework for heterogeneous multi-UAV systems targeting mobile objectives. Specifically, we establish an allocation model accounting for UAV performance disparities and mission constraints, integrate Kalman filtering to stabilize allocation outputs, and develop a Gray Wolf-Hippopotamus Optimization Algorithm (GWHO) for rapid model optimization and emergent target redistribution.

The main innovations of this paper are as follows:

Incorporating mission-specific requirements, UAV capture ranges, flight distances, and pursuit durations, we establish a heterogeneous multi-UAV systems task allocation model. To address real-time dynamic allocation needs, the GWHO algorithm is developed by introducing a tiered predation mechanism (Mirjalili et al., 2014). This approach applies weighted averaging to partial dominant solutions to constrain convergence tendencies, accelerating algorithmic convergence and enhancing allocation timeliness.

To accelerate the reallocation of emergent targets during pursuit, a dominant-solution retention strategy is proposed to improve redistribution efficiency.

Building upon these allocations, we introduce an improved Kalman filter-based activation buffering mechanism to resolve threshold-proximal oscillations.

The remainder of this paper is structured as follows: Section Modeling of Heterogeneous Multi-UAVs Mission Allocation introduces the task allocation optimization model for heterogeneous multi-UAV systems in pursuing moving targets, as well as a buffering mechanism based on an improved Kalman filtering algorithm to mitigate allocation oscillations. Section GWHO Algorithm proposes the GWHO algorithm to solve the aforementioned model. It introduces an event-triggered elite solution retention strategy to handle dynamic task allocation adjustments for emergent targets. Section Experimental Simulation and Analysis presents simulation results for allocation schemes across different UAV fleet sizes in both two-dimensional (2D) and three-dimensional (3D) spaces, along with performance testing and analysis of the algorithm. The effectiveness of the core innovations is further validated through ablation experiments. Finally, Section Conclusion summarizes the paper and clarifies the limitations of this study as well as potential directions for future research.

Modeling of heterogeneous multi-UAVs mission allocation

Problem description

This paper establishes a framework for collaborative ground target tracking and capture using heterogeneous multi-UAV systems. We define distinct effective capture ranges, flight speeds, energy consumption rates for each UAV type, and complete task allocation by considering dynamic updates to target positions.

The scenario assumes N heterogeneous UAVs executing combat missions against M targets, where the UAV set is denoted as $U = {U_{1}, U_{2}, \dots U_{N}}$ . The target set is denoted as $G = {G_{1}, G_{2}, \dots G_{M}}$ . The attribute set of the UAV is $U_{data} = (ID, Position, Velocity, Cons, Range, Hight, Endurance)$ , U^ID denotes the identification number of the ith UAV, U^Positionrepresents its real-time position, U^Velocity represents its cruising speed, U^Cons represents energy consumption, U^Range represents the effective capture range, U^Hight represents flight altitude data. Here, U^Hight is a quadruple consisting of the current flight altitude H_height, takeoff speed H_speed, minimum capture altitude H_lowWork, and maximum flight altitude H_highest. U^Endurance represents the endurance mileage of the UAV. We define a decision variable c_ij, c_ij= 1 indicates that UAV i has been assigned to target j to carry out mission, c_ij= 0 indicates that UAV i has not been assigned to target j. Attribute set of target is $G_{data} = (ID, Position, Tnumber, Velocity)$ , which, respectively, denote ID, the real-time position, the number of UAVs required to capture this target, and the speed of target j.

Building on this framework while incorporating practical constraints, we formulate the following assumptions for our paper:

All targets reside within the monitored area, and UAVs can acquire their positional data in real time.

Each UAV executes only one mission at a time, though mission-specific UAV requirements vary.

During tracking-capture operations, UAVs can immediately complete missions when targets enter their effective capture ranges.

The operational area is represented in a 2D Cartesian coordinate system, with UAV-target distances computed using Euclidean distance metrics.

Objective function

When allocating tasks to different targets, we comprehensively consider the task completion benefit and path energy consumption cost, aiming to maximize time benefit, minimize flight distance, and minimize energy consumption. Based on this, we define the sum of the UAV’s flight distances in a 2D environment as the total flight distance and formulate the flight distance function as follows

D_{i} = \sum_{j = 1}^{M} c_{i j} \cdot ∥ U_{i}^{P} - G_{j}^{P} ∥

(1)

In the formula, $U_{i}^{P}$ and $G_{j}^{P}$ are real-time positions of UAV i and target j, respectively. D_i represents the total flight distance of UAV i.

Design of the time-benefit function for UAVs to capture targets is presented as follows

T_{ij} = exp (- \frac{{(t_{ij}^{*} - τ_{j}^{best})}^{2}}{2 σ_{t}^{2}})

(2)

In the formula, $t_{ij}^{*}$ denotes the expected capture time of target j by UAV i, $τ_{j}^{best}$ represents the optimal capture time, and σ_t is the time-sensitive factor. When a task requires only one UAV, the optimal capture time is defined as the moment when the target enters that UAV’s effective capture range. For targets requiring multiple UAVs, the optimal capture time is defined as the time when the target enters the effective capture range of the final UAV.

Based on the above functions, the objective functions constructed in this paper are as follows

F_{score} = λ_{1} \sum_{i = 1}^{N} D_{i} + λ_{2} \sum_{i = 1}^{N} α_{i} \cdot D_{i} - λ_{3} \sum_{i = 1}^{N} \sum_{j = 1}^{M} T_{ij}

(3)

In the formula, λ denotes the weight coefficient of each component, T_ij represents the time-based benefit metric obtained by UAV I from the detection to final capture of target j, D_i indicates the flight distance of UAV i, and α_i is the energy consumption per unit distance for UAV i.

Constraint conditions

Constraint on the number of UAV

When target j requires coordinated capture by k UAVs, it must be assured that exactly k UAVs capture the target within the designated time window

\sum_{i = 1}^{N} c_{ij} \cdot 1 \geq k_{j}

(4)

In the formula, c_ij represents allocation decision vector between UAV i and target j, while 1 denotes a one-dimensional unit column vector. k_j represents the number of UAVs required to capture the target j.

Voyage constraints

In practical scenarios, UAVs have limited flight ranges due to endurance and payload constraints. Thus, during the task assignment process, it is imperative to ensure that the cumulative flight distance of each UAV does not exceed its maximum design range

\sum_{j = 1}^{M} c_{ij} \cdot ∥ U_{i}^{P} - G_{j}^{P} ∥ \leq U_{i}^{R}

(5)

Decision variable constraints

During mission execution, all targets must be captured. While each target can be captured by multiple UAVs, each UAV is limited to pursuing only one target. In addition, each UAV must adhere to the constraint that it can perform only one task at a time. Consequently, the following constraints must be satisfied

{\begin{matrix} \sum_{i = 1}^{N} c_{ij} \geq 1 \\ \sum_{j = 1}^{M} c_{ij} \leq 1 \\ c_{ij} = {0, 1} \end{matrix}

(6)

Anti-collision detection constraints

During the target encirclement mission, a collision risk area is established to mitigate potential collisions between UAVs during both takeoff and cruising. If the altitude difference between any two UAVs falls below the safety threshold risk_Height, and their horizontal distance is simultaneously less than risk_Dis, predefined constraints are activated. In response, the system increases the takeoff speed of the higher-altitude UAV while reducing that of the lower one, continuing until all UAVs have exited the collision risk area. This strategy effectively prevents collisions by dynamically adjusting the vertical profile of the UAV formation in real time

\begin{matrix} | U_{i}^{H_hight} - U_{j}^{H_hight} | > risk_Hight \\ \sqrt{{(x_{i} - x_{j})}^{2} + {(y_{i} - y_{j})}^{2}} > risk_Dis \end{matrix}

(7)

In summary, the heterogeneous multi-UAV systems task assignment model for moving targets investigated in this paper can be described as follows

\begin{matrix} max F_{score} = λ_{1} \sum_{i = 1}^{N} D_{i} + λ_{2} \sum_{i = 1}^{N} α_{i} \cdot D_{i} - λ_{3} \sum_{i = 1}^{N} \sum_{j = 1}^{M} T_{ij} \\ s . t . \sum_{i = 1}^{N} c_{ij} \cdot 1 \geq k_{j} \\ \sum_{j = 1}^{M} c_{ij} \cdot ∥ U_{i}^{P} - G_{j}^{P} ∥ \leq U_{i}^{R} \\ \sum_{i = 1}^{N} c_{ij} \geq 1 \\ \sum_{j = 1}^{M} c_{ij} \leq 1 \\ c_{ij} = {0, 1} \\ | U_{i}^{H_hight} - U_{j}^{H_hight} | > risk_Hight \\ \sqrt{{(x_{i} - x_{j})}^{2} + {(y_{i} - y_{j})}^{2}} > risk_Dis \end{matrix}

(8)

Improved Kalman filtering buffer mechanism

In the proposed model, a continuous matching matrix C is initialized stochastically, where element $C_{ij} \in [0, 1]$ represents the continuous matching value between UAV i and target j. This matrix serves as input to the GWHO algorithm for iterative optimization. The final allocation matrix c is obtained by discretizing the output matrix C into binary values {0,1} using activation threshold c_Active.

During allocation, each UAV’s comprehensive benefit for capturing a target is computed by sequentially evaluating three metrics: capture time benefit, path benefit, and flight energy consumption benefit, followed by weighted aggregation. In dynamic environments, however, allocation oscillations may occur due to:

Temporal variations in UAV performance parameters and target positions, which generate multiple near-optimal solutions with comparable benefit values.

Intentional randomization in UAV quantity constraints that maintains global exploration capability. Specifically, during iterations, surplus UAVs are randomly pruned or deficit UAVs supplemented to satisfy mission-specific requirements when the available UAV count deviates from requirements.

These factors cause the continuous matching matrix C to oscillate near the activation threshold c_Active during optimization. Consequently, some UAVs frequently switch between active and standby states, generating allocation churn that degrades system robustness and compromises mission effectiveness. To address this, we integrate a Kalman filter-based state estimator as a buffering mechanism into our allocation framework. This approach suppresses oscillations by filtering transient fluctuations through weighted averaging of predicted and measured matching values at consecutive time steps.

Kalman filtering is a probabilistic inference-based algorithm that estimates system states by fusing predictive models with measurement data through two sequential phases: prediction and update. Capitalizing on its capability to deliver optimal state estimates in noisy environments by combining measurement sequences with state-transition models, we apply Kalman filtering to buffer the continuous matching matrix C after each allocation cycle. The modified state estimation and observation equations for the prediction phase are

{\begin{cases} {\hat{F}}_{t}^{-} = {\hat{F}}_{t - 1} + w_{t} \\ z_{t} = {\hat{F}}_{t} + r_{t} \end{cases}

(9)

In the formula, F denotes the continuous matching matrix between targets and UAVs, where each element f_ij in the matrix is a decimal between 0 and 1. UAV i is deemed activated for pursuing target j if $f_{ij} > c_Active$ , otherwise it remains in standby mode. ${\hat{F}}_{t}^{-}$ represents the prior estimation matrix for matching continuous values at time t; ${\hat{F}}_{t - 1}^{-}$ represents the posterior estimation matrix for matching continuous values at time t-1; w_t denotes state estimation error at time t; z_t represents the matching continuous value matrix for the UAV-corresponding target, provided by the GWHO fusion algorithm proposed in this paper at time t, which is treated as observation matrix in this paper. r_t represents the observation error at time t.

The calculation formula for the estimated values in the update stage is as follows

{\begin{matrix} {\hat{F}}_{t} = {\hat{F}}_{t}^{-} + K_{t} (z_{t} - {\hat{F}}_{t}^{-}) \\ K_{t} = \frac{w_{t - 1}}{w_{t - 1} + r_{t}} \\ w_{t} = (1 - K_{t}) \cdot w_{t - 1} \end{matrix}

(10)

In the formula, ${\hat{F}}_{t}$ represents the posterior estimation matrix of activation value at time t, K_t represents the Kalman gain.

GWHO algorithm

To address the need for rapid model optimization, this section proposes the GWHO algorithm by fusing gray wolf optimizer (GWO) and hippopotamus optimization (HO) algorithm principles. It involves two key innovations: First, the social hierarchy and hunting mechanism inherent in GWO are incorporated into HO’s initialization phase. Second, an event-triggered dynamic strategy is introduced to enable rapid task reallocation through dominant-solution retention.

Content of the GWHO algorithm

The traditional HO algorithm determines the global optimal solution of optimization problem by establishing a three-stage model, including Pond position updating (exploration phase), Predator defense (exploration phase), and Predator evasion (exploitation phase). After initial solution generation, the exploration phase updates hippopotamus positions. The position update equations for females and juveniles are

x_{it}^{Fho} = {\begin{matrix} x_{it} + h_{1} \cdot (D_{ho} - I_{2} \cdot M G_{t}), T > 0.6 \\ Ξ, else \end{matrix}

(11)

Ξ = {\begin{matrix} x_{it} + h_{2} \cdot (M G_{t} - D_{ho}), r_{6} > 0.5 \\ l b_{i} + r_{7} \cdot (u b_{i} - l b_{i}), else \end{matrix}

(12)

h {\begin{matrix} I_{2} \times \vec{r_{1}} + (~ Q_{1}) \\ 2 \times \vec{r_{2}} - 1 \\ \vec{r_{3}} \\ I_{1} \times \vec{r_{4}} + (~ Q_{2}) \\ r_{5} \end{matrix}

(13)

T = exp (- \frac{t}{Γ})

(14)

In the formula, $x_{it}^{Fho}$ represents the real-time position of the female hippopotamus i in the iteration t. x_it represents the real-time position of the hippopotamus i in the iteration t. h₁ and h₂ are randomly selected values from the five expressions in the equation h. D_ho denotes dominant hippopotamus in the current iteration process. MG_t signifies average positional value of a randomly selected number of hippos in the iteration t. ub_i and lb_i, respectively, represent the upper and lower bounds of the position of the hippopotamus i. I₁ and I₂ are random integers between 1 and 2. r₁−r₄ are random vectors ranging from 0 to 1, r₅−r₇ is a random number between 0 and 1. Q₁ and Q₂ are integer random numbers that can only be 0 or 1. t represents the current iteration round. Γ represents the total number of algorithm iterations.

By incorporating the population’s average position during initial iterations, female hippos conduct extensive global exploration within the solution space, guiding the swarm toward current dominant solutions. However, the algorithm’s random selection neglects individual fitness values. A high proportion of inferior solutions among selected hippos may mislead convergence analyses and slow convergence. To mitigate this, we integrate the social hierarchy mechanism from the GWO. The enhanced formulation yields

P = \max_{a \in A} F_{score} (a)

(15)

In the formula, $A \subseteq X, | A | = g$ . X is the set containing all hippos, from which g hippos are randomly selected to form set A. P is the position corresponding to the hippopotamus with the highest fitness value in set A

M G_{t}^{G W H O} = \frac{1}{3} \sum_{i = 1}^{3} P_{i}

(16)

In the formula, ${MG}_{t}^{GWHO}$ represents the average position of the hippopotamus that was finally selected in the iteration t.

The enhancement operates as follows:

Perform three independent random selections of hippos from the population

Compute fitness values and retain the position of the highest-fitness hippo per selection.

Three hippos with the highest fitness values are considered analogous to three leading wolves in a gray wolf population. Their positional information is averaged calculation and used as the mean position of female hippos during the iterative process. Once this operation is completed, the iterative process for female hippos is no longer updated by randomly selected average hippo but instead by randomly selected dominant hippo, thereby accelerating convergence speed. The pseudo-code is as follows:

Algorithm Phase1: Exploration
1 Input: $D_{ho}, X, Agents, h$
2 fori = 1: Agents/2 do
3 $I_{1}, I_{2}$ ←randi([1,2],1,1)
4 $Q_{1}, Q_{2}$ ←randi([0,1],1,2)
5 for k = 1:3 do
6 R_n←randperm(Agents,1)
7 R_g←randperm(Agents, R_n)
8 $A = X (R_{g})$
9 fit ←fitness(A)
10 [fit_wolf,X_sort] ←sort(fit(k))
11 X_wolf(k) ←A (X_sort(1))
12 end
13 ${MG}_{t}^{GWHO}$ ←mean(X_wolf)
14 $X_{it}^{ho}$ ←X_it + h₁(D_ho-I₁X_it)
15 T = exp(-t/Max_iterations)
16 if T > 0.8
17 $X_{it}^{Fho}$ ←X_it+h₂(D_ho-I₂ ${MG}_{t}^{GWHO}$ )
18 else
19 if rand() > 0.7
20 $X_{it}^{Fho}$ ←X_it+h₃*( ${MG}_{t}^{GWHO}$ -D_ho)
21 else
22 $X_{it}^{Fho}$ ←(rand*(up-low) + low)
23 end
24 $X_{it}^{Fho}$ = min(max( $X_{it}^{Fho}$ ,low),up)
25 if fitness( $X_{it}^{ho}$ ) < fitness( $X_{it}^{Fho}$ )
26 fit_best ← fitness( $X_{it}^{ho}$ )
27 else
28 fit_best ← fitness( $X_{it}^{Fho}$ )
29 end
30 Output: X

Event-triggered reallocation strategy

Following the allocation of existing targets, emergent targets necessitate dynamic adjustments to the scheme. Reinitializing the allocation algorithm would prolong solution times and reduce efficiency. To address this, we treat new targets as event triggers. Upon triggering, partial dominant solutions from the current allocation are retained as initial solutions for subsequent iterations. This enables local scheme adjustments and accelerates optimization. Figure 1 illustrates the complete algorithm workflow.

Figure 1.

GWHO algorithm flow chart.

The specific process steps are as follows:

Step 1: Initialize the UAVs and the existing target information, and set the allocation algorithm’s invocation interval Δt. Periodically invoke the allocation algorithm at every Δt interval to update or maintain the assignment results in response to target movement.

Step 2: Set the event trigger Flag. If no new targets emerge: Set Flag to 0, randomly initialize the hippopotamus population, and proceed to the GWHO allocation algorithm. When a new target emerges: Set Flag to 1. During the population initialization phase, initialize a specified proportion of the hippopotamus individuals using partial elite solutions from the previous assignment result; initialize the remaining individuals randomly.

Step 3: Execute the GWHO algorithm. Upon completion of population initialization, perform position update iterations on the hippopotamus population to find the optimal UAV-to-target assignment solution.

Step 4: Apply filtering and buffering to the obtained assignment solution and adopt it as the final allocation output.

Step 5: Determine whether the simulation end time has been reached. If so, terminate the simulation.

Otherwise, verify if the next algorithm invocation time has arrived. If confirmed, invoke the allocation algorithm. If not, persist with the current state until the scheduled algorithm call time.

Experimental simulation and analysis

To evaluate the performance of the improved algorithm proposed in this paper, a simulation environment was established to execute the algorithm, and its performance was compared with that of the traditional hippopotamus algorithm.

Algorithm performance testing

This study employs a set of benchmark test functions to evaluate the performance differences between the proposed GWHO algorithm and the conventional HO algorithm. By comparing the algorithm-computed values with the theoretical optima under varying iteration numbers, we obtained the relative error curves illustrated in Figure 2(a) and (c). These curves not only reveal the convergence behavior of each test function but also provide a detailed comparison of relative errors before and after the algorithm improvement. It is worth noting that all parameters of the baseline HO algorithm were configured according to the standard values recommended in the original reference(Amiri et al., 2024). As shown in Figure 2(a), the test functions include F4: Schwefel’s Problem, F9: Generalized Rastrigin, F10: Ackley, F14: Shekel’s Foxholes, and F18: Goldstein-Price. Figure 2(c) presents the results for F19–F23 from the Hartman and Shekel families. The results clearly show that the relative error curves of the GWHO algorithm decline more rapidly across all test functions compared to those of the HO algorithm. Moreover, at any given iteration count, GWHO consistently achieves lower relative error. In addition, Figure 2(b) and (d) shows the number of iterations required for each algorithm to first attain a relative error below 0.01%. The GWHO algorithm reduces the number of necessary iterations by an average of 25.8% compared to the HO algorithm, demonstrating its ability to reach the target precision more rapidly and with improved convergence efficiency.

Figure 2.

Comparison of relative errors of different test functions and the number of iterations under specified precision: HO algorithm and GWHO algorithm.

Experimental simulation analysis

Verification of model rationality

The performance of the GWHO algorithm was assessed in a simulation scenario where 12 heterogeneous UAVs pursued three mobile targets within a 20 km × 20 km × 100 m operational area. The initial parameters of both UAVs and targets are provided in Tables 1 and 2, respectively. In the objective function formulation, minimizing the flight distance is a key factor in achieving the shortest time and lowest energy consumption, since the shortest path generally leads to a higher capture success rate and lower collision risk. To balance the trade-off between mission completion time and energy consumption, the weight coefficients in the objective function were set as λ₁ = 0.4, λ₂ = 0.3, and λ₃ = 0.3, based on empirical insights from prior applications.

Table 1.

UAVs’ initial information.

Information of UAVs
ID	Position	Velocity (m/s)	Cons (kWh/km)	Range (km)	Endurance (km)	H_speed (m/s)	H_lowWork (m)	H_hightest (m)
1	[8,2]	9	0.4	1.4	10	17	30	89
2	[12,6]	7	0.1	0.9	9	16	25	80
3	[16,4]	8	0.5	1.5	12	15	28	86
4	[14,14]	8	0.2	1.1	11	18	21	80
5	[11,17]	12	0.5	2.3	13	17	24	83
6	[15,12]	9	0.3	1.2	15	16	30	84
7	[3,6]	5	0.6	1.3	12	15	28	87
8	[10,8]	10	0.5	1.5	11	15	30	83
9	[2,14]	11	0.3	0.6	8	17	27	89
10	[3,8]	10	0.2	1.1	10	14	20	90
11	[9,14]	9	0.1	0.8	13	15	29	89
12	[10,4]	8	0.4	0.6	12	16	30	84

Table 2.

Targets’ initial information.

Information of Targets
ID	Position	Velocity (m/s)	Number of UAV
1	[2,18]	7	2
2	[4,3]	8	1
3	[15,8]	6	3

Following simulation initiation, Figure 3(a) displays the initial allocation results derived from target and UAV parameters. Activated UAVs satisfying target-specific capture requirements (denoted by distinct colors) executed pursuit tasks, while gray UAVs remained inactive. As shown in Figure 3(b), Target 1 entered the capture range of its assigned UAV, triggering successful capture. Thereafter, this UAV transitioned to an inactive state awaiting reallocation. Since Target 2 required two UAVs for capture, when it entered the range of only the first UAV (UAV 1), UAV 1 tracked it until Target 2 also entered the range of the second UAV (UAV 7). Target 2 was deemed captured only when both UAVs achieved simultaneous range coverage. Upon the abrupt emergence of Target 4 (initial position: (11,10), speed: 7 m/s, required UAVs: 2), the elite solution retention strategy was activated for rapid reallocation, yielding the results in Figure 3(c). During Target 2’s movement, increasing the distance between UAV 7 and Target 2 reduced the profit metric (e.g. reward minus energy cost). Consequently, the subsequent allocation replaced UAV 7 with a higher-yield UAV (UAV 12) for Target 2 pursuit, as validated in Figure 3(d).

Figure 3.

Initial allocation and target capture status visualization under 12-UAV/3-target scenario.

In the 3D Cartesian coordinate system, the assignment scheme for each UAV is shown in Figure 3(e). After taking off, the UAVs begin pursuing their designated targets. A target is considered successfully captured once it enters a UAV’s effective capture range and the UAV has reached the minimum capture altitude. During the pursuit of target 3, UAV 3 and UAV 6 gradually approach each other and enter a collision risk area—triggering the anti-collision constraints, as visible in the detailed view of Figure 3(e). Because UAV 3 is at a higher altitude than UAV 6, the collision avoidance mechanism increases UAV 3’s takeoff speed while reducing that of UAV 6’s. This adjustment separates their flight trajectories, enabling them to continue cruising safely at separate altitudes and thereby avoiding a potential collision.

Performance verification of algorithms at different scales

In the experiment above, 12 UAVs were assigned to pursue 3 mobile targets. The allocation strategy was dynamically adjusted to accommodate the emergence of new targets and low-revenue tasks, ensuring the successful completion of the pursuit mission. To validate the effectiveness of the algorithm for UAV-target allocation across different scales, this paper examined two additional scenarios: 20 UAVs pursuing 5 targets and 30 UAVs pursuing 7 targets.

In the scenario involving 20 UAVs pursuing 5 targets, a mission area of 100 km × 100 km × 70 m was defined, with the initial positions of both UAVs and targets being randomly generated. The initial allocation results are depicted in Figure 4(a). During the pursuit phase, upon the emergence of a new target (Target 6), the algorithm dynamically assigned UAV 14 and UAV 18 to pursue it, as marked by the red box in Figure 4(b).

Figure 4.

Task assignment results and filter buffer efficacy assessment under 20-UAV/5-target scenario.

Compared with the experiments above, increasing the number of both UAVs and targets led to a higher frequency of allocation oscillations. Specifically, some UAVs assigned to pursue certain targets exhibited repeated transitions between active and standby states. As illustrated in Figure 4(b), three UAVs are required to pursue Target 1. In the initial allocation, UAVs 5, 19, and 20 were activated. However, during the current simulation phase, the allocation plan was updated, activating UAVs 5, 10, and 19. Furthermore, as shown in Figure 4(c), in subsequent simulation steps, UAV 20 was activated once again and partially traveled its path before being deactivated. Similar oscillatory behavior was observed with UAV 13 assigned to Target 4 and with UAVs 11 and 16 assigned to Target 2.

The final capture results without the filtering buffer mechanism are presented in Figure 4(c), while the results with the mechanism implemented are shown in Figure 4(d). It is evident that the filtered results reduce the oscillation activation frequency for certain UAVs, thereby decreasing the unnecessary flight paths incurred. Figure 4(e) displays the allocation outcome in 3D space. The magnified view reveals that UAV 8 and UAV 12 rapidly adjust their takeoff speeds after entering the collision risk area, allowing both to exit the area safely.

In addition, a mission area of 300 km × 300 km × 100 m was defined, with the initial positions of both UAVs and targets randomly generated. The initial allocation results for 30 UAVs pursuing seven targets are presented in Figure 5(a). During the pursuit phase, upon the introduction of Target 8, the algorithm assigned UAV 14 and UAV 30 to pursue it, as marked by the red box in Figure 5(b). Compared with the aforementioned experiments, it is evident that as the number of UAVs and targets increases, the number of UAVs experiencing allocation oscillations rises significantly.

Figure 5.

Task assignment results and filter buffer efficacy assessment under 30-UAV/7-target scenario.

The final capture results without the filtering buffer mechanism are presented in Figure 5(c), while the results with the mechanism implemented are shown in Figure 5(d). The 3D allocation scheme is shown in Figure 5(e). When UAV 18 approaches too close to UAV 29, the collision avoidance constraint is triggered, causing UAV 18 to reduce its takeoff speed and enter the cruising phase. After exiting the collision risk area, it resumes normal takeoff speed.

The transition of a UAV from an activated state to a standby state at its original position is defined as one oscillation event. For the pursuit and allocation tasks across different scales, Figure 6(a) illustrates the total number of oscillation events for all UAVs, spanning from the initial allocation to the final capture completion.

Figure 6.

Comparison of the number of oscillations, total flight time, and remaining endurance percentage before and after filtration.

It is evident that as the number of UAVs increases, the oscillation frequency during the allocation process also rises. Following the introduction of the filtering buffer mechanism, the number of oscillation events decreased by 60.0%, 77.9%, and 78.1% for scenarios involving 12, 20, and 30 UAVs, respectively. Figure 6(b) presents both the total flight time required for all UAVs to complete the pursuit mission and the remaining endurance percentage under different scales. In the filtered allocation results, the total mission flight time decreased by 30.99%, 11.59%, and 6.08%, while the remaining endurance percentage increased by 0.72%, 2.07%, and 3.03% for the respective scenarios.

Analysis of effect of maintain dominant solution

In the experiments above, when a new moving target emerged during the simulation, the Flag was set to 1, triggering the dominant-solution retention strategy. Specifically, during the initialization phase of the GWHO algorithm, 20% of the hippopotamus population was initialized to the top 20% dominant solutions obtained from the algorithm’s iterations prior to the target’s addition. The remaining 80% were randomly initialized. Compared to randomly initializing the entire hippopotamus population, the convergence behavior of the objective function value over iterations is shown in Figure 7(a).

Figure 7.

The number of iterations for retaining the dominant solution under different UAV scales.

As shown in Figure 7, while both approaches effectively converge to the same objective function value, the dominant-solution retention strategy consistently achieved convergence within 100 generations. Specifically, for scenarios involving 12, 20, and 30 UAVs, the iteration count required to reach convergence was reduced by 84.0%, 59.9%, and 28.4%, respectively, compared to the strategy without dominant-solution retention. This approach not only reduces system allocation oscillations but also accelerates convergence to the target interval, thereby enabling rapid global target allocation.

Analysis of ablation experiment

As discussed in section Verification of model rationality under section Experimental simulation analysis regarding the weight analysis, the flight distance of UAVs is often directly correlated with both task completion time and overall energy consumption among the factors influencing the objective function. In this ablation study, we therefore use the total flight distance as a comprehensive performance metric that intuitively reflects the quality of the task allocation scheme. By sequentially removing each proposed module and calculating the total flight distance of the optimal allocation scheme across different iteration counts, we effectively evaluate the contribution of each innovation to the task allocation performance.

We take the scenario of 20 UAVs pursuing five targets as an example; the results are presented in Figure 8. In the figure, module A denotes the GWHO algorithm, module B the improved Kalman filter, and module C the elite solution retention strategy. “None” refers to the baseline, in which all proposed modules are removed. For each iteration count, the total flight distance under the optimal allocation scheme was computed, and each experiment was repeated 30 times to obtain the average value. From Figure 8, it can be observed that the configuration incorporating all modules (A + B + C) consistently achieves optimal performance, whereas removing all modules results in the worst performance. When module A is removed, a sharp performance decline occurs, particularly within the interval [180, 240] iterations, and the number of iterations required for stable convergence increases significantly compared to the full module setup. This suggests that the GWHO algorithm primarily accelerates the convergence process, enabling the attainment of high-quality solutions with fewer iterations. Removing module B results in a noticeable increase in total flight distance, indicating that the improved Kalman filter effectively mitigates allocation oscillations during pursuit, thereby preventing unnecessary detours. When module C is removed, the total flight distance shows a slight increase relative to the complete configuration, although the overall trend remains similar. This confirms that the elite solution retention strategy helps maintain efficient allocation plans when new targets emerge, reducing path waste caused by unnecessary re-allocations.

Figure 8.

Analysis of the impact of each innovation module on the total flight distance of the UAVs.

In summary, each individual innovative module contributes to improving the allocation results to varying degrees, and their integration leads to the best overall performance.

Conclusion

This paper addresses the task allocation problem for heterogeneous multi-UAV systems engaged in capturing mobile targets. A comprehensive allocation model and evaluation system was developed, incorporating key factors including UAV capture range, flight path distance, and energy consumption. To mitigate system allocation oscillations arising from performance disparities among heterogeneous UAVs and algorithmic calculation errors during the allocation process, a filtering buffer mechanism based on Kalman filtering was introduced. Furthermore, an enhanced hippopotamus algorithm incorporating the concept of gray wolf social hierarchy was proposed. Specifically, the random position updates of the female hippopotamus population were modified to follow updates guided by a subset of dominant females, thereby accelerating the algorithm’s convergence speed. This approach obtains a conflict-free task allocation scheme and adopts a dominant-solution retention strategy when encountering newly added targets. Real-time simulation analyses across various scales demonstrate that the proposed algorithm provides a feasible allocation scheme within a short computational time, effectively satisfying the task allocation requirements for multiple UAVs pursuing mobile targets. Nevertheless, this study has certain limitations. For instance, hardware-in-the-loop (HIL) experiments were not conducted for performance evaluation, and comparative analyses with additional classical algorithms were not included. Future research will place greater emphasis on actual UAV deployment scenarios, including path planning in no-fly zones and rapid dynamic task allocation in the presence of multiple newly emerging targets. Moreover, more extensive comparative studies of algorithms will be carried out to further enhance the cooperative optimization model in addressing the task allocation problem for moving targets.

Footnotes

Declaration of conflicting interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The authors received no financial support for the research, authorship, and/or publication of this article.

Ethical considerations

This article does not contain any studies with human or animal participants.

Consent to participate

There are no human participants in this article and informed consent is not required.

Consent for publication

Not applicable

ORCID iD

Hanghui Wu

Data availability statement

This study mainly focuses on theoretical derivation and algorithm simulation, and no new original experimental data were generated.

References

Amiri

Mehrabi Hashjin

Montazeri

(2024) Hippopotamus optimization algorithm: A novel nature-inspired optimization algorithm. Sci Rep 14: 5032.

Chakraa

Leclercq

Guérin

(2025) Integrating collision avoidance strategies into multi-robot task allocation for inspection. Transactions of the Institute of Measurement and Control 47(7): 1466–1477.

Chen

Qing

(2022) Consensus-based bundle algorithm with local replanning for heterogeneous multi-UAV system in the time-sensitive and dynamic environment. The Journal of Supercomputing 78(2): 1712–1740.

Chen

Yan

Liu

(2021) Communication constrained task allocation of heterogeneous UAVs. Acta Aeronauticaet Astronautica Sinica 42(8): 525844.

Chen

Liu

(2019) Cooperative task assignment for multi-UAV attack mobile targets. In: 2019 Chinese automation congress (CAC), Hangzhou, China, 22–24 November 2019, pp. 2151–2156. New York: IEEE.

Cheng

Xia

(2019) Modeling of unmanned aerial vehicles cooperative target assignment with allocation order and its solving of genetic algorithm. Control Theory and Applications 36(7): 1072–1082.

Ding

Zhu

Feng

(2023) A multi-objective assignment algorithm for UAVs in multiple task with different workloads. In: 2023 Asia-Pacific conference on image processing, electronics and computers (IPEC), Dalian, China, 14–16 April 2023, pp. 354–359. New York: IEEE.

Liu

Chen

(2025) Research on UAV task allocation algorithm based on mixed task grouping. In: 2025 IEEE 8th information technology and mechatronics engineering conference (ITOEC), Chongqing, China, 14–16 March 2025, pp. 446–451. New York: IEEE.

Duan

Liu

Tang

(2020) A novel hybrid auction algorithm for multi-UAVs dynamic task assignment. IEEE Access 8: 86207–86222.

10.

Edison

Shima

(2011) Integrated task assignment and path optimization for cooperating uninhabited aerial vehicles using genetic algorithms. Computers & operations Research 38(1): 340–356.

11.

Fei

Bao

Liu

(2024) Air-ground cooperative autonomous task allocation method for dynamic target search and strike. Systems Engineering and Electronics 46(7): 2346–2358.

12.

Guo

Wang

Zhang

(2024) An intelligent task assignment algorithm for UAVs cluster for fast-moving targets. In: 2024 China automation congress (CAC), Qingdao, China, 1–3 November 2024, pp. 284–289. New York: IEEE.

13.

Hao

Tian

(2021) A distributed-centralized dynamic task allocation algorithm for UAVs tracking moving targets. In: 2021 40th Chinese control conference (CCC) (eds Peng

Sun

), Shanghai, China, 26–28 July 2021, pp. 3774–3779. New York: IEEE.

14.

Hochbaum

Rao

Sauppe

(2022) Network flow methods for the minimum covariate imbalance problem. European Journal of Operational Research 300(3): 827–836.

15.

Kang

(2022) A novel PSO approach for cooperative task assignment of multi-UAV attacking moving targets. In: 2022 34th Chinese control and decision conference (CCDC), Hefei, China, 15–17 August 2022, pp. 3670–3675. New York: IEEE.

16.

Liu

Sun

Wan

(2024) Improved adaptive snake optimization algorithm with application to multi-UAV path planning. Transactions of the Institute of Measurement and Control 47(8): 1639–1650.

17.

Miao

Huang

Jiang

(2023) A novel multimodal multi-objective optimization algorithm for multi-robot task allocation. Transactions of the Institute of Measurement and Control 47(12): 2564–2575.

18.

Mirjalili

Lewis

. (2014) Wolf optimizer. Advances in Engineering Software 69(3): 46–61.

19.

Nguyen

Dambreville

Toumi

(2019) Solving the problem of coordination and control of multiple UAVs by using the column generation method. In: Proceedings of 6th world congress on global optimization, Metz, 8–10 July 2019, pp. 1097–1108. Cham: Springer.

20.

Otte

Kuhlman

Sofge

(2020) Auctions for multi-robot task allocation in communication limited environments. Autonomous Robots 44(3): 547–584.

21.

Peng

Xue

(2021) Review of dynamic task allocation methods for UAV swarms oriented to ground targets. Complex System Modeling and Simulation 1(3): 163–175.

22.

Skaltsis

Shin

Tsourdos

, et al. (2021) A survey of task allocation techniques in MAS. In: 2021 international conference on unmanned aircraft systems (ICUAS), Athens, Greece, 15–18 June 2021, pp. 488–497. New York: IEEE.

23.

Song

Dai

Wang

(2022) Capture the flag based assignment algorithm for tracking task of multi-UAVs. In: 2022 5th international symposium on autonomous systems (ISAS), Hangzhou, China, 8–10 April 2022, pp. 1–7. New York: IEEE.

24.

Sun

Cai

Guo

(2022) Collaborative dynamic task allocation with demand response in cloud-assisted multiedge system for smart grids. IEEE Internet of Things Journal 9(4): 3112–3124.

25.

Wang

ZHAO

(2025) Collaborative multi-task assignment of heterogeneous UAVs based on hybrid strategies based multi-objective particle swarm. Journal of Zhejiang University: Engineering Science 59(4): 821–831.

26.

Wei

Cai

(2020) Particle swarm optimization for cooperative multi-robot task allocation: A multi-objective approach. IEEE Robotics and Automation Letters 5(2): 2530–2537.

27.

Yue

Yan

(2023) Improving cooperative multi-target tracking control for UAV swarm using multi-agent reinforcement learning. In: 2023 9th international conference on control, automation and robotics (ICCAR), Beijing, China 21–23 April 2023, pp. 179–186. New York: IEEE.

28.

Zhang

Koch

Calvin

WJG

, et al. (2024) Interception of multiple drone targets by heterogeneous chasers using heuristic task allocations with DQN-GNN guidance model. In: 2024 SICE international symposium on control systems (SICE ISCS), Higashi–Hiroshima, Japan, 18–20 March 2024, pp. 7–13. New York: IEEE.

29.

Zhang

Cheng

Hang

(2025) Method of multiple UAV cooperative task allocation based on LGMPA algorithm. Modern Electronics Technique 48(4): 109–118.

30.

Zhu

Wang

(2021) An adaptive priority allocation for formation UAVs in complex context. IEEE Transactions on Aerospace and Electronic Systems 57(2): 1002–1015.