Abstract
Handoff reduction is considered one of the most exciting challenges in the study of cognitive radio networks. Spectrum handoff occurs between channels when a licensed user needs to access a channel which is currently occupied by an unlicensed user. Once the entry of the authorized user has been detected, the secondary user must move to an idle channel. This process continues until the unlicensed user finishes his transmission. This paper addresses the problem of spectrum mobility in a known radio electric environment, guiding secondary users through routes created with bio-inspired algorithms. The authors formulate a spectrum allocation scheme for multiple secondary users using two bio-inspired algorithms. The simulation results indicate that the Max feeding optimization algorithm proposed offers robustness and low complexity, which makes it a solution that is more in line with the spectrum allocation problem in cognitive radio networks.
Introduction
The demand related to wireless data has showed recent growth which is a trend estimated to continue due to ever-expanding wireless applications. It is expected that the monthly use of mobile data will be multiplied by eight before 2020 in comparison with 2015 [1, 28]. Multiple studies have shown that the spectrum is currently underused by licensed users [21] with an uneven use of bands. While searching for a solution, the study of cognitive radio networks (CRN) has been gaining interest since it would enable the optimization of limited resources within wireless networks. Cognitive radio is defined as a radio system that knows its environment and whose operational parameters can be adjusted dynamically and autonomously [4]. Non-licensed users are those who occupy the spectrum opportunistically. When non-licensed users detect an incoming licensed user, they must immediately leave the spectrum band that they are occupying and head towards an availableone [24].
Research methods have sought to solve the situation of underuse of radioelectric space in CRN, which includes artificial intelligence [7], neural networks [22], support vector machines [23], genetic algorithms [27] and evolutionary algorithms (EA) [16]. EA are particularly inspired by the biologic constitution of animals, plants and humans, regarding the common processes of feeding, survival and reproduction. Several evolutionary techniques have been proposed to solve particular matters in CRN. Contemporary techniques have been used to enhance the secondary users’ tasks of accessing and sharing the spectrum. The following investigations stand out in the areas of spectrum detection [9, 20], decision-making [15, 13] division [8], mobility [6, 19] and handover [26].
This research is based on two bio-inspired algorithms that grant some pointers in tackling the challenge of spectrum decision-making. The goal of this paper is to compare the results obtained with each algorithm and its main contribution is to demonstrate the performance and application of bio-inspired algorithms in cognitive radio networks, very little used for this purpose. The first algorithm to be implemented in this study is the well-known Ant Colony Optimization (ACO) strategy, based on the behavior of ants when establishing paths leading to food sources [2]. The ACO has been adapted to the scenario of searching for routes within a radioelectric spectrum environment. Furthermore, the number of handoffs caused by a licensed user that is transmitting an information packet is reduced. The second algorithm included in this research rests on the energy consumed by insects when looking for food. The distance travelled by the insects should be equivalent to the length of the information packet that an unlicensed user intends to transmit. In fact, the length of said packet is defined as the same for insects and ants, which will ease the development of tests and their subsequent comparison. In the case of insects, the discriminating factor is the number of mandatory hops performed by them in order to avoid interfering with licensed users.
This article is organized as follows. Section 2 describes previous work inspired by the behavior of ants, seeking to solve CRN problems. Section 3 introduces the bio-inspired methods implemented in this research, as well as its mathematical foundation. Section 4 exposes the results of said methods under different simulation scenarios. Finally, Section 5 states the conclusions of this research.
Related work
In [10]. A channel-hopping method is presented for the controlled selection of channels in a heterogeneous, spatial and time-varying spectrum for an Ad-hoc CRN. The Swarm Aided Stay-Jump (SASJ) algorithm is based on an Ants Colony System (ACS) to choose a channel to control. The particular characteristics of channels are displayed in terms of road quality using the mimicry of real pheromones. The results are simulated in an urban cellular radio environment with buildings that hinder direct communication.
The research discussed in [30] involves an optimization algorithm with an improved ACO strategy seeking to increase the use of the spectrum in a CRN. The algorithm quickly converges to an optimal solution due to the reinforced learning given to the accumulated pheromones. The number of available channels and unlicensed users is represented in a matrix that judges whether a node is available or satisfies the interference restrictions. If the node meets the interference restrictions, then the channel is assigned to the secondary user (SU), and a corresponding benefit is granted. Therefore, the SU is symbolized by an ant looking for food (node) who leaves a pheromone trail on the way. The subsequent ants will then choose the path with the highest probability of leading to food (node).
In networks that operate with wireless sensors, some problems can occur related to adaptation during self-organization, tolerance to failure, and scalability for both harnessing and exchanging cooperatively the spectrum. On the one hand, the unlicensed nodes of sensors are used temporarily, while the available licensed channels are occupied. On the other hand, in telecommunication networks, bands of the spectrum are used. Thus, the authors in [5] propose an algorithm inspired by the behavior of ants and bees for the routing process of wireless sensors in CRN. The goal is to properly use the available channels while diminishing communication delays.
Materials and methods
Cognitive radio network
A cognitive radio network (CRN) is a system that can assess the medium, analyze transmission parameters and make decisions in a dynamic time-frequency space. Focused on the allocation of resources, it seeks to improve the use of the electromagnetic radio spectrum [22]. Therefore, a CRN should be ‘smart’ enough to learn from its interaction experience with the RF environment. Hence, the learning process is a crucial component that can be approached from various areas of knowledge such as artificial intelligence, machine learning techniques, evolutionary algorithms and robust control methods [22].
Managing the spectrum with a CRN involves four stages that explain the interaction between primary users (PU) and secondary users (SU), in order to occupy and share the spectrum [11]. The first stage is spectrum detection in which the SUs look foravailable spectrum bands, capture their information and detect gaps within the spectrum. The second stage is spectrum decision-making aiming to assign a channel based on spectrum availability and allocation policies. The third stage is spectrum division which coordinates spatial allocation and prevents users from crashing into parts of the spectrum or overlapping when multiple SUs request access. The fourth stage is spectrum mobility in which SUs head towards free sections of the spectrum when a PU demands access to the spectrum [4].
The implementation of new technologies in CRN should require little computational complexity since the estimation of detection, decision, division, and mobility should be executed in fractions of a second. Furthermore, Evolutionary Algorithms (EA) offer computational simplicity and are considered to be a promising tool to implement solutions within CRN paradigms. This research studies a particular aspect of spectrum mobility which is the reduction of handoff in the radio electric spectrum of Colombia’s capital city, Bogotá.
Spectrum handoff occurs when a primary user enters a channel that is already occupied by a secondary user. The SU must then abandon this channel and move to an available one. This process is repeated until the SU concludes its scheduled transmission. Spectrum handoff has a negative impact on the performance of secondary users resulting in communication delays and link maintenance problems. Thus, reducing handovers in the system should be a priority [14].
Figure 1 shows several frequency channels transmitting information under different power signals taking place at different times. During each transmission, gaps appear in some channels, indicating the end of the current communication until a new one is begun. These time-frequency spaces can be used to access the spectrum dynamically. The DSA (Dynamic Spectrum Access) strategy could be used by SUs who need to establish transmission. Each time that a user jumps, handoff increases and thus the energy used for each jump is also increased. Consequently, in optimization terms, it is sought that the handoff function needs to be minimized and can be expressed with Equation (1).

Dynamic Spectrum Access (DSA).
Bio-inspired algorithms are a set of machine learning techniques that seek to imitate the robustness of the procedures and structures that biological organisms have used for adaptation and evolution learning. We are going to describe in brief, the technical and fundamental operations by which the Bio-inspired algorithms have been motivated. The intention is to show the reader a notion of the algorithms and how they can be implemented in the CRNs problems.
In the current literature there is a wide diversity of bio-inspired algorithms, among which are: Genetics Algorithms, Artificial Bee colony, Ant Colony Optimization, Max feeding optimization, Cuckoo Search and Particle Swarm Optimization. The present investigation decided to work with the Ant Colony Optimization and Max feeding optimization algorithms because they are the ones that have shown the best results for situations similar to those that are had when selecting a spectral opportunity in cognitive radio networks.
Ant colony optimization
Through the use of pheromone trails, the ants establish a mutual communication strategy to find paths leading to sources of food [3]. Therefore, as more ants follow a certain path, the higher is the amount of pheromones deposited on the trail. In other words, the probability of an ant choosing said path is increased by the number of ants that have travelled the path. The collective behavior of ants can be characterized as a positive reinforcement process that rapidly converges as long as there are no restrictions in the environment in terms of exploiting a solution [18]. Each ant is identified as an agent that leaves a signal over the traversed path, thereby influencing the future decisions of incoming agents. This means that the set of ants does not converge to a single solution but instead converges to a subspace of solutions where the best one is chosen. The ACO method is based on stochastic processes and is inspired by the social behavior of ants. It is used to solve complex optimization problems [12] which makes it suitable for CRN. Some features such as parallel computing, self-organization, and positive feedback are inherent to the ant colony algorithm, allowing multi-agent optimization to obtain a global solution as well as reducing complexity and computation times [27].
The first bio-inspired algorithm that is proposed is the ACS (ant colony system), which is based on the behavior of ants while looking for food by following pheromone traces. In this research project, the ACS serves to find routes for the continuous transmission of data. To create the corresponding model, the objective function (fo) is represented as Equation (2).
The first implementation step is the description of the pheromones left by the ants when the path has been satisfyingly travelled. A route is satisfyingly completed when the entire packet of length (l p ) has been transmitted. The pheromone values are updated for all ants once the route is completed.
The update of pheromone Pu(c, t) for channel c in time t is expressed as Equation (3).
For an m number of ants with an evaporation rate
The second step consists on identifying the transition probability p (ij) of the kth ant moving from time t, channel i, to time t + 1, channel j given by Equation (5) [17].
The length is determined by the change of channel. If the destination channel is the same, then the length will be equal to D. If the ant jumps to a different channel, then the length will be equal to 10D, Equation (6).
Finally, a greedy selection of the following chosen channel is proposed based on a pseudo-random variable q. q is compared with (0 < q0 < 1), a constant change, to establish the relationship between exploitation and exploration. As to q0 decreases, the ant chooses more according to the levels of pheromones thus exploitation. As to q0 increases, the ant decides to a random alternative route using more the adventure of exploration. The exploration fact stays the ants outside of local minimums,Equation (7).
Therefore, q0 is associated with the tolerance to risk-prone decisions. Values closer to 1 suggest a higher understanding of the risk. The exploration of possible paths would be appropriate at the beginning of the search and the exploitation of possible solutions for the end [17]. q is a random number between 0 and 1. If the search is redirected, then the probability function is modified for the selection of random numbers.
The second bio-inspired algorithm to treat is called max feeding (MaF) and is based on the amount of energy that an insect employ for its nutrition. Although the algorithm does not have a random component like the conventional evolutionary algorithms, it has an element of adaptation for known scenarios. The main base of the proposed algorithm is the maximum exploitation of the simulation scenario. For this model it was estimated that the objective function (fo) is represented as Equation (8).
where ɛ represents the energy used by the kth ant. The first stage involves the transformation of the scene into pheromone trails within the nest. The nest is composed by all channels for the entire transmission time. The traces of the pheromone will be assigned in proportion to the number of free spaces within the nest. Hence, as the time in which the channel is available grows higher, so will the corresponding pheromone trail spread inside the nest grow longer. This means that the insect is inherently attracted by the trail. The allocation of the pheromone weights is static at first, due to the counting process of the busy and idle channels. Then, a dynamic weight allocation is established in which the pheromones will evaporate as the available timeslots decrease. Figure 2 displays an example of the pheromone evaporation process in the availability matrix of the spectrum.

Pheromone distribution chain for the availability matrix.
Initially, pheromones have high weights according to the available timeslots and said weights decrease until they reach a value of 0. Considerable changes are detected over time in Fig. 2. For the calculation of pheromones, it is crucial to have complete knowledge of the scenario at hand. The algorithm stating the calculation of the pheromone-instilled path matrix is described below.
The next stage involves finding the best path based on the pheromone trails. Thus, an agent is created for such task, which in this case is an insect in charge of locating the most significant pheromone trace in channel m at time t1. The insect will then move to the channel with the largest pheromone trail and will consider the length (l p ) of the transmission packet. The transmitted packet (P t ) and the time elapsed in the channel (t i ) will be subtracted. The insect guided by the pheromones will transmit the packet until almost all traces disappear, and then choose a new channel when a single pheromone is left. This means that the insect will have enough time to move from channel m to channel n, thereby guaranteeing the continuity of the service. The following algorithm describes the method for finding the insect’s path.
The insect will repeat the same procedure until transmission is completed or an equivalent state is reached that requires finding a new transmission route. Afterwards, the availability matrix will be updated with Algorithm 1 , followed by another insect finding a different solution. This process will be repeated k times, by k insects that determine k paths until a discontinuity shows up in the transmission.
Estimation of pheromone trails
Once a single insect fulfills its purpose (i.e. it transmits the predefined data), the nest will be updated, thereby eliminating the path of the kth insect’s quest for food.
Since the insect travels the same distance from start to finish, the only discriminating parameter is the number of jumps made by the insect to avoid interrupting transmission. Every time that the insect jumps, the energy consumption (ɛ) increases, which means that the insect that consumed the least amount of energy will be the one that found the best route.
Route exploration by the insects
The ant colony method is tested in a real scenario. The goal is to find the best configuration for route-searching that minimizes the number of handoffs present in a single transmission. The chosen scenario is the radioelectric spectrum of the city of Bogotá, Colombia. The broadcast has a length of 70 time units equivalent to 23.1 seconds of continuous transmission. In this simulation, 70 sets of channels are available in the Wi-Fi band.
The relation between parameters α and β is the first focus of the study. α and β control the relative importance of pheromones against heuristic information. In fact, a scenario was created where said variables remained in the [0.05, 1] interval with steps of 0.05. The set is equivalent to 20 games derived from the combination of the mentioned variables in order to find the best relation between them and guarantee the reduction of hops. The results of this scenario are shown in Fig. 3.

Tuning of parameters α and β for handoff reduction.
The ACO strategy was tested under 100 iterations, with the following settings: 50 ants, D = 2, ρ= 60 and q0= 0.5, stating a balance between exploration and exploitation of new paths. After simulating 400 different scenarios with the conditions mentioned above, the results revealed a reduction in the number of jumps for α= 0.05 and β= 1. The simulations indicate that the proposed scenario offers a better response depending on the distance between channels instead of relying on the number of deposited pheromones. A local linear regression on the obtained data (marked in pink in Fig. 3) reveals that this trend occurs for values of α close to 0.05 and values of β close to 1 which translates into the lowest average number of jumps in the entire simulation. Hence, there is a tendency in the reduction of jumps when the value of α decreases and the value of β increases.
The second phase in the ACO tuning process is the adjustment of the term q0 that defines the willingness to either explore new routes or exploit previous ones. Thus, a new scenario was conceived by varying q0 from 0.05 to 1 with steps of 0.05 in order to determine the optimal value that significantly reduces the total number of jumps. Tests were carried out for 300 iterations, 50 ants, D = 2, ρ= 60, α= 0.05 and β= 1. Figure 4 summarizes the results of this experiment.

ACO response with uniform distribution of 0.05 < q0< 1.
It is evidenced that handoff reduction is directly proportional to lower values of q0. Figure 4 indicates in blue the average handoff for 300 iterations and encloses in pink the handoff confidence intervals per iteration for each q0. In order to generate a lower number of jumps, there should be a stronger inclination for low-risk tolerance and redirect the algorithm towards the exploitation of routes instead of their exploration.
Finally, a new simulation scenario was created to test the overall operation of the ant colony method. The following settings were defined based on previous experiments: 300 iterations, 50 ants, D = 2, ρ= 60, α= 0.05, β= 1 and q0= 0.05. In Fig. 5, the ants labelled as n° 1, n° 25 and n° 50 are visualized. The transitions of routes from the first iteration, going through the iteration one hundred, two hundred and finally ending in the three hundred iteration will illustrated how the ACO work in the current problem. This experiment allows to follow the path taken by the ants and observe the handoff reduction caused by increasing the number of iterations as shown in Fig. 5. The number of jumps is reduced as the iterations continue. The algorithms was intended to guarantee the quality and continuity of the service without affecting the transmission of the PU.

Transition from the 1st to 300th iteration.
The Max Feeding method has a much simpler definition of the studied set. Since there are no configuration variables, it only requires a testing scenario where it can determine the best routes thereby reducing the number of jumps in each iteration. The corresponding simulations would then consider the same testing scenario of the ACO strategy, which is a broadcast of 70 time units equivalent to 23.1 seconds of continuous transmission and a 70-channel set in the Wi-Fi band. Figure 6 replicates the same conditions of the ant colony optimization for the MaF algorithm.

Results of the MaF method in the ACO environment.
After running the MaF algorithm, 36 different routes were obtained. Figure 6 displays the top 10 solutions. The average handoff in the routes is 10.75, with a minimum of 4 handoffs and a maximum of 21. Figure 7 summarizes the results of this simulation in the form of a histogram. Table 1 shows the comparative evaluation of both tested algorithms.

Summary of the MaF method results in the same environment of the ACO simulation.
Comparative evaluation between the ACO and MaF algorithms
In practice, cognitive radio networks must face heterogeneous challenges such as the current network infrastructures. Therefore, there is a need to include a model to improve spectrum handoff reduction capable of adapting its operation principle to achieve the highest accuracy in the selection of the frequency channel.
In this document, the validation of the proposed model was shown for real spectral occupancy data in experiments carried out in the Wi-Fi frequency band. However, the application of the algorithm can also be extended to other frequency as long as it includes enough of the required statistical information.
After performing a comparative evaluation of the two algorithms exposed in this article, the MaF method delivers the best routes with lower handoffs. The efficiency of Max Feeding is based on the exploitation of a priori knowledge of the environment, guaranteeing shorter execution times and greater robustness in route searching. It can become an ideal tool for systems in which the behavior of the CRN is known beforehand. Although the MaF algorithm is inspired in the energy consumption of insects, it is not an evolutionary algorithm. Therefore, its efficiency is not the same in unknown environments.
In such environments, the ACO method is suggested since proper adaptation is expected. The partial configurable randomness inherent to the ACO delivers an equal efficiency in scenarios with unknown behavior. In cognitive radio networks, robustness and adaptability are crucial factors. In conclusion, a hybrid alternative that combines these two algorithms could show more promising results in CRN for PU routing.
As future work, two guidelines are proposed: the first consists in carrying out an evaluation and validation with the most relevant autonomous learning algorithms in the current literature, for example, the use of vector support machines to perform classification and learning processes by reinforcement to develop the adaptation part. The second is to perform an evaluation and validation with cognitive radio equipment that emulates a cognitive radio network instead of simulations, even with real spectral occupation data.
Competing interest
The authors declare that they do not have competing interests.
Footnotes
Acknowledgments
The authors would like to thank the Center for Research and Scientific Development of Universidad Distrital Francisco José de Caldas and Colciencias, for supporting and funding this research project.
