Communication-aware information gathering with dynamic information flow

Abstract

We are interested in the problem of how to improve estimation in multi-robot information gathering systems by actively controlling the rate of communication between robots. Communication is essential in such systems for decentralized data fusion and decision-making, but wireless networks impose capacity constraints that are frequently overlooked. In order to make efficient use of available capacity, it is necessary to consider a fundamental trade-off between communication cost, computation cost and information value. We introduce a new problem, dynamic information flow, that formalizes this trade-off in terms of decentralized constrained optimization. We propose algorithms that dynamically adjust the data rate of each communication link to maximize an information gain metric subject to constraints on communication and computation resources. The metric is balanced against the communication resources required to transmit data and the computation cost of processing sensor data to form observations. The optimization process selectively routes raw sensor data or processed observation data to zero, one or many robots. Our algorithms therefore allow large systems with many different types of sensors and computational resources to maximize information gain performance while satisfying realistic communication constraints. We also present experimental results with multiple ground robots and multiple sensor types that demonstrate the benefit of dynamic information flow in comparison to simpler bandwidth-limiting methods.

Keywords

Networked robots field robots sensor fusion

1. Introduction

Decentralized information gathering is an important task where outdoor robots coordinate to build maps, search for targets, track targets and classify objects. Coordination in outdoor systems relies on wireless communication between robots, but existing communication networks are subject to physical limits that are often idealized, over-simplified or ignored (Ghaffarkhah and Mostofi, 2011). We are interested in developing principled algorithms for decentralized information gathering that are designed to respect these limits and to use available communication resources efficiently.

The decentralized information gathering task is where a team of robots actively cooperates to maximize information about a given phenomenon. This task forms the basis of many applications such as cooperative search (Bourgault et al., 2004; Gan and Sukkarieh, 2011; Hollinger and Singh, 2012), target tracking (Chung et al., 2006; Hollinger et al., 2009; Xu et al., 2013b) and environmental monitoring (Golovin et al., 2010; Smith et al., 2011; Delle Fave et al., 2012). Robot teams are useful for information gathering because they can exploit diverse sensing and motion capabilities, access multiple simultaneous viewpoints and cover large areas more rapidly than single-robot systems. Communication is fundamental to the task because robots must share data for estimation and coordinated decision-making.

Communication is not an infinite resource. However, research in multi-robot systems often makes two invalid assumptions that fail to respect the physical limits of real communication networks. The first such assumption is that simultaneous communication between multiple pairs of robots is independent. In most existing wireless networks, bandwidth resources are shared globally and link capacity decreases rapidly as the number of robots increases (Gupta and Kumar, 2000). The second invalid assumption, sometimes called the r-disc model, is that constant bandwidth is available within a given radius about a robot and that zero bandwidth is available otherwise. Real communication links are far more variable (Malmirchegini and Mostofi, 2012). The implications of failing to consider communication limitations are significant and hence communication in realistic environments is currently the topic of considerable research interest (Sadler et al., 2013).

One possible approach to address the issue of communication limits is to simply increase total network bandwidth by using more powerful and sophisticated radio hardware. However, it is always possible to generate a problem instance that exceeds any given resource limit. Sensors such as 3D laser range-finders generate data at a high rate, typically 1.7 million points per second. High-resolution cameras can produce data at even higher rates. Controlling the communication of such data is essential to real-world application of decentralized information gathering systems.

We believe that a better approach is to develop algorithms that make efficient use of the communication resources at hand. We refer to this approach as communication-aware information gathering. The idea is to choose when and how a given pair of robots should communicate based on the information value of the communication and given resource limits. This idea can be viewed as roughly analogous to network flow, where flow represents the data communicated within the system.

The main challenge is that information may be represented at multiple levels of abstraction ranging from raw sensor data to highly compressed forms such as target state observations. Therefore, we must choose not only how to route data but also in what form. This decision must consider computation costs, since data may be processed at various possible locations within a system with varying resource capacity. A given robot may process its sensor data on-board, transmit this data to a powerful off-board processing station or rely on the computation resource of another robot. Manual design of a communication policy in this context is difficult and can result in poor communication efficiency. For example, down-sampling the rate of sensor data transmission may obey bandwidth constraints but can lead to unnecessary degradation in the performance of state estimation algorithms. The design task increases in difficulty for large heterogeneous systems, and any fixed policy would require redesign after any change to the underlying hardware or task. Therefore, the information flow must be adjusted dynamically and autonomously.

In this paper, we formalize the notion of communication-aware information gathering by introducing a novel problem formulation which we call the dynamic information flow (DIF) problem. Given a graph-based representation of a decentralized information gathering system, the objective is to maximize the information value of communication by minimizing a cost-based metric subject to constraints. The graph representation models an information gathering team as a system where data flows along a typical pipeline comprising sensors, perception algorithms, estimation algorithms and control algorithms. These logical elements are connected by communication links with associated costs, and a system may contain many such elements. For example, a single laser sensor may be connected to many other elements implemented on multiple robot platforms.

The DIF problem structure is designed to model trade-offs between information value, communication cost and computation cost at the system level. The information value of sensor observations is not defined globally but instead is defined relative to the belief state of each estimator element. Link costs are abstract costs that model both communication and computation. For example, a given sensor observation may be of high value to an estimator, but obtaining this information may incur a high cost due to the computational demands of a perception algorithm or due to large communication bandwidth requirements. Formulating the problem in this way provides a mechanism to balance these diverse costs against information value in a principled manner.

We define the problem as a family of optimization problems with two concrete variants, min-cost-DIF and threshold-DIF. A solution to the problem is in the form of a set of multicast flow rates that determine which pairs of robots communicate at any given time. In min-cost-DIF, the objective is to minimize the sum of link costs, assuming the relative scale of these costs is known. In threshold-DIF, the relative scale of costs is not assumed to be known and the objective is to find a solution that satisfies a given cost threshold.

The DIF problem formulation provides several significant benefits. Because link costs are abstract and dynamic, the problem admits any realistic communication link model and is not limited to the r-disc assumption. The threshold-DIF variant can model the global bandwidth constraint imposed by common shared-channel communication systems. Modeling system elements logically as a graph where flow rates are dynamically optimized avoids the need to manually pre-determine the information architecture of the system. This property is particularly useful for heterogeneous systems with many types of robots that have a range of sensing and computational resources. Although previous work has explored the problem of controlling communication between estimation and control elements (Semsar-Kazerooni and Khorasani, 2009; Kassir et al., 2012), this paper represents the first comprehensive effort to address communication between sensing and estimation.

We present algorithms and analysis for both problem variants. Our solution to min-cost-DIF is based on an adaptation of multicast routing. We prove that min-cost-DIF can be transformed such that existing multicast routing algorithms may be applied, and we present one such algorithm. Our solution to threshold-DIF is based on an optimization method known as the alternating direction method of multipliers (ADMM) (Boyd et al., 2011). We derive a decentralized version of this algorithm which we call distributed ADMM (DADMM) and show how it can be applied to solve threshold-DIF. We analyze convergence and running time for all algorithms and validate these results through simulations including up to 28 nodes.

We also present experimental results that illustrate the behavior of our algorithms and compare information gain performance with simple bandwidth-limiting methods. The task we consider is to track a moving target using multiple types of sensors. For the case of min-cost-DIF, the experimental system consists of one mobile robot equipped with a camera and one auxiliary static ground station. We also present simulation results for two mobile ground robots. For threshold-DIF, the experimental system consists of two outdoor mobile robots, with and without an auxiliary static camera. One robot is equipped with a 2D laser sensor and the other is equipped with a 3D laser. To further evaluate the performance of our algorithms, we present results from Monte Carlo simulations that demonstrate statistical significance.

Our results show that the DIF algorithms efficiently use available communication bandwidth to increase information gain. We observe that sensor data is either processed on-board or transmitted and processed at the ground station appropriately. We also observe that information from multiple sensor sources is communicated selectively based on sensor utility, available bandwidth and route overlap.

The paper is organized as follows. Section 2 discusses related work in the general area of communication limits in decentralized information gathering systems. The DIF problem and its two variants are defined in Section 3. We present algorithms, analysis and experimental evaluation for the case of min-cost-DIF in Section 4. We then address threshold-DIF in Sections 5 and 6: Section 5 presents our decentralized form of ADMM and Section 6 presents algorithms, analysis and experimental evaluation based on the results from Section 5. Section 7 discusses the method of estimating sensor observation utility that we use in our implementations and Section 8 concludes the paper.

2. Related work

Existing work has studied aspects of communication-aware information gathering from several different perspectives. One idea is to simply ease resource limits by boosting the capacity of wireless networks. Multi-radio multi-channel networks (Wu et al., 2000; Xing et al., 2007) can significantly increase network capacity by using multiple communication channels in parallel. Our previous work has shown that a single channel may be reused in a neighbor-to-neighbor architecture while avoiding mutual interference (Kuo and Fitch, 2014). Whereas the aim of these approaches is to maximize the throughput available from source to destination, this paper takes a complementary view. Instead of transmitting as much data as possible, we attempt to transmit only the most valuable data possible. Thus, data with little information value does not consume communication resources and available bandwidth is used efficiently.

From the perspective of optimal control, the entire problem (including communication decisions) can be modeled as a decentralized partially observable Markov decision process (Dec-POMDP) (Gmytrasiewicz and Durfee, 2001; Goldman and Zilberstein, 2003; Carlin and Zilberstein, 2008; Williamson et al., 2008). Partially observable Markov decision process (POMDPs) are a powerful and general approach but are computationally intractable for large problems due to the ‘curse of dimensionality’. We are interested in problems with many robots and sensors, and we focus on computationally efficient solutions to the more specialized DIF problem.

Related problems in efficient information sharing have been studied in the context of sensor networks. In Kulik et al. (2002) the SPIN routing protocol is introduced as a routing mechanism for sensor networks. Sensor nodes send an advertising message that contains metadata about the sensor information available and potential recipients send requests as required. However, the semantics of the metadata are not specified and are considered application-dependent. In Gupta et al. (2006), a sensor scheduling strategy for multiple sensors is proposed with bounds on the estimation error covariance. The strategy determines the order of sensor selection. Our work presents a general framework which allows for the dynamic selection of subsets of sensors in addition to dynamically selecting the processing platform for the transmitted sensor data.

In the context of target tracking, Chen et al. (2010) propose an algorithm for sensor networks that uses minimal communication by only transmitting relative changes. In our previous work (Xu et al., 2013a), robots learn to predict the observation utility of other robots and adjust inter-robot communication accordingly. These approaches offer promising results for the application of target tracking but do not address information flow for nodes with heterogeneous capabilities and arbitrary utility functions. Our approach seeks to solve the trade-off between communication, computation and sensing in a general, yet computationally efficient, manner.

Another related problem is communication-aware motion planning, where robots choose paths that are advantageous with respect to properties of the communication network, such as connectivity maintenance (Hsieh et al., 2008; Mostofi, 2009; Stachura and Frew, 2011; Lindhé and Johansson, 2013; Twigg et al., 2013; Yan and Mostofi, 2013). Connectivity is important to networked robots, but the problem we propose in this paper has a distinctly different objective. We build on the work in connectivity maintenance and assume that the network is connected. Our focus is on using this connected network efficiently by choosing when data should be transmitted and by choosing the best multicast routing. Instead of planning a path for a robot through its workspace, communication-aware information gathering involves planning a path for information through a network.

A more closely related problem is distributed linear quadratic control (Speyer et al., 2008; Molin and Hirche, 2009; Semsar-Kazerooni and Khorasani, 2009; Kassir et al., 2012), where each controller in a team makes control decisions based on knowledge of the team state vector. The full state vector may not be relevant to a given controller, and so the problem is to identify which state elements should be communicated. Our work again is distinct in that we focus on controlling the flow of observations as opposed to controller state information. Observations may be in the form of raw sensor data or in a processed task-dependent form such as target track estimates. We consider how to best route these observations through the network by balancing the computational effort required to process observations with the cost of communication and the value of this information in its various forms at any given time.

A classic problem in network flow optimization is the minimum cost flow problem (Ahuja et al., 1993). The minimum cost flow problem has known efficient decentralized solutions; however, the DIF problem is more closely related to the multicast network routing problem. This problem is equivalent to the Steiner tree problem on directed graphs which is NP-complete (Ramanathan, 1996). In a special case using network coding, multicast routing can be solved in polynomial time and in a decentralized manner (Cui et al., 2007; Xi and Yeh, 2010). Our algorithms exploit this special case. However, in our implementations we use an approximation that approaches the performance provided by network coding in relatively small networks.

3. Problem formulation

In this section, we define the dynamic information flow problem. We introduce the general problem formulation and then define two important variants; one assumes a priori knowledge of link costs and the other bounds the sum of all link costs by a given global maximum.

3.1. Dynamic information flow

Our goal is to maximize information gain by controlling the flow of information within a decentralized information gathering system subject to communication and processing constraints. Data is continuously produced by sensors and consumed by other elements of the system. We first define a graph structure that models these system elements and the data flow between elements. Given such a model, we then define the dynamic information flow problem in general form.

A decentralized information gathering system is a configuration of several elemental components. Sensors are elements that generate sensor data measured by physical sensing devices, such as laser scanners and cameras. Data from such sensors are transformed into observations by applying algorithms such as object detection and classification. Processors are computational elements that perform these processing tasks. Processors may be cascaded if necessary. The observations generated by processors act as input into estimator elements that maintain belief states. For example, an estimator could be an extended Kalman filter (EKF) in the tracking case or an occupancy grid in the mapping case.

Data flows via a communication system from sensors to processors, from processors to other processors and from processors to estimators. The topology of the resulting network is a directed acyclic graph, where information value, communication limits and computation limits induce costs or capacity constraints on the links in the graph. The induced link costs may vary according to the properties of the underlying communication mechanism, which may not be the same for all links. Elements of the system generally are physically distributed among multiple robots or ground stations and therefore communicate using an inter-robot communication system such as a wireless network. It is also possible for multiple elements to reside within a single physical platform and communicate using an intra-robot communication system such as a wired network or in-memory communication.

These system elements can also be viewed in terms of the well-known network flow problem (Ahuja et al., 1993) as follows. The commodity that flows through the network in this case is information in the form of sensor data or processed observations. Sensors correspond to supply nodes, estimators correspond to demand nodes, and processors correspond to intermediate, or transshipment, nodes. Communication links between nodes correspond to arcs or links between nodes of the network.

An example diagram of a decentralized information gathering system is shown in Figure 1(a) with the corresponding system topology represented by the directed acyclic graph shown in Figure 1(b). These diagrams could correspond, for example, to the case of two robots tracking a target using different types of sensors and with access to an off-board processing station. For target detection, each robot either processes its raw sensor data on-board or transmits the data to be processed off-board. Moreover, the robots can choose to either share raw sensor data or processed point observations instead.

Fig. 1.

(a) An example of a decentralized information gathering system with two robots and one off-board processor. (b) The corresponding network topology in our dynamic information flow formulation.

Formally, a decentralized information gathering team is represented by a directed acyclic graph (DAG) G = {V, E} where V is the set of vertices (or equivalently, nodes) and E is the set of edges or links. In the graph G, for every i, k ∈ V, if (i, k) ∈ E then we say that k is a child node of i and i is a parent node of k. The set $C (i) = {k \in V : (i, k) \in E}$ is the set of children of node i. Similarly, the set $P (k) = {i \in V : (i, k) \in E}$ is defined as the set of parents of node k. We define $N (i) = P (i) \cup C (i)$ as the neighborhood of node i. A node with no parents is called a head node. A node with no children is called a tail node. We denote the depth of the graph G as κ(G) defined as the number of nodes in the longest path from a head node to a tail node. The set $\bar{C} (i) = {k \in V :$ there exists a directed path from i to k} is referred to as the set of successors of node i. The set $\bar{P} (k) = {i \in V :$ there exists a directed path from i to k} is referred to as the set of ancestors of node i.

The set of nodes is partitioned into three mutually exclusive subsets: the set of sensors V_s which act as sources, the set of processor nodes V_p which act as intermediate nodes and the set of estimator nodes V_e which act as destination nodes. Links connect nodes in V_s to nodes in V_p , nodes within V_p and nodes in V_p to nodes in V_e .

Sensor data is multicast from each sensor node m ∈ V_s to all connected estimator nodes j ∈ V_e . Sensor m produces data at a fixed rate and this data is consumed by connected estimators at the same rate. To represent this production/consumption rate we introduce the variable $r_{i}^{m} (j)$ at node i for each sensor m and destination j. Variable $r_{i}^{m} (j)$ is called the inward flow and is set to sensor m’s data rate if i = m or else the negative of sensor m’s data rate if i = j or 0 otherwise. The average data rate of the flow passing through link (i, k) originating from source m and destined to j is defined as $x_{ik}^{m} (j)$ . As an example, Figure 2 shows a graph of an acyclic network with a single source. The inward flow variables are indicated for the source and destination nodes. A possible flow variable configuration is also shown for each link inside square brackets. The left entry is for j ₁ and the right entry is for j ₂.

Fig. 2.

An example routing configuration for the butterfly network. Numbers within brackets are the flow values for each link. The first value corresponds to destination j ₁ and the second corresponds to destination j ₂.

A set of flow variables ${x_{ik}^{m} (j) : j \in V_{e}}$ will lead to an average total flow of h^m _ik on link (i, k). The relation between the total flow and the destination-specific flow variables will also depend on the underlying multicast implementation. With network coding, the total flow is simply the maximum flow over all destinations as defined in (1) (Ahlswede et al., 2000). This relation will be assumed for the current problem formulation. The general validity of this assumption is further discussed in Section 4.3:

h_{ik}^{m} = max_{j \in V_{e}} x_{ik}^{m} (j)

(1)

Communication load, computation load and sensor observation utility induce a net link cost of $c_{ik}^{m}$ per unit of data flow from source m passing through link (i, k). The link cost is multiplied by the total flow $h_{ik}^{m}$ to obtain the total link cost arising from source m. Summing over all sources, link (i, k) has a total cost of $\sum_{m} c_{ik}^{m} h_{ik}^{m}$ .

Sensor observations induce a reward when reaching an estimator. In order to represent this reward, the information value of sensor m to estimator j is subtracted from the cost of each link incident to j.

Because a sensor observation may have little value for a given estimator, the system requires a mechanism by which a sensor can decide not to send any data to a certain destination. We model this option by adding a virtual, zero-cost link directly from each sensor to all connected estimators.

We now define the general dynamic information flow problem as follows. Given link costs ${c_{ik}^{m}}$ and inward flow rates ${r_{i}^{m} (j)}$ , choose the set of flow variables ${x_{ik}^{m} (j)}$ such that the total cost summed over all links in the network is minimized subject to constraints. Link costs and constraints may vary over time.

3.2. Min-cost-DIF

We define the first concrete form of the general problem, min-cost-DIF, according to the constrained optimization (2)–(5). Information value, communication and computation resource demand are represented using link costs. This formulation is appropriate for situations where the relative costs between the items are known a priori:

minimize \sum_{(i, k) \in E, m \in V_{s}} c_{ik}^{m} h_{ik}^{m}

(2)

subject to x_{ik}^{m} (j) \geq 0

(3)

h_{ik}^{m} = max_{j \in V_{e}} x_{ik}^{m} (j)

(4)

\sum_{l \in P (i)} x_{li}^{m} (j) - \sum_{k \in C (i)} x_{ik}^{m} (j) + r_{i}^{m} (j) = 0

(5)

The first constraint (3) ensures that flow is always positive. The second constraint (4) represents the multicast condition and the third constraint (5) ensures that the sum of all inward and outward flow at a node is zero.

In min-cost-DIF, link costs may change over time due to changes in communication and processing costs as well as changes in sensor utility. For example, robots may move closer to or further away from each other, resulting in a change in communication costs. Sensor viewpoint may also change, leading to a change in the value of on-board sensor observations.

3.3. Threshold-DIF

We introduce a second problem variant, threshold-DIF, to represent the case where the correct scale between communication costs, computation costs and information value is not known a priori. In this case, communication bandwidth and processing power are viewed as limited resources. The goal of threshold-DIF is thus to maximize information gain subject to communication bandwidth and processing power constraints.

We augment (2) to (5) to include two additional constraints (6) and (7) and define three additional input parameters, $ν_{ik}^{m}$ , C_ik and K_s , to represent resource capacity limits. Constraint (6) bounds the weighted sum of flows originating from different sensors to respect a fixed capacity C_ik . The summation over all sensors in V_s is required since a link may carry messages originating from different sensors. Weights ${ν_{ik}^{m}}$ are used to scale flow values $h_{ik}^{m}$ on a per-link basis. For example, variations in required communication bandwidth due to link quality can be modeled by assigning appropriate values to ${ν_{ik}^{m}}$ . Similar to link costs, these variables may change over time:

\sum_{m \in V_{s}} ν_{ik}^{m} h_{ik}^{m} \leq C_{ik}

(6)

\sum_{(i, k) \in S_{s}} \sum_{m} ν_{ik}^{m} h_{ik}^{m} \leq K_{s}

(7)

To motivate constraint (7), consider the decentralized information gathering network diagram in Figure 3 which is a subset of the diagram in Figure 1. If the two robots in this example exclusively use wireless communications to share observations, then all four links that cross the robot boundaries share a common resource (the wireless communication medium). This constraint is indicated in the figure by the dashed link. Moreover, if each robot only uses one computer for all processing requirements then the links from both sensors to each of the object detection modules share another common resource, the on-board processing computer. These constraints are indicated in the figure by dotted links. This class of constraints, which we call inter-link constraints, is represented in (7), where K_s is a fixed upper bound on resource s and $S_{s}$ is the set of links sharing resource s. Again, the flow rates are weighted because inter-link constraints impose bounds on the total flow across different links. These links could either be emanating from different nodes, as shown in the example in Figure 3, or from the same node, when a node sends to many nodes using the same medium.

Fig. 3.

An example of a system with inter-link constraints. Links tagged with the dashed line share a wireless communication medium, while those tagged with a dotted line share a common processing resource.

For convenience, the main terms used in the DIF problem formulation are listed in Table 1. We note that i, k, m and j all correspond to nodes in the network graph. We use m to denote a source node, j to denote a destination node, i to denote the incumbent node and k to denote the incident node of an edge.

Table 1.

List of symbols.

$r_{i}^{m} (j)$	Inward flow rate at node i corresponding to source m and destined to j
$x_{ik}^{m} (j)$	Flow rate through link (I, k) originating from source m and destined to j
$h_{ik}^{m}$	Total flow rate through link (i, k) originating from source m
$c_{ik}^{m}$	Cost per unit data flow through link (i, k) from source m
C_ik	Capacity of link (i, k)
K_s	Capacity of resource s
$ν_{ik}^{m}$	Flow scale factor for source m and link (i, k)

3.4. Sensor utility

Sensor utility in the DIF formulation is a measure of the relative importance of sensor data with respect to a specific estimator. The importance of sensor data is evaluated based on the improvement it induces in the estimate. Hence, sensor utility depends both on the sensor and on the current state of the estimator. Sensor utility can be computed exactly using the Partially observable Markov decision process (POMDP) formulation of the information gathering problem; however, this is intractable since Dec-POMDPs are nondeterministic exponential time-complete (Bernstein et al., 2002). In Section 4.3, we present a myopic approximation to sensor utility that was used in our experiments. A further discussion of the issue of sensor utility approximation is provided in Section 7.

4. Min-cost-DIF

In this section, we present a message-passing algorithm that solves the min-cost-DIF problem. We introduce a mapping that transforms an instance of min-cost-DIF into an instance of multicast network routing, prove equivalence and show that an algorithm that was originally developed for multicast network routing also finds an optimal solution to min-cost-DIF. We then describe our implementation of this algorithm in the context of min-cost-DIF along with empirical analysis in simulation that evaluates scalability. We also provide experimental demonstrations both in simulation and using a system of one mobile robot and one fixed ground station. Finally, we present results from a Monte Carlo simulation that statistically validates the performance advantage of our method.

4.1. Min-cost-DIF using multicast network routing

An instance of min-cost-DIF can be transformed into an instance of multicast network routing (Cui et al., 2007; Xi and Yeh, 2010) as follows. The flow variable x_ik (j) is replaced with t_i (j)ϕ_ik (j), where t_i (j) is the total flow passing through i and destined to j while ϕ_ik (j) is the routing variable for link (i, k); more specifically, it is the fraction of t_i (j) that is routed to k.

Following this change of variables, the resulting formulation is given by the optimization problem (8)–(12). Constraint (10) states that the sum of the routing variables for each node is equal to one, while (11) and (12) are equivalent to (4) and (5):

minimize \sum_{(i, k) \in E} c_{ik} h_{ik}

(8)

subject to ϕ_{ik} (j) \geq 0

(9)

\sum_{k \in C (i)} ϕ_{ik} (j) = 1

(10)

h_{ik} = max_{j} t_{i} (j) ϕ_{ik} (j)

(11)

t_{i} (j) = r_{i} (j) + \sum_{l \in P (i)} t_{l} (j) ϕ_{li} (j)

(12)

Given this mapping, existing algorithms for multicast network routing can be applied. Here we summarize one such algorithm, originally presented in Cui et al. (2007). The algorithm is based on message passing and relies on obtaining the marginal cost δ_ik (j) for each link. The marginal cost is the rate at which the total cost increases due to a unit increase in flow along that link and is given by (13):

δ_{i k} (j) = {\begin{matrix} c_{i k} / n + \sum_{l \in C (k)} ϕ_{k l} (j) δ_{k l} (j) & \begin{matrix} if t_{i} (j) ϕ_{i k} (j) and n - 1 other \\ flows on link (i, k) are the \\ maximum \end{matrix} \\ \sum_{l \in C (k)} ϕ_{k l} (j) δ_{k l} (j) & otherwise \end{matrix}

(13)

Min-cost-DIF can be solved for each source m independently and in parallel. The full problem can be decomposed into independent sub-problems, one for each source, since the objective is additive and there are no inter-source constraints. This is evident from the problem formulation (2)–(5). Therefore, for simplicity of notation the subscript m is dropped from all variables in this section.

At the start of the algorithm, the routing variables {ϕ_ik (j)} are initialized arbitrarily such that they obey constraints (9) and (10). The routing variables are then repeatedly updated such that after iteration t the routing variables are set as $ϕ_{ik}^{t + 1} (j) = ϕ_{ik}^{t} (j) + Δ ϕ_{ik} {(j)}^{t}$ . The update direction Δϕ_ik (j)^t is defined in (14) where E_j is the set of edges belonging to the subgraph containing the ancestors of destination j and δ_i, _min(j) = min_k δ_ik (j):

Δ ϕ_{ik} {(j)}^{t} = {\begin{matrix} 0 & if (i, k) \in E_{j} \\ - min {ϕ_{ik}^{t} (j), & \frac{α (δ_{ik} (j) - δ_{i, min} (j))}{t_{i} (j)}} \\ if δ_{ik} (j) \neq δ_{i, min} (j) \\ \sum_{\binom{δ_{ip} (j) \neq}{δ_{i, min} (j)}} Δ ϕ_{ip} {(j)}^{t} & if δ_{ik} (j) = δ_{i, min} (j) \end{matrix}

(14)

The algorithm runs synchronously. First, the head nodes send messages with their flow contributions to their children. Once a node receives messages from all of its parents, it passes the message to its own children and so forth. The purpose of this downward sweep is to allow nodes to compute the flow of the current routing configuration. The flow values are necessary to compute the marginal costs required in the upward sweep. The downward sweep is followed by an upward sweep during which the marginal costs are computed according to (13) and the routing variables are updated according to (14). The downward and upward sweeps are decentralized, synchronous and are guaranteed to visit every node. Their sequence is dictated by Algorithm 1. The synchronicity property of Algorithm 1 is proved in Lemma 1.

Algorithm 1. Synchronous message passing on DAGs
1: For node i
2: if i is a head node then
3: Perform a downward update and send downward message to children
4: end if
5: loop
6: if a downward message is received from all parents then
7: Perform a downward update and send downward message to children
8: if i is a tail node then
9: Perform an upward update and send upward message to parents
10: end if
11: end if
12: if an upward message is received from all children then
13: Perform an upward update and send upward message to parents
14: if i is a head node then
15: Perform a downward update and downward message to children
16: end if
17: end if
18: end loop

Due to possible changes in link costs, this message passing optimization runs continuously throughout system operation. As the system configuration changes, link costs are updated with new values. To ensure convergence, the interval between updates is set to an adequate time period. Further details on the appropriate length of the interval between updates can be found in Section 4.3.

Lemma 1. In Algorithm 1, after node i has performed its t-th downward update and before forwarding this update:

Node i and all of its successors would have performed exactly t − 1 upward updates;

All of its ancestors would have performed exactly t downward updates.

Proof. Suppose one of node i’s successors has performed t′ > t − 1 upward updates. This means that at least one tail node in the successors has performed t′ downward updates. This is impossible because node i has only yet forwarded t − 1 downward messages. Since node i has performed t updates then its ancestors have performed at least t updates. Now, suppose one of node i’s ancestors has performed t″ updates where t″ > t. Then, the head nodes in the ancestry have performed at least t″ updates. This in turn means that they have performed t″ − 1 upward updates which means the tail nodes have performed t″ − 1 > t − 1 updates, which is impossible as just shown. This means that node i has forwarded at least t′ − 1 ≥ t downward updates, which leads to a contradiction.□

Due to observation rewards, negative costs may be assigned to links incident to an estimator j ∈ V_e . These negative costs are handled within the framework of the multicast network routing algorithm by solving an equivalent problem. A large enough constant $\bar{c}$ is added to all {c_ij : j ∈ V_e } to obtain a set of non-negative cost variables {c′_ik} defined in (15).

The equivalence of solving the optimization problem (8)–(12) to solving the problem with cost variable c′_ik instead of c_ik is proved in Theorem 1:

c'_{ik} = {\begin{matrix} c_{ik} + \bar{c} & if k \in V_{e} \\ c_{ik} & otherwise \end{matrix}

(15)

Theorem 1. Replacing link cost c_ik in problem (8)–(12) with c′_ik defined in (15) results in a problem equivalent to (8)–(12).

Proof. For every link (i, j) such that j ∈ V_e , constraint (11) turns into the equality relation (16) instead since the only flow that should run along that link is the flow destined to estimator j:

h_{ij} = t_{i} (j) ϕ_{ij} (j), \forall j \in V_{e}, \forall (i, j) \in E

(16)

After adding $\bar{c}$ to the cost of these links to obtain {c′_ij}, a total of $\bar{c} \sum_{(i, j)} h_{ij}$ is added to the problem objective. Since $\bar{c}$ is constant, we now proceed to prove that $\sum_{(i, j)} h_{ij}$ is constant. After substituting h_ij from (16), we obtain (17):

\sum_{(i, j)} h_{ij} = \sum_{j \in V_{e}} \sum_{i \in P (j)} t_{i} (j) ϕ_{ij} (j)

(17)

The right-hand side of the equation is equal to the total flow arriving at a destination node summed over all destinations. By definition, this flow is equal to the source flow multiplied by the number of destinations and hence is constant.

4.2. Analysis

Subject to the choice of step size parameter α, the multicast routing algorithm is guaranteed to converge to the global optimum (Cui et al., 2007). Since a DAG contains no loops by definition, no contingencies are required to avoid routing loops. By Theorem 1, a min-cost-DIF instance with negative costs on links to an estimator can still be solved using the multicast routing algorithm by adding a sufficiently large positive constant to all such links.

The running time of multicast network routing with network coding is not explicitly provided in Xi and Yeh (2010) and Cui et al. (2007) but is implied to be polynomial in the size of the network. In practice, we have observed a polynomial rate of increase as a function of network size, as shown in Section 4.3 below.

4.3. Implementation and scalability

Implementing multicast routing for min-cost-DIF involves three main challenges. First, we need to allow all nodes to find the available sources and destinations in a decentralized manner. Second, we need to ensure that the nodes have a suitable mechanism to compute any changing input parameters. Finally, we must choose a suitable multicast policy to implement the chosen flow rates {x_ik (j)}.

Each node must find the set of sources and destinations to which it is connected in the network. Initially, each node is aware of its direct neighbors only. By performing only one downward sweep and one upward sweep of message passing described in Algorithm 1, each node can obtain the list of sources and destinations to which it is connected. In our implementation, the downward messages contain the set of source identities received so far and the upward messages contain the set of destination identities received so far.

Link costs are continuously computed due to changes in the team configuration throughout the progress of its mission. In our implementation, communication costs are simply set proportional to inter-robot distance. Processing costs are assumed to be constant throughout the system operation. Sensor utility, on the other hand, can be more difficult to compute since each estimator must receive observations from a sensor in order to evaluate this utility. We maintain sensor utility values dynamically though an exploration–exploitation model. Each node obeys the chosen flow rate x_ik (j) with probability (1−ε) (exploitation) and switches to another randomly selected flow rate with probability ε (exploration). The value of ε is set to a small positive number less than one. More sophisticated alternatives using machine learning methods are possible, such as in Xu et al. (2013a). Section 7 provides further discussion on this issue.

The chosen flow rates {x_ik (j)} are implemented using a multicast policy that determines how the inward flow of messages at a node is distributed amongst its children. In our problem formulation, we assume that network coding is used. For small-sized networks, network coding can introduce unnecessary complication with little performance advantage in practice (Keshavarz-Haddadt and Riedi, 2008). As an alternative, multicast routing can be implemented without network coding by using randomization. The probability of sending a given inward message along a given outward edge is set proportional to flow variable x_ik (j). In this case, the average total flow h_ik through the link for a source flow rate of r is given by (18) instead of the network coding relation given in (1):

{\bar{h}}_{ik} = r - \underset{j}{Π} (r - x_{ik} (j))

(18)

For a given set of flow variables, randomization will result in higher total flow through a link. The extra capacity required in comparison to network coding is given by the relation in (19). From the relation, we deduce that the gap is zero if the maximum flow variable over a link is equal to either zero or the source flow and that the gap is less significant when there are fewer destinations. Therefore, for ease of implementation we use randomization to implement the multicast policy. However, the total flow is still approximated by (1) since (18) otherwise leads to a non-convex problem. We found this approximation to be valid in practice:

\begin{matrix} {\bar{h}}_{ik} - h_{ik} & = r - h_{ik} - \underset{j}{Π} (r - x_{ik} (j)) \\ \leq r - h_{ik} - {(r - max_{j \in V_{e}} x_{ik} (j))}^{N_{j}} \\ = r - max_{j \in V_{e}} x_{ik} (j) - {(r - max_{j \in V_{e}} x_{ik} (j))}^{N_{j}} \end{matrix}

(19)

To demonstrate the scalability of the algorithm, we performed an empirical study of the convergence time for a given set of link costs. This study, which gives a convergence time estimate, also gives further insight into the time required between link cost updates.

We evaluated the convergence time in min-cost-DIF through a simulated network using randomized but fixed link costs. Our simulated network includes one sensor and a variable number of processors and estimators where the number of processors is always one more than the number of estimators. Results of the simulation for an increasing number of nodes are shown in Figure 4. The convergence condition is satisfied when the change in the solution variables is below a certain threshold.

Fig. 4.

Convergence time and convergence iterations of the message passing optimization as a function of the number of nodes in a simulated network. The simulated network includes one sensor and an increasing number of processors and estimators where the number of processors is always one more than the number of estimators.

Convergence time depends both on the number of iterations required until convergence and the time expended in each iteration. The number of iterations to convergence is hardware-independent and the results shown in Figure 4 indicate that the number of iterations to convergence is a sub-linear function of the network size. The time required for each iteration involves computing routing updates and transmitting the updated values. The time complexity of the routing updates in (13) and (14) is polynomial in the size of the network. Transmission time, on the other hand, is typically linear in the size of the transmitted message which in turn is proportional to the size of the network.

The convergence times shown do not account for communication delay. To estimate such delay in practice, we observed from the experiments shown later in the paper (Sections 6.3 and 6.4) that the time for one iteration over wireless networks is typically less than 100 ms. This value corresponds to networks concurrently being utilized for transmission of sensor data and processed observations. Based on this estimate and the number of iterations at convergence, we can estimate typical values for convergence time for networks from five to ten nodes to be between 50 and 150 s.

This empirical analysis indicates that the running time of our algorithm is polynomial in the size of the network for typical problem instances of interest. We observed that the number of iterations required until convergence is sub-linear in the number of nodes, and the time complexity of each iteration is polynomial in the number of nodes. In practice, we found iteration times to be dominated by communication delay resulting in the convergence time indicated above. We envisage that this period can be reduced by employing various quality-of-service protocols, yet this solution is deferred to future work.

Based on the above analysis, we can now specify nominal values for the interval between link cost updates. The frequency of link cost updates is set such that the multicast routing algorithm has sufficient time to converge. In our implementation, we chose fixed values between 50 and 150 s. However, we note that while the optimization algorithm is iterating, a valid routing is available and information can continuously flow through the network. The frequency of link cost updates determines how reactive the algorithm is to changes in estimated sensor utility, and thus the importance of a higher update frequency would be to handle situations where sensor utility changes rapidly. We leave consideration of this case to future work.

4.4. Two-robot simulation

We evaluated our solution approach to min-cost-DIF in simulation for the case of two mobile robots tracking a moving target. The implementation is decentralized with nodes running in separate processes. The aim of this simulation is to demonstrate the multicast behavior of dynamic information flow and to show the improvement in information gain realized in the min-cost-DIF setting in comparison to a control condition with uniform-rate communication.

The simulation setup is shown in Figure 5. Two mobile robots are assigned to separate workspaces. The first robot has a 360° field-of-view sensor and the second robot has a 180° field-of-view sensor. The sensors are bearing-only sensors, forcing cooperation between the robots. A moving target tracked by the robots moves in a circular pattern within a square region of interest outside the robots’ workspaces.

Fig. 5.

Demonstration setting of the two-robot simulation and experiment. Robots 1 and 2 are shown with their boundaries. Sample target tracks are shown making a square pattern inside the region of interest.

The network diagram of the system is shown in Figure 6. The object detection routine on each robot imposes a processing cost. Due to the multicast property, the processing cost of each processing module is distributed among the receiving estimators. Therefore, if the estimators collectively evaluate an observation utility greater than the processing cost, then the sensor raw data should be processed and sent forward. Virtual links, not shown in the figure, allow for the no-send policy. The cost of the links incident onto the estimators is adjusted throughout the simulation by the robots’ sensor utility evaluations.

Fig. 6.

The min-cost-DIF network diagram for the two-robot simulation. Virtual links are omitted for clarity.

To validate the performance of the algorithm, a control test was also run. In the control test, the sensor rates were reduced to the same average rate used in the dynamic case.

The information for each of the robots’ estimate over time is shown in Figure 7 and the corresponding average bars are shown in Figure 8. In Figure 7(a), the information in Robot 1’s estimate was higher on average for the dynamic flow case. In contrast to the down-sampled case, the dynamic flow case boosted inter-robot flow when required preventing long periods of poor information gathering performance. The advantage of the dynamic flow case can also be seen more clearly in Figure 8(a). The dynamic case for Robot 2 exhibits some lag in information gathering performance but it eventually outperformed the down-sampled case as shown in Figure 7(b). This is also confirmed in Figure 8(b).

Fig. 7.

Information value (negative entropy) of the robots’ target estimate over time for the two-robot simulation. The plots are shown for two communication methods, down-sampled and dynamic, with both requiring the same amount of computation on average.

Fig. 8.

Time averages of the plots in Figure 7 shown in bar format. The improvement for the dynamic case over down-sampling is clearly observed.

The policies and estimated sensor utilities for the dynamic flow case are plotted against time in Figure 9 with the corresponding averages shown in Table 2. The plots show consistency between sensor utility and flow. In particular, the oblique parts of the flow curve correspond to time periods where a robot would receive sensor observations freely due to raw data already being processed for the other robot. This free reception of data is a feature of multicast routing.

Fig. 9.

Flow rates and sensor utility for the two-robot simulation. Flow rates are shown in solid lines while the evaluated sensor utility is shown in dashed lines.

Table 2.

Average flow rates and Sensor Utility for the two-robot Simulation.

Link	Flow Rate	Sensor Utility
Robot 1’s sensor to Robot 1’s estimator	44.82	0.10

Robot 1’s sensor to Robot 2	55.03	1.64

Robot 2’s sensor to Robot 2’s estimator	26.12	0.04

Robot 2’s sensor to Robot 1	45.01	0.97

As noted in Section 4.3, discrete flow decisions minimize the error from our maximum flow approximation of the total flow in each link. As seen in Figure 9, with the exception of transient periods, the flow variables were mainly either 0 or 100.

4.5. Robot and ground station simulation

We also evaluated the case of one mobile robot and one ground station. The main purpose of this simulation is to show the generality of the dynamic information flow approach. The simulation was run using ROS and Gazebo.

The setting includes one Pioneer 2DX robot model with an on-board camera and one off-board processing station. The robot’s on-board processing is computationally expensive, however, it has wireless access to the off-board processing station. The demonstration process is as follows. The robot begins near the processing station. It then proceeds to gain information about a moving target.

The network diagram of the demonstration is depicted in Figure 10. Subject to communication cost, which is set proportional to distance, the robot’s decision is expected to vary between on-board and off-board processing. The maximum flow assumption is valid for this scenario since there is only one destination node.

Fig. 10.

The min-cost-DIF diagram of the robot and ground station demonstration scenario.

Figure 11 shows the flow between the robot and the ground station, the communication cost and the robot’s processing cost over time. The processing cost is assumed to be fixed over time as shown in the figure. The communication cost varies based on the distance between the robot and the ground station and the flow is adjusted accordingly. In Figure 11, at around time step 50, communication cost outweighs the cost of on-board processing. This leads to a switch from off-board processing to on-board processing. It should be noted that the communication cost displayed in the plot is only for transmission from the sensor to the processor. When compared with on-board processing cost, the total communication cost is doubled. The results of this simulation show the generality of the dynamic flow formulation. Even though the expected behavior here is as expected, the aim is to show that this behavior was achieved within the min-cost-DIF framework without modification.

Fig. 11.

Flow, communication cost and processing cost of the link from the sensor to the ground station processor for the one robot/one ground station simulation.

4.6. Robot and ground station experiment

We also performed an experiment in hardware for the one robot/one ground station case. A Pioneer 3DX robot equipped with an on-board computer using an Intel Atom processor N270 1.6 GHz was used. The robot is equipped with a SICK LMS291 2D lidar used exclusively for localization. The robot uses an on-board webcam as a 2D bearing-only sensor to track a moving target. With the exception of the off-board processing ground station, all processes were executed on-board the robot including localization, image processing (when required), estimation, decision-making and the information flow control algorithm.

The change in flow, communication cost and processing cost over time is shown in Figure 12. These results are consistent with the simulation results for the analogous case. We observe that the robot initially chooses to send images to be processed off-board by the ground station. The ground station performs object detection and sends point observations back to the robot. As the robot moved away to track the target, the communication cost increased and thus the robot chose to perform processing on-board. A snapshot of the moment when the robot decided to switch to on-board processing is shown in Figure 13.

Fig. 12.

Experimental results for the one robot/one ground station case.

Fig. 13.

Snapshot from the one robot/one ground station experiment. The image shows the robot having moved away to follow the target. At this distance, the robot prefers to perform processing on-board.

4.7. Monte Carlo simulation

We performed a Monte Carlo simulation for the experimental setting of Section 4.4 comparing the down-sampled and the dynamic communication cases. The aim of the simulation is to analyze the statistical significance of the performance advantage introduced by our solution to min-cost-DIF.

Each method was tested in 20 randomly initialized trials running for 1 min each. The down-sampling rate was chosen to match the average rate resulting from the dynamic case.

The results of the Monte Carlo simulation are shown in Figure 14 in box-plot format. The results shown assume each trial as one sample. Dynamic communication clearly outperforms down-sampling. A Welch’s t-test for statistical significance resulted in a p-value of less than 0.001.

Fig. 14.

Monte Carlo simulation results comparing down-sampling with dynamic information flow. Each method was tested on the two-robot scenario for 20 trials running for 1 min each. In the results depicted, a trial acts as one sample. The box extents represent the first and third quartiles, while the whiskers represent the extrema. The median is represented by the horizontal line inside the box.

5. Distributed optimization for threshold-DIF

In this section we introduce a distributed optimization method that will form the basis of our solution to threshold-DIF. The method is a distributed version of ADMM which we call DADMM. The DADMM method solves distributed non-smooth constrained convex optimization problems with a DAG structure. Threshold-DIF is a distributed optimization problem since its objective is a sum of local objectives and it only has neighbor-to-neighbor constraints. It is non-smooth due to the linear objective, and it has the structure of a DAG by definition.

We begin with a brief introduction to ADMM. Then, DADMM is presented along with complexity analysis. We then present the mapping from threshold-DIF to DADMM later in Section 6.

5.1 ADMM

For convenience, we provide a brief summary of ADMM. A detailed description can be found in Boyd et al. (2011).

ADMM solves optimization problems of the form given by (20). The objective is assumed to be a sum of two proper convex functions f ₁ and f ₂ where the first is a function of the vector z ₁ and the other is a function of the vector z ₂. We refer to z ₁ as the primary vector variable and to z ₂ as the secondary vector variable. The Lagrangian of the problem is shown in (21):

\begin{matrix} minimize f_{1} (z_{1}) + f_{2} (z_{2}) \\ subject to A_{1} z_{1} + A_{2} z_{2} = b \end{matrix}

(20)

\begin{matrix} L (z_{1}, z_{2}, y) = f_{1} (z_{1}) + & f_{2} (z_{2}) + y^{T} (A_{1} z_{1} + A_{2} z_{2} - b) \\ + (ρ / 2) ‖ A_{1} z_{1} + A_{2} z_{2} - b ‖_{2}^{2} \end{matrix}

(21)

ADMM is summarized in (22). Each iteration involves three updates. The primary update minimizes the Lagrangian about z ₁, the secondary update minimizes the Lagrangian about z ₂ and the third updates the Lagrangian variable y. Convergence is shown in Boyd et al. (2011):

\begin{matrix} z_{1}^{k + 1} & : = \underset{z_{1}}{argmin} L (z_{1}, z_{2}^{k}, y^{k}) \\ z_{2}^{k + 1} & : = \underset{z_{2}}{argmin} L (z_{1}^{k + 1}, z_{2}, y^{k}) \\ y^{k + 1} & : = y^{k} + ρ (A_{1} z_{1}^{k + 1} + A_{2} z_{2}^{k + 1} - b) \end{matrix}

(22)

5.2 Problem formulation

The general form of optimization problems that can be solved by DADMM is described as follows. Consider the DAG G = {V, E} defined in Section 3. Attach to each node i ∈ V a vector variable x_i and a proper convex function f_i (x_i ), which is not necessarily smooth. Node i can have constraints with its parents as per (24) where g_i is an affine function. The notation x_U where U = {i ₁,…,i_n }\subset V is defined as the concatenation of all vectors x_i such that i ∈ U, that is, $x_{U} = (x_{i_{1}}, \dots, x_{i_{n}})$ . Function g_i is interpreted as a vector-valued function with its dimension indicating the number of constraints $n_{g}^{i}$ . The goal of DADMM is to solve the optimization problem (23)–(24):

minimize \sum_{i \in V} f_{i} (x_{i})

(23)

subject to g_{i} (x_{i}, x_{P (i)})) = 0, \forall i \in V

(24)

5.3 Inequality constraints

The standard form of ADMM does not include inequality constraints. Therefore, we have only included equality constraints g_i in the problem definition. This is a non-restrictive assumption since by adding extra variables, inequality constraints can be transformed into equality constraints as we will show. Suppose that instead of g_i we have a function ${\bar{g}}_{i}$ that is required to satisfy the inequality constraint (25) instead:

{\bar{g}}_{i} \leq 0

(25)

By adding a slack variable p_i , this inequality constraint becomes an equality constraint plus a non-negative constraint on p_i as shown in (26). The slack variable p_i can be viewed as a variable belonging to a virtual parent node whose objective is an indicator function that is zero when p_i is non-negative and infinity otherwise:

\begin{matrix} g_{i} = {\bar{g}}_{i} + p_{i} = 0 \\ p_{i} \geq 0 \end{matrix}

(26)

In DADMM, p_i can be optimized independently. The solution of the optimization over p_i is given by (27):

p_{i} : = max {- {\bar{g}}_{i}, 0}

(27)

5.4 DADMM

DADMM consists of a preliminary decentralization step followed by the main optimization process. The decentralization step is only performed once during which the optimization problem is modified, through the addition of variables and constraints, such that it only requires neighbor-to-neighbor communication. In the optimization process, message passing and optimization updates run in a sequence that enforces decentralization while retaining equivalence to centralized ADMM.

The decentralization step modifies constraints (24). For every vector x_i where i ∈ V, a mirror vector ${\bar{x}}_{i}$ is introduced. The vector ${\bar{x}}_{i}$ acts as an interface for all other nodes. Any child node k that has a constraint including x_i replaces x_i with a local copy ${\tilde{x}}_{i}^{k}$ and an equality constraint between ${\tilde{x}}_{i}^{k}$ and ${\bar{x}}_{i}$ is added. Symmetrically, from node i’s perspective $x_{P (i)}$ is replaced with ${\tilde{x}}_{P (i)}$ . Therefore, from node i’s perspective, (24) is replaced with (28) to (30):

g_{i} (x_{i}, {\tilde{x}}_{P (i)}^{i}) = 0

(28)

x_{i} - {\bar{x}}_{i} = 0

(29)

{\tilde{x}}_{P (i)}^{i} - {\bar{x}}_{P (i)} = 0

(30)

The new constraints are assigned the following Lagrangian multiplier vectors. The Lagrangian multiplier λ_i is associated with constraint (28), μ_i is associated with constraint (29) and $η_{P (i)}^{i}$ is associated with constraint (30).

All the above constraints and variables except ${\bar{x}}_{P (i)}$ are attached to node i. This means that ${\bar{x}}_{P (i)}$ is node i’s only dependency on its parent nodes and ${\bar{x}}_{i}$ is the interface variable that is shared with node i’s children. We note that constraint (28) is now an internal constraint. This decentralization has decoupled the parents of node i. The decoupling is evident by noting that (30) is a decoupled set of equality constraints: ${\tilde{x}}_{l}^{i} = {\bar{x}}_{l}, \forall l \in P (i)$ . The decentralization step is shown schematically in Figure 15.

Fig. 15.

Decentralization step of DADMM. In (a), node i is shown with its parents and children in a DAG. It is assumed that node i has constraints that include all of its parents. By transforming the network into that of (b), each of node i’s parents only needs to communicate with node i given that they are not coupled elsewhere.

From the ADMM perspective, the sets of vector variables x_i and ${\tilde{x}}_{P (i)}^{i}$ are mapped to z ₁ in (20) and the sets of variables ${\bar{x}}_{i}$ are mapped to z ₂. The sets of constraints (28) to (30) are collectively mapped to the equality constraint in (20). Define ${\hat{x}}_{i} = (x_{i}, {\tilde{x}}_{P (i)}^{i})$ . Based on the mapping to ADMM, ${\hat{x}}_{i}$ is the primary vector variable while ${\bar{x}}_{i}$ is the secondary vector variable.

The main optimization process consists of message passing and optimization updates defined in a sequential and decentralized manner that is equivalent to the centralized version (22). The process begins with the head nodes and then proceeds to traverse the graph according to Algorithm 1. Algorithm 1 refers to two types of updates and two types of messages, upward and downward. We will now proceed to define what takes place during each update and what each message contains.

At the outset, each node i ∈ V is initialized with¹ x_i , $^{1} {\bar{x}}_{i}$ , $^{1} {\tilde{x}}_{P (i)}^{i}$ ,⁰ λ_i ,⁰ μ_i and $^{0} η_{P (i)}^{i}$ . In the tth downward update, node i updates its Lagrangian multipliers to obtain ^t λ _i,^t μ _i and $^{t} η_{P (i)}^{i}$ . It then updates the primary variables to obtain ^t+1 x_i and $^{t + 1} {\tilde{x}}_{P (i)}^{i}$ . The node’s downward message contains $^{t} {\bar{x}}_{i}$ that is required by its children nodes to update their primary variables. In the tth upward update node i updates its secondary variables to obtain $^{t + 1} {\bar{x}}_{i}$ . It then sends its upward message containing variables $^{t + 1} {\tilde{x}}_{P (i)}^{i}$ and Lagrangian multipliers $^{t} η_{P (i)}^{i}$ .

The decentralized nature of the process is evident from the definition of the updates at the node level. We need to show that the process is in fact equivalent to performing the centralized version of ADMM on the entire system. A proof of this equivalence is provided in Theorem 2 following Lemma 2.

Lemma 2. In Algorithm 1, every downward update t of node i is followed by an upward update t. Moreover, every upward update t of node i is followed by a downward update t + 1.

Proof. From Lemma 1, we know that if node i has just performed the tth update then it has performed t − 1 upward updates. Now, suppose that the next update is the (t + 1)th downward update. This is impossible because according to Lemma 1, node i would have performed t upward updates. The second part of the statement can be proved through a similar argument.

Theorem 2. For each node i, after the t-th update of the Lagrangian multipliers in the downward update, the variables owned by the node are equal to the ADMM update t of those variables.

Proof. The proof is by induction. Before the start of the algorithm, the variables of node i are set to¹ x_i , $^{1} {\bar{x}}_{i}$ , $^{1} {\tilde{x}}_{P (i)}^{i}$ ,⁰ λ_i ,⁰ μ_i and $^{0} η_{P (i)}^{i}$ . During the first downward update, node i’s Lagrangian variables are updated according to (31). The node would have received $^{1} {\bar{x}}_{P (i)}$ from its parents’ downward messages. At this stage, all variables belong to the ADMM update at t′ = 1:

\begin{matrix} ^{1} λ_{i} & : =^{0} λ_{i} + g (^{1} x_{i},^{1} {\tilde{x}}_{P (i)}^{i}) \\ ^{1} μ_{i} & : =^{0} μ_{i} + (^{1} x_{i} -^{1} {\bar{x}}_{i}) \\ ^{1} η_{P (i)}^{i} & : =^{0} η_{P (i)}^{i} + (^{1} {\tilde{x}}_{P (i)}^{i} -^{1} {\bar{x}}_{P (i)}) \end{matrix}

(31)

Assume that after node i’s (t − 1)th Lagrangian update, all variables belong to the ADMM update at t′ = t − 1. We now prove the statement for t′ = t. After the (t − 1)th Lagrangian variable update, node i directly updates its primary variables according to (32):

\begin{matrix} (^{t + 1} x_{i},^{t + 1} {\tilde{x}}_{P (i)}^{i}) : = \\ \underset{x_{i}, {\tilde{x}}_{P (i)}^{i}}{argmin} L_{i}^{1} (x_{i}, {\tilde{x}}_{P (i)}^{i},^{t} {\bar{x}}_{i},^{t} {\bar{x}}_{P (i)},^{t} λ_{i},^{t} μ_{i},^{t} η_{P (i)}^{i}) \end{matrix}

(32)

According to Lemma 2, the downward update is followed by an upward update. In the upward update, node i would have received $^{t + 1} {\tilde{x}}_{i}^{C (i)}$ from its children as well as the corresponding Lagrangian variables $^{t} η_{i}^{C (i)}$ . After receiving these variables, node i updates the secondary variables according to (33):

^{t + 1} {\bar{x}}_{i} : = \underset{{\bar{x}}_{i}}{argmin} L_{i}^{2} (^{t + 1} x_{i},^{t + 1} {\tilde{x}}_{i}^{C (i)}, {\bar{x}}_{i},^{t - 1} μ_{i},^{t} η_{i}^{C (i)})

(33)

The minimization over ${\bar{x}}_{i}$ can be written explicitly as shown in (34):

\begin{matrix} ^{t + 1} {\bar{x}}_{i} : = \\ \frac{1}{| C (i) | + 1} [^{t + 1} x_{i} +^{t} μ_{i} + \sum_{k \in C (i)} [^{t + 1} {\tilde{x}}_{i}^{k} +^{t} η_{i}^{k}]] \end{matrix}

(34)

Finally, from Lemma 2, we have given that the upward update is followed by downward update t + 1 during which the Lagrangian variables are updated in an analogous manner to (31) to obtain^t+1 λ_i ,^t+1 μ_i and $^{t + 1} η_{P (i)}^{i}$ . Hence, the (t + 1)th ADMM update is complete.

5.5. Analysis

In this section, we analyze the computational complexity of one iteration of DADMM. Analysis of the full problem depends on its convergence rate, which we consider for the special case of threshold-DIF in Section 6.

First, we define the following terms:

\begin{matrix} n_{g}^{i} & : number of constraints for node i \\ n_{g}^{max} & : = max_{i \in V} n_{g}^{i} \\ n_{i} & : = | x_{i} | = | {\bar{x}}_{i} | \\ n'_{i} & : = | {\tilde{x}}_{P (i)}^{i} | \\ n_{max} & : = max_{i \in V} n_{i} \end{matrix}

(35)

The worst-case complexity of one iteration of DADMM is given by Theorem 3.

Theorem 3. Each iteration of DADMM over a DAG G = {V, E} runs in $O (κ (G) {(n_{max} | V |)}^{2} (n_{g}^{max} + n_{max} | V |))$ time.

Proof. Each ADMM iteration consists of three updates: the primary update, secondary update and the Lagrangian update. Since the algorithm is decentralized and synchronous, its running time is dominated by the time to update a single node multiplied by the depth of G.

We now develop an upper bound on the computation performed by an arbitrary node i. The primary update involves solving $\nabla_{{\hat{x}}_{i}} L = 0$ . This equation is a linear equation, since the Lagrangian is quadratic and has the form $A {\hat{x}}_{i} = d$ , where $A \in ℝ^{{(n_{i} + n'_{i})}^{2}}$ and $d \in ℝ^{n_{i} + n'_{i}}$ since the primary vector ${\hat{x}}_{i}$ has dimension equal to n_i + n′_i. Each element in A potentially contains a term from the objective and a term from each constraint in which the corresponding element in ${\hat{x}}_{i}$ is involved. Each element of ${\hat{x}}_{i}$ is involved in an equality constraint (with the secondary variables) and may appear in the constraints of g_i . Therefore, the elements of A can be computed in $O ({(n_{i} + n'_{i})}^{2} (2 + n_{g}^{i})) = O ({(n_{i} + n'_{i})}^{2} n_{g}^{i})$ time. Each element in d can include a secondary variable and a Lagrangian multiplier from each constraint. The time complexity of computing the elements of d is $O ((n_{i} + n'_{i}) (1 + n_{g}^{i}))$ . Solving for ${\hat{x}}_{i}$ , assuming a dense matrix A, takes $O ({(n_{i} + n'_{i})}^{3})$ time. Therefore, the time complexity of the primary update is $O ({(n_{i} + n'_{i})}^{2} (n_{i} + n'_{i} + n_{g}^{i}))$ .

The secondary update computes an average over x_i and ${\tilde{x}}_{i}^{C (i)}$ to obtain ${\bar{x}}_{i}$ . Thus, its time complexity is $O (n_{i} (1 + | C (i) |)) = O (n_{i} | C (i) |)$ .

The Lagrangian update involves an evaluation of all constraints given by equations (28), (29) and (30). The evaluation of each element of g_i involves at most n_i + n′_i variables. The evaluation of the other constraints involves two variables each. Therefore, the time complexity of the Lagrangian update is $O (n_{g}^{i} (n_{i} + n'_{i}) + n_{i} + n'_{i}) = O (n_{g}^{i} (n_{i} + n'_{i}))$ .

The total time complexity is the sum of these three updates. Thus, we have $O ({(n_{i} + n'_{i})}^{2} (n_{i} + n'_{i} + n_{g}^{i})) + O (n_{i} | C (i) |) + O (n_{g}^{i} (n_{i} + n'_{i})) = O ({(n_{i} + n'_{i})}^{2} (n_{i} + n'_{i} + n_{g}^{i}) + n_{i} | C (i) |)$ .

We now restate the bound as a function of n _max, $n_{g}^{i}$ and |V|. Since n _max is an upper bound for n_i , $n_{max} | P (i) |$ is an upper bound on n′_i. An upper bound for the number of children or the number of parents is simply the number of nodes |V|. Hence, a more conservative upper bound for n′_i is n _max|V|. Therefore, the overall time complexity for the work performed by node i can be rewritten as $O ({(n_{max} | V |)}^{2} (n_{g}^{max} + n_{max} | V |))$ . The total time complexity for one iteration of the algorithm is thus $O (κ (G) {(n_{max} | V |)}^{2} (n_{g}^{max} + n_{max} | V |))$ .

6. Threshold-DIF

In this section, we show how DADMM can be applied to the threshold-DIF problem. The mapping from threshold-DIF to the general problem formulation of DADMM is presented in detail. A detailed complexity analysis of DADMM in terms of the threshold-DIF problem size is provided. We present several experimental results in simulation, including a 15-node example and a Monte Carlo simulation, and results from an experimental system with two outdoor ground robots and a stationary camera.

6.1. Threshold-DIF using DADMM

The threshold-DIF problem can be solved using DADMM. However, we first must reformulate constraints (4) and (7) such that they are compatible with the DADMM framework.

The maximum function in (4) is replaced by the set of inequalities (39). The maximum function is non-smooth and cannot be optimized in one step. With the set of inequalities in (39), the objective and all equality and inequality constraints of the optimization become linear as required by DADMM.

The set of inequalities in (39) is equivalent to the maximum relation in (4) as long as one of the constraints is active. One constraint will always be active for $h_{ik}^{m}$ if the link cost $c_{ik}^{m}$ is positive. From the problem formulation, we know that the link cost $c_{ij}^{m}$ can only be negative if j ∈ V_e , that is, if the link is incident to an estimator j. For these links, we replace the set of inequalities in (39) with the set of equalities given in (36):

\begin{matrix} x_{ik}^{m} (j) = 0, & if k \neq j \\ x_{ik}^{m} (j) = h_{ik}^{m}, & if k = j \end{matrix}

(36)

The inter-link constraint (7) is replaced by (42) where $S_{is}$ is the set of links emanating from node i involved in the inter-link constraint s. The DADMM format only permits constraints between a node and its parents. Therefore, constraint (42) only applies between links from node i and links from node i’s parents. This condition is not restrictive since an extra link can be added between non-neighboring nodes with inter-link constraints while retaining the directed acyclic property of the graph. The graph remains acyclic by preserving any ordering between the two nodes between which the extra link is added. Since the network is a connected DAG, then, by definition, for any two nodes i and k either k is a successor of node i, $k \in \bar{C} (i)$ or k is an ancestor of i, $k \in \bar{P} (i)$ or neither. If $k \in \bar{C} (i)$ , the link should extend from i to k. If $k \in \bar{P} (i)$ , the link should extend from k to i. If there is no directed path between the nodes, then either direction retains the acyclic property.

After transforming the constraints, we obtain the problem (37)–(42) shown below:

minimize \sum_{(i, k) \in E} c_{ik}^{m} h_{ik}^{m}

(37)

subject to x_{ik}^{m} (j) \geq 0

(38)

x_{ik}^{m} (j) \leq h_{ik}^{m}

(39)

\sum_{l \in P (i)} x_{li}^{m} (j) - \sum_{k \in C (i)} x_{ik}^{m} (j) + r_{i}^{m} (j) = 0

(40)

\sum_{m} ν_{ik}^{m} h_{ik}^{m} \leq C_{ik}

(41)

\begin{matrix} \sum_{k \in S_{is}} \sum_{m} ν_{ik}^{m} h_{ik}^{m} + \\ \sum_{l \in P (i)} \sum_{k' \in S_{ls}} \sum_{m} ν_{lk'}^{m} h_{lk'}^{m} \leq K_{s} \end{matrix}

(42)

To solve threshold-DIF, all that is required at this stage is that the threshold-DIF variables and constraints be mapped to the variables of the distributed optimization problem (23)–(24). This can be done as follows. The sets of variables ${h_{ik}^{m} : m \in V_{s}, k \in C (i)}$ and ${x_{ik}^{m} (j) : m \in V_{s}, j \in V_{e}, k \in C (i)}$ are mapped to x_i . The objective function f_i in (23) is represented by $\sum_{k \in C (i)} \sum_{m} c_{ik}^{m} h_{ik}^{m}$ and the indicator functions resulting from the inequality constraints. The constraint g_i in (24) is represented by the equality constraints and the equality versions of the inequality constraints involving node i in (37) to (42).

DADMM runs continuously throughout system operation. As the system configuration changes, link costs and weights are updated with new values. To ensure convergence, the interval between updates is set to an adequate time period. In practice, this interval was found to be of similar length to that of min-cost-DIF in Section 4.3.

6.2. Analysis

In this section, we provide time complexity analysis of DADMM when applied to threshold-DIF. The complexity is expressed in terms of the size of the threshold-DIF input parameters.

Complexity analysis is provided for the entire optimization process including the number of iterations required for convergence. We first determine the complexity of one iteration following directly from Theorem 3. We then find a bound on the number of iterations based on the algorithm’s convergence rate.

The complexity of a DADMM iteration was determined in Section 5.2. The complexity of one iteration in terms of threshold-DIF problem specification can be determined by substituting the appropriate values for n _max and $n_{g}^{max}$ . To begin, we denote the number of sources, destinations and inter-link constraints in the network as follows:

N_m : number of sources in the network;

N_j : number of destinations in the network;

N_s : number of inter-link constraints.

The maximum number of primary variables n _max is proportional to the maximum number of routing variables which, in turn, is proportional to the number of sources multiplied by the number of destinations multiplied by the number of children. The number of children is bounded from above by the number of nodes. Therefore, n _max is bounded such that n _max ≤ N_mN_j |V|.

The maximum number of constraints $n_{g}^{max}$ is bounded by the maximum number of flow consistency constraints (40) and the maximum number of inter-link constraints (42). The number of consistency constraints is proportional to the number of sources multiplied by the number of destinations. Therefore, $n_{g}^{max}$ is bounded such that $n_{g}^{max} \leq N_{m} N_{j} + N_{s}$ .

Substituting the obtained bounds into the result of Lemma 3, the complexity of one iteration of DADMM for threshold-DIF becomes $O (κ (G) {(| V |^{2} N_{m} N_{j})}^{2} (| V |^{2} N_{m} N_{j} + Ns))$ . In threshold-DIF, the depth of the underlying graph is a function of processor cascading which is independent of the number of robots. Therefore, the depth is assumed to be constant. Hence, the time complexity of one iteration can be restated as $O ({(| V |^{2} N_{m} N_{j})}^{2} (| V |^{2} N_{m} N_{j} + Ns))$ .

To determine the complexity of the whole optimization process, the convergence rate is required. A convergence rate in an ergodic sense is established in He and Yuan (2012) with relatively mild assumptions.

The result is restated here after establishing the appropriate notation. Define the primal vector of the kth iteration as $z^{k} = (z_{1}^{k}, z_{2}^{k})$ where $z_{1}^{k}$ and $z_{2}^{k}$ are the ADMM primary and secondary vectors defined in Section 5.1. Define the ergodic average ${\tilde{z}}^{k} = \sum_{k' = 1}^{k + 1} z^{k'}$ and define z* and y* as the optimal primal and dual vectors. Then, if we assume that z ⁰ = 0 and y ⁰ = 0, the convergence result is given by (43). The positive constants α and β are independent of the dimension and value of both the primal and dual variables:

L ({\tilde{z}}^{k}, y^{*}) - L (z^{*}, y^{*}) \leq \frac{α ‖ z^{*} ‖^{2} + β ‖ y^{*} ‖^{2}}{(k + 1)}

(43)

From (43), it is clear that in order to obtain a bound on the convergence rate, we need to find an upper bound on the squared norm of the primal and dual optimal vectors. The absolute value of the elements in the primal vector have an upper bound u_z which follows from the problem definition in Section 6.1. Hence, an upper bound on the norm of the primal vector is given by (44):

‖ z^{*} ‖^{2} \leq n_{z} u_{z}^{2}

(44)

We now seek a bound for the squared norm of the dual vector ‖y*‖². To simplify the analysis, we note that the centralized ADMM version of the threshold-DIF problem has the form of the optimization problem defined in (45) where the indicator function I _≥0 is defined in (46) below. The set $I$ contains the indices of the variables added to convert any inequality constraint into an equality constraint as described in Section 5.2. These variables need to satisfy the inequality constraint z _(i) ≥ 0 and they only appear in one row of the set of equality constraints Az = b:

\begin{matrix} minimize & c^{T} z + \frac{ρ}{2} ‖ Az - b ‖^{2} + \sum_{i \in I} I_{\geq 0} (z_{(i)}) \\ subject to & Az = b \\ A \in ℝ^{n_{g} \times n_{z}}, b \in ℝ^{n_{g}} \end{matrix}

(45)

I_{\geq 0} (z_{(i)}) = {\begin{matrix} 0 & if z_{(i)} \geq 0 \\ \infty & otherwise \end{matrix}

(46)

An upper bound on ‖y*‖² can be obtained from the following lemma.

Lemma 3. Assume that the equality constraints and the inequality constraints z_i ≥ 0 active at x*are all linearly independent. Then, there exists a positive constant γ independent of z, n_z and n_g such that ‖y*‖² ≤ γ‖c‖².

Proof. At optimality, we have Az − b = 0 and zero belongs to the subdifferential of the Lagrangian as shown in (47). The ith element of the vector $b_{I} \in ℝ^{n_{z}}$ is defined in (48) where ∂I _≥0 is the subgradient of the non-smooth indicator function:

0 \in c + A^{T} y^{*} + b_{I}

(47)

b_{(i)} = {\begin{matrix} \partial I_{\geq 0} & if i \in I and the constraint z_{i} \geq 0 is active \\ 0 & otherwise \end{matrix}

(48)

The subgradient ∂I _≥0 evaluated at 0 is an unbounded set and therefore cannot be used to directly bound y*. However, the optimality condition (47) has n_z rows while y* has dimension n_g . Furthermore, the maximum number of active inequality constraints, that is, constraints where $b_{(i)} \neq 0$ , is equal to (n_z − n_g ) since otherwise x* would be over-defined by the constraints due to the linear independence assumption. Consequently, if all rows in (47) such that $b_{(i)} \neq 0$ are removed, there will remain at least n_g rows.

To this end, we need to make sure that the matrix ${\bar{A}}^{T}$ obtained after removing the rows from A ^T remains full rank. When active, the inequality constraint z _(i) ≥ 0 becomes z _(i) = 0. If this equality is augmented as a row vector to the matrix A, due to the linear independence assumption, the rank of A becomes n_z + 1. Through elementary row operations, any non-zero element on the column corresponding to z _(i) can be changed to zero with no change in the rank of the matrix. At this stage, the row z _(i) = 0 can then be removed with the rank of the matrix dropping back to n_g . Once the row is removed, the column corresponding z _(i) column is now all zeros and can hence be removed with no change in rank. This proves that ${\bar{A}}^{T}$ has full rank n_g .

Therefore, the optimality condition (47) can be restated as (49) where $\bar{c}$ is the vector obtained after removing all the corresponding rows from c. The vector b_I becomes a zero vector after removing these rows:

{\bar{A}}^{T} y^{*} = - \bar{c}

(49)

From (49), we obtain (50) where $σ_{min} (\bar{A} {\bar{A}}^{T})$ is the minimum eigenvalue of $\bar{A} {\bar{A}}^{T}$ and is greater than zero since $\bar{A}$ is full rank:

\frac{‖ \bar{c} ‖^{2}}{‖ y^{*} ‖^{2}} = \frac{y^{* T} \bar{A} {\bar{A}}^{T} y^{*}}{‖ y^{*} ‖^{2}} \geq σ_{min} (\bar{A} {\bar{A}}^{T})

(50)

The proof is established by setting $γ = 1 / σ_{min} (\bar{A} {\bar{A}}^{T})$ and noting that $‖ \bar{c} ‖^{2} \leq ‖ c ‖^{2}$ since $\bar{c}$ is obtained by removing elements from c.

We assume that all elements in c are upper-bounded by a constant value u_c . This is a reasonable assumption since c represents the link cost vector and only the relative cost is of importance. Consequently, from Lemma 3, we have the upper bound given in (51) for the squared norm of the dual vector:

‖ y^{*} ‖^{2} \leq γ u_{c}^{2} n_{g}

(51)

We can now state the main complexity result given by Theorem 4. The complexity is polynomial as expected since the problem is convex and the number of variables is polynomial in the number of nodes.

Theorem 4. Obtaining an ε-optimal solution for the threshold-DIF problem using DADMM has a computational complexity of $O ({(| V |^{2} N_{m} N_{j})}^{2} (| V |^{2} N_{m} N_{j} + N s) (| V |^{3} N_{m} N_{j} + N_{s}) / ϵ)$ .

Proof. Bounds (44) and (51) mean that the left-hand side of (43) is bounded by a constant weighted sum of n_z and n_g . Therefore, an upper bound on the number of iterations k required to produce an error ε is given as $O ((n_{z} + n_{g}) / ϵ)$ .

Note that the number of primary variables n_z is bounded by $O ({| V |}^{2} N_{m} N_{j})$ multiplied by the number of nodes. Therefore, we have $n_{z} \leq O ({| V |}^{3} N_{m} N_{j})$ . The number of constraints n_g , on the other hand, can be bounded such that $n_{g} \leq O ({| V |}^{3} N_{m} N_{j} + N_{s})$ . Thus, the resulting number of iterations of the optimization process is given by (52):

k \leq O (({| V |}^{3} N_{m} N_{j} + N_{s}) / ϵ)

(52)

The complexity of the whole optimization process is obtained by multiplying the number of iterations by the complexity of each iteration. Thus, for an ε-optimal solution, DADMM for threshold-DIF runs in $O ({(| V |^{2} N_{m} N_{j})}^{2} (| V |^{2} N_{m} N_{j} + N s) (| V |^{3} N_{m} N_{j} + N_{s}) / ϵ)$ time.

6.3. Two-robot experiment

We implemented our approach and performed an experiment where two mobile robots communicate to track a moving target. The aim of this experiment is to show the information gain advantage of dynamic information flow in the case of limited inter-robot communication bandwidth.

6.3.1. Experimental setup:

The experimental system consists of two modified Segway RMP 400 robots. An image of the robots is shown in Figure 16. The first robot is equipped with a Velodyne 3D Lidar with a 360° field of view. The second robot is equipped with a 2D SICK LMS291 horizontally mounted laser scanner with a 180° field of view. Although these sensors provide range as well as bearing, they were treated as bearing-only sensors to force cooperation between the robots. Each robot is also equipped with a server-class computer with an eight-core processor. For localization, the two robots rely on high-accuracy inertial measurement unit (IMU) and differential global positioning system (DGPS) modules.

Fig. 16.

The Segway RMP 400 robots used in the experiments.

An image of the experimental setting is shown in Figure 17. The two robots were placed in two separate areas with virtually bounded geographical regions to avoid collision. Target tracking was limited to a geographically bounded region of interest and for the purpose of the demonstration, tracking was limited to one target performing circular patterns.

Fig. 17.

The outdoor experimental setup, with robots visible outside the tracking region of interest. The border of the tracking region is designated by solid lines.

The network diagram for this demonstration is shown in Figure 18. It is assumed that maximum communication bandwidth is limited and does not allow both robots to send sensor data at the full rate. We also assume that communication throughput decreases with inter-robot distance. Therefore, the robots are required to share the available bandwidth. The bandwidth sharing constraint is indicated by the dashed lines drawn between the two inter-robot links. Virtual links, not shown in the figure, allow for the no-send decision. It is expected that through dynamic information flow the bandwidth will be shared efficiently with respect to sensor utility. The maximum flow assumption is valid in this scenario since no processing costs are assigned.

Fig. 18.

The threshold-DIF diagram for the two-robot scenario. Virtual links are omitted for clarity.

To validate the performance of dynamic information flow, two control tests were run for the purpose of comparison. The first control test allows unconstrained communication between all nodes and shall be referred to as the unconstrained case. This test mainly acts as a benchmark since it violates bandwidth constraints. The second control test involves a reduced communication rate that obeys bandwidth bounds. This test shall be referred to as the down-sampled case.

6.3.2. Results:

The information value for the robots’ target estimates over time is shown in Figure 19 with the corresponding average bars shown in Figure 20. In both figures, we observe minimal difference in information value across the three communication methods for Robot 1. Because Robot 1 has a 360° field-of-view sensor, the target is always visible and therefore tracking does not depend on observations received from Robot 2. However, we do observe a difference in information value for Robot 2. Figure 20(b) shows that the down-sampled method results in reduced information gathering performance when compared to our method. This effect is also evident in Figure 19(b) where there is a clear decline in information for the down-sampled case at times 20 s and 60 s. This decline occurs because the target drops outside the robot’s sensor field-of-view. Dynamic flow ensured that information was directed from Robot 1 to Robot 2, but down-sampling naively shared the communication medium.

Fig. 19.

The information value (negative entropy) of the robots’ target estimate for the two-robot hardware experiment. Plots shown are for all three communication methods: unconstrained, down-sampled and dynamic. The sudden drops in information are due to target loss.

Fig. 20.

Time averages of the plots in Figure 19 shown in bar format. Data rates are shown superimposed. The dynamic case shows a clear improvement in information gain in comparison to down-sampling for Robot 2.

The advantage of dynamic information flow is further confirmed in Figure 21. The bottom two plots show the approximate sensor observation utilities and data flow between robots over time for the dynamic flow case. The flow from Robot 1 to Robot 2 dominates bandwidth usage when the target is not in Robot 2’s sensor field-of-view. However, at times 40 s and 80 s, the flow was directed from Robot 2 to Robot 1 because the target was closer to Robot 2. The top plot shows the distance between the robots over time. At the average inter-robot distance, the available bandwidth is limited to approximately half of the maximum bandwidth. As expected, the sum of the information flows obeys this reduced capacity. The flow rate and sensor utility averages are shown in Table 3.

Fig. 21.

Inter-robot flow rates and sensor utility over time for the dynamic flow case of the two-robot experiment. In the first plot, the inter-robot distance is shown. In the lower plots, flow rate is shown as solid lines and sensor utility is shown as dotted lines. Flow rate varies with utility and available bandwidth varies with inter-robot distance.

Table 3

Average flow rates and Sensor Utility for the two-robot Experiment.

Link	Flow Rate	Sensor Utility
Robot 1 to Robot 2	44.80	3.42
Robot 2 to Robot 1	10.30	2.10

6.4. Three-sensor-node experiment

We also evaluated our approach in a more complex scenario. This scenario involves the two robots from the previous experiment with the addition of a stationary camera. The aim of this three-sensor-node scenario is to show the generality of our method and to emphasize the multicast behavior of dynamic information flow.

6.4.1. Experimental setup:

In this scenario, the two robots are aided in tracking by a Prosilica GC2450 camera acting as a bearing-only sensor. The camera is positioned outside the tracking region of interest opposite the robots. The camera sends raw images to Robot 2 which processes the images and shares the observations with Robot 1. Hence, wireless communication is required for three links: 1) the link from the camera to Robot 2, 2) the link from Robot 1 to Robot 2 and 3) the link from Robot 2 to Robot 1. In the experiment, the robots remained stationary. This does not affect the results of the demonstration since separation distance is ignored in this scenario. Simulation results with moving robots in a similar setup are shown later in Section 6.5.

The network diagram is shown in Figure 22. The experiment assumes that available bandwidth is sufficient for the two mobile robots to share observations. However, if images are received from the static camera then the wireless network becomes congested. The camera provides accurate tracking when the target is in its proximity and inside its field-of-view. Dynamic flow is expected to allow images to be sent from the camera in such a situation. The extra flow from the camera has to be accompanied by a simultaneous reduction of flow in the other links sharing the medium.

Fig. 22.

The threshold-DIF diagram for the three-sensor-node scenario. Virtual links are omitted for clarity.

All three communication methods were tested. In the unconstrained communication case, images from the camera were sent to Robot 2 for processing and caused congestion in the wireless network. The bandwidth limits on the wireless network effectively reduced the image transfer rate and consequently diminished the advantage of the unconstrained case.

6.4.2. Results:

The information value for each of the robots’ estimates is plotted over time in Figure 24 with the corresponding average bars shown in Figure 25. The figures show that the dynamic case outperforms down-sampling for both robots. They also show better performance for the dynamic case in comparison to the unconstrained case for Robot 1, since the unconstrained case would have failed to produce the desired communication due to infrastructure bandwidth limitations. The results for Robot 2 show similar performance between the dynamic case and the unconstrained case.

Fig. 23.

Inter-robot information flow and sensor utility over time for the dynamic flow case of the three-sensor-node experiment. Flow rates are shown in solid lines; sensor utility is shown in dashed lines. Based on the bandwidth constraints and the camera’s data rate, 50 is the maximum flow available for the data sourced from the camera. At approximately 120 s, communication from the camera interrupts communication between robots due to the increased utility of camera observations.

Fig. 24.

The information value (negative entropy) of the robots’ target estimate for the three-sensor-node experiment. Plots are shown for all three communication methods: unconstrained, down-sampled and dynamic.

Fig. 25.

Time averages of the plots in Figure 24 shown in bar format. Data rates are shown superimposed. The dynamic information flow case has high average information gain performance with low average data rate.

One of the main objectives of this experiment is to highlight the multicast behavior. Multicast behavior can be observed by analyzing the bottom two plots of Figure 23. At times 80 s, 150 s and 190 s, the flow from the camera to Robot 2 retains a high value even though the camera utility for Robot 2 is low during those times. Robot 2 receives these observations without inducing additional cost since observations destined to Robot 1 are processed on-board Robot 2. Robot 1 receives these observations due to their high utility for Robot 1 but these observations must be processed on-board Robot 2 according to the system architecture. Multicast routing ensures that such flow does not get tallied twice. The top two plots confirm that the system obeys the bandwidth limits as the drop in flow takes place concurrently with the rise in flow from the camera. It should be noted that the maximum possible flow from the off-board stationary camera is only 50 units according to bandwidth bounds. The flow rate and sensor utility averages are shown in Table 4.

Table 4

Average flow rates and Sensor Utility for the three-sensor-node experiment.

Link	Flow Rate	Sensor Utility
Robot 1’s sensor to Robot 2	51.34	2.85
Robot 2’s sensor to Robot 1	24.68	0.50
Camera to Robot 1	26.29	3.40
Camera to Robot 2	24.89	1.88

These results also validate the maximum flow approximation of total flow. The flow in each of the inter-robot links shown in the top two plots in Figure 23 holds data to only one destination. Therefore, there is no error arising from the maximum flow approximation for those links. The link from the camera holds data destined to both robots. At instances when the camera link is at zero or at maximum flow, there is no loss due to the maximum flow approximation. Also, the number of destinations in this case is two and hence the loss calculated from (19) is at most 25% of maximum flow for all other instances.

6.5. Three-sensor-node simulation

We repeated the experiment presented in Section 6.5 in simulation. Here, we allow robots to move in order to improve tracking performance.

6.5.1. Experimental setup:

The experimental setting is similar to that of Section 6.4. Sensor output was only simulated through point observations. Therefore, no raw images were involved and the raw-data communication rate was fictitious. In a similar manner to the hardware case, communication is required for the links between the robots as well as the link from the camera.

The network diagram for this experiment is the same as shown earlier in Figure 22. Through dynamic information flow, the camera is expected to selectively interrupt communication between the two robots in order to send its images based on the benefit of its observations as evaluated by the robots.

The simulation was run for all three communication methods. The unconstrained communication method is naturally expected to produce higher information gain since the bandwidth bounds are not enforced due to the fictitious data rates.

6.5.2. Results:

The bottom two plots shown in Figure 26 highlight an aspect of multicast behavior different to that highlighted by the hardware analogue of this simulation. When only one robot evaluates a higher utility for camera observations, such as at times 120 s, 150 s and 220 s, no significant change in flow is observed. However, when both robots evaluate an improvement in the utility, the increase in the flow from the camera can be clearly seen.

Fig. 26.

Information flow and sensor utility for the three-sensor-node simulation. Flow rates and sensor utility are represented as in Figure 23. The maximum flow rate available for the data sourced from the camera is assumed to be 50. Flow rates vary with sensor utility.

The information value of the robots’ target estimates over time is shown in Figure 27 with the corresponding average bars shown in Figure 28. As expected, unconstrained communication results in higher information on average. However, this communication setting violates the bandwidth bounds. Nevertheless, Figure 28 shows that the dynamic case outperforms the down-sampled case for both robots. Figure 27 shows that the dynamic case dominates the down-sampled case at time 70 s, between times 150 s and 200 s and between times 250 s and 300 s. Observations were received from the camera at these times for the dynamic flow case (as can be seen in Figure 26). These times correspond to the configuration where the target enters the camera’s field of view and becomes closer to the camera than the mobile robots. The flow rate and sensor utility averages are shown in Table 5.

Fig. 27.

The information value (negative entropy) of the robots’ target estimate for the three-sensor-node simulation. Plots are shown for all three communication methods: unconstrained, down-sampled and dynamic.

Fig. 28.

Time averages of the plots in Figure 27 shown in bar format. The data rates are shown superimposed. The improvement for Robot 2 achieved by dynamic communication over down-sampling using the same communication rates is clearly observed. The unconstrained method violates the bandwidth constraints and is only included as a benchmark.

Table 5.

Average flow rates and Sensor Utility for the three-sensor-node simulation.

Link	Flow Rate	Sensor Utility
Robot 1’s sensor to Robot 2	89.74	3.01
Robot 2’s sensor to Robot 1	81.21	1.14
Camera to Robot 1	5.33	1.50
Camera to Robot 2	6.79	0.94

Similar to the hardware case, we note there is no error arising from the maximum flow assumption in the links between the two robots. The loss occurring in the link from the camera is negligible.

6.6. Multiple-node simulation

We demonstrated our decentralized algorithm on a simulated 15-node threshold-DIF network. The purpose of the simulation is to demonstrate our approach for a problem that is not amenable to manually designed communication protocols.

The simulated network comprises five agents each equipped with a sensor, a processor and an estimator. The network has a fully connected topology such that each sensor is connected to all processors and each processor is connected to all estimators. The agents are spatially distributed evenly in a linear manner. Each sensor is assumed to produce data at a rate of 100 units. A global communication constraint of 500 units was applied. In addition, a per-link capacity constraint of 200 units was applied to each processor–estimator link. Sensor utilities were externally randomized and provided to the estimators.

Figure 29 displays, in chronological order, the routing state of the network at various time instances throughout the simulation. Each column represents one agent equipped with a sensor, processor and estimator. This configuration is a matter of choice rather than a restriction of the algorithm. The links shown in the figure represent those that carried more than 10 units of data during the simulation. The figure demonstrates the shift of flow from one part of the network to another as the sensor utilities change. More importantly, the routing states shown would be difficult to determine based merely on intuition.

Fig. 29.

Routing state of multiple-node simulation at various times depicting active wireless links. Each column represents one agent equipped with a sensor, processor and estimator. Active links are represented by green lines. A link is considered active if it holds more than 10 units of data flow.

6.7. Monte Carlo simulation

To validate the performance of dynamic information in comparison to down-sampling, we conducted a Monte Carlo simulation for the experimental setting of Section 6.3. The aim of the simulation is to analyze the statistical significance of performance improvement due to our solution to threshold-cost-DIF.

The results of the Monte Carlo simulation are shown in Figure 30 in box-plot format. For each communication method, we ran 20 randomly initialized trials running for 1 min each. The results shown assume each trial as one sample. Dynamic communication outperforms down-sampling with p-value of less than 0.03 based on Welch’s t-test.

Fig. 30.

7. Discussion of sensor utility estimation

An essential component in addressing dynamic information flow is the estimation of sensor utility. In this section, we discuss our implementation of sensor utility estimation, its limitations and possibilities for improvement.

In our implementation, an estimator approximates sensor utility by evaluating the most recent sensor observation received. The value of the observation is computed as the reduction in entropy realized by fusing the observation into the estimator. This approach is advantageous due to the simplicity of implementation. Entropy reduction can be computed efficiently without additional data storage. The disadvantage of this approximation is that it is myopic; it only reflects the instantaneous effect of a sensor observation on the information gathering performance of a single robot. Myopic approximations such as this are commonly used since the long-term value of a sensor observation can be difficult to compute in the general case (Williamson et al., 2009). Alternatively, robots could learn the utility of an observation from other robots, as in Xu et al. (2013a). This approach is suited to homogeneous systems with sufficient capacity for the extra computational overhead necessitated by the learning process.

For non-myopic sensor utility estimation, the value of a shared observation should be evaluated according to that observation’s impact on the actions of other robots. Consider the following example where two moving robots track two moving targets. Suppose that the targets are far apart and each robot is only tracking one target. The observation from the first robot still induces an information gain in the second robot’s estimator in this case. However, the observation is of little value to the second robot because this additional information does not affect the second robot’s choice of viewpoint in the immediate future. The second robot can always receive the latest target estimate, which is an accumulation of past observations, from the first robot when needed. Therefore, in a non-myopic approach the value of an observation should not be based purely on information gain but rather on its effect on decision-making. Non-myopic estimation can improve the performance of DIF algorithms but is beyond the scope of the paper.

The submodularity of mutual information has been recently used to provide near-optimal solutions to the sensor selection problem (Golovin and Krause, 2011). However, the submodularity property does not hold for information gathering tasks with dynamic environments (Williams, 2007), which is the case of interest in this paper.

8. Conclusions and future work

We have introduced the dynamic information flow problem and defined two variants, min-cost-DIF and threshold-DIF. We proposed an efficient decentralized solution for each variant that allows heterogeneous decentralized information gathering teams to dynamically decide when and where sensor information should be transmitted. We expect this work to be of value in a broad range of application areas in and beyond robotics. For example, agriculture and environmental monitoring robotics involves problems in coordination and estimation for multiple robots with high-data-rate sensors that would benefit from efficient use of limited communication bandwidth. Our work can also be beneficial to automated personnel and equipment tracking systems in geographically large areas such as ports and mines.

In min-cost-DIF, robots determine flow rates based on information value, communication costs and computation costs. Our solution to min-cost-DIF was adapted from recent results in multicast routing, which we extended to allow for negative link costs that represent sensor utility. This solution was demonstrated in simulation for a system of two mobile robots and was also demonstrated experimentally and in simulation for the case of one mobile robot with access to an off-board processing station. The scalability of our solution was empirically evaluated in a simulation with up to 28 nodes.

In threshold-DIF, flow rates are optimized based on the value of information while obeying local computation limits and global communication limits. Our solution to threshold-DIF is based on a distributed version of ADMM that requires neighbor-to-neighbor communication only. We proved that the convergence time of our solution is polynomial in the size of the network. The benefits of this solution were demonstrated in simulation and hardware demonstrations with two mobile robots and two robots plus a stationary camera.

Our results have shown empirically that judicious adjustment of flow rates can improve information gathering performance. We focused on experimental systems with high-data-rate sensors that can easily overwhelm a standard wireless network. We have begun to explore scalability to larger networks in simulation, but the general issue of scalability and performance in large real-world networks (greater than 10 nodes) is an important area of future work.

Another interesting avenue to pursue is to consider further variants of the general DIF problem. For example, we would like to extend the cases considered in this paper, which address sensor-to-estimator communication, to include estimator-to-controller communication. This extension would introduce a new DIF variant for unified communication-aware information gathering and is of particular importance to applications that involve large amounts of data exchange for both estimation and control. The challenges of communication efficiency for control have traditionally been studied in the context of team decision theory in terms of information structure optimization (Ho, 1980). The difficulty of information structure optimization would complicate its integration with DIF, mainly due to the non-linearity of the control objective as a function of communication decisions. However, if communication costs are readily specified, as assumed in the case of min-cost-DIF, then the integration of our previous communication-efficient control method (Kassir et al., 2012), which also takes communication costs as input, can be made possible. This integration would require simplifying assumptions such as the separation between estimation and control, for example.

The performance of our solutions to DIF could be further improved through an accurate non-myopic sensor utility estimate. A possible solution to non-myopic sensor utility estimation might include employing machine learning techniques that learn the sensor utility as a function of the platform state. It is also interesting to consider cases where sensor utility estimates change rapidly, and to explore the sensitivity of DIF algorithms to such high frequency changes.

Another desirable future advancement would be to extend our current distributed solution to threshold-DIF to obtain an any-time feasible solution. Any-time feasibility might be achieved through the adaptation of feasibility projection methods from optimization theory.

In our current implementation, the identities of the nodes in the network need to be globally known. We envisage that this is not a necessary requirement and can be replaced by an automated discovery process. Nodes can advertise their types and then run a handshaking procedure with appropriate nodes before establishing connections.

Footnotes

Acknowledgements

We would like to thank Professor Stephen Boyd for his helpful advice on ADMM. We would also like to thank James Underwood, Zhe Xu, Suchet Bargoti, Joseph Nguyen, Marcos Castro, Tim Patten and Mark Calleija for their help in conducting the experiments.

Funding

This work is supported in part by the Australian Centre for Field Robotics and the New South Wales State Government.

References

Ahlswede

Cai

. (2000) Network information flow. IEEE Transactions on Information Theory 46(4): 1204–1216.

Ahuja

Magnanti

Orlin

(1993) Network Flows: Theory, Algorithms, and Applications. Englewood Cliffs, NJ: Prentice Hall.

Bernstein

Givan

Immerman

. (2002) The complexity of decentralized control of Markov decision processes. Mathematics of Operations Research 27(4): 819–840.

Bourgault

Furukawa

Durrant-Whyte

(2004) Decentralized Bayesian negotiation for cooperative search. In: Proceedings of IEEE/RSJ IROS, Sendai, Japan.

Boyd

Parikh

Chu

(2011) Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends in Machine Learning 3(1): 1–22.

Carlin

Zilberstein

(2008) Value-based observation compression for DEC-POMDPs. In: Proceedings of AAMAS, Estoril, Portugal.

Chen

Wang

Szymanski

. (2010) Dynamic service execution in sensor networks. The Computer Journal 53(5): 513–527.

Chung

Burdick

Murray

(2006) A decentralized motion coordination strategy for dynamic target tracking. In: Proceedings of IEEE ICRA, Orlando, FL.

Cui

Xue

Nahrstedt

(2007) Optimal distributed multicast routing using network coding. In: Proceedings of IEEE ICC, Glasgow, UK.

10.

Delle Fave

Rogers

. (2012) Deploying the max-sum algorithm for coordination and task allocation of unmanned aerial vehicles for live aerial imagery collection. In: Proceedings of IEEE ICRA, St. Paul, MN, USA.

11.

Gan

Sukkarieh

(2011) Multi-UAV target search using explicit decentralized gradient-based negotiation. In: Proceedings of IEEE ICRA, Shanghai, China.

12.

Ghaffarkhah

Mostofi

(2011) Communication-aware motion planning in mobile networks. IEEE Transactions on Automatic Control 56(10): 2478–2485.

13.

Gmytrasiewicz

Durfee

(2001) Rational communication in multi-agent environments. Autonomous Agents and Multi-Agent Systems 4(3): 233–272.

14.

Goldman

Zilberstein

(2003) Optimizing information exchange in cooperative multi-agent systems. In: Proceedings of AAMAS, Melbourne, Australia.

15.

Golovin

Krause

(2011) Adaptive submodularity: Theory and applications in active learning and stochastic optimization. Journal of Artificial Intelligence Research 42: 427–486.

16.

Golovin

Faulkner

Krause

(2010) Online distributed sensor selection. In: Proceedings of ACM/IEEE IPSN, Stockholm, Sweden.

17.

Gupta

Kumar

(2000) The capacity of wireless networks. IEEE Transactions on Information Theory 46(2): 388–404.

18.

Gupta

Chung

Hassibi

. (2006) On a stochastic sensor selection algorithm with applications in sensor scheduling and sensor coverage. Automatica 42(2): 251–260.

19.

Yuan

(2012) On the o(1/n) convergence rate of the Douglas-Rachford alternating direction method. SIAM Journal on Numerical Analysis 50(2): 700–709.

20.

Hollinger

Singh

(2012) Multirobot coordination with periodic connectivity: Theory and experiments. IEEE Transactions on Robotics 28(4): 967–973.

21.

Hollinger

Singh

Djugash

. (2009) Efficient multi-robot search for a moving target. The International Journal of Robotics Research 28(2): 201–219.

22.

(1980) Team decision theory and information structures. Proceedings of the IEEE 68(6): 644–654.

23.

Hsieh

Cowley

Kumar

. (2008) Maintaining network connectivity and performance in robot teams. Journal of Field Robotics 25(1–2): 111–131.

24.

Kassir

Fitch

Sukkarieh

(2012) Decentralised information gathering with communication costs. In: Proceedings of IEEE ICRA, St. Paul, MN, USA.

25.

Keshavarz-Haddadt

Riedi

(2008) Bounds on the benefit of network coding: Throughput and energy saving in wireless networks. In: Proceedings of IEEE INFOCOM, Phoenix, AZ, USA.

26.

Kulik

Heinzelman

Balakrishnan

(2002) Negotiation-based protocols for disseminating information in wireless sensor networks. Wireless Networks 8(2/3): 169–185.

27.

Kuo

Fitch

(2014) Scalable multi-radio communication in modular robots. Robotics and Autonomous Systems 62(7): 1034–1046.

28.

Lindhé

Johansson

(2013) Exploiting multipath fading with a mobile robot. The International Journal of Robotics Research 32(12): 1363–1380.

29.

Malmirchegini

Mostofi

(2012) On the spatial predictability of communication channels. IEEE Transactions on Wireless Communications 11(3): 964–978.

30.

Molin

Hirche

(2009) On LQG joint optimal scheduling and control under communication constraints. In: Proceedings of IEEE CDC/CCC, Shanghai, China.

31.

Mostofi

(2009) Decentralized communication-aware motion planning in mobile networks: An information-gain approach. Journal of Intelligent and Robotic Systems 56(1–2): 233–256.

32.

Ramanathan

(1996) Multicast tree generation in networks with asymmetric links. IEEE/ACM Transactions on Networking 4(4): 558–568.

33.

Sadler

Rus

Sukhatme

(eds) (2013) Special issue on robotic communications and collaboration in complex environments. The International Journal of Robotics Research 32(12): 1361–1514.

34.

Semsar-Kazerooni

Khorasani

(2009) Multi-agent team cooperation: A game theory approach. Automatica 45(10): 2205–2213.

35.

Smith

Schwager

Smith

. (2011) Persistent ocean monitoring with underwater gliders: Adapting sampling resolution. Journal of Field Robotics 28(5): 714–741.

36.

Speyer

Seok

Michelin

(2008) Decentralized control based on the value of information in large vehicle arrays. In: Proceedings of ACC, Seattle, WA, USA.

37.

Stachura

Frew

(2011) Cooperative target localization with a communication-aware unmanned aircraft system. Journal of Guidance, Control, and Dynamics 34: 1352–1362.

38.

Twigg

Fink

. (2013) Efficient base station connectivity area discovery. The International Journal of Robotics Research 32(12): 1398–1410.

39.

Williams

(2007) Information theoretic sensor management. PhD Thesis, Massachusetts Institute of Technology, Cambridge, MA.

40.

Williamson

Gerding

Jennings

(2009) Reward shaping for valuing communications during multi-agent coordination. In: Proceedings of AAMAS, Budapest, Hungary.

41.

Williamson

Gerding

Jennings

(2008) A principled information valuation for communications during multi-agent coordination. In: Proceedings of AAMAS, workshop on multi-agent sequential decision making in uncertain domains, Estoril, Portugal.

42.

Lin

Tseng

. (2000) A new multi-channel MAC protocol with on-demand channel assignment for multi-hop mobile ad hoc networks. In: Proceedings of I-SPAN, Richardson, TX, USA.

43.

Xing

Cheng

. (2007) Superimposed code based channel assignment in multi-radio multi-channel wireless mesh networks. In: Proceedings of ACM MobiCom, Montreal, Canada.

44.

Yeh

(2010) Distributed algorithms for minimum cost multicast with network coding. IEEE/ACM Transactions on Networking 18(2): 379–392.

45.

Fitch

Sukkarieh

(2013a) Decentralised coordination of mobile robots for target tracking with learnt utility models. In: Proceedings of IEEE ICRA, Karlsruhe, Germany.

46.

Fitch

Underwood

. (2013b) Decentralized coordinated tracking with mixed discrete–continuous decisions. Journal of Field Robotics 30(5): 717–740.

47.

Yan

Mostofi

(2013) Co-optimization of communication and motion planning of a robotic operation under resource constraints and in fading environments. IEEE Transactions on Wireless Communications 12(4): 1562–1572.