The Use of UAVs in Humanitarian Relief: An Application of POMDP‐Based Methodology for Finding Victims

Abstract

Researchers have proposed the use of unmanned aerial vehicles (UAVs) in humanitarian relief to search for victims in disaster‐affected areas. Once UAVs must search through the entire affected area to find victims, the path‐planning operation becomes equivalent to an area coverage problem. In this study, we propose an innovative method for solving such problem based on a Partially Observable Markov Decision Process (POMDP), which considers the observations made from UAVs. The formulation of the UAV path planning is based on the idea of assigning higher priorities to the areas that are more likely to have victims. We applied the method to three illustrative cases, considering different types of disasters: a tornado in Brazil, a refugee camp in South Sudan, and a nuclear accident in Fukushima, Japan. The results demonstrated that the POMDP solution achieves full coverage of disaster‐affected areas within a reasonable time span. We evaluate the traveled distance and the operation duration (which were quite stable), as well as the time required to find groups of victims by a detailed multivariate sensitivity analysis. The comparisons with a Greedy Algorithm showed that the POMDP finds victims more quickly, which is the priority in humanitarian relief, whereas the performance of the Greedy focuses on minimizing the traveled distance. We also discuss the ethical, legal, and social acceptance issues that can influence the application of the proposed methodology in practice.

Keywords

humanitarian relief disaster UAVs POMDP SAR path planning

Introduction

One of the most significant difficulties facing United Nations (UN) Agencies and Non‐Governmental Organizations (NGOs) when responding to rapid‐onset disasters, such as floods, earthquakes, and hurricanes, is understanding the requirements of the affected population accurately and swiftly (Tatham 2009). Those affected include people requiring immediate assistance during a period of emergency, that is, requiring basic survival needs such as food, water, shelter, sanitation, and immediate medical assistance (Lovell and Le Masson 2015).

Arii (2013) presents the current direct assessment methods as key informant interviews, community mapping, or a transect walk conducted by walking straight across the central part of the affected area while making careful observations (watching, listening, asking questions) and taking notes. However, these methods are time consuming and do not implement the data captured in a systematic manner, with the locations sampled not being geographically representative (too clustered and too few) and the subsequent reports produced too late (Tatham 2009). Thus, to improve the speed of data capture, NGOs often resort to proxy indicators, such as observations from satellites, aircrafts, or helicopters. While these can be effective, the use of helicopters for such tasks means that they are not available for cargo carrying and/or ambulance duties, which have generally a higher priority. As least in part, the difficulty of assessing the needs of a community leads to the situation reported by Tatham (2009) in which, in the aftermath of the 2005 Pakistan earthquake, a more remote village was 25% less likely to receive assistance than a less remote one.

As a result, many countries are using Unmanned Aerial Vehicles (UAVs) to support their survey of affected people. UAVs (or drones) are a class of aircraft that can fly without the presence of a pilot onboard. Humanitarian responses have used UAVs since 2001, after the terrorist attack of 9/11. An unprecedented number of small and lightweight UAVs were launched in the Philippines after Typhoon Haiyan, in 2013, and in Haiti, following Hurricane Sandy in 2012 (Meier 2014). In both cases, UAVs captured images to help the humanitarian organizations that were struggling to provide the most efficient aid to the affected regions (Turk 2014). More recently, in 2015, UAVs were used after the earthquake in Nepal – with thermal cameras that can find survivors by detecting body heat, as well as a high‐powered zoom lens that can show faces as well as on‐the‐ground‐detail from up to 1000 feet away (King 2015). UAVs were also used following the iron mining dam break in Mariana, Brazil, where drones were used as support to rescue teams and in monitoring structures. With the use of drones, it was possible to register a three‐meter crack in the wall of a second dam by the fire department team (Simões 2016). UAVs can also be used in slow‐onset disasters, such as the monitoring of deforested rural areas (Panaque‐Gálvez et al. 2014).

According to Keller (2015), world governments spent more than $4 billion on “drone” technology in 2015 and they expect this number to increase to $14 billion in 2024 for the worldwide UAV market (Keller 2015). The increased demand for drone technology following the Gulf conflict augmented substantially by the post‐9/11 conflicts in Afghanistan and Iraq. These conflicts, coupled with the broader Global War on Terror, created an opening for the expanded use of drones on an unprecedented scale.

The UAV's view from above is central for humanitarian responses because these vehicles can capture aerial imagery at a far higher resolution, more quickly and at much lower cost than satellite imagery (Meier 2014). The use of UAVs is increasing in a variety of applications, such as search and rescue operations, which can benefit significantly from the use of UAVs to survey affected areas (often very large) and collect information about the presence of victims and their possible locations (Murtaza et al. 2013). Manned rescue teams can be effectively directed to these locations to maximize the possibility of rescuing trapped victims (Murtaza et al. 2013). Chen and Miller‐Hooks (2012) mention the use of UAVs in urban search and rescue (USAR) to assess damage in areas impacted by natural disasters (size up phase), in which a disaster struck an urban area with numerous sites, such as buildings or other structures suspected of housing people where survivors may be trapped and require extrication and emergency care. Lin and Goodrich (2009) believe that mini‐UAVs have a potential use in wilderness search and rescue (WiSAR) because the UAV onboard video camera provides visual support, enables search and rescue workers to systematically survey large areas of importance in real time and increases the workers' awareness of the environment.

In fact, the use of UAVs in SAR operations is suggested due to the speed with which information is generated and made available to coordination teams, highlighting local evaluation, detection of blocked areas, and identification of secondary disasters (Xu et al. 2014). According to Luis et al. (2012), another advantage of the use of UAVs is that, different from trucks, UAVs do not circulate on land routes that may be destroyed, as occurred in 2010 in the Haiti earthquake.

An important step for the success of a search and rescue mission is the process of path planning, that is, designing the autonomous flight path of UAVs. Yuan and Wang (2009) develop a path‐planning algorithm for emergency logistics management, with the objective to minimize total travel time along the path. However, in most practical disaster situations, the number of trapped victims is unknown and, as such, the path‐planning operation becomes equivalent to an area coverage problem, once the UAVs must search through the entire affected area to find victims. Moreover, in typical disaster areas, certain locations are more likely to have stranded victims, usually more densely populate areas, such as schools, hospitals, and markets. Hence, if the UAVs are programmed to first visit such locations, then it is likely that the stranded victims will be found quickly. We adopted a priority‐based approach for coverage path planning in UAV networks, in which we assign different priorities to different regions within the target area based on a priori knowledge of the terrain (Murtaza et al. 2013).

Among 63 studies analyzed in our systematic literature search, only 10 studies consider route planning algorithms, and only 3 use the drone imaging approach to help rescue teams find victims: Waharte and Trigoni (2010), Murtaza et al. (2013), and Baker et al. (2016) – where the first two papers are conference ones (non‐peered‐reviewed) and the latter was published in a non‐operations management journal. We also included in our literature review conference papers, to the selection of peer‐reviewed journals only, to reduce publication bias and because the theme is emergent and is still under development in the academy. Thus, relying exclusively on peer‐reviewed literature could omit potentially relevant work.

Both Waharte and Trigoni (2010) and Murtaza et al. (2013) propose a solution based on the Partially Observable Markov Decision Process (POMDP) for the constrained coverage problem because the method fits well with the imperfect information in the context of disasters. Both studies compare the results with the Greedy and the Potential‐Based Algorithms and validate them in a hypothetical environment, but not a real disaster scenario map. However, these studies do not provide a methodology that allows their application in real scenarios, where the type of drone, the flight height, and the area size should be considered. Thus, their papers do not allow the reader to know about the feasibility of their proposed method in terms of the UAV's endurance, time of operation, and even a real area, which determines the time to find groups of victims in practice. Different from the previous cited authors, Baker et al. (2016) apply the Monte Carlo tree search to find victims and consider a real disaster area in their application, the 2010 Haiti earthquake. None of these studies implement a sensitivity analysis on variables, such as the number and size of states and discount factor, nor do they clarify how the UAV works when it is “isolated” in an area to determine which neighborhood has already been visited.

Considering the research gap for the use of UAVs for humanitarian operations management, this study contributes for important humanitarian operations management issues, focusing efforts on the path‐planning problem by proposing a POMDP‐based methodology and discussing ethical, legal, and social acceptance issues that can influence the application of the proposed methodology in practice, as well as have managerial implications. Based on the POMDP algorithm, we propose a step‐by‐step implementation process that academics and practitioners can apply for UAVs to find victims in areas affected by disasters while avoiding “isolating” the UAV in a previously visited cell. Motivated by the hypothetical environment used by Murtaza et al. (2013) and Waharte and Trigoni (2010), this study applies the POMDP method to three real examples. Two of these applications consider SAR operations in sudden‐onset disasters: a tornado, in Brazil, and the nuclear accident in Fukushima, a consequence of an earthquake followed by tsunamis. The third example shows the applicability of the proposed method in slow‐onset disasters: a refugee camp in South Sudan, to assess the needs of the affected population (including water systems, toilets, educational facilities, and health care), to map settlements, to register the displaced, to understand the dynamics of population movements and to assess the environmental damage caused by displacement. This study also implements a sensitivity analysis on the size/number of states, the discount factor, and the allocation of the priorities of each state, and includes additional analysis metrics (e.g., traveled distance, operation duration) to analyze the feasibility of the operation. We conduct interviews with practitioners to align with our assumptions and findings, which make the results more useful to humanitarian operations management (Gupta et al. 2016, Pedraza‐Martinez and Van Wassenhove 2016).

We organized the remainder of this text as follows: section 2 presents a literature review of the applications of UAVs in humanitarian relief. Section 3 presents the framework of the POMDP technique. Then, we present the method for formulating the path‐planning problem on section 4. Section 5 presents the three illustrative examples. Section 6 presents the results and discussion of the POMDP application for UAV path planning. We present Concluding remarks in section 7.

Literature Review

This section presents a review of the literature on the use of drones for humanitarian relief SAR operations. According to Thorpe et al. (2005), a literature review helps to engender a collective endeavor, relevance, and openness among studies, which not only prevents expensive and fruitless repetition of effort but also assists in linking future research to the questions and concerns that have been proposed by past research and, finally, improves the methods used to collect and synthesize previous empirical evidence. The methodology adopted to conduct the literature review consisted of four steps, as proposed by Thorpe et al. (2005). We show the summary of the review process in the Table 1.

Table 1

Summary of the Review Process and Results (Adapted from Thorpe et al. 2005)

Stage one	Stage two	Stage three	Stage four
Activity Search in databases	Activity Exclusion criteria	Activity Keywords and search strings used to exclude further articles	Activity Quality and relevance criteria used to separate into two lists
Key Results Databases (2) Keyword used (9) Searches from Databases (119) Snowball Search (93) Total: 212	Key Results Anonymous Author (1) Other language (3) Nor peer‐reviewed or conference papers (9) Total: 199	Key Results – Total: 199	Key Results Not relevant (136) Relevant (63) Total: 63

Stage one

Stage two

Stage three

Stage four

Activity

Search in databases

Activity

Exclusion criteria

Activity

Keywords and search strings used to exclude further articles

Activity

Quality and relevance criteria used to separate into two lists

Key Results

Databases (2)

Keyword used (9)

Searches from Databases (119)

Snowball Search (93)

Total: 212

Key Results

Anonymous Author (1)

Other language (3)

Nor peer‐reviewed or conference papers (9)

Total: 199

Key Results

–

Total: 199

Key Results

Not relevant (136)

Relevant (63)

Total: 63

First, we selected the databases and we considered two databases as suggested by Thomé et al. (2016a): Scopus and Web of Science. According to Mongeon and Paul‐Hus (2016), these databases are the two main and largest databases, and their use together makes the research wider and reduces the possibility of bias related to journals indexed exclusively in one of the databases. Thomé et al. (2016b) also highlight its extensive coverage of over 22,000 journals from the main publishers of peer‐reviewed papers. Then, we used the following keywords to filter the databases in their topic, title, abstract, or keywords: “UAV OR Drone” and “Humanitarian OR Disaster OR Relief OR Emergency OR Crisis” and “Search and Rescue OR Search & Rescue”—which resulted in 119 documents. There are other synonyms for UAVs, such as RPV (remotely piloted vehicle), ROA (remotely operated aircraft), UVS (unmanned vehicle system), and UAS (unmanned aerial system) (Bendea et al. 2008), but we did not consider them because the papers found with these keywords were not relevant. To complement the database research, we conducted a snowball search in the most relevant papers (Thomé et al. 2016a), which increased the total number of documents to 212.

In the second step, we applied the exclusion criteria and excluded three documents written in another language and one document with no author available. We limited the review to peer‐reviewed papers and conference ones, which excluded nine more documents, so step two ends with 199 documents. The number of conference papers is very representative (see Figure 1). Conference papers account for 70% of the relevant papers and peer‐reviewed papers represent 30% of them.

Figure 1

Number of Peer‐Reviewed Papers and Conference Papers Over the Years

We did not consider the third stage of the review process because there was no need to exclude further articles through keywords and search strings.

Finally, we read the abstracts to confirm the relevance of the documents, and 63 of them were considered relevant for the review and were categorized according to the criteria presented below (see Supplementary Material).

Table 2 shows the papers categorized by origin and speed of disaster according to the classification proposed by Van Wassenhove (2006).

Table 2

Papers Categorized by Origin and Speed of Disaster

	Sudden‐onset	Slow‐onset	ND	Total
Natural	21	0	0	21
Man‐made	1	2	0	3
ND	0	0	39	39
Total	22	2	39

Thirty‐nine (62%) of the papers were not defined (ND) in both classes (origin and speed) because its proposals can be applied in the response phase of natural and man‐made disasters, and they also have applications at different disaster speeds. There were three review papers. The natural sudden‐onset disasters account for a third of the papers, where the use of UAVs consists mostly in the analyzing of the collected images. Only three papers (4.8%) address man‐made disasters, of which two were categorized as slow‐onset, and the use of UAV's occurred in the refugee crisis in Europe (Mendonca et al. 2016, Tarchi et al. 2017), and one was categorized as sudden‐onset, with the UAV used to monitor and measure a forest fire (Merino et al. 2012).

In Figure 2, it is possible to see that almost half of the papers showed experimental results, which means that UAVs performed experimental flights and were used to validate the methodologies, algorithms, and proposed models (practical). This finding indicates that UAV use in humanitarian logistics can already be viewed as a highly feasible possibility, in addition to being an efficient and effective implementation.

Figure 2

Papers Categorized by Approach

Theoretical papers include mostly route planning algorithms and UAV networks. Only three of the theoretical papers considered a case study in real disaster scenarios to validate their model, as in this study. Salisbury et al. (2016) present a real‐time crowdsourcing platform that tags live footage from aerial vehicles flown during disasters. Kim et al. (2015) demonstrate the use of expert systems and advanced technologies for assessing damage caused by Tropical Storm Iselle. Baker et al. (2016) used data from the Ushahidi project generated from crowd‐sourced information during the 2010 Haiti earthquake. This finding indicates that there is a gap in the literature in terms of applications of the models and methodologies in real disaster areas with the focus of saving lives, which is one of the contributions of this study.

Figure 3 shows the papers categorized by the purpose of the application, where we could classify each paper in more than one category.

Figure 3

Papers Categorized by Purpose of the Application

From Figure 3, we can see that the majority of the papers address the UAV's networks. Nine of them cover centralized control methods, whereas five of them focus on decentralized control methods. The work of three of the papers compare these different methods, and five of them did not define the method; 38% of the papers address 3D mapping, mapping of affected areas, and image analysis, which are related topics. This relation is due to the main objective of early impact analysis after a disaster: to define the damages to the infrastructure and/or to facilities, which requires suitable data, such as high‐resolution satellite images. The 3D mapping and the image analysis provide more clear views of the affected areas as input data for early impact analysis in medium and large‐scale maps.

From Figure 3, it is also possible to see that 19% of the relevant papers address the use of sensors in detection operations. The types of sensors can be visual (6), radiofrequency (5), thermal (3), infrared (2), or X‐band radar (1). We could classify each paper with multiple types of sensors.

Table 3 shows the papers categorized by the number of UAVs deployed and the number of targets.

Table 3

Papers Categorized by the Number of UAVs Deployed and the Number of Targets

		Number of targets
		Single	Multiple	ND	Total
Number of UAV's Deployed	Single	6	9	19	34
	Multiple	0	12	14	26
	ND	0	0	3	3
	Total	6	21	36

Papers addressing image analysis, 3D mapping, and mapping of affected areas, which represent almost half of the relevant papers, usually deployed just one drone to collect images, and do not have a specific target on the images. On the other hand, papers addressing the UAVs' networks, which represent 35% of the relevant papers, deployed multiple UAVs, and they usually had multiple targets or no target at all when used just to test the connection between drones.

As this study addresses drone routing, it is worth mentioning that we found only 10 papers in the literature review addressing route planning algorithms. We show these papers in Table 4.

Table 4

Route Planning Algorithm's Papers

Authors	Conference – C Journal Peer‐Reviewed Papers ‐ P	Purpose of reducing time to find victims	Case study in real disaster's area	Sensitivity analysis	Algorithm
Authors	Conference – C Journal Peer‐Reviewed Papers ‐ P	Purpose of reducing time to find victims	Case study in real disaster's area	Sensitivity analysis	Ant colony optimization	Dijkstra	Greedy	Hyper‐heuristic framework	Min_Route Priority_Route	Monte Carlo tree search	Online planning with active sensing	POMDP	Potential	ND
Wang et al. (2017)	C							x
Mondal et al. (2016)	C			x					x
Wu et al. (2016)	C										x	x
Baker et al. (2016)	P	x	x							x
Xiaowei and Xiaoguang (2016)	C													x
Murtaza et al. (2013)	C	x					x					x	x
Almurib et al. (2011)	C					x
Waharte and Trigoni (2010)	C	x					x					x	x
Meng et al. (2009)	C					x
Cheng et al. (2009)	C				x
This study		x	x	x			x					x

Wang et al. (2017) consider five meta‐heuristic algorithms for solving the complex problem of planning the search sequence and search modes of UAVs in search and rescue operations but find that none of them can always obtain satisfactory solutions on a variety of instances. To overcome this obstacle, the researchers integrate these meta‐heuristics into a hyper‐heuristic framework. Mondal et al. (2016) develop two path‐planning algorithms, MIN_ROUTE (calculates shortest distance of shelter points) and ROUTE_PRIORITY (priority of shelter points at any time stamp), for a small‐sized fixed‐wing UAV for data collection and dissemination in a post‐disaster situation. They tested their algorithms on a simulator considering different environment simulators (wind speed and wind direction) and UAV calibration parameters, like battery level and flight duration. Xiaowei and Xiaoguang (2016) study path‐planning and communication‐optimizing problems when a UAV team performs communication relay tasks. Baker et al. (2016) introduce the survivor discovery problem and present their solution, the first example of a continuous factored coordinated Monte Carlo tree search algorithm. Wu et al. (2016) propose the OPAS algorithm—Online Planning with Active Sensing—to capture the uncertainty and partial observability of the domain, arguing that the POMDP resulting model is computationally intractable and cannot be solved by most existing POMDP solvers due to the large state and action spaces. Murtaza et al. (2013) solve the coverage problem while optimizing the time to find victims when the number of victims in the disaster area is unknown, and propose a heuristic based on POMDP. Almurib et al. (2011) create an initial static path planning for quad‐rotors, using the Dijkstra algorithm, and apply a flight mode dynamic path planning with a Virtual Potential Function algorithm. Waharte and Trigoni (2010) minimize the time to find the victim and discuss how some fundamental parameters (such as quality of sensory data collected by the UAVs, UAVs energy limitations, environmental hazards (e.g., winds, trees), and level of information exchange/coordination between UAVs) can affect the search task. The researchers also study the performance of different search algorithms when the time to find the victim is the optimization criterion. Cheng et al. (2009) propose a cooperative path planner for UAVs, where the path of each UAV is represented by a B‐spline curve with a number of control points. The positions of these control points are optimized such that the total coverage of the UAVs is maximized. Meng et al. (2009) propose an algorithm to address the re‐tasking and path re‐planning of UAVs to handle to unanticipated events and environmental disturbances.

Only the works of Waharte and Trigoni (2010), Murtaza et al. (2013), and Baker et al. (2016) have the purpose of reducing the time to find victims. Murtaza et al. (2013) focus on applying their POMDP algorithm for multiple UAVs to find victims in a very small target area (10 × 10 m cells). Although the results have been better than the greedy and the potential algorithms, it is a simulated environment, so it does not allow the authors to know how this type of operation would be, that is, if the drones' endurance would be sufficient and if the time of operation would be reasonable according to the size of the affected area. Another gap in their algorithm is that it does not consider the case when the UAV is “isolated” in an area of a neighborhood that has already been visited. Waharte and Trigoni (2010) also test three different algorithms (POMDP, Greedy, and Potential) in a very small target area (10 × 10 grid), and, different from Murtaza et al. (2013), their grid is on a map, but not a map of a real disaster scenario. Baker et al. (2016) apply the Monte Carlo tree search algorithm to create the UAV's path plan, which is empirically evaluated using the 2010 Haiti earthquake, but they do not consider a sensitivity analysis to study the variations in the belief map.

It is important to emphasize that among the algorithms listed in Table 4, the POMDP is the only Markovian Decision Process (MDP), which is a way of modeling processes where the transitions between states are probabilistic; it is possible to observe in which state the process is, and it is possible to interfere in the process by periodically executing actions. As it is partially observable, the actions are based only on the available information, which consists of previous observations and actions.

Table 4 indicates that there is a need for research on the POMDP‐based methodology to help rescue teams to find victims from the UAVs' images of real disaster areas and then apply a sensitivity analysis to show how different areas and priorities can affect the time to find groups of victims, as proposed in this study. We applied the methodology in three real disaster scenarios considering the case when the UAV is “isolated” and analyze some statistics that allow the validation of the operation in terms of UAV endurance and data link range, which are real constraints.

Partially Observable Markov Decision Process

The Markov decision process (MDP) models a controlled stochastic process with perfectly observable states. The MDP represents the situation in which a control agent can be uncertain about the possible outcomes of its actions but is still able to verify the resulting state once the action is complete. That is, there is no uncertainty regarding the state the agent currently is in, although there is uncertainty regarding the location where the agent can be after the next action (Hauskrecht 1997).

Imagine the situation in which the agent cannot observe the process state directly but only indirectly through a set of noisy or imperfect observations. The partial observability feature can be important in many real‐world problems. For example, a robot planning its route or determining what action to take usually operates on noisy sensory information; in medicine, the physician often needs to decide about a treatment based on available findings and symptoms while being uncertain about an underlying disease. In such cases, the perceptual information available need not align with and imply the actual world state with certainty. Therefore, the agent that acts in environments with imperfect state information may face uncertainty from two sources (Hauskrecht 1997):

uncertainty in the action outcome;

uncertainty in the world state due to imperfect (or partial) information.

The main distinction between fully observable MDPs and POMDPs lies in the information one uses to select an action. In the MDP case, actions are selected using process states that are always known with certainty, whereas for the POMDP, actions are based only on the available information, which consists of previous observations and actions.

Note that the observation model as defined makes it possible to condition observations on both actions and process states, which allows one to model investigative actions similarly to other control actions (Hauskrecht 1997).

According to Hauskrecht (1997), the POMDP is defined as a tuple (S, A, T, R, Ω, O, z, γ), where

S is a set of possible states for the stochastic process;

A is a set of actions that can be executed in different decision times;

T : S x A x S → [0, 1] is a function that indicates the probability that the system transitions to an s ^′ state, considering it was in s, and action a was executed;

R : S x A \to R

is a function that indicates the cost (or reward) of making a decision a when the process is in s;

Ω is a set of observations obtained in each decision time;

O : S x A x Ω → [0, 1] a function that indicates the probability that an o observation is verified, considering an s state and an a previously executed action;

z is the number of time steps the agent must plan. The term is also called the horizon, and can be finite when there is a fixed number of decisions to make, or infinite, when decisions are made repeatedly;

γ is a discount factor used to indicate how rewards earned at different time steps should be weighed. In general, the more lagged a reward is, the smaller its weight will be.

The POMDP‐solve program proposed by Cassandra (2015) solves problems through the dynamic programming approach, solving one stage at a time but working backward in time. It solves finite horizon problems with or without discounting. It stops solving if the result is within a tolerable range of the infinite horizon, considering a couple of different stopping conditions (which requires a discount factor less than 1.0). Alternatively, it solves a finite horizon problem for some fixed horizon length.

The POMDP is used in search and rescue operations because the POMDP provides a policy for transitioning through states such that the reward gained is maximized upon execution. The reward gain in a POMDP nicely maps onto transitioning to high‐priority areas as quickly as possible (Murtaza et al. 2013). This heuristic can also be used in different humanitarian activities, such as road damage, for example, where the algorithm can “learn” the exit paths and propose a new route for trucks to deliver supplies.

Information State

Because information state I _t represents all information available to the agent at the decision time that is relevant for the selection of the optimal action, the information state consists of either a complete historical series of actions and observations or its sufficient statistic (Hauskrecht 1997).

Equation 1 describes the information state:

I_{t} = τ (I_{t - 1}, 0_{t}, a_{t - 1})

(1)

where I _t and I _t‐1 denote new and previous information states, and τ is the information state estimator. The process defined over information states is also called the information‐state MDP. In principle, one can always reduce the original POMDP into the information‐state MDP (Hauskrecht 1997).

Belief State

The quantity often used as a sufficient statistic for planning and control in POMDPs is the belief state (or belief vector), b _t(s). The belief state assigns a probability to every process state and reflects the extent to which states are believed to be present. The belief vector b _t(s) represents the probabilities of the process to be in the state s at time t given the information state, as indicated in equation 2:

b_{t} (s) = P (s | I_{t}^{c})

(2)

where

I_{t}^{c}

is a complete information vector at time t.

POMDPs as Belief States about MDPs

According to Cassandra (1998), an information state, b, is simply a probability distribution over the set of states, S, where b(s) is the probability of occupying state s. We define B = P(S) to be the space of all probability distributions over S. A single information state can capture the relevant aspects of the entire previous history of the process and, more importantly, can be updated after each state transition to incorporate one additional step into the historical dataset.

The information state estimator τ : B x A x Ω → B defines the next belief state, given the previous belief state (b), the previous action (a), and the previous observation (o). If the observations are always caused by the previous action, the current state is b, the previous action is a, and the resulting observation is o; thus, the state estimator can calculate the next belief state b′ from the previous state b using Bayes' rule. Equation 3 defines b _a(s′), the probability that the new state is s′ given the a executed action:

b_{a} (s^{'}) = \sum_{s^{'} \in S} T (s^{'} | s, a) b (s)

(3)

Equation 4 describes b _a(o), the probability that the next observation is o given the a executed action.

b_{a} (o) = \sum_{s^{'} \in S} O (o | s^{'}, a) b_{a} (s^{'})

(4)

The new belief state b′ is composed of the probabilities b′(s′), according to equation 5:

b^{'} (s^{'}) = \frac{O (o | s^{'}, a) b_{a} (s^{'})}{b_{a} (o)} = \frac{O (o | s^{'}, a) \sum_{s^{'} \in S} T (s^{'} | s, a) b (s)}{\sum_{s^{'} \in S} [O (o | s^{'}, a) \sum_{s^{'} \in S} T (s^{'} | s, a) b (s)]}

(5)

In equations 6 and 7, the function T′ indicates the probability that the system passes from a belief state b to another, b′, after executing an action a:

T^{'} (b^{'} | b, a) = P (b^{'} | b, a) = \sum_{o \in θ} P (b^{'} | b, a, o) P (o | b, a)

(6)

where

P (b^{'} b, a, o) = \{\begin{matrix} {1 if τ (b, a, o) = b^{'} \\ 0, o t h e r w i s e \end{matrix}

(7)

The reward function ρ presented in equation 8 is defined for the belief states, and indicates the expected reward of each action, given the probabilities with which the system can be found in each state:

ρ (b, a) = \sum_{s \in S} b (s) R (s, a)

(8)

The solution to the MDP of the continuous state space (B, A, T, ρ) is the solution to the POMDP used to build it.

The next section shows the POMDP‐based methodology.

Policies

Given a tuple (S, A, T, R, Ω, O) specifying a POMDP, it is necessary to know what action an agent should execute at each time‐step to earn as much reward as possible over time. Let us define Π to be the set of all policies π (action strategies) that an agent can execute. Roughly speaking, a policy is some strategy that dictates which action a to execute (at each time‐step) based on some information previously gathered. The relevant information available to the agent consists of some belief b ₀ about the initial state of the world and the history (sequence) of actions and observations experienced up to the current time‐step t(hist _t = (a ₀, o ₁, a ₁, o ₂, …, a _t‐1, o _t)). Since the agent may not have complete knowledge of the initial state of the world, we use b ₀ to denote a probability distribution over all possible states that corresponds to his belief about the initial state. Hence, a policy π is a mapping from initial beliefs and histories to actions (Poupart 2005).

A policy for a belief POMDP can be viewed as a policy for an information‐state MDP. The POMDP policy definition above is Markovian regarding the information states but not Markovian regarding the POMDP states as originally described.

Value Function

In a Markov Decision Process (MDP), given a state s ∈ S, an action a ∈ A and a policy π for an MDP, one can define the value of the action a in state s, considering the immediate reward of a and the reward after the decision, if the actions taken after a are determined by the π policy. The function that gives this value is denoted by Q. For the total discounted expected reward, Q is defined as:

Q^{π} (s, a) = R (s, a) + γ \sum_{s^{'} \in S} T (s^{'} | s, a) V^{π} (s^{'})

(9)

In general, the smaller the discount factor is, the more lagged a reward after the decision is, and so the smaller its weight will be.

POMDP‐Based Methodology for Finding Victims

This section presents our proposed innovative methodology for modeling the constrained coverage problem, solving it based on the POMDP technique, and testing it by scenarios generation. It is worth mentioning that, different from Waharte and Trigoni (2010) and Murtaza et al. (2013), we have created a step‐by‐step methodology that can be replicated with four macro steps: modeling, scenarios generation, solving, and analyzing statistics, differently from the methodology used by the mentioned author. Our step‐by‐step method, detailed in Appendix S1, is very detailed and explains how to calculate the area of the states and the indicators.

The first modeling step concerns regarding mapping the affected area to be flown and choosing the type of UAV to be used, which is related to the necessary flight height. In this step, the area to be flown can be viewed as a Markov Chain, where each state is a part of the affected area and has a priority, which represents the probability of having victims. These priorities should be assigned by a specialist who knows the area to be flown and can suggest which state is more likely to have victims.

The scenario generation step considers a hypothetical distribution of victims in the affected area to test the solver in terms of time to find victims, distance traveled, operation's duration and coverage percentage. We have considered three scenarios (best, mixed, and worst) that suppose that the victims are in cells with some population density (which can be seen through the urban areas in the map).

The solving step repeats a policy execution until the UAV covers the entire affected area. In the policy execution, the UAV executes the step‐by‐step tasks as follow:

The UAV asks the solver for an action based on its initial belief and start location.

The UAV takes the action and goes to a new cell.

In this new state, the UAV sends a photo of the new cell's area and the rescue team makes its observation that can be either YES (there are victims in the cell) or NO (there are no victims in the cell).

If the observation is YES, we will update the belief map, increasing the likelihood cells by 1, unless it is already in highest priority class or has priority 0. If the observation is NO, the belief map remains the same. Finally, since we have observed the current cell, the UAV marks its priority as 0.

Once again, different from Murtaza et al. (2013), the methodology proposed in this study includes a step to avoid the UAV becoming “isolated” if it has already visited all the neighbor cells, which is

If all the neighbor cells have priority 1, then the UAV goes, in a straight line, to the highest priority state (not traveled), starting with the closest ones.

The analyzing statistics step consists of calculating four indicators: traveled distance, operation duration, coverage percentage, and time to find groups of victims. The traveled distance and operation duration, innovative contributions of this study, analyze the feasibility of the operation in terms of the drone's endurance and data link range (km), whereas Murtaza et al. (2013) have also used coverage percentage and time to find groups of victims.

We calculate the traveled distance according to the actions undertaken. As the state areas are always squares, if a UAV travels along directions north, south, east, and west, the traveled distance in each iteration will be x (the side of the square – state), but if the UAV travels along directions including north‐east, north‐west, south‐east, and south‐west, the traveled distance will be

x \sqrt{2}

(the diagonal of the square – state). We should calculate the traveled distance after each round, based on the path plan traveled. Because the traveled distance is already known, we calculate the operation's duration by dividing the traveled distance by the average speed of the drone.

An analysis of coverage percentage and the time required to find groups of victims is proposed in Murtaza et al. (2013) to demonstrate that the POMDP can achieve 100% coverage and can locate victims very quickly. In this study, the states have the same area; therefore, the coverage percentage by iteration would be a linear curve. Hence, we calculate the coverage percentage based on the total traveled distance instead of the total area and the coverage percentage, in each iteration, by summarizing the accumulated traveled distance and dividing it by the total traveled distance. Equation 10 describes d _i as the traveled distance in iteration i and D as the total traveled distance, and we calculate the coverage percentage in iteration n as follows:

c o v e r a g e %_{n} = \frac{\sum_{i = 0}^{n} d_{i}}{D} .

(10)

We calculate the time required to find groups of victims by dividing the distance traveled to find groups of victims G _i by the average speed of the UAV. For example, if a UAV travels 150 m before finding the first group of victims, and the UAV's average speed is 15 m/s, the time required to find the first group of victims is 10 seconds.

We used these statistics to measure the POMDP performance, but they can also be used to compare the POMDP with the greedy algorithm. According to Roughgarden et al. (2013), they are often used to solve optimization problems, in which the objective is to maximize or minimize a quantity subject to a set of constraints. According to Cormen et al. (2001), a greedy algorithm always makes the choice that appears the best at a given moment, based on greed. That is, the algorithm makes a locally optimal choice in the hope that this choice will lead to a globally optimal solution.

Case's Presentation

This section presents the application of the proposed methodology with three numerical examples: (1) a tornado in Xanxerê, Brazil; (2) a nuclear accident in Fukushima; and (3) a refugee camp in South Sudan. These disasters were chosen due to their diversity of characteristics. According to Tatham et al. (2013), the total number of people affected, and the population density will impact the logistic response and the extent of the destruction capture of the geographic coverage of the disaster. The number of affected people in Xanxerê was approximately 10,000 (Canes 2015); in South Sudan, it was 2289 (Reach Resource Centre 2015); and in Fukushima, there were up to 100,000 people affected (Verdú 2016). The population density in Xanxerê is approximately 116 persons/km² (IBGE 2010); in Fukushima, it is 464 persons/km² (Knoema 2017); and in South Sudan, it is 28,292 persons/km² (Reach Resource Centre 2015). In terms of the extent of impact, the affected area in Xanxerê was approximately 377.80 km² (IBGE 2010); in South Sudan, it was 0.08 km² (Reach 2015); and in Fukushima, it was up to 1256 km² (Verdú 2016).

In each of these experiments, the use of UAVs has a different purpose. According to Tsunemi et al. (2015), in the Fukushima experiment, UAVs were used in a wide‐area search, which should be performed immediately after a disaster occurs, with the aim of grasping the overall scale of the disaster, and should clarify the scale of the damage, as well as assessing the general conditions in disaster‐stricken areas. In the Xanxerê and South Sudan experiments, we classify the use of UAV as a narrow‐area search, with the purpose to gather detailed information on the disaster site identified in the wide‐area search.

The first case refers to a tornado that hit Xanxerê, in Brazil's southern Santa Catarina province on April 21, 2015. Aerial imagery made with the aid of a drone shows the destruction caused by the tornado. Dozens of houses had roofs torn off by the wind, which may have reached 330 km/h, according to INMET (2015). In the city, two people were killed, and approximately 120 people were taken to hospitals, according to the military police. Approximately 2600 houses were affected, according to the balance sheet of the military police, and approximately 1000 people were left homeless. The central electricity company of Santa Catarina reported that 200,000 consumer units were left without electricity in 20 cities in the region after 11 transmission towers fell or were bent (Trezzi 2015).

According to Torres (2015), the city of Xanxerê was divided into eight regions for the search and rescue operation. In each region, there were a team of firefighters, a group of police and social workers. The tasks were multiple: to find victims among the rubble, to help homeless and to ensure that, in the face of chaos, the situation does not lead to home looting. The Ministry of National Integration has summoned 200 soldiers to help with the work. The population has also helped with reconstruction efforts.

The second case represents the Great East Earthquake that occurred at 2:46 pm on March 11, 2011, with a recorded magnitude 9.0. It caused tremendous damage to the northern part of Japan, especially in the prefectures of Fukushima, Miyagi, and Iwate. The earthquake and tsunami triggered the worst nuclear accident since Chernobyl. The Fukushima Daiichi nuclear power station located in the Pacific Ocean coast incurred huge damage by the earthquake and tsunami. The piping facility in the building, the facilities for the external power supply and backup power were destroyed. The next day, in the early morning, the leakage of radioactive materials was found in front of the main gate of the nuclear power plant. Steam filled the building from the core melt down caused by the dysfunction of the cooling system (Mizushima 2012).

According to Mizushima (2012), the relief efforts for the survivors of the earthquake and tsunami were initiated by the Self‐Defense Forces (SDF) and the local fire brigade. Defense Minister issued the command to send 50,000 SDF members for the disaster relief work immediately after the disaster occurred, and the number was increased to 100,000 people by Prime Minister the next day. Then, the US military began an operation for rescuing survivors. Based on the criteria by the US Nuclear Regulatory Commission (NRC), the activities of the US military forces were done in the area more than 80 km away from the Fukushima nuclear power plant. Therefore, they concentrated on Iwate Prefecture and Miyagi Prefecture, and it did not work at Fukushima directly.

Compared to other disaster‐affected areas, there were less NGOs and general volunteers who entered Fukushima from outside the prefecture for relief work because of the fear of radiation. It was difficult to secure the safety of staff members. In terms of the local activities and the general volunteer associations, the existing network of those organizations in Fukushima functioned effectively in the initial stage, such as distribution of the relief supplies and emergency meal services (Mizushima 2012).

According to Jiji (2015), drones were used in Fukushima to do a survey in the interior of the reactor buildings which were badly damaged. Unlike standard remote‐controlled drones, the UAV used in this operation guided itself using lasers to avoid obstacles. According to Europa Press (2014), UAVs were also used in search and rescue operations to view areas difficult to access.

The latter case covers a refugee camp in South Sudan. According to United Nations High Commissioner for Refugees (UNHCR, 2015), since the outbreak of the conflict in South Sudan in December 2013, the delivery of food and other essential items to refugees has been hampered due to heavy rains, increasing insecurity and logistical constraints. Access to displaced people has been restricted, and refugees faced serious protection concerns. At the same time, humanitarian workers were in a state of heightened risk.

According to Reach Resource Centre (2015), the Bor protection of civilian (PoC) site was established in December 2013 following outbreaks of violence that forced people into the United Nations Mission in South Sudan (UNMISS) base for refuge. The PoC was relocated to the new Bor PoC Site in September 2014 and the site population is 2289 people.

The use of drones in slow‐onset disasters, such as refugee camps, is increasing in countries such as Niger, Burkina Faso, and Uganda, to help map huge populations of refugee's, to evaluate their needs, such as water, toilets, educational facilities, and health care, and to determine better ways to help them. The drones are also being used to register the refugees and evaluate damages to the environment caused by the displacement, such as people cutting firewood around an area where two‐thirds of the land is affected by desertification. The use of the drones offered invaluable video information on how to provide assistance and ensure sustainable daily living in an area of scarce natural resources and infrastructure. Aerial views and camp mapping can help improve the ability to respond to short and long‐term needs. They can, for example, monitor the evolution of shelter locations and movements within camps but also document the evolution of the environmental context and the natural resources available in and around the camps. It can also help prevent and mitigate the risks of natural disasters, according to the head of UNHCR's Dori office (UNHCR, 2015).

The use of UAVs in the three disasters mentioned would not invalidate the efforts mentioned above, but would complement the search and rescue and needs assessment operations, as the drones could see from above the areas more likely to have victims. These areas are identified through destroyed houses and broken buildings, for example. The use of drones could direct the work of the disaster responders to improve the efficiency of the operation.

Results and Discussion

In this section, we present the results and the sensitivity analysis and then we discuss the managerial implications.

Results and Sensitivity Analysis

We applied the proposed method in five rounds of simulations (R1, R2, R3, R4, and R5) in each of the three cases and, then compared to the greedy algorithm. We also conduct a multivariate sensitivity analysis to understand how the priority of each state, the discount factor, and the area of the states (and therefore the number of states) can affect the performance of the methodology proposed in this study.

To analyze the influence of the priority of the states, we considered three scenarios in the Xanxerê experiment (Worst, Best and Mixed). The Worst Scenario supposes that the victims are in cells with low population density, that is, the allocation of victims to specific states was improper. The Best one supposes that the victims are only in cells with high population density and the other cells have the least priority, that is, the allocation of victims was predictable. The Mixed Scenario combines the other two scenarios. In the Xanxerê example, the UAV flew above the most affected area, called Esportes, which has 0.84 km², at an average height of 250 m. Additionally, the other states affected cover 150 m². It considered victims in 13 of the 46 states, and the discount factor was 0.95.

In Fukushima, the flight height varies from 1500 m, 2000 m, and 2500 m, in a 520.22 km² area, and the states are 5.01 km², 9.00 km², and 13.69 km², respectively. These altitudes are under the limit presented by Tsunemi et al. (2015) for the wide‐area search, which is 10,000 feet or 3048 m (3.04 km). The aim is to understand how the area of the states, and therefore the number of states in the system, can influence the results. It considered victims in 35% of the states, and the discount factor was 0.95.

In South Sudan, we vary the discount factor to understand how this variable behaves. The UAV flew with an average height of 150 m, in a 0.08 km² area, with states that are 90 m². It considered victims in 19 of the 36 states, and the discount factors were 0.70, 0.85, and 0.95.

Finally, we did a multivariate sensitivity analysis for Fukushima's case (due to its large area where design choices may lead to more dramatic outcomes) with two discount factors (0.85 and 0.95), two altitudes (1500 m and 2500 m), and for the three scenarios previously mentioned.

In all the cases, the states have priorities from 1 to 3, according to the probability of having affected people. We programmed the solver to have eight actions (north, south, east, west, north‐east, north‐west, south‐east, south‐west), two observations (y – yes, there is a victim, n – no, there is not a victim) and two rewards (1 for finding victims or 0 for not finding victims). After the iterations, the drone completed one path planning (R1), as shown in Figure 4.

Figure 4

UAV Path Planning Mixed Scenario DF = 0.95 [Color figure can be viewed at wileyonlinelibrary.com]

We present the results of the evaluation metrics for the POMDP and the greedy algorithms in the Table 5.

Table 5

POMDP × Greedy

	Xanxerê		South Sudan		Fukushima
Metrics:	POMPD	Greedy	POMPD	Greedy	POMPD	Greedy
Coverage percentage (%)	100	52	100	33	100	77
Traveled distance (km)	8.83	9.13	4.30	4.30	383.90	485.95
Operation duration (minute)	9.81	10.14	4.78	4.77	104.70	132.53
Time to find victims (minute)	6	8	3	4	62	95

For the first round of the Greedy, in the Xanxerê example, the coverage percentage reached 52% of the total area, indicating that the UAV traveled through only 24 of the 46 states. Once a UAV misses a cell near the start location to move toward high‐priority areas, it is very difficult for it to return to cover it later. We should also mention that missing 48% of the total area can have a huge impact in saving lives. Xanxerê has 48,000 inhabitants, and its neighbor, Esportes, the most affected area, where the UAV flew, has approximately 2600 inhabitants (IBGE 2010). If the greedy algorithm did not cover almost half of the affected area, 1250 people would not be rescued (if the population was distributed uniformly around the neighborhood).

Forcing the greedy algorithm to travel over the entire area, in the Xanxerê example, the results showed that the average distance traveled under the greedy algorithm was 0.3 km longer than that traveled under the POMDP (for the mixed scenario), the average operation duration was 0.33 minutes longer than that under the POMDP, and the average time to find groups of victims was 2 minutes longer than that under the POMDP. The most significant difference was the time to find groups of victims: the POMDP was 20% faster than the greedy algorithm because the POMDP is biased to save lives, updating its belief at each iteration through observations, whereas the greedy algorithm focuses on minimizing the distance traveled.

In Xanxerê, there are records that the search and rescue operation in the entire affected area continued for more than 3 days (Folha, 2015). According to Xu et al. (2014), the rescue team can benefit from the fast information availability of the UAVs to plan the search operation, considering local evacuation, identification of areas blocked by debris, and detection of secondary disasters. Because the area of Bor PoC is ten times smaller than that of Xanxerê, the difference between the POMDP and the greedy algorithm was not significant in terms of traveled distance or operation duration. The average time to find groups of victims with the greedy algorithm was 1 minute longer than that with the POMDP (with 0.95 discount factor), indicating that the POMDP performed 25% better. The larger the area is, the greater the difference between the performances of these statistics becomes.

In the Fukushima example, in the five rounds of the greedy algorithm, the coverage percentage did not reach 100% of the total area, indicating that the UAV traveled through only a percentage of the total affected area. In the worst case, the greedy algorithm reached only 25% of the total area. As mentioned in the Xanxerê example, once a UAV goes to a cell whose neighbor has already been visited, it becomes isolated and the greedy algorithm does not provide this type of constraint. It should also be noted that missing 75% of the total area can have a huge impact in saving lives. Fukushima's nuclear accident affected more than 100,000 people (Verdú 2016), and 75,000 of them would not be rescued with the greedy algorithm (if the population was distributed uniformly around the neighborhood).

Forcing the greedy algorithm to travel over the entire area, in the Fukushima example, the results showed that the average distance traveled under the greedy algorithm was 102.05 km longer than that traveled under the POMDP (with 1500 m of altitude), the average operation duration was 27.83 minutes longer than that under the POMDP and the average time to find groups of victims was 33 minutes longer than that under the POMDP. Therefore, we reinforce that the most significant difference was the time to find groups of victims, and the POMDP was 53% faster than the greedy algorithm.

Because the area of Fukushima is more than 600 times larger than Xanxerê, the difference between the POMDP and the greedy algorithm is significant in terms of average time to find groups of victims. The bigger the area is and the more states it has, the greater is the opportunity to show the POMDP‐based algorithm's performance because of the granularity of the Markov chain.

In the sensitivity analysis, we can see how the allocation of the priorities behaves over the scenarios in the Xanxerê case, how the discount factor behaves when decreased in the South Sudan case, and how the size and number of states behave with the flight height variations in the Fukushima case.

Table 6 and Figures 5–7represent the average of the indicators in five rounds of the proposed cases.

Table 6

Traveled Distance (km) and Duration (minute)

	Scenarios	Traveled distance (km)	Duration (minute)
Xanxerê priorities	Worst	10.0	11.1
	Mixed	8.8	9.8
	Best	9.7	10.8
South Sudan discount factor	0.70	4.5	5.00
	0.85	4.5	5.00
	0.95	4.3	4.8
Fukushima flight height	1500 m	383.90	104.70
	2000 m	263.18	71.78
	2500 m	186.68	50.91

Figure 5

Xanxerê's Case

Figure 6

South Sudan's Case

Figure 7

Fukushima's Case

The main impact of setting priorities involves the time to find groups of victims, which is reduced according to the best allocation of priorities. If we allocate higher priorities to states that do not contain victims, the algorithm directs the drone to those states, and the time to find victims increases. If we allocate the highest priorities to the states containing the victims, the algorithm directs the drone to those states, and finds the victims immediately, which is unlikely to happen. In this case, it is important that a person who knows the region helps to prioritize, so that the time needed to find victims is reasonable, that is, neither overly optimistic or pessimistic given the actual conditions of a rescue operation.

The discount factor gives a weight to the reward after the decision. In this study, as the observation in state s, and so its reward, can influence the neighbor cells' priority, we recommend to use high discount factors so the reward after the decision can be considered (Murtaza et al. 2013). This pattern is demonstrated in the sensitivity analysis presented in Table 6 and Figures 5, 6, and 7. The comparison between the discount factor of 0.70 and 0.85 shows that there is no significative variance in the statistics, but according to Cassandra (1998), when we increase it to 0.95, we can see that the traveled distance and the operation's duration reduces by approximately 5%. As mentioned in the methodology section, as the observation in state s and its reward can change the priority of the neighbor cells', the discount factors should be high so that the reward after the decision will be considered. This finding can also be explained from the total discounted expected reward function in Equation 9. It is important to mention that, to the best of our knowledge, there is no study considering low discount factors, such as 0.70, in the literature. The purpose of considering these values is to analyze how the proposed statistics vary over the different discount factors.

From Table 6 and Figures 5, 6, and 7 for the same area, the larger the number of states and, therefore, the smaller the area of the states, the greater the distance traveled, the time of flight, and the time to find victims. The reason this occurs is because the larger the number of states, the greater the granularity of the area traveled, which makes the drone route more detailed.

It is important to note that the traveled distance and the operation's duration do not only behave according to the established priorities, these metrics also vary according to the route traveled. For example, in the best case, the drone can go through the states in a more sinuous path and find the victims quickly; whereas, in the worst case, the UAV may have a less winding route and, in turn, take more time to find the victims.

These results can also be analyzed in the multivariate analysis, presented in Table 7 for the Fukushima's case, where we can see that the higher the discount factor, the shorter the time to find victims, given the increase in the belief of having or not victims according to neighboring states. We also noticed that the higher the altitude of the flight, the shorter the time to find victims because the granularity of the drone's route gets smaller. It is worth mentioning that in this case, the states have a larger area, thus it is important to have a rescue team working with the search team so that the operation is assertive, and the victims can not only be found but also rescued as quickly as possible. Finally, we see that the time to find victims reduces when working with more optimistic and mixed scenarios because the allocation becomes more realistic, which reinforces the importance of the priorities being defined by someone who knows the region.

Table 7

Multivariate Analysis: Time to Find Groups of Victims (minute) – Fukushima's Case

	Worst			Mixed			Best
	POMDP		Greedy	POMDP		Greedy	POMDP		Greedy
	0.85	0.95	–	0.85	0.95	–	0.85	0.95	–
1500 m	103	104	102	86	62	95	33	30	40
2500 m	45	45	51	39	17	45	27	17	30

Table 7 also presents the comparison between POMDP and greedy algorithm. The greedy algorithm results are always longer to find victims except for the worst case, where results are very close.

Discussion

To provide more realism in disaster research and to make it acceptable to the practitioners (Gupta et al. 2016, Pedraza‐Martinez and Van Wassenhove 2016), we have conducted interviews with humanitarian workers guided through an online questionnaire. We shared the results above with seven (7) interviewees who gave their input on the subject. They are from different countries, have an average of 13 years of experience, and are engaged in organizations such as civil defense, NGOs, the Army and Academy who represent different stakeholders in humanitarian relief. The academic professionals who contributed with their feedback have experience with drone flights and/or have published on the subject.

The interviewees presented some reasons for having just a few drone routing algorithms with the bias of saving victims available. According to them, the initial purpose of using drones was in wars. In the military context, armed UAVs pose ethical issues not only with respect to their use in armed conflict but also concerning the prevention of war. To prevent arms control dangers, international humanitarian law, for military stability as well as for society, armed UAVs should be limited (Altmann 2013). The interviewees also presented reasons why drones are useful to improve the efficiency of search and rescue operations: in areas with difficult access, such as in flooding and landslides, they can monitor the conditions of rivers, forests, and water sources, estimate the number of people injured and identify these people (gender, nationality, age), deliver medical supplies and small devices in remote areas, bring items from a remote ground base, they can be used in surveillance in disaster areas, serve as a point of communication and reach the interest point quickly while avoiding sending people to dangerous places to do the same job.

Murtaza et al. (2013) compare the POMDP algorithm with the greedy algorithm. The results in each case, which were measured in terms of percent coverage and time to find groups of victims, showed that the POMDP covered 100% of the affected area, whereas the greedy algorithm did not. The time to find groups of victims was only a few seconds less in the POMDP, but the theoretical application area was 225 m². They report no percent improvement in performance because the study simply provides comparative graphics without numerical values. By adding two more metrics, the traveled distance and operation's duration, we shared the performance measures with the interviewees that found the indicators good or very good.

After applying the methodology proposed in this study, the statistics presented 100% coverage in the three case studies, indicating that we successfully implemented the algorithm in a rapid‐onset disaster and in a slow‐onset disaster. The drone traversed the entire affected area, in the Mixed Scenario with a 0.95 discount factor, in less than 10 minutes in the Xanxerê example, in less than 5 minutes in the Bor PoC example, and in less than 2 hours in Fukushima example.

In the Xanxerê and South Sudan cases, the operation's duration is mechanically feasible for micro UAVs, and the data link range is less than 2 km, according to Cai et al. (2014). In the Fukushima case, the UAVs works in less than 2 hours and the range is less than 20 km, which is possible for a miniature UAVs (Cai et al. 2014). Sharing these findings with the interviewees, they mentioned the importance of choosing the type of drone in terms of autonomy and type of operation. They recommended to use fixed wing, instead of rotary wing, drones in the Fukushima example because they have a better flight time and cover long distances.

In fact, Meier (2014) affirms that very small and lightweight UAVs will soon be used in disaster response missions for micro‐transportation applications. Because the POMDP algorithm identifies areas containing victims, it could be used as a micro‐transport system, delivering emergency materials such as medicine and supplies to affected people. This application was even mentioned in the questionnaire by some interviewees who agreed to the use of drones in slow‐onset disasters, such as a refugee crisis, and mentioned other applications, such as to facilitate access to a crisis area, to capture images and send to the control stations to be analyzed by specialists (so they can identify the needs and provide the population help as needed), and to provide communication and to identify the amount of people involved.

The ethical and legal issues regarding the use of drones appeared as a concern in two of the interviews. These ethical and legal issues vary from country to country, and it is necessary to understand the implications of these factors in the proposed methodology. In Brazil, rules regarding the use of drones were created in May 2017 and restricts the distance to people to a minimum of 30 m, which is related to the flight height (ANAC 2017). In Japan, a law was created in December 2015 and requires permission from the Minister of Land, Infrastructure, and Transportation to fly drones over residential areas or areas surrounding an airport, which can delay search and rescue operations if a disaster occurs in these areas (Current Law 2017). Finally, in South Sudan, the drone laws are constantly changing. The flight should meet criteria, such as not flying drones over people or crowds of people and not flying drones near military installations, power plants, or any other area that could cause concern among local authorities (UAV Systems 2016). On the other hand, according to OCHA (2014), the use of UAVs by peacekeepers and other military partners is likely to increase. There are discussions about adding UAV capacity to other UN missions, including those in South Sudan. Humanitarian organizations that take a principled stand to reject military UAV capacities may even be criticized for not taking advantage of “life‐saving technology.” As some humanitarian workers have argued, “soon human rights groups will actually be demanding that drones be included as a staple ingredient in peacekeeping operations. Opting not to use drones could indeed someday be considered a breach of international humanitarian law as a failure to take all measures to protect civilians and document violations.”

In addition, two of the interviewees have witnessed or are aware of a search and rescue operation where the local population demonstrated some type of negative reaction to the use of drones, such as a fear of bombing. We should consider this type of information when deciding which region to fly over, such as regions where the population lives in a state of war. According to Galindo and Batta (2013), the social behavior of these populations differs from that found in developed countries, which reveals an additional consideration, regarding how ethical and socioeconomic contexts may affect the response of the population to evacuation warnings. According to Caunhye et al. (2012), human behavior can make an optimal plan hard to implement correctly, so models need to make explicit assumptions about human behavior in post‐disaster environments.

From the social acceptance perspective, it is extremely important to concern of privacy appropriately. Public concerns of insufficient safeguards to ensure that UAVs are not used to spy on citizens and unduly infringe upon their fundamental privacy needs to be thoughtfully addressed before allowing UAVs to fly in the national airspace. The guiding principles for Federal Aviation Administration (FAA) policies include mainly the safety of people in the air and on the ground (Namaduri et al. 2013).

Another challenge, for practical applications, is related to the access of airspace. According to Namaduri et al. (2013), after Hurricane Katrina, the Joint Terminal Air Controller deployed the Evolution Tactical UAVs—used in the military and government context. The attempts to use these UAVs were restricted due to FAA (Federal Aviation Administration) regulations on accessing airspace. The workaround was to attach small UAVs to the bottom of a helicopter. In response to the growing demand for civilian use of UAVs, the FAA has been rigorously pursuing policies for safe and secure use of UAVs in the national airspace.

The interviewees also mentioned that the drone is a new technology and needs to be improved, and there are problems with localization by sight, limited base distance for low‐cost drones, and battery life for low‐cost drones. While these issues need to be improved, the cost of using drones in search and rescue operations are cheaper than the current methods used to measure the number of victims according to two of the interviewees, and three of them viewed using drones as much cheaper. Given that the cost of building and operating a UAVs is reducing while its operational capabilities are increasing, it would seem likely (if not inevitable) that UAVs would perform useful and cost‐effective functions within the overall post‐disaster needs assessment process and, thereby, assist in the mitigation of risk in the response to such disasters (Tatham 2009).

This increasing use of UAVs for humanitarian purposes explains why the United Nations (UN) recently published an official policy brief on the topic. A number of UN groups, such as the Office for the Coordination of Humanitarian Affairs (OCHA), are actively exploring the use of UAVs for disaster response. These organizations have also joined the Humanitarian UAV Network (UAViators 2014) to promote the safe and responsible use of UAVs in humanitarian settings (Meier 2014).

Conclusions

This study contributes to the UAV path‐planning problem by providing a POMDP‐based method for finding victims in disaster‐affected areas. From our literature review, we concluded that there is a need for research with theoretical use of UAVs that validate their methods and algorithms in real disaster scenarios and apply a sensitivity analysis to their results.

After applying the POMDP‐based methodology and conducting five rounds for each example in a mixed scenario, it was observed that the proposed solution achieves 100% coverage while optimizing the time to find victims. It was also verified that the number of states is crucial for determining the UAV's traveled distance and operation duration, which should be realistic and mechanically viable statistics. Another important conclusion is that the larger the area is and the more the states it has, the bigger is the opportunity for the POMDP to show its efficiency when compared with the greedy algorithm, as shown in the Fukushima's example. In all the three illustrative disasters, independent of the area size, the greedy algorithm did not cover the whole affected area in, at least, one simulation round. This coverage can be crucial for humanitarian operations, as the victims are distributed in all the affected area and need help equally.

From the sensitivity analysis, the findings of the best/mixed/worst scenario suggest the need for specialists who know the area well to set the state's priorities such that the algorithm can first direct the UAV to areas containing victims and can be successfully implemented to save lives as soon as possible. Regarding the discount factor, the sensitivity analysis suggests that this parameter should be high, to consider the weight of the reward after the decision. Finally, from the variation in the number of states and, therefore, in the size of states, the findings suggest that the time to find groups of victims increases according to the number/size of states because of the level of detail in the path planning. Future research may measure the methodology's performance through indicators suggested by the interviewees, such as energy consumption, number of households mapped, location covered, and payload. Practical application can also be pursued in future research because UAVs are now programmable; thus, the proposed algorithm can be implemented in a real UAV. The area to be flown over for practical applications should meet the legal restrictions of the region; thus, the use of private terrain or a military area is recommended.

Even though studies are growing in this area, some drone challenges need to be overcome. The use of UAVs in humanitarian relief can be effective, since it has many applications and is also feasible in relation to the costs, but on the other hand, the military context poses an ethical issue regarding the use of armed UAVs, the social acceptance addresses the concern of privacy, and the access of the airspace is restricted and needs some regulations. When all these issues are addressed, the use of drones will certainly be much more secure, organized, and useful for humanitarian logistics.

Footnotes

Acknowledgments

The authors acknowledge the support of the National Council for Scientific and Technological Development (CNPq) [311723/2013‐6; 304843/2016‐4]; the Coordination for the Improvement of Higher Education Personnel (CAPES) [88887091739/2014‐01]; and the Foundation for Support of Research in the State of Rio de Janeiro (FAPERJ) [202.806/2015; 203.178/2016]. They are also thankful to the editorial team whose insightful remarks greatly improved the clarity of this manuscript.

ORCID

Adriana Leiras

Fernando Luiz Cyrino Oliveira

References

Almurib

H. A. F.

Nathan

P. T.

Kumar

T. N.

. 2011. Control and path planning of quadrotor aerial vehicles for search and rescue. Proceedings of the SICE Annual Conference, pp. 700–705.

Altmann

2013. Arms control for armed uninhabited vehicles: An ethical issue. Ethics Inf. Technol. 15: 137–152.

Anac . 2017. Regras da ANAC para uso de drones entram em vigor. Available at http://www.anac.gov.br/noticias/2017/regras-da-anac-para-uso-de-drones-entram-em-vigor (accessed date October 15, 2017).

Arii

2013. Rapid assessment in disasters. Japan Medical Association Journal 56(1): 19–24.

Baker

C. A. B.

Ramchurn

Teacy

W. T. L.

Jennings

N. R.

. 2016. Planning search and rescue missions for UAV teams. Front. Artif. Intell. Appl. 285: 1777–1782.

Bendea

Boccardo

Dequal

Tonolo

F. G.

Marenchino

Piras

. 2008. Low cost UAV for post‐disaster assessment. Int. Arch. Photogrammetry, Remote Sensing Spatial Inf. Sci. 37: 1373–1379.

Cai

Dias

Seneviratne

. 2014. A survey of small‐scale unmanned aerial vehicles: recent advances and future development trends. Unmanned Syst. 2: 1–25.

Canes

. 2015. Mais de 10 mil pessoas foram atingidas por tornado em Xanxerê. Available at http://agenciabrasil.ebc.com.br/geral/noticia/2015-04/mais-de-10-mil-pessoas-foram-atingidas-por-tornado-em-xanxere (accessed date January 20, 2017).

Cassandra

A. R

. 1998. Exact and Approximate Algorithms for Partially Observable Markov Decision Processes. Doctorate Thesis – Department of Computer Science, Brown University.

10.

Cassandra

A. R

. 2015. The POMDP Page. Available at pomdp.org/code (accessed date July 27, 2015).

11.

Caunhye

A. M.

Nie

Pokharel

. 2012. Optimization models in emergency logistics: A literature review. Soc. Econ. Plann. Sci. 46(1): 4–13.

12.

Chen

Miller‐Hooks

. 2012. Optimal team deployment in urban search and rescue. Transp. Res. Part B 46: 984–999.

13.

Cheng

C.‐T.

Fallahi

Leung

Tse

. 2009. Cooperative path planner for uavs using aco algorithm with gaussian distribution functions. Circuits and Systems, 2009. ISCAS 2009. IEEE International Symposium, pp. 173–176.

14.

Cormen

T. H.

Leiserson

C. E.

Rivest

R. L.

Stein

. 2001. Introduction to algorithms. MIT Press & McGraw‐Hill 2: 317.

15.

Current Law . 2017. The current law on drones in Japan. Available at http://dronelawjapan.com/ (accessed date October 15, 2017).

16.

Europa Press . 2014. Available at http://www.europapress.es/ciencia/laboratorio/noticia-naturaleza-inspira-proxima-generacion-drones-20140523161454.html (accessed date January 11, 2017).

17.

Folha . 2015. Available at http://www1.folha.uol.com.br/cotidiano/2015/04/1620152-avo-salva-quatro-criancas-de-trailer-atingido-por-tornado-em-sc.shtml (accessed date January 11, 2017).

18.

Galindo

Batta

. 2013. Review of recent developments in OR/MS research in disaster operations management. Eur. J. Oper. Res. 230(2): 201–211.

19.

Gupta

Starr

M. K.

Farahani

R. Z.

Matinrad

. 2016. Disaster management from a POM perspective: mapping a new domain. Prod. Oper. Manag. 25(10): 1611–1637.

20.

Hauskrecht

. 1997. Planning and control in stochastic domains with imperfect information. Doctorate Thesis – EECS, Massachussets Institute of Technology.

21.

IBGE . 2010. Available at http://cidades.ibge.gov.br/xtras/perfil.php?lang=&codmun=421950&search=santa-catarina|xanxere (accessed date January 20, 2017).

22.

INMET . 2015. National Institute of Meteorology. Available at http://www.inmet.gov.br/portal/ (accessed date May 5, 2015).

23.

Jiji . 2015. Drone being developed to fly autonomously inside Fukushima reactor buildings. Available at http://www.japantimes.co.jp/news/2015/06/11/national/science-health/drone-developed-fly-autonomously-inside-fukushima-reactor-buildings/#.WHa-cPkrLIW (accessed date January 11, 2017).

24.

Keller

. 2015. Available at http://www.militaryaerospace.com/articles/2014/03/uav-spending-2015.html (accessed date May 24, 2016).

25.

Kim

Pant

Yamashita

. 2015. CUPUM 2015 ‐ 14th International Conference on Computers in Urban Planning and Urban Management.

26.

King

. 2015. Nepal earthquake relief and the urgent boost from drones. Available at http://www.forbes.com/sites/leoking/2015/04/30/nepal-earthquake-drones-relief-aid/#2994ed4d518b (accessed date January 20, 2017).

27.

Knoema . 2017. Available at https://knoema.com/atlas/Japan/Fukushima/Population-density (accessed date January 20, 2017).

28.

Lin

Goodrich

M. A.

. 2009. UAV intelligent path planning for wilderness search and rescue. International Conference on Intelligent Robots and Systems, pp. 709–741.

29.

Lovell

Le Masson

. 2015. Number of people affected by disasters. Overseas Development Institute. Available at https://www.odi.org/sites/odi.org.uk/files/odi-assets/publications-opinion-files/9475.pdf (accessed date January 20, 2017).

30.

Luis

Dolinskaya

I. S.

Smilowitz

K. R.

. 2012. Disaster relief routing: Integrating research and practice. Soc. Econ. Plan. Sci. 46(1): 88–97.

31.

Meier

. 2014. Humanitarian in the sky: drones for disaster response. Available at: http://www.virgin.com/unite/businessinnovation/humanitarian-in-the-sky-drones-for-disaster-response#%2EVChevptyFsY%2Etwitter (accessed date October 10, 2014).

32.

Mendonca

Marques

M. M.

Marques

Lourenço

Pinto

Santana

Coito

Lobo

Barata

. 2016. A cooperative multi‐robot team for the surveillance of shipwreck survivors at sea. OCEANS 2016 MTS/IEEE Monterey, OCE 2016, art. no. 7761074.

33.

Meng

B.‐B.

Gao

Wang

. 2009. Multi‐mission path re‐planning for multiple unmanned aerial vehicles based on unexpected events. 2009 International Conference on Intelligent Human‐Machine Systems and Cybernetics, IHMSC 2009, art. no. 5336128. pp. 423–426.

34.

Merino

Caballero

Martnez‐De‐Dios

J. R.

Maza

Ollero

. 2012. An unmanned aircraft system for automatic forest fire monitoring and measurement. J. Intell. Robot. Syst. 65(1–4): 533–548.

35.

Mizushima

. 2012. The Japan‐US “military” response to the earthquake, and the strengthening of the military alliance as a result. Available at http://fukushimaontheglobe.com/the-earthquake-and-the-nuclear-accident/whats-happened/the-japan-us-military-response (accessed date January 11, 2017).

36.

Mondal

Chakraborty

Roy

Saha

Bhattacharya

Saha

. 2016. Smart navigation and dynamic path planning of a micro‐jet in a post disaster scenario. Proceedings of the 2nd ACM SIGSPATIAL International Workshop on the Use of GIS in Emergency Management, EM‐GIS 2016, art. no. 14.

37.

Mongeon

Paul‐Hus

. 2016. The journal coverage of Web of Science and Scopus: a comparative analysis. Scientometrics 106: 213–228.

38.

Murtaza

Kanhere

Jha

. 2013. Priority‐based coverage path planning for Aerial Wireless Sensor Networks. 2013 IEEE Eighth International Conference on Intelligent Sensors, Sensor Networks and Information Processing , pp. 219–224.

39.

Namaduri

Wan

Gomathisankaran

. 2013. Mobile ad hoc networks in the sky: State of the art, opportunities, and challenges. 2nd ACM MobiHoc Workshop on Airborne Networks and Communications, ANC 2013. Bangalore, India.

40.

OCHA . 2014. Unmanned Aerial Vehicles in Humanitarian Response. Available at https://docs.unocha.org/sites/dms/Documents/Unmanned%20Aerial%20Vehicles%20in%20Humanitarian%20Response%20OCHA%20July%202014.pdf (accessed date on October 15, 2017).

41.

Panaque‐Gálvez

McCall

M. K.

Napoletano

B. M.

Wich

S. A.

Koh

L. P.

. 2014. Small drones for community‐based forest monitoring: An assessment of their feasibility and potential in tropical areas. Forests 5: 1481–1507.

42.

Pedraza‐Martinez

A. J.

VanWassenhove

L. N.

. 2016. Empirically grounded research in humanitarian operations management: the way forward. J. Oper. Manag. 45: 1–10.

43.

Poupart

. 2005. Exploiting Structure to Efficiently Solve Large Scale Partially Observable Markov Decision Processes. Doctorate Thesis – University of Toronto.

44.

Reach Resource Centre . 2015. Available at http://www.reachresourcecentre.info (accessed date October 15, 2015).

45.

Roughgarden

Sharp

Wexler

. 2013. Guide to Greedy Algorithms. Available at http://web.stanford.edu/class/archive/cs/cs161/cs161.1138/handouts/120%20Guide%20to%20Greedy%20Algorithms.pdf (accessed date December 22, 2015).

46.

Salisbury

Stein

Ramchurn

. 2016. CrowdAR: A live video annotation tool for rapid mapping. Procedia Eng. 159: 89–93.

47.

Simões

P. R

. 2016. O uso de drones em desastres ambientais. Available at http://blog.droneng.com.br/o-uso-de-drones-em-desastres-ambientais/ (accessed date January 20, 2017).

48.

Tarchi

Guglieri

Vespe

Gioia

Sermi

Kyovtorov

. 2017. Search and Rescue: Surveillance support from RPAs radar. 2017 European Navigation Conference, ENC 2017, art. no. 7954216. pp. 256–264.

49.

Tatham

2009. An investigation into the suitability of the use of unmanned aerial vehicle systems (UAVS) to support the initial needs assessment process in rapid onset humanitarian disasters. Int. J. Risk Assess. Manag. 13(1): 60–78.

50.

Tatham

L'Hermitte

Spens

Kovacs

. 2013. Humanitarian logistics: development of an improved disaster classification framework. In ANZAM Operations, Supply Chain and Services Management Symposium.

51.

Thomé

A. M. T.

Scavarda

L. F.

Scavarda

A. J.

. 2016a. Conducting systematic literature review in operations management. Prod. Plann. Control 27(5): 408–420.

52.

Thomé

A. M. T.

Scavarda

Ceryno

P. S.

Remmen

. 2016b. Sustainable new product development: A longitudinal review. Clean Techn. Environ. Policy 18: 2195–2208.

53.

Thorpe

Holt

Macpherson

Pittaway

. 2005. Using knowledge within small and medium‐sized firms: A systematic review of the evidence. Int. J. Manag. Rev. 7(4): 257–281.

54.

Torres

. 2015. Available at http://brasil.elpais.com/brasil/2015/04/22/politica/1429733424_709936.html (acessed date January 11, 2017).

55.

Trezzi

. 2015. Available at http://zh.clicrbs.com.br/rs/noticias/noticia/2015/04/sul-e-sudeste-do-brasil-formam-segundo-maiorcorredor-de-tornados-no-mundo-4744532.html (accessed date May 15, 2015).

56.

Tsunemi

Ishii

Murata

. 2015. Imaging solutions for Search & Rescue Operations. NEC Tech. J. 9(1): 90–93.

57.

Turk

. 2014. Drones Mapped the Philippines to Improve Typhoon Aid Efforts. Available at http://motherboard.vice.com/read/drones-mapped-the-philippines-to-improve-typhoon-aid-efforts (accessed date January 20, 2017).

58.

UAV Systems . 2016. Flying Drones in South Sudan. Available at http://www.uavsystemsinternational.com/drone-laws-by-country/south-sudan-drone-laws/ (accessed date October 15, 2017).

59.

UAViators . 2014. Available at http://uaviators.org (accessed date October 12, 2014).

60.

UNHCR . 2015. Available at http://www.unhcr.org/ (accessed date October 12, 2015).

61.

VanWassenhove

L. N.

2006. Humanitarian aid logistics: Supply chain management in high gear. J. Oper. Res. Soc. 57(5): 475–489.

62.

Verdú

. 2016. Available at http://brasil.elpais.com/brasil/2016/04/30/eps/1462052785_347240.html (accessed date January 12, 2017).

63.

Waharte

Trigoni

. 2010. Supporting search and rescue operations with UAVs. International Conference on Emerging Security Technologies, pp. 142–147.

64.

Wang

Zhang

Zheng

. 2017. A hyper‐heuristic method for UAV search planning. 8th International Conference on Swarm Intelligence, ICSI 2017, 10386. pp. 454–464.

65.

Ramchurn

S. D.

Chen

. 2016. Coordinating human‐UAV teams in disaster response. IJCAI International Joint Conference on Artificial Intelligence. pp. 524–530.

66.

Xiaowei

Xiaoguang

. 2016. Multi‐UAVs cooperative control in communication relay. ICSPCC 2016 ‐ IEEE International Conference on Signal Processing, Communications and Computing, Conference Proceedings, art. no. 7753600.

67.

Yang

Peng

Jiang

Zheng

Gao

Liu

Tian

. 2014. Development of an UAS for post‐earthquake disaster surveying and its application in Ms7.0 Lushan Earthquake, Sichuan, China. Comput. Geosci. 68: 22–30.

68.

Yuan

Wang

. 2009. Path selection model and algorithm for emergency logistics management. Comput. Ind. Eng. 56(3): 1081–1094.