Abstract
Contact-based decision and planning methods are becoming increasingly important to endow higher levels of autonomy for legged robots. Formal synthesis methods derived from symbolic systems have great potential for reasoning about high-level locomotion decisions and achieving complex maneuvering behaviors with correctness guarantees. This study takes a first step toward formally devising an architecture composed of task planning and control of whole-body dynamic locomotion behaviors in constrained and dynamically changing environments. At the high level, we formulate a two-player temporal logic game between the multi-limb locomotion planner and its dynamic environment to synthesize a winning strategy that delivers symbolic locomotion actions. These locomotion actions satisfy the desired high-level task specifications expressed in a fragment of temporal logic. Those actions are sent to a robust finite transition system that synthesizes a locomotion controller that fulfills state reachability constraints. This controller is further executed via a low-level motion planner that generates feasible locomotion trajectories. We construct a set of dynamic locomotion models for legged robots to serve as a template library for handling diverse environmental events. We devise a replanning strategy that takes into consideration sudden environmental changes or large state disturbances to increase the robustness of the resulting locomotion behaviors. We formally prove the correctness of the layered locomotion framework guaranteeing a robust implementation by the motion planning layer. Simulations of reactive locomotion behaviors in diverse environments indicate that our framework has the potential to serve as a theoretical foundation for intelligent locomotion behaviors.
Keywords
1. Introduction
The goal of this paper is to devise a reactive task and motion planning framework for whole-body dynamic locomotion (WBDL) behaviors in constrained environments. We employ formal methods for synthesis of a symbolic task planner and design of reachability controllers to achieve legged locomotion behaviors that are reactive to the environment. Although widely used in mobile robot motion planning (Fu and Topcu, 2016; Kloetzer and Belta, 2010; Wongpiromsarn et al., 2012) and autonomous driving (Campbell et al., 2010; Xu et al., 2018), formal methods have not been previously used to reason about keyframe states of dynamic locomotion behaviors. To that end, we rely on dynamic locomotion abstractions that reduce the dimensionality of the reasoning process (Zhao et al., 2017). These abstractions allow to sequentially compose locomotion modes by reasoning about the previously mentioned keyframe dynamic locomotion states and achieve advanced reactive behaviors that can respond to dynamic events in the environment as well as to disturbances, a hallmark of intelligent locomotion behaviors. The complex locomotion behaviors studied in this paper could not be achieved by using motion planners alone without a high-level decision-making process. Reasoning about keyframe dynamic locomotion states has several advantages allowing to (1) take advantage of the passive dynamics of legged robots, (2) directly compose behaviors in the phase space of the locomotion process, (3) achieve goal state reachability considering robustness margins, and (4) adjust locomotion behaviors in response to disturbances.
Our technical approach relies on a suite of template-based locomotion modes that span a spectrum of desired whole-body dynamic locomotion behaviors. Sequentially composing these modes via the proposed reactive synthesis enables us to formally combine tasks such as multi-contact locomotion, swinging movements, and hopping motions, as shown in Figure 1. Using simplified models to characterize locomotion dynamics has been widely pursued such as the use of the linear inverted pendulum model (LIPM) (Kajita et al., 2001), the spring-loaded inverted pendulum model (Piovan and Byl, 2015), the brachiation-like pendulum model (Bertram et al., 1999), the multi-contact model (Caron and Kheddar, 2016; Sentis et al., 2010b), and our recently proposed prismatic inverted pendulum model (PIPM) (Zhao and Sentis, 2012), to name a few. Usually, these models are separately considered in their own specific scenarios and lack a framework to seamlessly integrate them. Seminal locomotion results using template models (Alexander, 1984; De and Koditschek, 2015; Full and Koditschek, 1999; Raibert, 1986) and sequential composition of these models (Burridge et al., 1999) championed the advantages of using simplified models to uncover the fundamental locomotion principles related to the fine details of multi-body mechanism and dynamics. The work in Arslan and Saranli (2012) employs sequential composition to achieve reactive and robust planning against both model uncertainty and measurement noise without replanning. Nevertheless, no high-level decision-making algorithms with formal guarantees have been investigated, although the mentality of hierarchical planning and control had been proposed in Full and Koditschek (1999). Maneuvering in a constrained environment via multi-contact whole-body dynamic locomotion. Three locomotion modes are illustrated. The contact decisions are made according to the high-level symbolic task planner.
In this study, we aim at bridging this gap by proposing formal symbolic-level decision-making theories to sequentially compose more challenging—highly dynamic, versatile, non-periodic—locomotion behaviors reactive to dynamic environmental events. In the vein of work addressing rough terrain locomotion (Englsberger et al., 2015; Sreenath et al., 2013; Zhao et al., 2016a, 2016b), we address the variability of the terrains by allowing the robot to respond to sudden environmental events. The behaviors we synthesize are required to satisfy formal task specifications in a provably correct manner, which we guarantee by using formal methods with discrete abstractions of hybrid systems (Alur et al., 2000). To the best of our knowledge, our study is the first attempt to use formal methods applied to phase-space keyframe state during dynamic locomotion behaviors.
The inherent hybrid dynamics of the locomotion process and our use of keyframe dynamic locomotion states facilitate the discrete planning synthesis. Instead of discretizing the robot’s state space, we rely on a discretization of the phase space keyframe states for synthesizing symbolic-level decisions which are further sent to the underlying motion planner. We focus on the integration between the symbolic-level discrete task planner and the continuous motion planner. This top-down planning approach significantly reduces the computational complexity compared to bottom-up approaches (Belta et al. (2017); Liu and Ozay, 2016; Liu et al., 2013; Tabuada, 2009). The correctness of our top-down hierarchy is guaranteed via a correct-by-construction synthesis at the task planner level and a reachability control synthesis at the motion planner level.
The contributions of this paper are as follows. The first one is on devising symbolic reasoning methods that make decisions on keyframe states of the dynamic locomotion process in response to the dynamically changing environment. Our second contribution is on ensuring robust locomotion under bounded disturbances by reasoning about keyframe state reachability. The third contribution is on using game theory to compose complex dynamic locomotion behaviors sequentially. The final contribution is on reasoning about the correctness of the overall planning framework.
This paper is organized as follows. Section 3 introduces various dynamic locomotion models and the problem formulation of switched systems, phase space planning, and temporal logic preliminaries. In Section 4, we present the task specifications for whole-body dynamic locomotion and a reactive planner winning strategy via defining a two-player logic game. Section 5 introduces a robust finite transition system for the hybrid locomotion process to reason about local robustness with respect to the bounded disturbances and proposes robustness margin sets using phase-space Riemmanian metrics. In Section 6, we reason about the one-walking-step robust reachability and the correctness of the overall planning strategy. Simulation results of whole-body dynamic locomotion behaviors over changing environments are shown in Section 7. In Sections 8 and 9, we discuss the results and the conclusions of this paper. The Appendix presents supplementary mathematical formulations, algorithms, and propositions. A preliminary version of this paper was published in a conference proceeding (Zhao et al., 2016a, 2016b). Compared to that proceeding, this paper presents a new study on robust reachability control synthesis, incorporates additional locomotion modes and more diverse task specifications, proposes a replanning strategy, and implements more sophisticated behaviors with a diversity of environmental events.
2. Related work
Formal methods have been widely investigated for mobile navigation (DeCastro et al., 2015; Kress-Gazit et al., 2011; Raman et al., 2015; Sadigh and Kapoor, 2016). The authors in Kloetzer and Belta (2010) proposed an automated computational framework for decentralized communications and control of a team of mobile robots from global task specifications. This work suffers from high computational complexity and does not address reactive response to environmental changes. To alleviate the computational burden, the work in Wongpiromsarn et al. (2012) proposed a receding-horizon based hierarchical framework that reduced the complex synthesis problem to a set of significantly smaller problems with a shorter horizon. An autonomous vehicle navigation process is simulated in the presence of exogenous disturbances. Provable correctness is an important property of temporal logic-based control and planning approaches. The work of Kress-Gazit et al. (2009) allows mobile robots to react to the environment in real time and guarantees the provable correctness of controllers. The approach proposed in Liu et al. (2013) extended controller synthesis with guaranteed-correctness to nonlinear switched systems and designed a reactive mechanism in response to an adversarial environment at runtime. Given a high-level discrete controller encoding reactive task behaviors, the work in DeCastro and Kress-Gazit (2015) designed low-level controllers to guarantee the correctness of a high-level controller. More recently, the work of Duperret and Koditschek (2020) solves a formal discrete leaping navigation problem of legged robots to reach a goal set while in the interim reactively avoiding a set of obstacle states. However, all of the work above is applied to 2D-world mobile robots or a single-leg hopper, which have simple dynamics unlike our focus on underactuated and hybrid legged robots. In particular, task and motion planning for bipedal or multi-limb robots requires to reason about contact-based dynamics which do not present in mobile robots.
2.1. Formal methods for manipulation and locomotion
Formal methods have also gained increasing attention in the mobile manipulation community via task and motion planning (TAMP) methods (Dantam et al., 2018; He et al., 2015; Kaelbling and Lozano-Pérez, 2011; Srivastava et al., 2014) or reactive synthesis methods (Chinchali et al., 2012; He et al., 2017; Sharan, 2014). However, many existing TAMP approaches rely on sampling-based motion planners which ignore the underlying physical dynamics. To fill this gap, the recent work of Toussaint et al. (2018) proposed a logic-geometric program to incorporate manipulation dynamics into the task and motion planning process, where discrete logic rules are used to specify the mode sequence for dynamic manipulation tasks. However, this work lacked a reactive mechanism in response to environmental actions and manipulated objects. More importantly, formal methods are yet to be used to reason about dynamic legged locomotion, or for more complex dynamic tasks for humanoid robots like the ones described in this paper. The authors in Antoniotti and Mishra (1995) determined goals for legged robots by using computational tree logic and synthesized controllers for locomotion. However, their work is restricted to static locomotion tasks which do not allow robots to walk dynamically or jump similarly to humans. An abstraction-based controller was proposed in Ames et al. (2015) for bipedal robots using virtual constraints, but this work focused on controller generation without addressing symbolic task reasoning. Recently, the work of Maniatopoulos et al. (2016) proposed an end-to-end approach to automatically synthesize temporal-logic-based plans on an Atlas humanoid robot. Reaction to low-level failures was formally incorporated by simply terminating the execution. However, the robot behaviors focus on manipulation and grasping tasks, instead of locomotion behaviors. The work of Sreenath et al. (2013) proposed a two-layer hybrid controller for locomotion over varying-slope terrains with imprecise sensing. To account for terrain uncertainties, a high-level controller implements a partially observable Markov decision process to make sequential decisions for controller switching. Once again, this work does not address symbolic task reasoning for dynamic locomotion. In addition, this work is limited to walking on terrains with mild roughness while our focus is locomotion on highly rough terrain and constrained environments.
2.2. Robustness reasoning of formal methods
Robustness to disturbances and reactiveness to changing environments are major challenges in robotic systems. Related work includes Fainekos and Pappas (2009) which studies the robust satisfaction of temporal logic specifications associated with continuous-time signals. Signal temporal logic (STL) (Donzé and Maler, 2010) allows to reason about dense-time, real-valued signals, enabling for the evaluation of the extent to which the specifications are satisfied or violated. This property makes STL especially suitable to quantify robustness (Deshmukh et al., 2015; Farahani et al., 2015; Sadraddini and Belta, 2015). The focus of all the work above is on the robust semantics of temporal logic, while our objective is to design robust locomotion planners where robustness margin sets are quantified as a goal in the reachability analysis under bounded disturbances. The work of Majumdar et al. (2011) studied robust controller synthesis on discrete transition systems against disturbances and proposed a robust metric to ensure that the state deviation from the nominal system is bounded by the magnitude of the disturbance. The work of Topcu et al. (2012), on the other hand, investigated the amount of uncertainty that can be tolerated while the controller still satisfies the given specifications. Both of the two papers above, however, focused on robustness reasoning in a purely discrete model, whereas our proposed method reasons about robustness in a hybrid locomotion system and incorporates the underlying physical dynamics. Recently, the work in Plaku et al. (2010), Bhatia et al. (2010), and He et al. (2015) proposed a multi-layered synergistic framework such that the low-level sampling-based planner communicates with the high-level discrete planner through a middle coordinating layer. This coordinating layer allows the motion planner to ask the task planner for a new high-level plan when a failure occurs at the low level. This synergy between multiple planning layers enhances the robustness of the planning framework. As an alternative, the work of Dantam et al. (2016) incrementally incorporated geometric information from the failure event of the motion planner into the task planner via the so-called incremental constraint updates. The robustness in the two lines of research above is reasoned from a replanning perspective. While our study employs a similar replanning strategy as theirs, our focus is on the formal synthesis of a task planner that can react to sudden event changes in the environment. In this paper, we address the robustness as follows: (1) at the task planning level, we devise a reactive mechanism that chooses appropriate system actions according to environmental actions, and (2) at the motion planning level, we achieve robustness against bounded state disturbances by designing robust keyframe transitions for dynamic locomotion.
2.3. Multi-contact legged locomotion
Multi-contact locomotion planning and control for humanoid robots have gained good traction as legged robots operate within complex environments more frequently in recent years (Bouyarmane and Kheddar, 2011; Chung and Khatib, 2015; Hauser, 2014; Posa et al., 2016; Sentis et al., 2010a). The work in Bretl (2006) studied multi-contact locomotion as a hybrid control problem while the work in Hauser et al. (2005) posed the multi-contact planning problem as a hierarchy that first reasons about contacts, and then interpolated these contacts with trajectories computed from a probabilistic planner. The study in Kudruss et al. (2015) formulated multi-contact centroidal momentum dynamics as an optimal control problem. However, all of the work above focused on either static or quasi-static mobility behaviors. Instead, our planning framework tackles highly dynamic behaviors, that is, non-periodic multi-contact dynamic locomotion over rough and constrained environments. The work in Caron et al. (2015) employed contact wrench cones to geometrically construct dynamic supports in arbitrary virtual planes for multi-contact behaviors. This work did not employ a rich set of locomotion templates due to restrictive assumptions on the center-of-mass behavior. Once again, all the work does not address symbolic reasoning of dynamic locomotion behaviors.
3. Preliminaries and problem formulation
3.1. Dynamic locomotion modes
We design a phase-space motion planner that consists of a palette of locomotion modes. To begin with, we introduce centroidal momentum dynamics in a general form. Dynamics of mechanical systems can be represented by their rate of linear and angular momenta, which are affected by external wrenches (i.e., force/torque) exerted on the system. We characterize this class of dynamical systems via the balance of moments around the system’s centroid.
Given this general model, certain assumptions are commonly imposed to make the problem tractable (Audren et al., 2014). In our case, six locomotion modes are proposed to produce various WBDL behaviors.
Illustration of template-based locomotion behaviors dynamically interacting with complex environments. Inspired from real-world human and animal motions, our study focuses on how to make model abstractions and high-level decisions for complex environments. A fundamental problem is how to use template models to characterize essential locomotion modes and sequentially compose these modes to achieve agile and robust locomotion.
Given the locomotion modes above, we define the set of locomotion modes as
All the locomotion modes above are illustrated in Figure 3. Each mode has closed-form solutions for their phase-space tangent and cotangent manifolds as will be derived in Section 5 and Appendix C. The timing synchronization between the sagittal and lateral dynamics is guaranteed by a Newton-Raphson foot placement searching algorithm (Zhao et al., 2017). Likewise, more complex tasks can be defined in the locomotion mode set Contact planning strategies for locomotion in rough terrains. We discretize the terrain height to decide what locomotion actions to take, and define them as environmental actions to set up a two-player game decision problem. For instance, given a moderately upward or downward terrain, there can be multiple contact actions to deal with it. Events motivated by ordinary accidents in human daily lives, such as a crack on the terrain and the sudden appearance of a human, are treated as emergency events, and incorporated into the allowable environment. Detailed definitions of environment and system actions are provided in Section 4.1.
Our phase-space planning process produces three-dimensional locomotion. However, the planning framework of this study focuses on forward walking using sagittal keyframes. Given high-level sagittal keyframes, the robot’s lateral dynamic behavior is automatically computed by our motion planner. Turning behaviors can be incorporated in our framework by using the method that we introduced in Zhao et al. (2017).
3.2. Switched systems and phase-space planning
Given the continuous locomotion modes above, we formulate the locomotion planning problem as a switched system (Liberzon (2012)). The dynamics of the WBDL process are defined as Logic-based locomotion planner structure. A set of locomotion templates is devised for maneuvering in constrained dynamic environments. Each template is indexed by a locomotion mode signal p. The discrete environment actions are represented by the variable e while control actions are represented by the variable s describing limb contact actions. The discretized dynamic locomotion keyframes are represented by 
Our phase-space planning is a three-dimensional hybrid bipedal locomotion planning framework based on robustly tracking a set of non-periodic keyframe states. This framework focuses on non-periodic gait generation for robust and agile locomotion over various challenging terrains and under external disturbances. The keyframe state in the phase-space is defined as
(
In general, this keyframe state refers to the apex state when the center-of-mass (CoM) velocity reaches the local minimal or maximum velocity in the CoM sagittal axis 2 Given two consecutive keyframe states, the phase-space planner evolves continuously and computes the contact transitions of one walking step as defined below.
(
Our contact-triggered switching strategy is especially suitable for non-periodic locomotion, which is abstracted as a progression map Φ between keyframe states, that is, driving the robot’s center-of-mass from one desired keyframe to the next one via the control input
(
Instead of using generalized coordinates associated with the robot joints, our planning framework chooses to use the robot’s center-of-mass state as the output space. This simplified coordinate choice is often used in the locomotion communities. Alternatives for dimensionality reduction include, for instance, differential flatness (Liu et al., 2012) and partial hybrid zero dynamics (Ames et al., 2015). The switched system dynamics in Equation (5) can be represented by a tuple,
3.3. Finite transition systems and LTL preliminaries
We now define system, environment, and product finite transition systems and describe LTL preliminaries.
(
(
(
Note that, the environment states
(
The word w
γ
is said to satisfy a LTL formula φ, if and only if the execution The interconnected feedback diagram of system and environment finite transition systems 
3.4. Discrete task planner synthesis formulation
Given the preliminaries above, we formulate a discrete task planner synthesis problem and introduce a specific fragment of the temporal logic for the task specifications.
To make the computation tractable, we employ a fragment of LTL formulae with a favorable polynomial complexity, named the Generalized Reactivity (1) (GR (1)) formulae (Bloem et al., 2012). This class of formulae is expressed as, for v ∈ {e, q, s}
The GR(1) formula is an efficient fragment of LTL and reasons over a rich set of states and actions and makes the task planner synthesis process tractable. A motivation of using this automated synthesis is to lay the theoretical foundation of devising a correct-by-construction decision-maker for composing complex locomotion trajectories.
4. Task planning for whole-body dynamic locomotion
In this section, we introduce the temporal logic specifications for locomotion in a possibly adversarial environment. We will specify a two-player game where the environment and keyframe state are the first player while the robot action is the second player. Our task specifications will capture two types of environmental events: (i) varying-height terrains are treated as ordinary events; and (ii) sudden incidents, such as a person appearing on the robot’s path or a crack on the terrain, are treated as adversarial actions, since if the robot does not respond properly they may cause an accident.
The design of LTL specifications relies on human designers who specify locomotion tasks and models of the environment. Our task specifications below are designed according to locomotion heuristics. In general, there are no unique ways to evaluate the efficacy of the LTL design process. For instance, we could decide to add an additional environmental specification to forbid repeating the same environmental actions
4.1. Environment specifications
As previously stated, we treat the environment as a player “acting” against the robot’s locomotion process. We define an environmental action set,
Since only one environmental action can be 1. (Se-1) If the current environmental action is 2. (Se-2) If the current environmental action is 3. (Se-3) If the current environmental action is
To evaluate the effectiveness of our proposed approach handling all the allowable environmental actions, we enforce them to occur infinitely often via the goal proposition:
To ensure the robot makes progress (i.e., continuously moves forward within the constrained environment), we define the following liveness condition:
4.2. Robot specifications
To maneuver in the environment using whole-body dynamic locomotion, we define the following robot actions:
We enforce the robot not to take actions responding to emergency events of the environment • (Srobot-1) Robot actions in response to varying-height terrain • (Srobot-2) If the environmental action • (Srobot-3) If the environmental action • (Srobot-4) If a narrow passage • (Srobot-5) If the environmental action
As for the goal proposition of the robot, we require that all locomotion modes and contact actions will occur infinitely often to verify their correctness.
4.3. Keyframe specifications
Our phase-space motion planner relies on a keyframe state vector Phase-space partition of locomotion manifolds for keyframe design. The left figure shows a grid-based partition while the right figure is a non-Euclidean partition that follows the phase-space locomotion manifolds. The latter partition is consistent with locomotion dynamics and we define it as the ”locomotion-manifold-based partition.” This partition will be used to achieve robust locomotion. We use different granularities for two orthogonal axes.
Given the environmental actions in Section 4.1, the specifications for keyframe states are designed as follows. • (Sq-1) If the next environmental action is • (Sq-2) If the next environmental action is • (Sq-3) If there is a crack on the terrain with a normal-height ceiling, that is, esc-sc, then the next keyframe state is qbrachiation relying on a different set of apex velocities and step lengths than for walking behaviors: • (Sq-4) If there is a crack on the terrain and there is a high ceiling, that is, esc-hc, then the keyframe state is qhop relying on a specific apex velocity, regardless of the current q: • (Sq-5) If a human appears in front of the robot, that is, eha, then the next keyframe state is qstop relying on a specific step length, regardless of the current q: • (Sq-6) If there is a narrow passage, that is, enp, then the next key frame state is qslide relying on a specific apex velocity, regardless of the current q:
The remaining eight scenarios involving different environment and system action combinations are defined in a similar manner omitted here for brevity. The specifications in (Sq-1)-(Sq-6) and all others belong to
From a high-level perspective, the goal of our task planner is to enable the robot to continuously maneuver through constrained environments by repeatedly selecting contact actions among
All the task specifications have been proposed such that
4.4. Synthesis of a high-level reactive task planner
Here we formulate the high-level locomotion planning problem as a game between the robot and its possibly adversarial environment. Given the task specifications defined above, a reactive control protocol is synthesized such that the controlled legged robot behaviors satisfy all the designed specifications whatever admissible uncontrollable environment behaviors are.
A winning strategy for the task planner represented by the pair
( • • • • θ
i
and θ
o
are atomic propositions characterizing initial states of the input and output variables, respectively; • • ϕwin is the winning condition given by an LTL formula.
(
Figure 7 shows an automaton fragment of the WBDL contact planner A fragment of the synthesized automaton for the WBDL contact planner. Non-deterministic transitions are encoded in this automaton. The blue transitions represent a specific execution. For illustration, we index both the environmental action 
Non-deterministic transitions exist in the synthesized automaton as follows: (i) environmental actions are non-deterministic. (ii) given an environmental action, several non-deterministic keyframe states can be chosen. (iii) even when both an environmental action and a keyframe state are given, non-deterministic system contact actions exist for certain transitions. This non-deterministic transitions allow self-transitions. In this case, we can guarantee the robot to make progress (i.e., maneuvering forward) due to the properties of locomotion keyframe states.
The keyframe specifications in this section purely reason about logic-level decisions and have no knowledge of underlying locomotion dynamics. However, the locomotion dynamics, especially those affected by external disturbances or model uncertainties, often result in the desired keyframe transitions being unrealizable. As such, we need to propose keyframe transitions with robustness margins and synthesize a reachability based controller to determine realizable keyframe transitions by the low-level locomotion dynamical system as proposed in the next section.
5. Robust reachability control of hybrid locomotion systems
When we model robot dynamics and estimate physical environments, uncertainty is ubiquitous due to sensor noise, model inaccuracy, external disturbance, sudden environmental changes, contact surface geometry uncertainty, and so on. As a result, commands from the symbolic task planner are potentially not achievable by the low-level motion planner. Additionally, a mismatch between the high-level discrete and low-level continuous planners is usually caused by the abstraction techniques applied on the underlying continuous systems. To handle these difficulties, we define a robust finite transition system and compute its keyframe transitions via synthesizing reachability controllers for every single walking step. In order to use phase-space locomotion manifolds to define robustness margin sets, a phase-space mapping needs to be defined between the Euclidean and Riemmanian spaces to evaluate whether a phase-space state is in the robustness margin or not.
5.1. Phase-space Euclidean-to-Riemmanian mapping
We first consider a specific locomotion process, for example, the prismatic inverted pendulum model (PIPM) (see Section 3.1 for more details) in order to establish a Euclidean-to-Riemmanian mapping in the phase space. Our previous study derives closed-form solutions of phase-space tangent and cotangent manifolds for this process (Zhao et al. (2017)) as follows.
(
The tangent manifold can be used to measure deviations from the nominal locomotion trajectory in the phase-space. We use this manifold to quantify the width of a phase-space robustness margin.
(
This cotangent manifold represents the arc length along the tangent manifold σ in Equation (25). We use this cotangent manifold to quantify the length of a phase-space robustness margin. Detailed derivations of these two closed-form solutions above, that is,
5.2. Robust finite transition system for one walking step
We now focus on a case of the one-walking-step locomotion process as defined in Def. 3. As illustrated in Figure 5(b), the discrete task planner uses a Riemannian discretization of the local state space, which is defined by an abstraction map
To guarantee that the motion planner yields feasible phase-space plans robust to disturbances, such as state measurement errors and disturbances in the dynamics, we introduce ϵ1 and ϵ2 as the bounds of initial and final robustness margins in the one-walking-step transition system. Namely, we not only consider the nominal initial and final keyframe states
(
The robustness margins ϵ1 and ϵ2 in Def. 9 are defined in the Riemannian space. A mapping
Now we describe how to simplify the robustness margin sets based on the closed-form phase-space manifolds defined in Section 5.1.
(
These two pairs of bounds represent Riemannian distances in phase-space, as shown in the upper right miniature subfigure in Figure 8. A locomotion-manifold-based partition is illustrated in Figure 5(b). The proposed robust finite transition system will use this partition to design the robustness margin around keyframe states as shown in Figure 8. A merit of our analysis is that this partition is consistent with the vector field of the locomotion dynamics. Additionally, this partition simplifies mathematical descriptions of robustness margin sets. Phase-space abstraction via locomotion-manifold-based partition. This figure shows a keyframe state transition process with robustness margins. Compared to the conventional grid-based partition in Euclidean phase space of Figure 5(a), this partition complies with locomotion dynamics, further enabling us to define robustness margins based on closed-form locomotion manifolds.
( • • • • • •
The mapping function
A conceptual illustration of this transition computation is shown in Figure 9. Using the robustness margins, we construct the transition set Keyframe state reachability with robustness margins for one walking step. Due to measurement error or external disturbance, the initial state 
The construction of Sequential procedure of designing feasible robust keyframe transitions. Step 1 determines the nominal keyframe state pair from the symbolic task planner. In Step 2, we design discrete state set 
Algorithm 1. One-walking-step robust finite transition subsystem 1: 2: Define 3: Define the discrete state of keyframe 4: Initialize transition set 5: 6: 7: Construct an inter-sampling finite abstraction 8: 9: 10: 11: 12: 13: 14:
Algorithm 1 details the construction of the robust finite transition subsystem above. The high-level task planner specifies the inputs of the algorithm, that is, two pairs of keyframe states, locomotion modes, and contact actions (
The proposed robust finite transition system (RFTS) differs from the abstraction approaches in Liu and Ozay (2014, 2016), Tabuada (2009), and Belta et al. (2017) with respect to the following points: (i) The most salient difference is that our planning approach is a hierarchy consisting of both top-down and bottom-up components. The RFTS is an interface taking the desired command from the high-level symbolic task planner (i.e., the top-down component) and use this command to synthesize a reachability controller in the low-level motion planner (i.e., the bottom-up component). The approach in Liu and Ozay (2014, 2016) is an abstraction of the underlying continuous dynamical system and represents a bottom-up approach. (ii) By using the proposed hierarhical structure, we are able to solve a more challenging problem with whole-body dynamic locomotion in a constrained environment, instead of simple examples such as 2D mobile robot or vehicle. (iii) Our RFTS reasons about the robustness to bounded state disturbances at not only the inter-sampling level, but also the locomotion keyframe level capturing the essential locomotion dynamics. (iv) We incorporate hybrid dynamics into our RFTS design, which is constructed for the one walking step. Overall, our planning framework sequentially composes multiple locomotion modes. (v) Instead of a grid-based partition, we use a locomotion-manifold-based partition to characterize the robustness margins of the keyframe states in the phase-space.
5.3. One-walking-step reachability control synthesis
To determine the transitions satisfying the conditions in Equations (36)–(38) of Def. 11, we employ abstraction-based control synthesis developed for general dynamical systems. The idea of this approach is to automatically and rigorously compute the set of states that can be controlled to realize a given specification and generate feedback controllers for those states. Generally, abstraction-based control synthesis consists of three steps: (1) Construct a finite transition system (also called a finite abstraction) that over-approximates the dynamics of the original continuous system. (2) Design control algorithms based on the finite transition system with respect to the given specification. This step not only verifies whether the given specification is realizable by the low-level robot dynamics, but also synthesizes a controller for the abstraction if realizable. (3) Interpolate the synthesized controller to be executed in the original continuous system.
We consider a one-walking-step locomotion subsystem defined on a local state space determined by two keyframe states.
(
The state spaces
The control actions
( • • •
The abstraction map
Next, we will discuss in detail how to compute the over-approximation
(
Algorithm 2: Reachability control synthesis 1: 2: assign a queue Que ← G, 3: initialize a winning set 4: 5: 6: 7: 8: 9: 10: Que ← 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: Given a locomotion mode, we denote by δ Given the abstraction defined in Def. 13, we synthesize a reachability controller for the inter-sampling finite abstraction A merit of the proposed hierarchical structure in Figure 12 is to decompose the overall high-dimensional contact-rich planning problem into tractable sub-problems with smaller state dimensions, circumventing prohibitive computational complexity. In particular, the symbolic task planner takes charge of the high-level decisions being reactive to the environment actions. The middle-level robust finite transition system reasons about the robustness of a local phase space region around the nominal keyframe state w.r.t bounded state disturbances. The low-level phase-space planner executes the continuous locomotion dynamics. This hierarchy is analogous to the receding horizon control approach in Wongpiromsarn et al. (2012), where the complex high-dimensional planning problem is decomposed into a set of solvable sub-problems. This strategy facilitates efficient decision making during dynamic interactions with uncertain environments.
In next section, we will prove the robust reachability of the synthesized controller, that is, the reachability goal is realizable for
6. Correctness of the reactive task and motion planner
Correctness guarantees of the WBDL planner play a key role in the successful execution of robust legged locomotion interacting with dynamic environments. The objective of this section is to prove such a correctness. In particular, the correctness of our planning framework is interpreted as successful implementations of the high-level task planner on the low-level motion planner under bounded disturbances. Our locomotion planner is a hierarchy composed of a task planning layer with reactive synthesis and a robust motion planner layer synthesized by a robust finite transition system. A high-level task planner, that is, a WBDL winning strategy
6.1. One-walking-step robust reachability
To guarantee the robust finite transition system to be realizable by its underlying continuous system, we need to prove that the conditions in Equations (36)–(38) of Def. 11 also hold for continuous states. Since the robust finite transition system is based on the keyframes of one walking step, we name the keyframe reachability problem as “one-walking-step robust reachability.” We model the bounded disturbance causing initial state deviations, model uncertainties, and external perturbations during the evolution of the locomotion trajectory. The term “robust reachability” refers to the reachability of the goal robustness margin set centered around the final keyframe from the initial robustness margin set.
(
Suppose that the walking step from ( We construct a control strategy
6.2. Correctness of the hierarchical WBDL planner
Given the one-walking-step robust reachability of Theorem 1, we now prove the correctness of the top-down planning hierarchy, that is, Top-down hierarchy of layered abstractions: 
Assume a low-level locomotion trajectory κ = (
(
By Proposition 1, a winning WBDL strategy
6.3. Replanning strategy and robustness
It is worth noting that the proposed correctness holds under a set of assumptions on allowable environmental and system actions and disturbance boundedness. However, sometimes the real-world disturbance can violate the bounded disturbance assumption and perturb the state to be out of the local reachability region (i.e., the winning set). To handle this situation, we establish a replanning strategy to request a new high-level task planner command. Ideally, the union set of all local winning sets is expected to cover the entire state space of interest. However, the existence of such a winning set union often cannot be guaranteed. Thus, it is difficult to generalize formal correctness of one winning set to that of the union set of all winning sets. What we strive to is to maximize the phase-space coverage by the union set of all winning sets. From a practical implementation viewpoint, synthesizing a large number of winning sets enables our planner to cover a sufficiently large phase space such that it is always likely to find a feasible winning set when large disturbances occur (Figure 12). Hierarchical structure of robust WBDL task and motion planner. This planner has three cascaded planning layers: high-level task planner, middle-level keyframe-based robust finite transition system, and low-level continuous motion planner. A replanning process (see blue lines) is triggered when the state is out of the reachability region (i.e., the winning set) or a sudden environmental change occurs.
In other words, there is no ground truth of “formal correctness” for real robotic systems. Even though we have a provably correct planner and implement it on a real robot in a correct way, the actual planner may not be formally “correct” due to many potential hardware issues. For instance, unmodelled actuator dynamics can easily break the correctness guarantee of task specifications at the high level. Thus, it makes more practical sense to target a formally correct approach that generates a palette of robust controllers, the winning sets of which jointly cover a sufficiently large phase-space of interest (if not a global state space). Our results indicate that a properly designed controller switching mechanism among these locomotion winning sets enables an effective replanning strategy such that a set of contiguous phase-space initial robustness margin sets can be controlled to reach a set of contiguous goal robustness margin sets under bounded disturbances.
Overall, the proposed planning framework reasons about robustness at the following three levels. • The robust finite transition system • If the disturbance is larger than the boundary value modeled in the controller synthesis, the state may be disturbed out of the winning set. In this case, an alternative winning set will be searched in the control library of allowable keyframe transitions determined by • When an environment event changes suddenly, a replanning signal will be sent to the high-level task planner. Note that, this replanning process can only be executed before the next step transition. Otherwise, the contact of the next walking step already occurs. Figure 23 in the Appendix shows a timing sequence of the replanning process. More details of this replanning process are shown in Algorithm 4 in the Appendix.
Algorithm 3. Execution of reactive task and motion planner 1: 2: 3: qnext ← getNextKeyframe (enext, qcurrent) {look up 4: snext ← getNextContactAction (enext, qnext) 5: pnext ← getNextLocomotionMode (enext, qnext) 6: ( 7: ( 8: ( 9: (e, q, p, s)current ← (e, q, p, s)next {update task planner states} 10: 11:
7. Results
We demonstrate WBDL results by sequentially composing the low-level locomotion modes via the symbolic task planner. In particular, we analyze in detail the robustness performance of the reachability control with respect to several key parameters. The Temporal Logic Planning (TuLiP) toolbox, a python-based embedded control software (Wongpiromsarn et al., 2011), is used to synthesize the symbolic task planner. The gr1c 8 tool, involving the CU Decision Diagram Package, is used by TuLiP as the underlying synthesis solver. The synthesized planner is correct by construction, that is, satisfying all the proposed specifications. The LTL synthesis procedure is offline and will take around 20 min to generate an automaton on a MacBook with a 2.9 GHz Intel Core i9 processor and 32 GB of memory. Once the automaton is generated, the task planning process will be executed at run-time. To guarantee the successful implementations of the low-level motion plans by the high-level task planner, we perform a robust reachability analysis of the keyframe states by using the so-called robustly complete control synthesis (ROCS) 9 tool (Li and Liu, 2018), which currently supports abstraction-based control synthesis using both uniform and non-uniform discretizations. ROCS also generates feedback control strategies, which are used to design the control library of our locomotion tasks.
7.1. Case I: Locomotion with stopping and brachiation behaviors
We first demonstrate a locomotion scenario involving environmental actions such as the appearance of a human and the terrain being crack in the scene. The synthesized discrete task planner is represented by a finite state automaton with 27 states and 148 transitions. The two-dimensional keyframe state q is composed of the apex velocity and step length. For either dimension, Sequential composition of the sagittal CoM phase-space trajectories and mode switchings for a 20-step WBDL maneuver. The top four figures illustrate phase-space manifolds of the four locomotion modes. The mode switching is governed by the synthesized high-level contact planner. Among these steps, two terrain crack and one human appearance events are taken into account. A short multi-contact phase is designed between every two consecutive modes for a smooth transition (see the short red trajectory between two green dots). Environment events, system actions and keyframe states of 50 walking steps according to the synthesized automaton. Actions and states are indexed by numbers. Emergency events, that is, human appearance and terrain crack, are highlighted in the shaded regions. In the bottom subfigure, the numbers 0 to 4 on the vertical axis correspond to {0, 0.4, 0.6, 0.8, 1.7} m/s for next step apex velocity and {0.15, 0.5, 0.6, 0.7, 0.6} m for next step step length. Snapshots of the WBDL motions in respond to two environmental emergencies. The snapshots show a sequence of locomotion behaviors including a brachiation motion over the cracked terrain and a stopping motion when a human appears. The figure at the bottom shows the CoM vertical position trajectory (orange thick line), hand and feet trajectories (thin interlaced lines).


7.2. Case II: Locomotion with ground sliding and hopping behaviors
When the robot maneuvers through a narrow passage, an ordinary locomotion mode (e.g., Ground sliding motion when a narrow passage appears in the scene. The robot crouches and slides on the ground through the low-ceiling area with a constant negative acceleration. This ground sliding motion is preceded and succeeded by a multi-contact transition phase as shown in the phase-space subfigure at the lower right corner.
When a crack in the terrain and a high ceiling occur simultaneously, the robot cannot grasp the overhead support any longer. To maneuver forward successfully, the robot has to leap over the unsafe region as shown in Figure 17. Thus, a hopping phase will be executed with no contact with the environment. A constant CoM sagittal velocity shows up in the sagittal phase portrait while a parabola appears in the vertical position trajectory of Figure 17(b). The keyframe state of this hopping motion is chosen to be the center of the horizontal line segment in the sagittal phase-space. The lateral velocity is set to zero to avoid a lateral drift. The state will stay at the red dot in Figure 17(d) during the hopping motion. Since the robot locomotes forward, the lateral phase portrait in Figure 17(d) behaves like a limit cycle but non-periodically due to the rough terrain. The low-level locomotion model corresponds to the mode (e) in Section 3.1. Via introducing specific locomotion modes, our planner is capable of handling emergency-motivated scenarios. Hopping motion when a crack in the terrain and a high ceiling occur simultaneously. In this case, no overhead support is available for grasping so the robot has to jump over the cracked terrain.
7.3. Case III: Locomotion replanning strategy
When the robot is already leaping in the air and detects a sudden change from an ordinary terrain to a cracked terrain, it has to replan its contact action and locomotion mode to accommodate this sudden change on the fly. The robot will execute a replanning process to ask the high-level planner for a new decision, that is, grasp the ceiling support and swing the robot’s body over the cracked region as shown in Figure 18. Otherwise, the robot will fail to locomote. The top subfigure of Figure 18 shows a decision sequence including the replanning process. The three rows represent environment actions, locomotion modes, and keyframe states, respectively. The second column with three dotted blocks is the decision before the replanning process and not executed yet until the next walking step. The third column represents the replanned decision which is executed in response to the sudden environmental action change. This replanning process in the phase-space is illustrated in Figure 23. Let us consider a more challenging terrain scenario (i.e., Replanning in response to a sudden environment event change. Suppose that during the flight phase, the robot finds out that the next terrain is cracked. Accordingly, the robot triggers its replanning process by changing the locomotion mode to the prismatic pendulum model, that is, grasping the overhead support, swinging the body over the second terrain crack region, and then landing on the non-cracked terrain. The top subfigure shows the decision sequence in the symbolic task planner. The bottom subfigure shows the snapshot sequence of the locomotion process.
Besides the above replanning strategy in response to environmental changes, there is another replanning strategy embedded in the reachability control library. When the robot state is perturbed to be out of the reachability region, that is, the winning set currently being executed, the robot cannot reach the robustness margin of the targeted keyframe. Then the robot calls for a new reachability controller in the library that covers the current perturbed state and uses this replanned controller to reach a new keyframe goal. Overall, the replanning process determines which new controller to call in the control library.
7.4. Case IV: Validation of the robust reachability controller
This case study evaluates the performance of the synthesized controller given a keyframe robust reachability goal. Let us first consider the prismatic inverted pendulum model (PIPM). Assume that we have a sagittal CoM state vector
Let us define two nominal keyframe states
As to the underlying continuous locomotion subsystem The additive disturbances to the dynamics are bounded by Dr = (0.05 m; 0.1 m/s) in the subfigure (a). The shaded yellow region represents the winning set. The black trajectories are the five closed-loop trajectories simulated in five trials. The blue trajectory represents a trial suffering a large disturbance, that is, a velocity jump in the phase-space. Since the disturbed state is still in the winning set, the CoM trajectory is guaranteed to reach the final robustness margin set. In subfigure (b), different levels of bounded disturbances are modeled in the computation of the winning sets. Naturally, a larger magnitude of the disturbance results in a smaller winning set.
Parameters of the PIPM-PPM mode transition.

Success rate of the simulations under varying granularities and disturbances. In subfigure (a), the system is subjected to disturbances bounded by D r = (0.15 m; 0.3 m/s). All the 50 simulation trails can reach the goal robustness margin set successfully. In subfigure (b), we run 1000 trials for each case with a specific granularity and a bounded disturbance. The disturbance exerted in the simulation remains the same, that is, D r = (0.1 m, 0.2 m/s).
We evaluate the effect of the discretization granularity and the magnitude of disturbances used in the controller synthesis process as shown in Figure 20(b). Given each controller synthesized using a specific granularity and for a specific disturbance bound, we simulate 1000 trials with the bounded disturbance Dr = (0.1 m, 0.2 m/s). Figure 20(b) shows four sets of simulation results for different granularities ranging from 0.002 to 0.005. For each set of simulations, the success rate increases as the modeled disturbance in the controller synthesis increases, and it reaches 100% when the modeled disturbance matches the actual disturbance D r used in the simulation. This is consistent with our expectation. Let us inspect the figure from another perspective. If we compare the results for different granularities with a specific disturbance set D i (i = 0, 1, 2, 3, 4), the success rate almost remains the same. This is because when constructing the abstraction for the robust reachability analysis, we have taken into consideration the effects of approximation errors caused by different discretization granularities, by using non-deterministic transitions that over-approximate the dynamics of the system. In addition, we observe that the success rates for all the synthesized controllers are greater than 97%, even in the case no disturbance is considered in the controller synthesis. This can again be interpreted by the over-approximation used in the abstraction. Nonetheless, as shown in the simulations, to achieve 100% correctness guarantee, the modeled disturbance has to be larger than (or at least match) the actual disturbance in the simulation. Moreover, under the same disturbance Dr, the nominal phase-space planner with a fixed open-loop control input only achieves a success rate of 29%. This huge discrepancy in success rate clearly shows the advantage of using an abstraction-based feedback controller over an open-loop phase-space planner.
7.5. Case V: Integrated multi-step locomotion via the reachability control library
This case evaluates an integrated multi-step locomotion example with the robust finite transition system
Parameters of constructing the inter-sampling finite abstraction
The computational time for constructing abstractions is around 30 s on average, and 5 s–15 s for synthesizing a reachability controller corresponding to each keyframe pair, depending on the number of states and transitions of the abstraction. Since we synthesize 625 (i.e., 25 × 25) reachability controllers for each walking step, the time to generate them is approximately 90 mins. In our simulation of six consecutive walking steps, all the local reachability control strategies are patched together to cover the overall state space. The time for simulating a single closed-loop walking trajectory is around 2 s. As the results show in Figure 21, we simulate six different trials with different initial conditions, that is, starting from different initial robustness margin sets. Each locomotion trajectory is guaranteed to reach one of the robustness margin sets at the next walking step via using the reachability controller from the control library. In particular, a trial is tested to evaluate the replanning strategy when the state is perturbed out of the winning set (Figures 22 and 23). Integrated phase-space trajectories of multi-walking step simulations under bounded disturbances. The replanning strategy is evaluated with a trial (see the blue trajectory) exerted with a velocity disturbance larger than the modeled disturbance in the MCM mode (around the position x = 2.3 m). In this case, the state is perturbed out of the winning set of the currently used reachability controller. A replanning signal is triggered, and the planner searches within the control library for a new winning set (together with a new reachability controller) that covers the perturbed state. Then the perturbed state will use that new controller to reach a new robustness margin set for continuous locomotion maneuvering. An illustration of the top-down decision sequence of the high-level reactive task planner and middle-level reachability controller synthesis. It illustrates the relationship between the keyframe state, environment action, and system mode. Replanning timing for the next walking step. The high-level task planning for the next walking step is determined at the beginning of one walking step. Then during the remaining time of the current walking step (before switching to the next walking step), a replanning process can be triggered anytime if the state is out of the reachability region (i.e., the winning set) or the environment action change suddenly. This figure uses single-contact prismatic inverted pendulum model for illustration.


8. Discussions and future work
8.1. Low-level uncertainties
This paper proposes a hierarchical approach to the task and motion planning of dynamic locomotion in complex environments. We achieve robustness against a general, bounded disturbance by synthesizing a middle-layer robust reachability controller with robustness margins to accommodate low-level uncertainties. Undoubtedly, a variety of low-level uncertainties can come from time delays, actuation limits, unmodeled dynamics, state estimation, and measurement error from the environment. These uncertainties severely deteriorate the execution success rate of the high-level planner, in particular when the robot performs highly agile motions in complex and unstructured environments. In addition, the abstraction methods can induce approximation errors between the high-level and low-level planners. Although not directly dealing with these low-level uncertainties and abstraction approximation errors, the keyframe-based robustness margin proposed in this paper can be viewed as an abstract representation of these uncertainties in the center-of-mass (CoM) state space. As long as a mapping can be established between these low-level uncertainties and the CoM phase-space deviations from the nominal trajectory, these uncertainties can be handled indirectly by the proposed reachability controller at run-time. Additionally, a replanning strategy is designed to handle large uncertainties that are not explicitly modeled in the reachability controller. In the future, abstraction refinement (Nilsson and Ozay, 2014) will be inspirational for designing a model abstraction with a proper granularity. More importantly, implementing the proposed high-level decision-making algorithms in the dynamic simulation and real hardware (Kim et al., 2016; Luo et al., 2017)), and evaluating the robustness performance against low-level uncertainties will be our main upcoming task.
8.2. System and environment assumption relaxation
To make the proposed hierarchical planning approach applicable to locomotion tasks in more complex and cluttered environments, it is important to relax the assumptions and approximations of the environment and model more realistic scenes. For instance, how to formally design recovery strategies for slippery terrains (i.e., with friction coefficient inaccuracies), large tilting angles, and swing foot obstacle collision is a practically meaningful topic.
Our current planner assumes that all limb contacts switch synchronously. To relax this conservative assumption, we will explore the asynchronous contact switching strategy in the future. This relaxation opens up the opportunity for designing more natural and diverse locomotion contact behaviors. From a more general perspective, contact actions and keyframe states may exhibit probabilistic features. Incorporating probabilistic models, such as Markov decision process (MDP) (Feng et al., 2015; Fu and Topcu, 2014; Platt et al., 2004), into the high-level decisions will be a promising direction. Accordingly, studying probabilistic correctness and completeness will be of our interest.
8.3. Generalization to complex tasks
Generalizing the proposed planning framework to more realistic locomotion tasks is of practical importance, in particular when robots are unleashed into the real world. Some more practical locomotion tasks include walking while carrying a payload, walking alongside human teammates, dynamically interacting with a human during motion (Alonso-Mora et al., 2018), and multi-robot coordination (da Silva et al., 2016). To this end, how to design an automatic method for generating locomotion primitives of diverse tasks becomes important. Also, allocating computing resources efficiently among different planning layers is an essential topic. A mission planner will be needed to operate at a more abstract level to make decisions on task allocation, coordination, and synchronization. A key problem is how to properly design integrated, scalable, and reactive mission and motion planners (da Silva et al., 2016) for legged robots to accomplish collaborative tasks in dynamic and unstructured environments.
At the individual robot level, our motion planner is designed for the three-dimensional case, although the demonstrated locomotion tasks are primarily straight walking. In the future, we will incorporate steering models (Zhao et al., 2017) such that the locomotion behaviors are extendable to complex 3D motions with steering capabilities. An advantage of our planning framework is to use simplified models which allow us to efficiently compose multiple locomotion modes and achieve dynamic and complex locomotion behaviors in constrained environments. The high-level symbolic planner automates this sequential composition process and guarantees the formal correctness of the overall planning framework.
An application of the proposed whole-body dynamic locomotion methodology in the constrained environment is the following: The US Defense Advanced Research Projects Agency (DARPA) created a Subterranean Challenge (DARPA, 2018) aiming at augmenting underground operation capabilities. “The Challenge aims to explore new approaches to rapidly map, navigate, and search underground environments … and propose novel methods for tackling time-critical scenarios through unknown courses in mapping subsurface networks and unpredictable conditions, which are too hazardous for human first responders.” Our proposed hierarchical decision-making approach for whole-body dynamic locomotion in constrained environments raise the importance of decision-making algorithms with formal guarantees for robots as complex as humanoid robots, a research topic of increasing importance as these robots begin to move out of the laboratories and work outdoors.
8.4. Planning horizon
Making planner decisions with a sufficiently long predictive horizon has great potential to enable intelligent and robust behaviors in complex and dynamically changing environments (Egerstedt et al., 2018). Our task planner has a one-walking-step horizon and may sometimes result in myopic locomotion decisions. For instance, if the disturbance is so large that the robot cannot recover within one walking step, our planner will execute a replanning process. However, a natural alternative is to design a recovery strategy over the next multiple walking steps, which is commonly used in the recovery process of human locomotion. The downside of this strategy is the increase in computational complexity. Our planning process substantially relieves this computational burden by using the simplified locomotion models. In addition to this computational consideration, the design of the planning horizon should take into account the versatility of the locomotion behaviors. For instance, if the locomotion process is of high speed, being able to make predictions over a longer horizon will be advantageous. Overall, we should take into account the computational power and behavior versatility when designing the planning horizon. Algorithm 2 is designed in a general form and should be extendable to the multiple-walking-step scenario.
9. Conclusions
This paper employs temporal-logic-based formal methods to synthesize a high-level reactive task planner and designs a middle-level discrete control to achieve the one-walking-step robust reachability process for complex WBDL behaviors in constrained environments. A particular focus has been given to (i) the robustness of the keyframe state reachability with respect to bounded disturbances and (ii) the correctness of the top-down hierarchy from the high-level task planner to the low-level motion planner processes.
A diverse set of locomotion models are devised at the low-level to form a template library in response to various environmental events, including those adversarial ones such as cracked terrain, human appearance, and narrow passage. These adversarial events require specifically-designed locomotion modes to enable desired locomotion behaviors. Deviating from numerous existing studies primarily using a single simplified model for a specific locomotion task, our symbolic task planner focuses on integrating and unifying a variety of simplified models and achieves complex locomotion behaviors via sequential composition of trajectories. A key novelty of this task planner lies in solving the traditional contact planning problem via a two-player game. Contact decisions are determined according to the synthesized switching protocol in response to possibly adversarial environment actions.
As for the reachability control under disturbances, we propose a robust metric of the keyframe state and use it to design a robust finite transition system realized by the underlying reachability synthesis. The proposed task and motion planner is validated through simulations of WBDL maneuvers in constrained environments. The performance of the reachability control is benchmarked via a series of synthesis and execution tests. We expect that this line of work acts as an entry point for the locomotion community to employ formal methods to verify and synthesize planners and controllers in legged and humanoid robots (Hereid et al., 2016; Kuindersma et al., 2016; Ramezani et al., 2014). Evaluating the proposed framework on dynamic bipedal robots is one of our high-priority future works.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was partially supported by the NSF Grant [#1924 978, #1724 360, #1652 113], Office of Naval Research (ONR) Grant [grant #N000141512507], and partially supported by NSERC of Canada, the Canada Research Chairs program, and an Ontario MRIS Early Researcher Award.
