Professional Documents
Culture Documents
2010 14 HybridControl Shape-Accelerated
2010 14 HybridControl Shape-Accelerated
Abstract— This paper presents a hybrid control strategy controller does not have any knowledge of the workspace
for navigation of shape-accelerated underactuated balancing constraints and obstacles in the environment. Though it is
systems with dynamic constraints. It extends the concept of se- possible to make dynamic, underactuated balancing systems
quential composition to perform discrete state-based switching
between asymptotically convergent control policies to produce navigate environments using these decoupled procedures,
a globally asymptotically convergent feedback policy. The they are often sub-optimal and result in jerky motions,
individual control policies consists of an external trajectory where the controller is fighting with the dynamics of the
planner, a shape trajectory planner, an external trajectory system to move it around. Moreover, when disturbed, these
tracking controller and a balancing controller. The paper also procedures often either result in collision with the obstacles
presents an integrated planning and control procedure, wherein
standard graph-search algorithms are used to plan for the or drive the system unstable. In order to achieve robust,
sequence of control policies that will help the system achieve smooth and collision-free motions, an integrated planning
a navigation goal. Simulation results of the 3D ballbot system and control procedure is necessary, where both the planner
navigating an environment with static obstacles to reach the and the controller understand the system dynamics and also
goal position are also presented. understand each other’s details.
I. INTRODUCTION
Underactuated mechanical systems are systems with fewer A. Related Work
control inputs than the degrees of freedom [1]. In robotics,
In the last decade, there has been a large body of work
balancing (dynamically stable) mobile robots form a special
on hybrid control techniques that will avoid decoupling
class of underactuated systems. They include wheeled robots
between planners and controllers. Sequential composition,
like Segway [2], ballbots [3] and legged robots. Balancing
introduced in [6], is a controller composition technique that
robots will play a vital role in realizing the dream of
connects a palette of controllers and automatically switches
placing robot workers in human environments by virtue of
between them to generate a globally convergent feedback
their small footprints and high centers of gravity. Among
policy. This technique was successfully applied to a variety
wheeled balancing systems, ballbots have the advantage
of systems [7], [8], [9], [10]. In [11], sequential composition
of omnidirectional motion that make them more suitable
was extended to produce an integrated planning and control
for operation in constrained spaces. These omnidirectional
procedure to achieve global navigation objectives for convex-
balancing systems, referred to as shape-accelerated under-
bodied wheeled mobile robots navigating amongst static
actuated balancing systems [4], are of interest here. The
obstacles.
interesting and troubling factor in control and planning
for such underactuated systems is the constraint on their
dynamics by virtue of underactuation. These constraints B. Contributions of the Paper
are second-order nonholonomic [5] constraints, i.e., non-
integrable acceleration/dynamic constraints, which restrict This paper presents a hybrid control framework for navi-
the family of trajectories the configurations can follow. gation of shape-accelerated underactuated balancing systems.
Underactuated balancing systems that are destabilized by Sequential composition [6] is used to discretely switch be-
gravitational forces have to maintain balance, which makes tween individual, asymptotically convergent control policies
it difficult to track desired configuration trajectories. to produce a globally, asymptotically convergent feedback
Traditionally, motion planning and control for mobile control policy that will achieve the overall navigation goal.
robots have been decoupled. Robot motion planning proce- The individual control policies are a combination of local
dures, generally, account for obstacles in the environment planners and controllers, as will be described in Sec. IV.
and workspace constraints but do not account for the system The local planner plans shape trajectories that account for the
dynamics and the constraints on them. They also do not dynamic constraints of the system in order to effectively track
have any knowledge of the details of the controller that is desired external configuration trajectories [12], [4]. Graph-
used to achieve these motion plans. On the other hand, the search algorithms like A∗ are used as a high-level planner
to plan for the sequence of control policies that will help
U. Nagarajan, G. Kantor and R. Hollis are with The Robotics the system achieve the navigation goal. This paper primarily
Institute, Carnegie Mellon University, Pittsburgh, PA 15213,
USA. umashankar@cmu.edu, kantor@ri.cmu.edu, focuses on the ballbot [3] and simulation results on a 3D
rhollis@cs.cmu.edu model are presented in Sec. V.
3567
A. Optimal Shape Trajectory Planner underactuated systems [4]. For a desired constant acceler-
In underactuated balancing systems, we are often inter- ation trajectory, Kqx = K̂qx ensures optimality, but for any
ested in tracking desired trajectories for the external variables general q̈dx (t), Kqx = K̂qx provides a good initial guess for
without losing balance. Shape-accelerated underactuated bal- the optimization process. It is to be noted that the optimality
ancing systems in Sec. II-A have constraints on the accel- here is in tracking error and not in time or path length.
eration of these external variables w.r.t. the shape variables’ In design of the optimal shape trajectory planner described
position, velocity and acceleration as given below: above, the objective has been approximate tracking of q̈dx (t),
but in reality, it would be desirable to track some qdx (t).
q̈x = −Msx (qs )−1 (Mss (qs )q̈s + hs (qs , q̇s )) Under the current procedure, this is possible only if the
= Γ(qs , q̇s , q̈s ). (9) initial conditions for the external variables are met. In
order to approximately track a desired external configuration
It is to be noted that Eq. 9 holds only if Msx (qs )−1
exists, trajectory qdx (t) using the optimal shape trajectory planner
which it does in the neighborhood of the origin, a property described above, the follow conditions must hold: (i) qdx (t)
of shape-accelerated underactuated balancing systems [4]. must be of at least of class C2 , i.e., q̇dx (t) and q̈dx (t) exist
Using a = (qs , q̇s , q̈s ) and b = q̈x , Eq. 8 can be written as and are continuous; (ii) initial conditions for the external
Θ(a, b) = 0. Taking the Jacobian w.r.t. b at (a, b) = (0, 0) variables are met, i.e., qxp (0) = qdx (0) and q̇xp (0) = q̇dx (0). It
yields is to be noted that qdx (t) is preferred to be of class C4 so that
∂ Θ(a, b)
|(a,b)=(0,0) = Msx (qs )|qs =0 (10) the first four derivatives exist and are continuous.
∂b
From properties of shape-accelerated underactuated balanc- B. Balancing and Trajectory Tracking Control
ing systems [4], the Jacobian in Eq. 10 exists and is invert- The shape trajectory planning procedure, described in
ible. Hence, by the implicit function theorem, there exists a Sec. III-A, assumes that there exists a balancing controller,
map Γ : a → b such that Θ(a, Γ(a)) = 0, which can be seen which has good shape trajectory tracking performance. Sim-
from Eq. 9. Again from the implicit function theorem, the ilar to [14], [12], this work uses a linear PID controller
map Γ is not invertible since the Jacobian ∂ Θ(a, b)/∂ a at (Eq. 14) as the balancing controller.
(a, b) = (0, 0) exists but is not invertible.
In order to track a non-constant, time-varying q̈dx (t), τ (t) = β p (qs (t) − qds (t)) + βi (qs (t) − qds (t))
there is no function that maps q̈dx (t) to (qds (t), q̇ds (t), q̈ds (t))
such that the dynamic constraints in Eq. 8 are satisfied. +βd (q̇s (t) − q̇ds (t)), (14)
So, it is desirable to plan shape configuration trajectories
where, β p , βi , βd are the proportional, integral and derivative
(qsp (t), q̇sp (t), q̈sp (t)), which when tracked will result in ap-
gains respectively.
proximate tracking of q̈dx (t). Here, a linear map Kqx : q̈dx → qsp
is proposed such that
qs
d Shape-Accelerated qx
... .... Balancing t
Kqx = argmin Γ(K q̈dx (t), K q dx (t), K q dx (t)) − q̈dx (t)22 . (11) + Controller
Underactuated
qs
K Balancing Systems
-
Here, the ... planned ....shape trajectories (qsp (t), q̇sp (t), q̈sp (t)) =
(K q̈dx (t), K q dx (t), K q dx (t)).
c -
The shape trajectory planning procedure is now an opti- qs
Tracking + d
qx
mization problem with the objective of finding the elements + Controller
of Kqx such that the L2 -norm of the error in tracking q̈dx (t) +
p
qs Optimal Shape
is minimized. It is to be noted that the parameter space is
Trajectory Planner
m2 -dimensional and any optimization algorithm that solves
nonlinear least-squares problem can be used.
Fig. 2. Control Architecture.
A good initial guess for Kqx is obtained from the dynamic
constraint given by Eq. 9 with (q̇s , q̈s ) = (0, 0). In this case,
Eq. 9 reduces to The combination of the balancing controller and the op-
timal shape trajectory planner provides good approximate
q̈x = −Msx (qs )−1 Gs (qs ) (12) tracking of the desired external configuration trajectories
under idealized conditions. However, an external trajectory
in the neighborhood of the origin. Jacobian linearization of
tracking controller is required to achieve tracking in more
Eq. 12 w.r.t. qs at qs = 0 gives
realistic conditions such as different initial conditions, un-
∂ (Msx (qs )−1 Gs (qs )) modeled dynamics and disturbances. In this work, a linear
q̈x = − |qs =0 qs
∂ qs PD controller (Eq. 15) is used as the external trajectory
= K̂qs qs , (13) tracking controller.
and K̂qx = K̂q−1 . The inverse, K̂q−1 , exists in the neighborhood qds (t) = qsp (t) + qcs (t),
s s
of the origin due to the properties of shape-accelerated qcs (t) = γ p (qx (t) − qdx (t)) + γd (q̇x (t) − q̇dx (t)), (15)
3568
where, γ p , γd are the proportional and derivative gains the shape and external variables. Any changes in shape
respectively. configurations cause changes in the external configurations
The resulting control architecture (Fig. 2) has been shown and in order to track any desired external configuration
to work well on the experimental ballbot setup [14]. The trajectory, the shape configurations have to follow a particular
balancing controller tracks the desired shape trajectories, trajectory that is stable. The balancing controller (Sec. III-
qds (t), which are a sum of planned, qsp (t) and compensation, B), which makes this tracking possible, is assumed to have
qcs (t), shape trajectories. The planned shape trajectories are a large enough domain of attraction in the shape state space.
given by the optimal shape trajectory planner, whereas the
compensation shape trajectories are provided by the tracking
controller, which tries to compensate for the deviation of
external trajectories from the desired ones.
3569
Bezier curves are used to plan external trajectories for mo- with discrete state-based switching between them. To initiate
tion inside the ice-cream hypercone domains. The parametric the switching behavior, it is necessary to determine whether
Bezier curve (x(t), y(t)) can be written as: the state trajectory has entered a domain or not. Simple
n n geometric domain shapes make it easier to determine whether
x(t) = ∑ xi bi,n (t), y(t) = ∑ yi bi,n (t), (16) the external state lies inside the geometric shape by use of
i=0 i=0 analytical equations.
where, (xi , yi ) are the n + 1 control points and the Bernstein The Bezier curves and the corresponding shape trajectory
polynomial bi,n (t) is given by: plans are generated online based on the entering external
state values. These trajectories are tracked until the state
n
bi,n (t) = t i (1 − t)(n−i) (0 ≤ t ≤ 1). (17) trajectory enters the next policy domain along the path
i towards its overall goal. This switching behavior continues
The n + 1 control points are chosen such that the boundary until the state trajectory reaches the final policy domain that
conditions are satisfied, i.e., the initial condition and the contains the overall goal.
desired final goal configuration. When the boundary condi-
tions include conditions on the derivatives, the n + 1 control V. S IMULATION R ESULTS
points and, in turn, the Bezier curve is a function of the
time duration of the curve. The optimal time duration that In this section, we present simulation results of the hybrid
minimizes the summed area under the curve of the position, control procedure, described in Sec. IV, on the 3D ballbot
velocity, acceleration and jerk trajectories is used here. A model. Here, the navigation goal is to move from (0 m, 0
variety of other objective functions can also be optimized to m) to (10 m, 10 m) around two obstacles in the environment
obtain the time duration. shown in Fig. 4(a).
Shape trajectories are planned using the optimal shape
trajectory planner described in Sec. III-A for the Bezier
curves. As described in Sec. III-B, the external trajectory 12
1
tracking controller, in combination with the shape trajectory 10 GOAL 2 13
28
planner, is used to provide desired shape trajectories that the 26 27
3 14
Y Linear Position (m)
8 25 12
balancing controller will track. 17 18 19 4 20 15
24 11
6 16 5 21 16
D. Prepares Graph 23 10
15 6 22 17
Given an environment and a collection of policy domains 4
22 9 7 23 18
14
distributed in the 4D space, a prepares graph can be gener- 2 21 8 8 24 19
13
ated, where each node corresponds to a policy domain and 2 3 20
4 5 6 7 9 25
each directed link represents the prepares relationship. The 0 1
START 10 26
task of navigating from a given start state to a desired overall −2 11 12 27
goal state can be performed by the described hybrid control −2 0 2 4 6 8 10 12
scheme provided the following conditions are satisfied: (i) X Linear Position (m) 28
there is at least one domain that contains the start state; (a) (b)
(ii) there is at least one domain whose goal configuration
matches the goal state; and (iii) there is a path between these Fig. 4. (a) Environment with start and goal configurations and also the
available policy domains; (b) Prepares Graph.
two domains in the prepares graph.
Optimal graph search algorithms like A∗ can be used to
obtain a sequence of control policies that are optimal w.r.t. For example, suppose that the ice-cream hypercone pol-
some cost function. A variety of heuristic functions can icy domains are placed in the environment as shown in
be used to even optimize for time or length of the path. Fig. 4(a). Note that the policy domains are placed in a
Existing dynamic replanning graph search algorithms like D∗ 4-dimensional subset of the state space and are projected
[16] can be used for automatically replanning the sequence onto the 2-dimensional workspace for ease of visualization.
of control policies when the system is disturbed from its The resulting prepares graph is shown in Fig. 4(b). In the
current path. An integrated planning and control procedure case presented, none of the policy domains collide with the
has been developed, where the high-level planner is planning obstacles. The policy domains that intersect the obstacles are
a sequence of control policies and dynamically updating the removed before generating the prepares graph. Moreover, the
sequence based on the system’s current state and overall collision check is done with the outer ice-cream hypercone
desired goal. domains D (Φi ) for each control policy Φi (Fig. 5).
A∗ , with Euclidean distance between the goal configu-
E. Switching Control rations of domains as the distance metric, was used to
Given a path i.e., a sequence of control policies to reach determine the optimal path as shown in Fig. 5. The resulting
the overall goal, a hybrid control strategy is used that sequen- linear position and body angle trajectories are shown in
tially composes asymptotically convergent control policies Figs. 6 and 7 respectively.
3570
12 3
D(Φi ) φx
D (Φi ) 2
GOAL φy
Fig. 5. Navigation among static obstacles using Hybrid Control Scheme. VIII. ACKNOWLEDGMENTS
This work was supported in part by NSF grants IIS-
12 0308067 and IIS-0535183.
10 R EFERENCES
Linear Position (m)
3571