Professional Documents
Culture Documents
Neural-Learning-Based Telerobot Control With Guaranteed Performance
Neural-Learning-Based Telerobot Control With Guaranteed Performance
Abstract—In this paper, a neural networks (NNs) enhanced synthesize the Internet-based teleoperation systems is studied,
telerobot control system is designed and tested on a Baxter and the identified delay parameters of an Internet segment
robot. Guaranteed performance of the telerobot control system are exploited to design a stable controller. In most conven-
is achieved at both kinematic and dynamic levels. At kinematic
level, automatic collision avoidance is achieved by the control tional teleoperation systems, the environmental information
design at the kinematic level exploiting the joint space redun- is supposed to be feedback to the operators directly, and
dancy, thus the human operator would be able to only concentrate they thus have to take care of every single interaction with
on motion of robot’s end-effector without concern on possible col- the environment around the manipulator. The operator could
lision. A posture restoration scheme is also integrated based on a manipulate a telerobot in this manner easily when the environ-
simulated parallel system to enable the manipulator restore back
to the natural posture in the absence of obstacles. At dynamic ment is simple, but in a dynamic and uncertain environment,
level, adaptive control using radial basis function NNs is devel- this approach could result in extreme huge burden to the
oped to compensate for the effect caused by the internal and operator.
external uncertainties, e.g., unknown payload. Both the steady Typically, a telerobot system is subject to two types of
state and the transient performance are guaranteed to satisfy a uncertainties that affect the control performance. One is con-
prescribed performance requirement. Comparative experiments
have been performed to test the effectiveness and to demonstrate cerned with the external environment around the robot, e.g.,
the guaranteed performance of the proposed methods. potential obstacles. The other is concerned with the internal
uncertainties such as unknown dynamics and varying pay-
Index Terms—Collision avoidance, guaranteed performance,
neural networks (NNs), telerobot control. load. For the external uncertainties, the partial feedback of
telerobot’s environment information to the operator may limit
the application range of a teleoperation system, while com-
I. I NTRODUCTION prehensive feedback may distract the operator from focusing
N THE last few decades, the teleoperated robots, also
I known as telerobots, have been widely applied for human
unfriendly tasks such as handing radioactive material and
on the task. A visual sensing-based teleoperation method is
designed in [5], in which the operator is able to control the
motion of each single joint of the robot manipulator, by trans-
searching in dangerous environment. In comparison to the ferring his/her hand-arm motion captured by the vision system
fully automatic robot manipulators used to perform rou- in real time. In such kind of joint-to-joint teleoperation sys-
tine task under static and structured environment, telerobots tem, human operators take full control of the manipulator
could work in dynamic and unstructured environments to including each degree of freedom (DOF). The workload of
perform more diverse tasks. A stereoscopic images processing- the operator is thus imaginably high. Therefore, the shared
based teleoperated system is proposed in [1], where the control strategy has attracted much attention [3], [4]. In the
stereoscopic images are displayed to the operator to provide shared control framework, a telerobot is partially automati-
assistance in manipulative tasks. In [2], an environment to cally controlled to assist the neuromotor control of the human
operator.
Manuscript received April 19, 2016; accepted May 22, 2016. Date of publi-
cation June 21, 2016; date of current version September 14, 2017. This work To decrease the workload of the human operator, we con-
was supported in part by the Engineering and Physical Sciences Research sider to employ the shared control framework, and embed an
Council under Grant EP/L026856/2 and Grant EP/J004561/1, in part by the automatic collision avoidance mechanism into the teleoper-
National Natural Science Foundation of China under Grant 61422310 and
Grant 61473038, and in part by the Beijing Natural Science Foundation under ation system, to enable the telerobot safely interact with a
Grant 4162066. This paper was recommended by Associate Editor P. X. Liu. dynamic environment. In this manner, the operator is able to
(Corresponding author: Chenguang Yang.) focus on manipulation of the end-effector of the telerobot,
C. Yang is with the Zienkiewicz Centre for Computational Engineering,
Swansea University, Swansea SA1 8EN, U.K. (e-mail: cyang@theiet.org). which could avoid collision automatically with little influence
X. Wang and H. Ma are with the School of Automation and the State Key to the end effector, by using redundancy mechanism. In the
Laboratory of Intelligent Control and Decision of Complex System, Beijing previous studies on manipulator collision avoidance, the redun-
Institute of Technology, Beijing 100081, China.
L. Cheng is with the State Key Laboratory of Management and Control for dancy of the manipulator is commonly used. The redundant
Complex Systems, Institute of Automation, Chinese Academy of Sciences, manipulators are studied in [6] and [7], in which the sec-
Beijing 100190, China. ondary goal is described by a homogeneous solution in the
Color versions of one or more of the figures in this paper are available
online at http://ieeexplore.ieee.org. joint space. The solution is decomposed into a particular and
Digital Object Identifier 10.1109/TCYB.2016.2573837 a homogeneous component, such that the multiple goals can be
This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/
YANG et al.: NEURAL-LEARNING-BASED TELEROBOT CONTROL WITH GUARANTEED PERFORMANCE 3149
Fig. 5. Illustrations of the collision points pcr and pco and of the Baxter
arm’s joints.
Fig. 6. Illustration of obstacle detection. The red 3-D model consisting of where Jei , 1 = 1, 2, . . . , n, is the ith column of Je .
spheres is a simple 3-D model of the robot. The green points in the point The first goal of control design at the kinematics level is
cloud denote the obstacle. The blue and red points represent the collision to find joint velocities for the manipulator’s end-effector to
points for the left and the right arms, respectively. The black line indicates
the distance measured in between the robot manipulator and the obstacle. follow the desired position xd commanded by the operator
in the Cartesian space. In order to achieve that, a closed-loop
TABLE I kinematic control law is designed in Cartesian space as below:
N OMENCLATURE FOR D ESIGN AT K INEMATIC L EVEL (i = 1, 2, 3)
ẋe = ẋd + Ke ex (5)
where ex = xd − xe is the position error for the end-effector
in the Cartesian space, defined as the difference between the
desired position and the actual position. The positive definite
Ke is a gain matrix to be specified by the designer.
The second goal is to avoid any potential collision with min-
imal effect on the first goal. This can be achieved by exploiting
the kinematic redundancy mechanism of the manipulator in
the joint space. If the DOF number of a manipulator is larger
than the dimension number of the velocity of the end-effector,
then the manipulator is regarded as of kinematic redundancy.
The inverse kinematics of a kinematically redundant manipu-
lator is not well defined because there are an infinite number
of solutions. Using the pseudo-inverse of the Jacobian matrix
J† defined in Table I, we give a general inverse kinematics
solution as
robot and the latter one on the obstacle, if of shortest distance θ̇ = J† ẋ + I − J† J z (6)
d in between the robot manipulator and the obstacle.
As this paper does not focus on the obstacle detection where z is a vector to be used for collision avoidance
method, the detection result is directly provided in Fig. 6, design [6].
where the red 3-D model is the 3-D model of the robot, the When the collision points are close to the manipulator arm,
green points in the point cloud denote the obstacle, the blue the manipulator could simply move toward the opposite direc-
and the red points represent the collision points on the left and tion of the obstacle’s velocity. The desired avoiding velocity
the right arms, respectively. The black line segments show the ẋo must satisfy the kinematic constraint described by
distance in between the robot manipulator and the obstacle. ẋo = Jo θ̇ (7)
where the Jacobian matrix of the collision point Jo is defined
III. C ONTROL S TRATEGY AT K INEMATIC L EVEL
in Table I. To reduce computational complexity, Jo can be
The control to be designed at the kinematic level aims to simply chosen in the manner
generate a reference trajectory θ̇ d in the joint space, such that
the manipulator could avoid potential collision, as shown in the Jo = [Je1 , . . . , Jel , 0, . . . , 0]
left dashed box in Fig. 1. More specifically, the control goal in which the first l columns are taken from Je , and the rest n−l
is to make the telerobot manipulator’s end-effector accurately columns are simply zero vectors, where l ≤ n is the number
follow reference trajectory in the Cartesian space commanded of joints that are above the potential collision point pcr .
by the operator, while simultaneously avoid the obstacle auto- As mentioned above, the collision avoiding velocity can be
matically. For the convenience of the readers, the notations designed to make the collision point pcr move toward the direc-
used in this section are presented in Table I. tion opposite to the obstacle velocity’s direction, as graphically
illustrated in Fig. 7. It is reasonably to assume that the colli-
A. Collision Avoiding sion point pcr should move faster when the obstacle is closer.
Without loss of generality, in this paper we only focus on Thus, we design the collision avoiding velocity ẋo as
⎧
the cases of collision avoidance that could be achieved using ⎨ 0, d ≥ do
kinematic redundancy mechanism, i.e., the joint motion in the ẋo = γ (d)vmax , dc < d < do (8)
⎩
null space of Jacobian Je . Let us now consider one arm of the vmax , d ≤ dc
3152 IEEE TRANSACTIONS ON CYBERNETICS, VOL. 47, NO. 10, OCTOBER 2017
position of three joints, namely S1 , E1 , and W1 , the Cartesian The control strategy at the kinematic level can be obtained
positions of which are denoted as xr1 , xr2 , and xr3 , respec- by substituting (5) into (19), as
tively. In the restoring process, each of these three joints on
the real manipulator is supposed to move to coincide with θ̇d = J†e (ẋd + Ke ex ) + I − Je† Je [αzo + (1 − α)zr ]. (21)
the same joints of the parallel system. The aforementioned
dimension reduction method is applied here as well to avoid Lemma 1: Consider the desired velocity θ̇d defined in (21).
the kinematics nonsolvable problem and achieve the position If joint velocity θ̇ completely follows θ̇d , then the end-
control of all joints simultaneously. The manipulator needs to effector’s position error ex will asymptotically converge to
satisfy (4) and (12) when there is no obstacle around zero.
⎡ ⎤T ⎡ ⎤ ⎡ ⎤T ⎡ ⎤ Proof: See the Appendix.
ẋr1 ẋr1 ẋr1 Jr1
⎣ ẋr2 ⎦ ⎣ ẋr2 ⎦ = ⎣ ẋr2 ⎦ ⎣ Jr2 ⎦θ̇ (12)
IV. C ONTROL S TRATEGY AT DYNAMICS L EVEL
ẋr3 ẋr3 ẋr3 Jr3
The control strategy at dynamics level aims to make sure the
where the Jacobian matrices [Jr1 T , J T , J T ]T , and joint veloc-
r2 r3 manipulator follow the joint space trajectory generated from
ities [ẋr1 , ẋr2 , ẋr3 ] are defined in Table I. Define ẋr =
T T T T
the kinematics level, as shown in Fig. 1. The radial basis func-
[ẋTr1 , ẋTr2 , ẋTr3 ]T and Jr = [Jr1
T , J T , J T ]T . Then (12) can be
r2 r3 tion NN (RBFNN) is used to compensate for the unknown
rewritten as dynamics, especially that caused by the unknown payload,
ẋTr ẋr = ẋTr Jr θ̇ . (13) to guarantee the steady state performance of the controller.
An error transformation method is employed in the design to
The restoring velocity ẋr is designed based on the feedback guarantee the transient performance.
position errors, as below:
ẋr = Kr er (14) A. Radial Basis Function NN
The effectiveness of the linear-in-parameter RBFNN has
where er = [eTr1 , eTr2 , eTr3 ]T is vector of position error between
been extensively tested by a large number of researchers, and
simulated artificial manipulator and actual manipulator, and
it is theoretically proved that RBFNN is able to approximate
the positive definite gain matrix Kr is to be specified by the
any continuous function φ(θ) : Rm → R arbitrarily close on
designer.
a compact set z ⊂ Rm as [27], [28]
D. Control Design at Kinematic Level φ(θ ) = W T Z(θ ) + εφ , ∀θ ∈ θ (22)
Let us combine the inverse kinematics general solution (6),
the dimension reduction equations (10) and (12), then we have where W = [ω1 , ω2 , . . . , ωl ]T ∈ Rl is the ideal NN weight
the following equations: vector of constant elements, θ ∈ θ ⊂ Rm is the input
vector, εφ is the bounded approximation error, and Z(θ ) =
ẋTo Jo J†e ẋe + ẋTo Jo I − J†e Je zo = ẋTo ẋo (15) [z1 (θ ), z2 (θ ), . . . , zl (θ)]T ∈ Rl is basis function with zi (θ)
chosen as Gaussian functions as below [29]
ẋTr Jr J†e ẋe + ẋTr Jr I − J†e Je zr = ẋTr ẋr . (16)
−(θ − μi )T (θ − μi )
The solutions of zo and zr can be derived from (15) and (16), zi (θ) = exp , i = 1, 2, . . . , l (23)
ηi2
as given in the following equations:
† where vector μi = [μi1 , μi2 , . . . , μim ]T ∈ Rm represents the
zo = ẋTo Jo I − Je† Je ẋTo ẋo − ẋTo Jo Je† ẋe (17)
note centers and ηi the variance. The value of the ideal weight
†
vector W minimizes the approximation error εz for all θ ∈ θ
zr = ẋTr Jr I − Je† Je ẋTr ẋr − ẋTr Jr Je† ẋe . (18)
in the following manner:
In order to smoothly switch in between the obstacle avoid-
W = arg min supφ(θ ) − W T S(θ ) , θ ∈ θ . (24)
def
ance and the restoration, we employ a weighted sum of (17)
W ∈Rl
and (18), and integrate it into the design of the desired joint
velocities as If the number of NN node l is sufficiently large and node
centers μi are appropriately chosen, the approximation error
θ˙d = J†e ẋe + I − Je† Je [αzo + (1 − α)zr ] (19) |εz | could be reduced arbitrarily small.
where the weight factor α is chosen in [0, 1], depending on
the distance in between the obstacle and the manipulator, as B. Control Design at Dynamic Level
⎧ 1) Error Transformation and Joint Position Control Loop:
⎪ 0, d ≥ do
⎨ d −d Let us define the joint angle tracking error eθ = θ − θd ,
o
α= , dr < d < do (20) and then employ the following error transformation func-
⎪
⎩ do − dr
1, d ≤ dr tions [12], [30]:
where dr is the distance threshold that the manipulator start eθi (t)
eθi (t) = ρ(t)Ri Pi , i = 1, 2, . . . , n (25)
to restore its original pose. ρ(t)
3154 IEEE TRANSACTIONS ON CYBERNETICS, VOL. 47, NO. 10, OCTOBER 2017
where where
⎧ x
⎪ e −δ Ṗ(η(t)) = diag Ṗ1 (R1 (η1 (t))), . . . , Ṗn (Rn (ηn (t)))
⎨ , if eθi (0) ≥ 0
Ri (x) = 1 + ex (26) vd = [vd1 , vd2 , . . . , vdn ]T
⎩ δe − 1 , if eθi (0) < 0
x
⎪
1 + ex ev = θ̇ − vd . (34)
and Pi (·) defined below is the inverse function of Ri (·) 2) Neural-Learning and Joint Velocity Control Loop:
⎧ The dynamics of the robot manipulator can be described as
⎪ x+δ
⎨ ln , eθi (0) ≥ 0
Pi (x) = 1−x (27) M(θ )θ̈ + C θ , θ̇ θ̇ + G (θ ) + τext = τ (35)
⎪ x+1
⎩ ln , eθi (0) < 0
δ−x
where M(θ) is the manipulator inertia matrix, C(θ , θ̇ ) is the
with ρ(t) denoting the tracking performance requirement, Coriolis matrix for the manipulator, G (θ ) is the gravity terms
which is defined as and τext denotes the external torque caused by payload. For
ρ(t) = (ρ0 − ρ∞ )e−pt + ρ∞ (28) convenience, let us define G(θ ) = G (θ ) + τext .
The velocity control loop is used to achieve the desired joint
where parameters δ, ρ0 , ρ∞ < ρ0 and p used above are pos- angular velocities vd by applying the control torque τ .
itive constants to be specified by the designer, depending on Design the control torque input as follows:
the requirement of the tracking performance.
Ṗ(η(t))η(t)
The joint position control loop aims to generate the desired τ = −k2 ev + M̂v̇d + Ĉvd + Ĝ + f̂ − (36)
joint angular velocities and to guarantee the transient perfor- ρ(t)
mance of the manipulator. Define ηi (t) as where Ĝ(θ ), M̂(θ), Ĉ(θ , θ̇ ), and f̂ are the estimates of G(θ),
M(θ), C(θ , θ̇ ), and f , respectively, and f defined later in (41)
eθi (t)
ηi (t) = Pi . (29) is a function of θ , θ̇ , vd , and v̇d .
ρ(t)
The closed-loop dynamics can then be formulated as
Then the joint position control loop is designed as follows:
ρ̇(t) Ṗ(η(t))η(t)
vdi (t) = −k1 ρ(t)ηi (t) + θ̇di (t) + eθi (t). (30) Mėv + Cev + k2 ev + − f̂
ρ(t) ρ(t)
Lemma 2 [12], [30]: If ηi (t) is bounded, the tracking per- = − M − M̂ v̇d − C − Ĉ vd − G − Ĝ . (37)
formance can be tailored by the performance requirement
function ρ(t). Applying NN approximation technique, we have
Proof: From the definition of Ri (·), we have
M(θ ) = WM T
ZM (θ ) + ε M (θ)
− δ < Ri (·) < 1, if eθi (0) ≥ 0 C θ , θ̇ = WCT ZC θ , θ̇ + ε C θ, θ̇
−1 < Ri (·) < δ, if eθi (0) < 0. (31)
G(θ ) = WGT ZG (θ ) + ε G (θ)
If ηi (t) is bounded, then from the definition of ηi (t), we
f = WfT Zf θ , θ̇ , vd , v̇d + ε f θ, θ̇ , vd , v̇d (38)
have
− δρ(t) < eθi (t) < ρ(t), if eθi (0) > 0 where WM ∈ Rnl×n , WC ∈ R2nl×n , WG ∈ Rnl×n , and Wf are
the ideal NN weight matrices defined as
−ρ(t) < eθi (t) < δρ(t), if eθi (0) < 0. (32)
WM = WMi,j , WC = WCi,j
Thus, ρ(t) can be seen as the bounding envelope of the
error eθi (t). For the transient performance, the maximal ampli- WG = diag WGi , Wf = diag Wfi (39)
tude of overshoot is bounded by δρ0 and the amplitude of
where WMi,j ∈ Rl , WCi,j ∈ R2l , WGi ∈ Rl , and Wfi ∈ Rl , i =
maximal tracking error in the stable phase is bounded by
1, 2, . . . , n, j = 1, 2, . . . , n, as weight vectors defined in (24),
max{ρ∞ , δρ∞ }.
are used to approximate element Mi,j (θ ) ∈ R, Ci,j (θ ) ∈ R, and
The 5% settling time, i.e., the required time for the tracking
Gi (θ ) ∈ R, respectively.
error to reach and stay within a 100% ± 5% range is bounded
The matrices of radial basis functions, namely, ZM (θ),
by (max{1, δ}/p) ln(ρ0 − ρ∞ /1.05ρ∞ ). Therefore, ρ(t) actu-
ZC (θ , θ̇ ), ZG (θ ), and Zf (θ ) are designed as follows:
ally regulates both the transient and steady performance.
ZM (θ ) = diag(Zθ , . . . , Zθ ) ∈ Rnl×n
Consider a Lyapunov function V1 = (1/2)ηT (t)η(t), which
Zθ Z
will be used later for stability analysis. Its derivative is ZC θ, θ̇ = diag ,..., θ ∈ R2nl×n
Zθ̇ Zθ̇
given by T
ZG (θ ) = Zθ T , . . . , Zθ T ∈ Rnl×n
ηT (t)Ṗ(η(t))ev (t) T
V̇1 = − k1 ηT (t)Ṗ(η(t))η(t) (33) Zf θ , θ̇ , vd , v̇d = Z̄T , . . . , Z̄T ∈ R4nl×n (40)
ρ(t)
YANG et al.: NEURAL-LEARNING-BASED TELEROBOT CONTROL WITH GUARANTEED PERFORMANCE 3155
where Zθ = [z1 (θ ), z2 (θ ), . . . zl (θ)]T ∈ Rl , Zθ̇ = the robot manipulator can be guaranteed. Now, let us consider
[z1 (θ̇ ), z2 (θ̇ ), . . . zl (θ̇ )]T ∈ Rl , Zvd = [z1 (vd ), an overall Lyapunov function as
z2 (vd ), . . . zl (vd )] ∈ R , Zv̇d = [z1 (v̇d ), z2 (v̇d ), . . . zl (v̇d )]T ∈
T l
V = V1 + V2 . (47)
Rl , and Z̄ = [Zθ T , Zθ̇ T , Zvd T , Zv̇d T ]T ∈ R4l , with zi ,
i = 1, 2, . . . , l defined in (23). The NN-based estimates of The derivative of V is
G(θ ), M(θ ), C(θ , θ̇ ), and
V̇ = −k1 ηT (t)Ṗ(η(t))η(t) − eTv k2 ev − eTv εf
f = εM v̇d + εC vd + εG (41)
+ tr σM W̃M T
ŴM + tr σC W̃CT ŴC
can be written as follows:
+ tr σG W̃GT ŴG + tr σf W̃fT Ŵf . (48)
M̂(θ ) = ŴM
T
ZM (θ )
Ĉ θ , θ̇ = ŴC ZC θ, θ̇
T According to the definition of Ṗ(η(t)), we have
ηT (t)Ṗ(η(t))η(t) ≥ (2)/(1 + δ)η(t)2 . Consider the
Ĝ(θ ) = ŴGT ZG (θ ) following inequality obtained by Young’s inequality:
f̂ = ŴfT Zf θ , θ̇ , vd , v̇d . (42) 1
2
1 2
T
tr W̃(·) Ŵ(·) ≤ − W̃(·) + W(·) F . (49)
By substituting (42) into (37), we have 2 F 2
Then (48) can be further derived as
Ṗ(η(t))η(t)
Me˙v + Cev + k2 ev + 2k1 1
ρ(t) V̇ ≤ − η(t)2 − k2 − ev 2 +
= −W̃M ZM v̇d − W̃C ZC vd − W̃GT ZG − W̃fT Zf − εf
T T 1+δ 2
σM
2
σC
2
σG 2
σf
2
(43) − W̃M − W̃C − W̃G − W̃f
2 F 2 F 2 F 2 F
where W̃(·) = W(·) − Ŵ(·) . (50)
Let us consider a second Lyapunov function as
where = (σM /2)||WM ||2F + (σC /2)||WC ||2F + (σG /2)
1 1 T
||WG ||2F + (σf /2)||Wf ||2F + (1/2) f2 with f the upper limit
V2 = eTv Mev + tr W̃M QM W̃M
2 2 of εf over .
1 T
Obviously, if W̃M , W̃C , W̃G , W̃f , η(t), and ev satisfy the
+ tr W̃C QC W̃C + W̃GT QG W̃G + W̃fT Qf W̃f (44)
2 following inequality:
where QM , QC , QG , and Qf are positive definite weight matri- 2 2 2 2
˙
˙ σM W̃M + σC W̃C + σG W̃G + σf W̃f
ces to be specified. Note that W̃(·) = −Ŵ(·) , we see the F F F F
derivative of V can be written as 2
eT Ṗ(η(t))η(t) 2k1 1
V˙2 = −eTv k2 ev − v
− eTv εf + η(t)2 + k2 − ev 2 ≥ (51)
ρ(t) 1+δ 2
˙ then we have V̇ ≤ 0.
− tr W̃M ZM v̇d ev + QM ŴM
T T
Using the LaSalle’s theorem and Lemma 1, it is easy to
˙ establish stability and convergence results in the following
− tr W̃C ZC vd ev + QC ŴC
T T
Theorem 1.
Theorem 1: Consider the closed-loop dynamics system
˙
− tr W̃GT ZG eTv + QG ŴG consisting of telerobot open loop dynamics (35), the torque
control input defined in (36) with neural learning law spec-
˙
− tr W̃fT Zf eTv + Qf Ŵf . (45) ified in (46). Semi-global uniformly boundedness of all the
closed-loop signals is guaranteed given bounded θd and θ̇d .
The neural learning law is thus designed as follows: Particularly, the error signals η(t) and ev will converge to an
invariant set s ⊆ defined
˙
ŴM = −Q−1 M ZM v̇d ev + σM ŴM
T
˙
s = η(t), ev , WM , WC , WG , Wf |
ŴC = −Q−1 Z v eT
+ σ Ŵ
C C d v C C
2 2 2
˙ −1 σM W̃M σC W̃C σG W̃G
ŴG = −QG ZG ev + σG ŴG
T
× F
+ F
+ F
2 2 2
˙ 2
Ŵf = −Q−1f Zf eT
v + σf Ŵf (46)
σf W̃f 2k1
where σM , σC , σG , and σf are positive parameters to be + F
+ η(t)2
2 (1 + δ)
specified by the designer.
In the following analysis, the boundness of η(t) is estab- (2k2 − 1)
+ ev ≤ 1 .
2
(52)
lished, such that both transient and steady state performance of 2
3156 IEEE TRANSACTIONS ON CYBERNETICS, VOL. 47, NO. 10, OCTOBER 2017
Fig. 11. Joint angle errors eθ without the neural-learning. The lines of
different colors indicate different joints.
Fig. 12. Compensation torque TNN generated by the neural learning. The
lines of different colors indicate different joints.
Fig. 10. Reference joint angles θ d and the actual joint angles θ without
the neural-learning. The dashed and solid lines indicates reference and actual
joint angles, respectively. The lines of different colors indicate different joints.
V. E XPERIMENT S TUDIES
A. Test of Neural-Learning Performance
Fig. 13. Reference joint angles θ d and the actual joint angles θ with the
The first group of experiments mainly test the compensa- neural-learning. The dashed and solid lines indicates reference and actual joint
tion of the effect caused by the unknown dynamics and the angles, respectively. The lines of different colors indicate different joints.
uncertain payload. As shown in Fig. 9, the Baxter right arm’s
end-effector is controlled to move along a fixed trajectory
specified by
⎡ ⎤
0.6 + 0.1 sin(2π t/2.5)
xd (t) = ⎣ −0.4 + 0.3 cos(2π t/2.5) ⎦. (53)
0.2
A payload is held by the gripper, which has a weight of 1.3 kg.
In order to achieve a high precision of the approximation of
the 7-DOF robot dynamics, we choose three nodes for each
Fig. 14. Joint angle errors eθ with the neural-learning. The lines of different
input dimension, and employ totally l = 37 NN nodes for colors indicate different joints.
neural networks M̂(θ ) = ŴM T Z (θ ) and Ĝ(θ ) = Ŵ T Z (θ ),
M G G
2l NN nodes for neural network Ĉ(θ ) = ŴCT ZC (θ )T and 4l NN
nodes for neural network f̂ = ŴfT Zf (θ, θ̇ , vd , v̇d )T . While the The experiment results with the proposed neural learning
NNs weight matrix are initialized as ŴM (0) = 0 ∈ Rnl×n , are shown in Figs. 12–14, where Fig. 12 shows the compen-
ŴC (0) = 0 ∈ R2nl×n , ŴG (0) = 0 ∈ Rnl×n , and Ŵf (0) = 0 ∈ sation torque TNN = M̂v̇d + Ĉvd + Ĝ + f̂ . When the payload
R4nl×n , where l = 2187 and n = 7. is attached to the end effector, its gravity effect was more
Two comparative experiments are performed to verify the obvious than the dynamic uncertainties of the robot manipu-
performance of the proposed neural-learning based controller. lator. Therefore, we particularly show the NN weight of ŴG
In the first experiment, the manipulator with payload is con- in Fig. 15. Fig. 13 shows the reference and actual joint angles
trolled by the controller without neural learning, while in the and Fig. 14 shows the joint angle errors. We see that at the
second experiments, the proposed neural learning is enabled. beginning, the compensation torque is zero and the joint angle
For the first experiment, the reference joint angles θ d and error is as large as the results shown in Fig. 11. While later
the actual joint angles θ are shown in Fig. 10; the joint angle the weight matrices show the trend of convergence. Variance
errors eθ are shown in Fig. 11. We see that the joint angle of same periodicity as the reference trajectory can be found in
errors are relatively high due to the heavy payload. the compensation torque TNN . Along with the increment of the
YANG et al.: NEURAL-LEARNING-BASED TELEROBOT CONTROL WITH GUARANTEED PERFORMANCE 3157
Fig. 15. Norm of each column vector of the NN weight ŴG , corresponding
to each row of Ĝ.
Fig. 16. Experiment results of the tracking performance test. Joint angle (a)
without error transformation method, (b) error without error transformation
method, (c) with the error transformation method, and (d) error with the error
transformation method.
ACKNOWLEDGMENT
D. Test of Restoration Function
The authors would like to thank Professor Angelo Cangelosi
In the last experiment, the restoration function is tested. of Plymouth University for the technical support provided.
For ease of test, the operator holds the joystick statically such
that the manipulator’s end-effector is maintained at the fixed
R EFERENCES
position, while the obstacle is moving in a dynamic manner.
The test results are shown in Fig. 20. We see in the figure [1] M. Ferre, R. Aracil, and M. Navas, “Stereoscopic video images for
telerobotic applications,” J. Robot. Syst., vol. 22, no. 3, pp. 131–146,
that when the obstacle is moving close to the manipulator, Mar. 2005.
its elbow moves down in order avoid possible collision. It [2] R. Oboe and P. Fiorini, “A design and control environment for Internet-
is noted that there is nearly no change of the end-effector’s based telerobotics,” Int. J. Robot. Res., vol. 17, no. 4 pp. 433–449, 1998.
[3] H. Boessenkool, D. A. Abbink, C. J. M. Heemskerk, and
position during the collision avoidance. When the obstacle F. C. T. van der Helm, “Haptic shared control improves tele-operated
moves away, the manipulator automatically restores back to task performance towards performance in direct control,” in Proc. IEEE
its previous posture. World Haptics Conf. (WHC), Istanbul, Turkey, 2011, pp. 433–438.
[4] H. Boessenkool, D. A. Abbink, C. J. M. Heemskerk,
F. C. T. van der Helm, and J. G. W. Wildenbeest, “A task-specific
VI. C ONCLUSION analysis of the benefit of haptic shared control during telemanipulation,”
IEEE Trans. Haptics, vol. 6, no. 1, pp. 2–12, Mar. 2013.
In this paper, a telerobot control method with guaranteed [5] J. Kofman, X. Wu, T. J. Luu, and S. Verma, “Teleoperation of a robot
performance at both kinematic and dynamic levels is devel- manipulator using a vision-based human-robot interface,” IEEE Trans.
oped. On the kinematic level, a dimension reduction method Ind. Electron., vol. 52, no. 5, pp. 1206–1219, Oct. 2005.
[6] A. A. Maciejewski and C. A. Klein, “Obstacle avoidance for kinemati-
is developed to overcome the kinematics nonsolvable problem cally redundant manipulators in dynamically varying environments,” Int.
while the manipulator is avoiding obstacle, and to achieve a J. Robot. Res., vol. 4, no. 3, pp. 109–117, 1985.
more efficient use of the redundancy. A simulated parallel sys- [7] L. Zlajpah and B. Nemec, “Kinematic control algorithms for on-line
obstacle avoidance for redundant manipulators,” in Proc. IEEE/RSJ
tem of the manipulator is designed to achieve restoration back Int. Conf. Intell. Robots Syst., vol. 2. Lausanne, Switzerland, 2002,
to the natural pose in the absence of obstacle. The human pp. 1898–1903.
YANG et al.: NEURAL-LEARNING-BASED TELEROBOT CONTROL WITH GUARANTEED PERFORMANCE 3159
[8] Z. Peng, G. Wen, A. Rahmani, and Y. Yu, “Distributed consensus-based Chenguang Yang (S’07–M’10–SM’16) received
formation control for multiple nonholonomic mobile robots with a speci- the B.Eng. degree in measurement and control
fied reference trajectory,” Int. J. Syst. Sci., vol. 46, no. 8, pp. 1447–1457, from Northwestern Polytechnical University, Xi’an,
2015. China, in 2005, and the Ph.D. degree in control
[9] G. Wen, Z. Peng, A. Rahmani, and Y. Yu, “Distributed leader-following engineering from the National University of
consensus for second-order multi-agent systems with nonlinear inherent Singapore, Singapore, in 2010.
dynamics,” Int. J. Syst. Sci., vol. 45, no. 9, pp. 1892–1901, 2014. He was a Post-Doctoral Fellow with Imperial
[10] Z. Peng, S. Yang, G. Wen, A. Rahmani, and Y. Yu, “Adaptive distributed College London, London, U.K. He is a Senior
formation control for multiple nonholonomic wheeled mobile robots,” Lecturer with the Zienkiewicz Centre for
Neurocomputing, vol. 173, pp. 1485–1494, Jan. 2016. Computational Engineering, Swansea University,
[11] Z. Peng, G. Wen, A. Rahmani, and Y. Yu, “Leader–follower formation Swansea, U.K. His current research interests include
control of nonholonomic mobile robots based on a bioinspired neurody- robotics, automation, and computational intelligence.
namic based approach,” Robot. Auton. Syst., vol. 61, no. 9, pp. 988–996, He was a recipient of the 2011 IEEE T RANSACTIONS ON ROBOTICS Best
2013. Paper Award.
[12] L. Cheng, L.-G. Hou, M. Tan, and W. J. Zhang, “Tracking control of
a closed-chain five-bar robot with two degrees of freedom by integra-
tion of an approximation-based approach and mechanical design,” IEEE
Trans. Syst., Man, Cybern. B, Cybern., vol. 42, no. 5, pp. 1470–1479,
Oct. 2012.
[13] A. Smith et al., “Novel hybrid adaptive controller for manipulation in Xinyu Wang received the bachelor’s degree from
complex perturbation environments,” PLoS One, vol. 10, no. 6, 2015, the Beijing Institute of Technology, Beijing, China,
Art. no. e0129281. in 2011, where he is currently pursuing the Ph.D.
[14] B. S. Park, J. B. Park, and Y. H. Choi, “Adaptive formation control degree with the School of Automation.
of electrically driven nonholonomic mobile robots with limited infor- He was a visiting student in U.K. from 2014 to
mation,” IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 41, no. 4, 2015. His current research interests include robotics
pp. 1061–1075, Aug. 2011. and autonomous vehicle.
[15] Z. G. Hou, L. Cheng, and M. Tan, “Decentralized robust adaptive control
for the multiagent system consensus problem using neural networks,”
IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 39, no. 3, pp. 636–647,
Jun. 2009.
[16] L. Cheng, Z.-G. Hou, and M. Tan, “Adaptive neural network track-
ing control for manipulators with uncertain kinematics, dynamics and
actuator model,” Automatica, vol. 45, no. 10, pp. 2312–2318, 2009.
[17] J.-Q. Huang and F. L. Lewis, “Neural-network predictive control for
nonlinear dynamic systems with time-delay,” IEEE Trans. Neural Netw., Long Cheng (SM’14) received the B.S. (Hons.)
vol. 14, no. 2, pp. 377–389, Mar. 2003. degree in control engineering from Nankai
[18] F. Lu, J.-Q. Huang, C.-S. Ji, D.-D. Zhang, and H.-B. Jiao, “Gas path on- University, Tianjin, China, in 2004, and the Ph.D.
line fault diagnostics using a nonlinear integrated model for gas turbine (Hons.) degree in control theory and control
engines,” Int. J. Turbo Jet Engines, vol. 31, no. 3, pp. 261–275, 2014. engineering from the Institute of Automation,
[19] H. Modares, F. L. Lewis, and M.-B. Naghibi-Sistani, “Adaptive optimal Chinese Academy of Sciences, Beijing, China, in
control of unknown constrained-input systems using policy iteration and 2009.
neural networks,” IEEE Trans. Neural Netw. Learn. Syst., vol. 24, no. 10, In 2010, he was a Post-Doctoral Research Fellow
pp. 1513–1525, Oct. 2013. with the Department of Mechanical Engineering,
[20] X. Ren, A. B. Rad, and F. L. Lewis, “Neural network-based compen- University of Saskatchewan, Saskatoon, SK,
sation control of robot manipulators with unknown dynamics,” in Proc. Canada, for seven months. From 2010 to 2011, he
Amer. Control Conf. (ACC), New York, NY, USA, Jul. 2007, pp. 13–18. was a Post-Doctoral Research Fellow with the Mechanical and Industrial
[21] O. Khatib, “Real-time obstacle avoidance for manipulators and mobile Engineering Department, Northeastern University, Boston, MA, USA. From
robots,” in Proc. IEEE Int. Conf. Robot. Autom., vol. 2. St. Louis, MO, 2013 to 2014, he was a Visiting Scholar with the Electrical and Computer
USA, Mar. 1985, pp. 500–505. Engineering Department, University of California at Riverside, Riverside,
[22] Z. Ju, C. Yang, Z. Li, L. Cheng, and H. Ma, “Teleoperation of humanoid CA, USA. He is currently a Professor with the Laboratory of Complex
Baxter robot using haptic feedback,” in Proc. Int. Conf. Multisensor Systems and Intelligent Science, Institute of Automation, Chinese Academy
Fusion Inf. Integr. Intell. Syst. (MFI), Beijing, China, 2014, pp. 1–6. of Sciences. He has published over 50 technical papers in peer-refereed
[23] C. Yang et al., “Visual servoing control of Baxter robot arms with obsta- journals and prestigious conference proceedings. His current research
cle avoidance using kinematic redundancy,” in Proc. 8th Int. Conf. Intell. interests include intelligent control of smart materials, coordination of
Robot. Appl. (ICIRA), Portsmouth, U.K., Aug. 2015, pp. 568–580. multiagent systems, neural networks and their applications to robotics.
[24] X. Wang, C. Yang, H. Ma, and L. Cheng, “Shared control for teleopera- Dr. Cheng is an Editorial Board Member of Neural Processing Letters,
tion enhanced by autonomous obstacle avoidance of robot manipulator,” Neurocomputing, and the International Journal of Systems Science.
in Proc. IEEE/RSJ Int. Conf. Intell. Robots Syst. (IROS), Hamburg,
Germany, 2015, pp. 4575–4580.
[25] Z. Ju, C. Yang, and H. Ma, “Kinematics modeling and experimental
verification of Baxter robot,” in Proc. 33rd Chin. Control Conf. (CCC),
Nanjing, China, 2014, pp. 8518–8523.
[26] X. Wang, C. Yang, Z. Ju, H. Ma, and M. Fu, “Robot manipulator Hongbin Ma (M’12) received the bachelor’s degree
self-identification for surrounding obstacle detection,” Multimedia Tools from Zhengzhou University, Zhengzhou, China, in
Appl., pp. 1–26, 2016, doi: 10.1007/s11042-016-3275-8. 2001, and the Ph.D. degree from the Academy
[27] J. Park and I. W. Sandberg, “Universal approximation using radial-basis- of Mathematics and Systems Science, Chinese
function networks,” Neural Comput., vol. 3, no. 2, pp. 246–257, 1991. Academy of Sciences, Beijing, China, in 2006.
[28] C. Yang, Z. Li, R. Cui, and B. Xu, “Neural network-based motion control From 2006 to 2009, he served as a Research
of an underactuated wheeled inverted pendulum model,” IEEE Trans. Scientist with the Temasek Laboratories, Naitonal
Neural Netw. Learn. Syst., vol. 25, no. 11, pp. 2004–2016, Nov. 2014. University of Singapore, Singapore. He has been a
[29] F. L. Lewis, A. Yesildirak, and S. Jagannathan, Neural Network Control Professor with the School of Automation, Beijing
of Robot Manipulators and Nonlinear Systems. Bristol, PA, USA: Institute of Technology, Beijing, since 2009. His
Taylor and Francis, 1998. current research interests include adaptation, learn-
[30] C. P. Bechlioulis and G. A. Rovithakis, “Prescribed performance adap- ing and recognition, especially applications in robots, self-driving cars, and
tive control for multi-input multi-output affine in the control nonlinear unmanned aerial vehicle (UAV).
systems,” IEEE Trans. Autom. Control, vol. 55, no. 5, pp. 1220–1226, Prof. Ma was a recipient of the 13th Huo Ying Dong Young Teacher Award
May 2010. for Higher Education in 2012.