6-Aprendizaje Iterativo

Iterative Learning Controller Design for
Multivariable Systems
Manuel Olivares† , Pedro Albertos‡ , Antonio Sala‡
mos@elo.utfsm.cl pedro@aii.upv.es asala@isa.upv.es
†
Dept. of Electronics, Universidad Técnica F. Santa Marı́a Valparaı́so, Chile
‡
Dept. of Syst. Eng. and Control, Universidad Politécnica de Valencia, Spain
XV IFAC World Congress b’02, Barcelona, Spain, July 25th, 2002 – p. 1/31
Outline
Introduction
ILC General Scheme
Paper Objectives
Background
ILC Analysis and Design, for MIMO Systems with
Uniform Delay
Examples
Conclusions
Introduction 1/3
The general ILC setting, (Arimoto et al., 1984), is shown

uk yk
Plant
fH
u k+1
Learning
yd
fL
A linear learning algorithm considers fL as a linear

operator, for example
n
X
uk+1 (t) = αj uk−j (t) + Γek (t + m)
j=0
ek (t + m) = yd (t + m) − yk (t + m)
Introduction 2/3
where ek (t) represents the output error vector at time t in

the iteration k, αj are the past control factors and Γp×p is
the learning gain matrix:
how are the αj factors chosen?
how is the Γp×p matrix designed?
does the algorithm converge?
how many iterations does it take to converge?
Introduction 3/3
This paper focuses on a new approach to establish

convergence conditions of the linear discrete ILC
algorithm, for linear MIMO plants
the convergence velocity of the algorithm
what is the steady state input u sequence, for a given
yd desired trajectory
a design methodology and its limitations
Other results exist, using some other approaches (λ-norm,
L2 -norm, 2D systems, etc.)
Background
System Representation 1/3
To represent the plant fH , a discrete state space model of

a MIMO system with input delaya of m sampling periods of
time, is considered
x(t + 1) = Ax(t) + Bu(t)

x ∈ Rn , u ∈ R v , y ∈ R p
y(t) = Cx(t) + Du(t)
with matrix transfer function

y(z) = G(z)u(z)
G(z) = C(Iz − A)−1 B + D
a
All input channels with the same delay, and v = p is assumed
The pulse response matrix H(i) allows for the equivalent

representation
t
X
y(t) = H(t − i)u(i), t≥0
i=0
H(0) = D, H(i) = CAi−1 B, i>0
Note that
H(i) = 0, 0<i<m
and most systems usually have D = 0
With that, the convolution formula can be split as
y(t + m) = H(m)u(t) + η(t)

t−1
X (1)
η(t) = H(t + m − i)u(i)
i=0
Considering just a finite time horizon 0 ≤ t ≤ N , the block

vectors U, Y ∈ Rp(N −m+1) can be defined, as in
(Moore, 1993)
T T T
T
U = u (0) u (1) · · · u (N − m)
T T T
T
Y = y (m) y (m + 1) · · · y (N )
Inverse System 1/2
The system can be represented in the matrix form

Y = HU , where H has a triangular Toeplitz structure.
 
H(m) 0 ··· 0
 
H(m + 1) H(m) ··· 0
 

H=
 .. .. ..



 . . . 

H(N ) H(N − 1) ··· H(m)
Thus, the inverse system in finite time, is
U = H −1 Y = M Y
The inverse exists if rank[H(m)] = rank[CAm−1 B] = p
Inverse System 2/2
Then, the ideal input block vector U ∗ for perfect tracking of

a desired output trajectory block vector Yd , is simply
calculated as
U ∗ = H −1 Yd = M Yd
As the matrix M is also triangular Toeplitz, with diagonal
elements M (m) = H −1 (m), the input vector u∗ (t) can be
obtained by deconvolutionb
t
X
u∗ (t) = M (m + i)yd (m + t − i), 0≤t≤N −m (2)
i=0
b
Non causal
Why ILC?
Why ILC and not direct inversion?

The expression (2) would yield the ideal input u∗ in a
“one-shot” way, but it is not practical, as matrix M has
(N − m + 1) × p2 no nil elements
Full plant knowledge is needed for direct inversion
The ILC can manage plant changes, as the input
vector sequence u is refreshed iteratively, with the
actual output error (at least those of the previous
iteration)
ILC Analysis and Design, for MIMO Systems
with Uniform Delay
First order ILC-P Analysis
From (1), and taking α0 = 1, αj = 0 ∀j > 0, the iterative

learning algorithm can be viewed as a set of N − m + 1
disturbed subsystems, in the iteration axis, one for each
t ∈ [0, N − m].
η k (t)
y d(t+m) e k(t+m) +
u k(t) + yk(t+m)
+ +
Γ z -1 I H(m)
- +
The stability analysis of this set of closed loops, yields the

following theorem
Main Theorem 1/2
Theorem 1 (Learning Algorithm Convergence) The first

order ILC-P algorithm
uk+1 (t) = uk (t) + Γ(yd (t + m) − yk (t + m)) (3)
i. Converges if and only if
F = I − H(m)Γ is Hurwitz
ii. the maximum steady state tracking error is

lim max |yd (i) − yk (i)| = 0
k→∞ m≤i≤N
Main Theorem 2/2
iii. the input sequence U ∗ converges to
U ∗ = lim Uk = H −1 Yd
k→∞
iv. and the maximum convergence speed is achieved for

m−1
−1
Γ = CA B = H −1 (m)
Proof 1/2
Proof: The learning law (3) may be rewritten using (2)

t
X t
X
M (m + i)yk+1 (m + t − i) = M (m + i)yk (m + t − i) + Γek (t + m)
i=0 i=0
then, subtracting from both sides the ideal input u∗

t
X t
X
M (m + i)ek+1 (m + t − i) = M (m + i)ek (m + t − i) − Γek (t + m)
i=0 i=0
finally, the Z-transform is applied, obtaining a relationship

among the temporal tracking errors
t
X
[(z − 1)I · M (m) + Γ]ez (t + m) = −(z − 1)I M (m + i)ez (t + m − i)
i=1
Proof 2/2
Initial condition terms have been removed, as they do not

affect stability
Part i is proven from the characteristic equation
|eig[M −1 (m)(M (m) − Γ)]| < 1
The final value theorem, proves ii

Part iii is consequence of ii, as the inverse solution is
unique
For Γ = H −1 (m), the poles of the characteristic
equation are at the origin, proving part iv
Design Considerations 1/2
H(m) This is the only term needed to design Γ. It can be

obtained from an open loop experiment
m This parameter is needed to form the learning algorithm.
It can be also determined experimentally
σ(H(m)) Small singular values of H(m) could yield strong
control actions. In this case, m̄ > m should be used
−1
Γ = (CAm−1 B) = H −1 (m) This can be a bad option in a
noisy environment or if there is a model mismatch
A, B, C The system can be unstable, non minimum phase
U ∗ Bounded input is obtained for minimum phase system.
For unstable systems and disturbance rejection a
feedback controller in a 2DOF control scheme is
required XV IFAC World Congress b’02, Barcelona, Spain, July 25th, 2002 – p. 19/31
Design Considerations 2/2
Plant Delay Overestimation In this case, first outputs can not

be corrected
In this case, very strong control
Plant Delay Underestimation
actions would be produced
Pole placement technique can be
Learning Gain Selection
used to design Γ, assigning stable eigenvalues to
matrix F
Γ = H −1 (m)(I − F ) (4)
GeneralizationIn linear multivariable systems, the

superposition principle applies. Thus single input
channel learning can be combined to follow any
combination of individual learnt output trajectories
Examples
Stable Plant 1/2
Two independent finite trajectories are tracked, with a

minimum time ILC-P giving U ∗ block vector input for the
plant G1 (s)
2 0.5
 

 4s2 + 1 3s + 1   2.1961 −0.9514
G1 (s) = 
 −(s + 1) ,
 1 Γ =
2 3.2619 1.6434
s2 + 3s + 1 2s2 + 2s + 3
The system is sampled each T = 1sec; m = 1 and N = 20
Stable Plant 2/2
6 5
x 10 x 10
3 16 20
2
0 15
14
2.5 −2 10
u1
u2
12 −4 5
2 −6 0
10
−8 −5
||yd1−y1||max
||yd2−y2||max
0 5 10 15 20 0 5 10 15 20
1.5 8
6
1 10
1
4 8
0.5
0.5 6
0
y1
y2
2
4
−0.5
2
0 0 −1
0
−1.5
10 20 30 40 50 10 20 30 40 50 0 5 10 15 20 0 5 10 15 20
k iteration k iteration t [s] t [s]
Maximum error vs. k and tracking result, at k = 50. From

iteration k = 19 onwards, max |yd (i) − yk (i)| < 10−9 , on
m≤i≤N
both channels
Unstable Plant 1/2
The same two independent finite trajectories are tracked,

with a minimum time ILC-P giving U ∗ block vector input for
the plant G2 (s)
1 −2
 

 2s2 + 2s + 3 2s + 1   0.4868 3.9539
G2 (s) =  4 1 ,
 2 Γ =
 −1.1695 0.8219
(2s + 1)(4s − 1) 4s2 + 3s + 1
This time, a 2DOF is used, combining ILC with a discrete

LQR state feedback controller. The system is sampled
each T = 1sec; m = 1 and N = 20
Unstable Plant 2/2
5 6
x 10 x 10
6 1.5
4
5
1
2
0.5
0
5
0
u1
u2
−2
4
−4 −0.5
4 −6 −1
−8 −1.5
3
||yd1−y1||max
||yd2−y2||max
0 5 10 15 20 0 5 10 15 20
3
2
1.5
2 10
1
8
0.5
1
1 6
0
y1
y2
4
−0.5
−1 2
0 0
−1.5 0
10 20 30 40 50 10 20 30 40 50 0 5 10 15 20 0 5 10 15 20
k iteration k iteration t [s] t [s]
Maximum error vs. k and tracking result, at k = 50. From

iteration k = 19 onwards, max |yd (i) − yk (i)| < 10−5 , on
m≤i≤N
both channels
Convergence Rate
Two different Γ are adopted, using (4), for G1 (s)

−0.1
0.1 0 e 0
F1 = F2 =
0 0.3 0 e−10
5 5
x 10 x 10
7 200
20 12
180
18 6
160 10
16
5 140
14
8
120
12 4
||yd1−y1||max
||yd2−y2||max
||yd1−y1||max
||yd2−y2||max
100
10 6
3
80
8
60 4
6 2
4 40
1 2
2 20
0 0 0 0
10 20 30 40 50 10 20 30 40 50 100 200 300 400 500 100 200 300 400 500
k iteration k iteration k iteration k iteration
Conclusions 1/2
A full multivariable open loop controller is lead by ILC

which is able to decouple strong coupled MIMO
systems iteratively
Conclusions 1/2

systems iteratively
Hidden oscillations may occur. They can be avoided by
selecting a well suited sampling time
Conclusions 1/2

systems iteratively
Big transient errors could be avoided with an adaptive
or a progressive selection of convergent learning gains
Γ, as k increases
Conclusions 1/2

systems iteratively
Big transient errors could be avoided with an adaptive
or a progressive selection of convergent learning gains
Γ, as k increases
A feedback - feedforward 2DOF scheme can cope with
system uncertainties and disturbances (unstable
systems)
Conclusions 2/2
Zero tracking error in all sampled points is assured in

steady state
Conclusions 2/2

steady state
A linearized model of a non linear system allows a Γ
design, if a constant reference is used
Conclusions 2/2

steady state
The analysis methodology could be applied to a first
order ILC P, PI, PD,for linear MIMO systems
Conclusions 2/2

steady state
The analysis methodology could be applied to a first
order ILC P, PI, PD,for linear MIMO systems
Other than the pole assignment technique, to tune the
learning parameter could be applied
Current Work
Integrate ILC with NN’s and FS’s in 2DOF schemes
Current Work

Use of ILC to tune NN’s and FS’s parameters
Current Work

Use of ILC to tune NN’s and FS’s parameters
Synthesis of linear ILC algorithms for MIMO systems
with non uniform delay
Acknowledgments
The authors would like to acknowledge the support of

The Agencia Española de Cooperación Internacional
AECI
Project 23.01.11 and Dirección General de
Investigación y Postgrado DGIP-USM
Project GV00-100-14
References
Arimoto, S., S. Kawamura and F. Miyazaki (1984).
Bettering operations of robots by learning. J. of
Robotic Systems 1, 123–140.
Moore, K.L. (1993). Iterative Learning Control for
Deterministic Systems. Springer-Verlag.

6-Aprendizaje Iterativo

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

6-Aprendizaje Iterativo

Uploaded by

Copyright:

Available Formats

Iterative Learning Controller Design for

The general ILC setting, (Arimoto et al., 1984), is shown

A linear learning algorithm considers fL as a linear

where ek (t) represents the output error vector at time t in

This paper focuses on a new approach to establish

To represent the plant fH , a discrete state space model of

x(t + 1) = Ax(t) + Bu(t)

with matrix transfer function

The pulse response matrix H(i) allows for the equivalent

H(0) = D, H(i) = CAi−1 B, i>0

and most systems usually have D = 0

With that, the convolution formula can be split as

y(t + m) = H(m)u(t) + η(t)

Considering just a finite time horizon 0 ≤ t ≤ N , the block

The system can be represented in the matrix form

Thus, the inverse system in finite time, is

Then, the ideal input block vector U ∗ for perfect tracking of

Why ILC and not direct inversion?

From (1), and taking α0 = 1, αj = 0 ∀j > 0, the iterative

The stability analysis of this set of closed loops, yields the

Theorem 1 (Learning Algorithm Convergence) The first

uk+1 (t) = uk (t) + Γ(yd (t + m) − yk (t + m)) (3)

i. Converges if and only if

ii. the maximum steady state tracking error is

iii. the input sequence U ∗ converges to

iv. and the maximum convergence speed is achieved for

Proof: The learning law (3) may be rewritten using (2)

then, subtracting from both sides the ideal input u∗

finally, the Z-transform is applied, obtaining a relationship

Initial condition terms have been removed, as they do not

Part i is proven from the characteristic equation

|eig[M −1 (m)(M (m) − Γ)]| < 1

The final value theorem, proves ii

H(m) This is the only term needed to design Γ. It can be

Plant Delay Overestimation In this case, first outputs can not

GeneralizationIn linear multivariable systems, the

Two independent finite trajectories are tracked, with a

Maximum error vs. k and tracking result, at k = 50. From

The same two independent finite trajectories are tracked,

This time, a 2DOF is used, combining ILC with a discrete

Maximum error vs. k and tracking result, at k = 50. From

Two different Γ are adopted, using (4), for G1 (s)

A full multivariable open loop controller is lead by ILC

A full multivariable open loop controller is lead by ILC

A full multivariable open loop controller is lead by ILC

A full multivariable open loop controller is lead by ILC

Zero tracking error in all sampled points is assured in

Zero tracking error in all sampled points is assured in

Zero tracking error in all sampled points is assured in

Zero tracking error in all sampled points is assured in

Integrate ILC with NN’s and FS’s in 2DOF schemes

Integrate ILC with NN’s and FS’s in 2DOF schemes

Integrate ILC with NN’s and FS’s in 2DOF schemes

The authors would like to acknowledge the support of

You might also like