Application of GA To Design LQR Controller For An Inverted Pendulum System

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

Proceedings of the 2008 IEEE

International Conference on Robotics and Biomimetics


Bangkok, Thailand, February 21 - 26, 2009

Application of GA to Design LQR Controller


for an Inverted Pendulum System
Chaiporn Wongsathan Chanapoom Sirima
Department of Electrical Engineering Department of Electrical Engineering
North-Chiang Mai University North-Chiang Mai University
Hangdong,Chiang Mai 50230, Thailand Hangdong,Chiang Mai 50230, Thailand
chaiporn@northcm.ac.th, chaiporn089@gmail.com chanapoom@northcm.ac.th

Abstract - This paper apply Genetic Algorithm (GA) to mathematical formulations. High convergence rapid, low
design weighting matrices in Linear Quadratic Regulator computational burden and do not caught in local minima are the
(LQR) for an Inverted Pendulum System (IPS). Feedback gain illustrious features of this method.
settings of the system are obtained by minimizing the
performance index using GA to optimize the weight matrices of
The purpose of this paper is to add GA to LQR for optimizing
LQR. The divisional of the searching space mechanism and the
the weight matrices of LQR. LQR-GA is expected to overcome
dynamic crossover and mutation rate are used to insure getting
the shortcoming of the experiential-LQR [6]. The paper is
the global optimal results. Time specification performances are
organized as follows. In section II, the dynamic model of the IPS
also compared considering LQR-GA and experiential-LQR [6]
and the linearized of the system are expounded. In section III, the
by determine the control strategy delivers with respect to
concept of LQR optimal control method is shortly reviewed. The
pendulum’s angle and cart’s position. The simulation results
idea of GA is illuminated and introduced how to apply GA on
show that the proposed method gives the response better than
LQR step by step are described in section IV. In section V, the
experiential-LQR [6]. All computations will be carried out
convergence of GA method and the contrasting results of the
using MATLAB®.
optimization in our proposed method and [6] are presented.
Finally, conclusions are summarized in section VI.
Index Terms – Genetic Algorithm (GA), Linear Quadratic
Regulator (LQR) and Inverted Pendulum System (IPS). II. DYNAMIC MODEL OF THE INVERTED PENDULUM SYSTEM
The IPS is a classic control problem that is used in
I. INTRODUCTION universities around the world. It is a suitable process to test
The IP is used for control engineers to verify a modern control prototype controllers due to its high nonlinearities and lack
theory since its characteristics as marginally stable as a control. of stability. The system consists of an inverted pole hinged
This system was chosen as a test model because it is a well-known, on a cart which is free to move in the x-direction as shown
however, it also has its’ own deficiency due to its highly non-linear in Fig. 1.
and open-loop unstable system. Thus, causing the pendulum falls
over quickly whenever the system is simulated due to the failure of
standard linear technique to model the non-linear dynamics of the
system. Moreover, it makes the identification and control become
more challenging. The conventional controller approaches to
overcome the problem by this system is LQR which is an optimal
control method with the quadratic performance indexes and these
indexes have specify physical concepts generally [2]. At the same
time, LQR has simple math disposal process and can achieve
closed loop optimal control with the linear state feedback or output
feedback. Nevertheless, it attribute to difficulty in specifying an (a) (b)
accurate mathematical model of the process [3]. The selection of Fig.1 (a) Inverted Pendulum System, (b) Free body diagram of the
weight matrices in LQR is very importance and it straight affects system
the control effect. In general, the weight matrices are set by
experience of engineers, who need familiar with the controlled
system [4]. Reference [5]-[6] applied LQR to design the control In order to obtain the dynamic model of the system, the
law of the IPS. The weight matrices were set by experience first following assumptions and design requirements have been
and then were adjusted by simulation till obtaining the satisfying made. The assumptions are:
output responses. To this process, if the designer known poor about (1) The system starts in a state of equilibrium meaning
the system, the optimal weight matrices could not be obtained and that the initial conditions are assumed to be zero.
so the control performance also could not be optimal. The GA is (2) The pendulum does not move than a few degrees
one method that help the designer achieve this task. away from the vertical to satisfy a linear or nonlinear model.
The GA is a highly efficient and robust search algorithm based (3) A step input is applied.
on evolution in nature. GA has been widely used to evolve good
solutions to difference problems [7]. It can be applied directly to The design requirements are:
various problems, without need to transform them into

978-1-4244-2679-9/08/$25.00 ©2008 IEEE 951


(1) The settling time, ts for x and θ is to be less than 5
seconds (ts ≤ 5 s). III. LQR OPTIMAL CONTROL AND LQR DESIGN
(2) The overshoot of θ less than 20 degrees (OS ≤ 22.5%). LQR is a method in modern control theory that uses state-
(3) Rise time for x of less than 1 second (tr ≤ 1 s). space approach to analyze such a system. In this section, the
(4) Steady-state error is within 2% (ess ≤ 2%) basic design process for LQR will be illuminated. From the state
equation of the controlled system (5)-(6) and the initial condition,
The physical properties of the system are fixed as follows: the performance index is given by:

a) M – Mass of cart 0.5 kg, 1
m – Mass of pendulum 0.2 kg J=
2 ∫
[ xT Qx + u T Ru ]dt (7)
b) B – Friction of cart 0.1 Nm-1s-1 0
c) L – Length to pendulum centre of mass 0.3 m where Q and R are the weight matrices, Q is required to be
d) I – Inertia of the pendulum 0.006 kg m2 positive definite or positive semi-definite symmetry matrix,
e) g – gravitational acceleration 9.81 ms-2 R is required to be positive definite symmetry matrix. Then,
f) F – Force applied to cart matrix Q and R has a dimension 4 × 4 and 1× 1
g) x, x , x – Cart position coordinate, cart velocity and respectively. Since they are symmetry so there are 10
cart velocity, respectively. distinct in Q and 1 in R for a total of 11 distinct elements
need to be selected. And also they should satisfy the positive
h) θ, θ , θ - Pendulum angle from the vertical, pendulum definitions. One practical method is to Q and R to be
angular velocity and angular acceleration, respectively. diagonal matrix (Q = diag(q1, q2, q3, q4), R = r) such that
i) N – Sum of the forces of the cart. only 5 elements need to be decided. The value of the
j) P – Sum of the forces of the pendulum in the elements in Q and R is related to its contribution to the cost
horizontal direction. function J.
Summing the forces in the Free Body Diagram 2(b) of Since the system described by (5) and (6) is controllable
the system in horizontal and vertical direction, we get the completely, the control method which make (7) achieving
following equation of motion: minimum is called LQR. Using (7), the correctional Riccati
matrix equation can be obtained:
( M + m) x + bx + mlθ cos θ − mlθ 2 sin θ = F (1)
PA + QAT P − ( PB + N ) NR −1 ( PB + N )T + Q = 0 (8)
( I + ml 2 )θ + mgl sin θ = − mlx cos θ (2)
By solving (8), matrix P can be obtained and if it is
The dynamic equations of (1) and (2) should be linearlized positive definite, the system will be steady and the optimal
about θ = π . Assume that θ = π + φ ( φ represents a small feedback vector K and the optimal control variable u(t) are
angle from the vertical upward direction). After linearization gained:
the dynamic equation, (3) and (4) are obtained:
K = R −1 ( BT P + N T ) (9)
( I + ml )φ − mglφ = mlx
2
(3)
u (t ) = − Kx (t ) (10)
( M + m) x + bx − mlφ = u (4)
For LQR control problem, the weight matrices Q and R
where u represents the input. in the performance index have a great influence on the
After manipulating the dynamic equation of (3) and (4) control effects. Since selection Q and R is weakly connected
and substituting the parameter values of the cart and to the performance specification and a certain amount of
pendulum, the linearized system equations can also be trial and error is required with an interactive computer
represented in state-space form: simulation before a satisfactory design results. To achieve
this problem, GA is one method that used to optimize the
weight matrices.
⎡ x (t ) ⎤ ⎡0 1 0 0⎤ ⎡ x(t )⎤ ⎡ 0 ⎤
⎢ x(t ) ⎥ ⎢0 − 0.1818 2.6727 0⎥ ⎢ x (t )⎥ ⎢1.8182⎥
⎢ ⎥=⎢ ⎥⎢ ⎥+⎢ ⎥u (t ) (5) IV. GA AND IMPLEMENT GA ON DESIGN LQR CONTROLLER
⎢φ(t )⎥ ⎢0 0 0 1⎥ ⎢φ (t )⎥ ⎢ 0 ⎥ The GA is an optimization and stochastic global search
⎢  ⎥ ⎢ ⎥⎢  ⎥ ⎢ ⎥ technique based on the principles of genetics and natural
⎣φ (t )⎦ ⎣0 − 0.4545 31.1818 0⎦ ⎣φ (t )⎦ ⎣4.5455⎦
selection and developed by John Holland (1975). A GA
⎡ x (t ) ⎤ allows a population composed of many individual to evolve
⎢ ⎥ under specified selection rules to a state that minimizes the
⎡1 0 0 0⎤ ⎢ x (t ) ⎥ ⎡0⎤
y (t ) = ⎢ ⎥⎢ + ⎢ ⎥u (t ) (6) cost function. The basic structure of the GA consists of:
⎣0 0 1 0⎦ φ (t )⎥ ⎣0⎦ coding, selection, crossover (mating), and mutation. The
⎢ ⎥
⎣φ (t )⎦ design process of the GA method is as follows:
Since the system of (5) and (6) has the controllability matrix Step 1 Coding: The binary code is adopted and the
(Co=[A: AB: A2B: A3B] ) which has full rank (rank = 4). number of bits (Nbit) depends on desired accuracy. Suppose
Thus, the linearized model is controllable and (5) and (6) the range of parameter x is [Lx,Ux] and the precision demand
can be used to design LQR controller. is ε, then number of bits is obtained as follow:

952
Bx = log 2 [(U x − Lx ) / ε ] (11) crossover and using dynamic crossover, then the probability
equal to crossover rate (Pc) of each generation (Gen) can be
For the parameters of weight matrices Q and R (qi and ri), expressed:
the number of bits in whole chromosome is obtained:
n2 n2 Pc = exp( −Gen / Max _ Gen ) (17)
Nbit = ∑ Bq i + ∑ Bri (12)
i =1 i =1 Mutation: This function makes small random changes in
where n1 and n2 are the amount of parameters included in Q the individuals, providing genetic diversity and thereby
and R respectively. increasing the likelihood that the algorithm will generate
individuals with a better fitness value. For each member of
Step 2 Divide searching space: The process begins by the population a random number in the range (0, 1) is
dividing bounded parametric search space into finite pieces
to avoid the solution being trapped in local minima. generated. If the random number is below some pre-
specified mutation threshold, then the gene is allowed to
Step 3 Initialization of population: The number of initial mutate. This paper randomly selected num_m points to
population is randomly generated M individuals. Each mutate and used dynamic mutation that mutation rate (Pm)
individual possesses vector entries with certain length of of each generation (Gen) can be expressed:
gene shown by Fig. 2. Each gene is coded by binary code
with certain length of bit (Nbit). Pm = exp( 0.05 × Gen / Max _ Gen ) − 1 (18)

Step 7 Replacement: The new generation from step 6 is


brought to replace the current population.
Fig. 2 Parameters represented by genes in chromosome. Step 8 Repeat steps 3-7 until achieved: Steps 3-6 is
repeated in the new generation until convergence is
Step 4 Normalize the solution: The binary value of each achieved. The algorithm stops if it meets any one of three
gene is normalized within the range [qi_min, qi_max] and [rmin, stopping criteria.
rmax] by linear mapping function:
Step 9 Assign top ranking of children to the pool: After the
(qi _ min + (qi _ max − q i _ min ) × y (i ) (13) GA convergence criterion is achieved, the children
gene(i ) =
2 NBit − 1 possessing the top ranking of the fitness value are assigned
(rmin + (rmax − rmin ) × y (i ) (14) to the decision pool.
gene(i ) =
2 NBit − 1 Step 10 Repeat steps 2-9 to meet the final subspace.
where y(i) is binary value of each gene
Nbit is bits per gene V. RESULTS
Step 5 Fitness Computation: The performance of the
solution in the current population is computed by using the Through the designed step by step in the previous
fitness function: section, next the convergence of GA method, the design
parameters for LQR-GA and the contrast between the LQR-
M GA and experiential-LQR [6] will be presented.
FGA = (15)
f ev + 1 The design parameters for the LQR-GA are selected as:
search domains for q1, q2, q3, q4 and r are [1,109]
where fev is the cost function and evaluated by the Root
Mean Square (RMS) of the pendulum angle from the likewise, ε = 1× 10 4 , M = 100, NBit = 20, and Max_Gen = 10.
vertical, suggested by: After run by GA, the optimal weight vector is obtained:
7
r = [0.0364 × 106 ] , q = [3.2482, 0.0376, 0.4734, 0.0462] × 10
T /2 T
f ev = w1 ∑t =0
(φ (t )) 2 + w2 ∑ (φ (t ))
t =T / 2
2
(16) and the feedback gain matrix is K = [-29.8724, -20.2218, 71.5782,
14.4192]. Convergence curve of GA method by dynamic
crossover and mutation is shown in Fig. 3.
Thus, the chromosomes which have higher scores have a
lower fitness values.
Step 6 Reproduction: Reproduction options determine
how the GA creates children for the next generation from
the parents. Two types of reproduction in this paper are:
Crossover: This enables the algorithm to extract the best
genes for different individuals by selecting genes from a
pair of individuals in the current generation and
recombining them into potentially superior children for the
next generation. By randomly selecting num_c points to

953
Results by LQR-GA method and experiential-LQR [6] method
for cart’s position
Object LQR-GA LQR[6]
Rising Time, tr 0.58 s 0.41 s
Settling time, ts 1.16 s 2.04 s
Percent Overshoot, OS 0.94 % 0%
Steady state error, ess 0 0

It shows that LQR-GA method has better performance


as compared to LQR [6] method. From Table 1, we clearly
see that LQR-GA has the smallest settling time (Ts) of 1.83
seconds. In addition, LQR-GA method has the lowest range
for the maximum overshoot range while the LQR [6]
Fig. 3 Convergence curve of GA method has exceeded the maximum range of 20 degrees.
The last characteristic, both LQR-GA and experiential-LQR
The output response of the pendulum’s angle and cart’s [6] method has the steady state error of 0. From Table 2, the
position by using LQR designed by GA compared with the experiential-LQR [6] has the fastest rising time of 0.41
experiential-LQR [6] are shown in Fig. 3 and 4 respectively. seconds while LQR-GA has the rising time of 0.58 seconds.
From the responses of the figures, the systems are However LQR [6] has the largest value of settling time of
successfully stabilized as required by the design criteria that 2.04 seconds. For the percent overshoot, LQR [6] has 0%
mentioned on section II. overshoot while the LQR-GA has a little percent overshoot
of 0.94%. From both of these characteristics, we could say
that the LQR-GA controller able to response faster than
experiential-LQR [6].

VI. CONCLUSION
In this paper, GA was introduced to LQR and a new
optimal control method (LQR-GA) is successfully designed.
The control results by our proposed method are compared
with the experiential-LQR [6] for controlling the inverted
Fig. 4 Step response of the pendulum’s angle with pendulum’s angle and the cart’s position of the linearized
LQR controller designed parameter by GA.
system. Simulation results show that LQR-GA controller
has better performance compared to experiential-LQR [6] in
controlling the IPS. As seen in the results, LQR-GA can
decrease the requests to the designer and the designing
process is done automatically. The design efficiency is
improved obviously. However, in the future, the design idea
can be extended to the control method such as fuzzy logic
controller, neuro-fuzzy controller for the non-linear model
of IPS which needs to be studied ulteriorly.

Fig. 5 Step response of the pendulum’s position with


REFERENCES
LQR controller designed parameter by GA. [1] C. Mellon, University of Michigan, www.engine.umich.
edu/group/ctm/examples
[2] P.O.M. Scokaert and J. B. Rawlings, “Constrained linear quadratic
Comparing between the results by the LQR-GA method regulation,” IEEE Trans. Automatic Control, 43(8):1163-1169, 1998.
and those by the experiential-LQR [6] method is done and [3] C. Wei, J. Fang and L.L. Kam Kin, Fuzzy logic controller for an
Inverted Pendulum System, Faculty of Science and Technology,
shown in Table 1 and Table 2 for the pendulum’s angle and University of Macau.
cart’s position, respectively. [4] Y.B. Shtessel, “Principle of proportional damages in a multiple
criteria LQR problem,” IEEE Trans. Automatic Control, 41(3):461-
TABLE 1 464, 1996.
Results by LQR-GA method and experiential-LQR [6] method [5] R.J. Pulles, “Controller design for ADAMS models using
Matlab/SIMULINK interaction,” Technishe Universiteit Eindhoven,
for pendulum’s angle 2003.
Object LQR-GA LQR[6] [6] A.N.K. Nasir, M.J. Rahmat and M.A. Ahmad, “Performance
[-29.8724, -20.2218, [-70.7107, -37.8345, comparison between LQR and fuzzy logic controller for an inverted
Feedback gain, K
71.5782, 14.4192] 105.5298, 20.9238] pendulum system,” University Malaysia Pahang.
Settling time, ts 1.83 s 3.34 s [7] E. Dilettoso and N. Salerno, “A self-adaptive niching genetic
Maximum Overshoot algorithm for multimodal optimization of electromagnetic devices,”
[11.24°, -24.54°] [19.57°,-46.47°] IEEE Trans. Magnetics., 42(4):1203-1206, 2006.
Range
Steady state error, ess 0 0
TABLE 2

954

You might also like