Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

Inverted pendulum with obstacles

1 Constants
∆t = 0.1 : Time step
g = 9.81 : Gravity
L = 3.0 : Half-length of the track
l = 0.5 : Half-length of the pole
mc = 1.0 : Mass of the cart
mp = 0.1 : Mass of the pole
µc = 0.0005 : Friction coefficient of the cart
µp = 0.000002 : Friction coefficient of the pole
ẍmax = 15.0 : Maximum acceleration applied to the cart
ẋmax = 15.0 : Maximum velocity supported by the system
θ̇max = 10.0 : Maximum angular velocity supported by the system

1
2 Obstacles
There is 8 obstacles on the track that have all a radius of 0.05 :
(−3.0, −0.35)
(−2.5, 0.35)
(−2.0, −0.35)
(−1.25, 0.35)
(−0.5, −0.35)
(0.5, 0.35)
(1.75, −0.35)
(3.0, 0.35)

3 State variables
x : Position of the cart
ẋ : Velocity of the cart
θ : Angular position of the pole in [0, 2 ∗ π]
θ̇ : Angular velocity of the pole

4 Actions
a1 = −ẍmax
a2 = ẍmax

5 Dynamics
a11 = (4 ∗ l) /3
a22 = − (mc + mp )
a12 = − cos (θ)
a21 = l ∗ mp ∗ cos (θ)
n o
b1 = g ∗ sin (θ) − µp ∗ θ̇/ (l ∗ mp )
 
b2 = l ∗ mp ∗ θ̇2 ∗ sin (θ) − aj + µc ∗ sign (ẋ)
 2 12
    
θ̈ = n b ∗ a a22 ∗ b1 / a12 ∗ a21
− o i − a11 ∗ a22

ẍ = b1 − a11 ∗ θ̈ / a12

     
ẋ ẋ ∆t ∗ ẍ
= +
θ̇ θ̇ ∆ ∗ θ̈
     t 
xi x ∆t ∗ ẋ
= +
θ θ ∆t ∗ θ̇

6 Reward
r(xt , at ) = 0 if the resulting state variable |x| > L or if the pole touched one of
the obstacles trought its movement. Else r(xt , at ) is equal to result computed
with the state variable of the next state :
(1 + cos (θ)) /2

2
7 Inital state
x = −2.5
ẋ = 0.0
θ=π
θ̇ = 0.0

You might also like