Professional Documents
Culture Documents
UAV Path Planning Based On Receding Horizon Control With Adaptive Strategy
UAV Path Planning Based On Receding Horizon Control With Adaptive Strategy
UAV Path Planning Based On Receding Horizon Control With Adaptive Strategy
Strategy
Zhe Zhang1 , Jun Wang1,2 , Jianxun Li1 , Xing Wang2
1. Department of Automation, Shanghai Jiao Tong University, Shanghai, 200240
E-mail: lijx@sjtu.edu.cn, zhangzhebobo@sjtu.edu.cn
2. Luoyang Electronic Equipment Test Center of China, Luoyang, 471000
E-mail: sjzdhwj@sjtu.edu.cn
Abstract: Recently, researchers show great interest in Unmanned Aerial Vehicle(UAV) path planning problem due to
development of artificial intelligence algorithms. However, the convergence of these kinds of heuristic methods can not
be proved. Therefore, the UAV path planning problem has been modeled as a linear optimal control problem to ensure
the convergence. But the computing time will increase exponentially as the scale of the problem enlarge. Hence, the
Receding Horizon Control(RHC) is introduced to the problem to guarantee the efficiency. Nevertheless, how to choose
a proper receding horizon span to keep balance between efficiency and accuracy becomes a new problem. An adaptive
strategy is proposed to choose a proper parameter to meet the real-time requirement and fuel consuming in this paper.
The strategy has better performance in the two simulation scenes, which indicates the effectiveness of the algorithm.
Key Words: UAV, Path Planning, Receding Horizon Control, Adaptive Strategy, Optimal Control
978-1-5090-4657-7/17/$31.00 2017
c IEEE 843
ized licensed use limited to: National Centre of Scientific Research "Demokritos" - Greek Atomic Energy Commission. Downloaded on March 17,2023 at 09:03:01 UTC from IEEE Xplore. Restrictions
MILP model of the UAV path planing. The traditional 2.3 Feasible Region
RHC is introduced in Section 3. Section 4 is the main con- The path planning model can be used in obstacle avoidance,
tribute of the paper which proposes an adaptive strategy to collision avoidance, plume avoidance for vehicles, plume
choose receding horizon time span. Simulation results can avoidance for obstacles, trajectory optimization and fleet
be found in Section 5 which verified the effective of the assignment and so on. The feasible regions 𝒳 , 𝒰 of above
algorithm and Section 6 concludes this paper. applications are same as Richards’s paper[9].
2 SYSTEM MODEL
2.4 Problem Modeling
An UAV path planning problem is an optimal control prob- From above subsections, the optimal control model of our
lem and can be described in a discrete form as follow. problem is regraded as 𝒫(𝑘, 𝑁 ), where 𝑘 is the current step
𝑁
∑ −1 and 𝑖 is the receding horizon span.
min 𝐽= 𝐿 (𝑥𝑘 , 𝑢𝑘 , 𝑘) + 𝐹 (𝑥𝑁 , 𝑢𝑁 ) (1) 𝒫(𝑘, 𝑁 ) :
𝑥,𝑢
𝑘=0
𝑁∑
−1 𝑁∑−1
𝑠.𝑡. 𝑥𝑘+1 = 𝑓 (𝑥𝑘 , 𝑢𝑘 , 𝑘) (2) min 𝐽 = 𝑎 𝑥𝑘+𝑖∣𝑘 − 𝑥𝑓 + 𝑏 𝑢𝑘+𝑖∣𝑘 + 𝑐 𝑥𝑘+𝑁 ∣𝑘 − 𝑥𝑓
𝑥,𝑢 𝑖=1 𝑖=0
𝑥𝑘 ∈ 𝒳 (3)
(6)
𝑢𝑘 ∈ 𝒰 (4)
𝑠.𝑡. 𝑥𝑘+𝑖+1∣𝑘 = 𝐴𝑥𝑘+𝑖∣𝑘 + 𝐵𝑢𝑘+𝑖∣𝑘 (7)
Where Eq.(1) is the cost function of the problem, 𝑥𝑘 is the 𝑥𝑘+𝑖∣𝑘 ∈ 𝒳 , 𝑖 = 1, . . . , 𝑁 (9)
state variable at 𝑘 𝑡ℎ step and 𝑢𝑘 is the control variable at
𝑢𝑘+𝑖∣𝑘 ∈ 𝒰 , 𝑖 = 0, . . . , 𝑁 − 1 (10)
𝑘 𝑡ℎ step. 𝐿 (𝑥𝑘 , 𝑢𝑘 , 𝑘) is the cost function of 𝑥𝑘 , 𝑢𝑘 and 𝑘
, 𝑁 is the last step of the simulation, so 𝐹 (𝑥𝑁 , 𝑢𝑁 ) is the 𝑥𝑘+𝑖∣𝑘 ∈
/ 𝒪, 𝑖 = 1, . . . , 𝑁 (11)
cost function of final state 𝑥𝑁 and 𝑢𝑁 . Eq.(2) is the state 𝑥𝑘∣𝑘 = 𝑥𝑘∣𝑖𝑛𝑖 , 𝑢𝑘∣𝑘 = 𝑢𝑘∣𝑖𝑛𝑖 (12)
equation of the problem and 𝑓 (𝑥𝑘 , 𝑢𝑘 , 𝑘) is a function of
𝑥𝑘 , 𝑢𝑘 and 𝑘. In Eq.(3)(4), 𝒳 , 𝒰 are the feasible regions of Where 𝒪 is the region of obstacles defined in [9] and 𝑥𝑘∣𝑖𝑛𝑖 ,
state variable 𝑥 and control variable 𝑢, respectively. 𝑢𝑘∣𝑖𝑛𝑖 are the initial state of the 𝑘 step.
Receding Horizon Control need to consider the state and The optimal control model of the problem 𝒫(𝑘, 𝑁 ) is pro-
input of the future step with respect to the current step, so posed in this section, the Receding Horizon Control and
the reference prediction form is introduced. 𝑥𝑘+𝑖∣𝑘 means algorithm will be introduced in the next section.
the current step is 𝑘, the future step is 𝑘 + 𝑖 and step span
3 RECEDING HORIZON CONTROL
is 𝑖. For example, Eq.(2) can be re-expressed as:
At each discrete sampling step, using current system state
𝑥𝑘+𝑖+1∣𝑘 = 𝑓 (𝑥𝑘+𝑖∣𝑘 , 𝑢𝑘+𝑖∣𝑘 , 𝑘) (5)
as the initial state, online solving a finite horizon open-loop
Variables in rest of the paper will be expressed in the ref- optimal control problem to obtain the optimal control se-
erence prediction form to distinguish between the current quence and only the first control signal of the sequence be-
step and the future step. ing applied to the system, repeating the process at the next
sampling step is what we called Receding Horizon Control.
2.1 Cost Function Using the strategy mentioned above, UAV path planning al-
Since our problem depends on time implicitly, function gorithm based on Receding Horizon Control is regarded as
𝐿, 𝐹 in Eq.(1) can be expressed as 𝐿(𝑥𝑘+𝑖∣𝑘 , 𝑢𝑘+𝑖∣𝑘 , 𝑥𝑓 ), Algorithm 1.
𝐹 (𝑥𝑘+𝑁 ∣𝑘 , 𝑥𝑓 ), where 𝑥𝑓 is state variable of destination. Using Algorithm 1, we need to input the receding horizon
Considering the distance and fuel consuming, the specific span 𝑁 in advance. How to choose a proper 𝑁 is a prob-
formula of the cost function (1) is lem, and it is a tradeoff between efficiency and accuracy.
𝑁∑
−1 𝑁∑−1 Bigger 𝑁 means less cost value but higher computing bur-
min 𝐽 = 𝑎 𝑥𝑘+𝑖∣𝑘 − 𝑥𝑓 + 𝑏 𝑢𝑘+𝑖∣𝑘 + 𝑐 𝑥𝑘+𝑁 ∣𝑘 − 𝑥𝑓 den. Choose a small 𝑁 can meet the require of real-time
𝑥,𝑢 𝑖=1 𝑖=0
(6) application but may trap into the local optimal. So we pro-
Where 𝑎,𝑏,𝑐 are nonnegative weighting factors. pose a adaptive strategy to choose a proper 𝑁 at each step,
which will be discussed in detail in the next section.
2.2 State Equation
This paper is implemented in the Mixed Integer Linear Pro- 4 ADAPTIVE STRATEGY
gramming(MILP) framework, so the state equation must be In this section, we propose an adaptive strategy to choose
linear. Therefore, the state equation is receding horizon span 𝑁 . And we regraded 𝑀 as the final
𝑥𝑘+𝑖+1∣𝑘 = 𝐴𝑥𝑘+𝑖∣𝑘 + 𝐵𝑢𝑘+𝑖∣𝑘 (7) step number which means the step that UAV just arrive the
destination. The strategy is suitable for UAV flying in the
where city environment. The process of the strategy is described
[ ] [ 1 2
] as follow:
𝐼3 Δ𝑡 ⋅ 𝐼3 2 (Δ𝑡) ⋅ 𝐼3
𝐴= , 𝐵= (8)
𝑂3 𝐼3 Δ𝑡 ⋅ 𝐼3
(1) Calculate the distance between the start point 𝑥0 and
Where Δ𝑡 is the sample time. Obviously, Eq.(7) is the stan- the final point 𝑥𝑓 , 𝑑𝑖𝑠 = ∥𝑥𝑓 − 𝑥0 ∥2 , where ∥ ⋅ ∥2 is
dard formula of Euler Method. Euclidean norm.
ized licensed use limited to: National Centre of Scientific Research "Demokritos" - Greek Atomic Energy Commission. Downloaded on March 17,2023 at 09:03:01 UTC from IEEE Xplore. Restrictions
Algorithm 1: UAV Path Planning Algorithm Based on chosen as 13 𝑀 ∗ (if not a integer, round up) in the
Receding Horizon Control actual flight-test.
1 Initialization:
(6) if the requirement of efficiency and accuracy is high,
Input: initial position 𝑥0 , final position 𝑥𝑓 , receding
horizon span 𝑁
the horizon can not be selected a fixed number any
2 Set 𝑥0∣𝑖𝑛𝑖 = 𝑥∗0 = 𝑥0 , 𝑢0∣𝑖𝑛𝑖 = 𝑢∗0 = 0 more, an adaptive 𝑁 is acquired to meet the require-
3 for 𝑘 = 0 until meeting terminal condition do ment. UAV path planning algorithm based on Reced-
{ }𝑁 { }𝑁 ing Horizon Control with adaptive strategy is proposed
4 Obtain optimal sequence 𝑥∗𝑘+𝑖∣𝑘 𝑖=1 , 𝑢∗𝑘+𝑖∣𝑘 𝑖=1
by solving 𝒫(𝑘, 𝑁 ) to choose 𝑁 and regarded as Algorithm 2.
5 if 𝑥∗𝑘+𝑁 ∣𝑘 = 𝑥𝑓 then
6 Set 𝑥∗𝑘+𝑖 = 𝑥∗𝑘+𝑖∣𝑘 , 𝑢∗𝑘+𝑖 = 𝑢∗𝑘+𝑖∣𝑘 , 𝑖 = 1, . . . , 𝑁 .
7 break;
8 end Algorithm 2: UAV Path Planning Algorithm Based on
9 Update: Receding Horizon Control with Adaptive Strategy
∗ ∗
1 𝑥𝑘+1∣𝑖𝑛𝑖 = 𝑥𝑘+1∣𝑘 , 𝑢𝑘+1∣𝑖𝑛𝑖 = 𝑢𝑘+1∣𝑘 ,
1 Initialization:
2 𝑥∗𝑘+1 = 𝑥∗𝑘+1∣𝑘 , 𝑢∗𝑘+1 = 𝑢∗𝑘+1∣𝑘 , 𝑘 = 𝑘 + 1 Input: initial position 𝑥0 , final position 𝑥𝑓
2 Set 𝑥0∣𝑖𝑛𝑖 = 𝑥∗0 = 𝑥0 , 𝑢0∣𝑖𝑛𝑖 = 𝑢∗0 = 0, receding horizon
10 end span 𝑁 = 𝜔𝑖𝑛𝑖 ∗ 𝑀 ∗
Output: {𝑥∗𝑖 }𝑘+𝑁 ∗ 𝑘+𝑁
𝑖=0 , {𝑢𝑖 }𝑖=0 3 for 𝑘 = 0 until meeting terminal condition do
{ }𝑁 { }𝑁
4 Obtain optimal sequence 𝑥∗𝑘+𝑖∣𝑘 𝑖=1 , 𝑢∗𝑘+𝑖∣𝑘 𝑖=1
by solving 𝒫(𝑘, 𝑁 )
(2) Ignore the influence of the obstacle and other factors, 5 if 𝑥∗𝑘+𝑁 ∣𝑘 = 𝑥𝑓 then
Calculate the minimum final step number 𝑀𝑚𝑖𝑛 on the 6 Set 𝑥∗𝑘+𝑖 = 𝑥∗𝑘+𝑖∣𝑘 , 𝑢∗𝑘+𝑖 = 𝑢∗𝑘+𝑖∣𝑘 , 𝑖 = 1, . . . , 𝑁
basis of the maximum velocity 𝑉𝑚𝑎𝑥 and acceleration 7 break;
𝐴𝑚𝑎𝑥 . 8 end
9 Update:
(3) According to the map size and the obstacles number, 1 Current cover distance 𝑑𝑖𝑠(𝑘) = 𝑥𝑘∣𝑘 − 𝑥𝑘+𝑁 ∣𝑘
set 𝑁 = 𝑀𝑚𝑖𝑛 + 𝑁𝑟𝑒𝑓 and the value of 𝑁𝑟𝑒𝑓 can be { }2 2
ized licensed use limited to: National Centre of Scientific Research "Demokritos" - Greek Atomic Energy Commission. Downloaded on March 17,2023 at 09:03:01 UTC from IEEE Xplore. Restrictions
5.1 Scene One Table 2: Simulation results for a UAV flying in the city
There are six cubic obstacles simulating buildings in the environment with 𝑂𝑛 = 6
first scene and map size is 75 × 120 × 40. Using strategy 𝑁 Total time Ave time 𝑀 Opt value
mentioned in Section 4, choose 𝑁 = 13 + 20 = 33 at the 5 56.26 1.62 34 1499.49
first time and gain 𝑀 ∗ = 19. Some initial parameter are 8 40.45 2.31 22 987.23
10 29.2 3.48 19 882.28
listed. Sampling time 𝑑𝑡 = 1𝑠, 𝑥0 = (65, −60, 0), 𝑥𝑓 =
14 16.04 5.58 19 865.57
(98, 50, 10), 𝑣0 = 𝑣𝑓 = 0(𝑚/𝑠), 𝑎0 = 𝑎𝑓 = 0(𝑚/𝑠2 ),
Adaptive strategy 15.95 1.97 19 843.55
𝑎 = 𝑐 = 1, 𝑏 = 0.01, 𝜔𝑑𝑖𝑠 = 0.6, 𝜔𝑜𝑏𝑠 = 0.4, 𝜔𝑁 = 0.88.
19 8.55 8.55 19 843.55
There are six trajectories in Figure 1 and two perspectives
to demonstrate more clearly. The comparison between dif- 1600
ferent 𝑁 and adaptive strategy are illustrated in Figure 2.
1499.49
8.5
1400 8.55
7.5
1200
Nt=5
987.23 6.5
1000
Nt=5 Nt=8
Nt=8 Nt=10
Nt=10 Nt=14
Adaptive Nt
Average time/s
Nt=14
Adaptive Nt Nt=19
882.28
Final
Optium value
Nt=19
40
35
30
Final
40
865.57 843.55 843.55
Altitude(m)
25
20
5.5
Altitude(m)
20
0
15
60
800
10
60
40 5.58
40 20
120
20 110
0
0
100
110
120
90
100
4.5
600
-20
80
90
-20 Start
80 70
Start
70 North(m) 60
-40
North(m) -40 60
50
50 East(m)
40 East(m) 40
-60 30 -60 30
400 3.5
(a) UAV trajectories front view (b) UAV trajectories top view 3.48
200 2.5
Figure 1: Different perspectives for UAV trajectories with 1.62 2.31
Final
Nt=5
Nt=8
40
35
30
Nt=8
Adaptive Nt
30 20
Altitude(m)
20 15
10 10
5
0
0
60
60
40
40
120
110
20
20
100
120
0 90
0 110
80 100
Start 90
-20 70 -20
80
60 Start
70
North(m) 50 North(m) -40 60
-40
East(m) 50
40 East(m)
40
-60 30 -60 30
(a) UAV trajectories 𝑁 = 5 (b) UAV trajectories Adaptive 5.2 Scene Two
v.s. 𝑁 = 8 Strategy v.s. 𝑁 = 8
There are evelven cubic obstacles simulating buildings in
40
Nt=10
Adaptive Nt
Adaptive Nt
Nt=19
the second scene and map size is 200 × 300 × 100. Using
strategy mentioned in Section 4, choose 𝑁 = 27 + 100 =
35 40
30 Final
35
25
Altitude(m)
20
30
15
10 25
Altitude(m)
127 at the first time and gain 𝑀 ∗ = 42. Some initial pa-
20 Final
0
60
15
40 10
5 120
20
110
120 100
0
90
0 110
60 80
100 40 Start
70
(c) UAV trajectories Adaptive (d) UAV trajectories Adaptive 𝑥𝑓 = (200, 300, 50), 𝑣0 = 𝑣𝑓 = 0(𝑚/𝑠), 𝑎0 = 𝑎𝑓 =
Strategy v.s. 𝑁 = 10 Strategy v.s. 𝑁 = 19 0(𝑚/𝑠2 ), 𝑎 = 𝑐 = 1, 𝑏 = 0.01, 𝜔𝑑𝑖𝑠 = 0.6, 𝜔𝑜𝑏𝑠 = 0.4,
Figure 2: UAV trajectories comparison in the city environ- 𝜔𝑁 = 0.88.
ment with 𝑂𝑛 = 6 There are seven trajectories in Figure 4 and are two per-
spectives to demonstrate more clearly. The comparison be-
tween different 𝑁 and adaptive strategy are illustrated in
The simulation result for one UAV flying in the city envi- Figure 5. The simulation result for one UAV flying in the
ronment with 𝑂𝑛 = 6 can be found in Table 2 and is also
illustrated in Figure 3. Final
Nt=8
Nt=10
Nt=15
Nt=21
Nt=30
Adaptive Nt
Nt=42
Final
Adaptive Nt
Nt=42
40 100
90
20
80
0 70
60
Altitude(m)
10 120
150 200
0 100
more fuel consumption, and 𝑁 = 10 is the opposite, which (a) UAV trajectories front view (b) UAV trajectories left view
verified the strategy mentioned in Section 4. 𝑁 = 5 cost
least in each step but with maximum optimal value and is Figure 4: Different perspectives for UAV trajectories with
the least receding horizon time span in the first scene be- different 𝑁 in the city environment with 𝑂𝑛 = 11
cause less 5 the UAV will be trapped and can’t arrive the
destination. The adaptive strategy uses little more time than city environment with 𝑂𝑛 = 11 can be found in Table 3
𝑁 = 5 but fuel consume is much less. The optimal value of and is also illustrated in Figure 6.
adaptive strategy should be little bigger than 𝑁 = 19, but The data analysis of scene one is the same as scene one,
is same in the first scene because of the map size. In fact, and is not listed here because of the length of the paper.
there is some difference between trajectories of adaptive 𝑁 = 42 is the optimal solution and 𝑁 = 8 is the least
strategy and 𝑁 = 19 which can be seen in Figure 2(d). The receding horizon time span. The adaptive strategy performs
simulation result indicates the adaptive strategy has better better than other 𝑁 and indicates our algorithm is practical
performance. and effective again.
ized licensed use limited to: National Centre of Scientific Research "Demokritos" - Greek Atomic Energy Commission. Downloaded on March 17,2023 at 09:03:01 UTC from IEEE Xplore. Restrictions
Nt=15
Adaptive Nt
Final
ing.
100
80
60
REFERENCES
Altitude(m)
Nt=8
Nt=10
40
Final
100 20
80
0
60
Altitude(m)
[1] F. Nex and F. Remondino, “Uav for 3d mapping applica-
40
300
20
250
0
300
200
250
200
150
100
100
120
140
160
180
200
150
100
100
120
140
160
180
200
(a) UAV trajectories 𝑁 = 8 (b) UAV trajectories Adaptive risk minimization for military aircraft,” AIAAJ Guid Con-
v.s. 𝑁 = 10 Strategy v.s. 𝑁 = 15 trol Dyn, vol. 12, no. 3, pp. 311–317, 1989.
[3] I. Kaminer, A. Pascoal, E. Hallberg, and C. Silvestre, “Tra-
Nt=21
Final Final
100 100
80 80
60 60
Altitude(m)
Altitude(m)
40
20
300
40
20
300
proach to guidance and control,” AIAAJ Guid Control Dyn,
vol. 21, no. 1, pp. 29–38, 1998.
250 250
200 200
(c) UAV trajectories Adaptive (d) UAV trajectories Adaptive path planning algorithm for uavs using rapidly-exploring
Strategy v.s. 𝑁 = 21 Strategy v.s. 𝑁 = 42 random trees,” J Intell Robot Syst, vol. 71, no. 2, pp. 231–
253, 2013.
Figure 5: UAV trajectories comparison in the city environ- [5] H. Yang and Y. Zhao, “Trajectory planning for autonomous
ment with 𝑂𝑛 = 11 aerospace vehicles amid known obstacles and conflicts,”
AIAAJ Guid Control Dyn, vol. 27, no. 6, pp. 997–1008,
Table 3: Simulation results for a UAV flying in the city 2004.
environment with 𝑂𝑛 = 11 [6] G. Winter, J. Periaux, M. Galan, and P. Cuesta, Genetic al-
gorithms in engineering and computer science.
𝑁 Total time Ave time 𝑀 Opt value
John Wiley & Sons, Inc., 1996.
8 138.4 2.76 58 6127.46 [7] J. Kennedy and R. Eberhart, “Particle swarm optimization,”
10 208.65 4.06 55 5851.48 in Neural Networks, 1995. Proceedings., IEEE Interna-
15 270.125 6.76 49 5691.56 tional Conference on, vol. 4, pp. 1942–1948, IEEE, 1995.
21 248.29 10.67 45 5605.55 [8] D. Karaboga and B. Basturk, “A powerful and efficient
30 165.67 19.68 42 5601.2 algorithm for numerical function optimization: artificial
Adaptive strategy 141.39 3.81 42 5601.21 bee colony (abc) algorithm,” J Glo Optim, vol. 39, no. 3,
42 32.94 32.94 42 5536.79 pp. 459–471, 2007.
[9] A. Richards, T. Schouwenaars, J. P. How, and E. Feron,
“Spacecraft trajectory planning with avoidance constraints
6 CONCLUSION using mixed-integer linear programming,” AIAAJ Guid
Control Dyn, vol. 25, no. 4, pp. 755–764, 2002.
In this paper, the model of the UAV path planning problem [10] A. Richards and J. P. How, “Model predictive control of ve-
was introduced which was a MILP and can be solved in hicle maneuvers with guaranteed completion time and ro-
polynomial time with certain convergence. The algorithm bust feasibility,” in American Control Conference, 2003.
of RHC was introduced to reduce the scale of the problem, Proceedings of the 2003, vol. 5, pp. 4034–4040, IEEE,
but the fixed coefficient cannot meet the requirement of ef- 2003.
ficiency and accuracy. The proposed algorithm 2 gave an [11] M. Grant and S. Boyd, “CVX: Matlab software
adaptive strategy to choose a proper receding horizon time for disciplined convex programming, version 2.1.”
span, whether fixed or varied. The simulation results had http://cvxr.com/cvx, 2014.
proved that the proposed strategy was effective and satis-
fied the request of real-time application and fuel consum-
6200 37
6127.46
32.94
6100
32
6000
5900 27
5851.48
Average time/s
5800
Optium value
22
19.68
5691.56
5700
5605.55 5601.2 5601.21 17
5600
5536.79
5500 12
5400 10.67
7
5300 6.76
2.76
5200 4.06 3.81 2
8 10 15 21 30 Adaptive 42
stategy
Horizon range Nt
ized licensed use limited to: National Centre of Scientific Research "Demokritos" - Greek Atomic Energy Commission. Downloaded on March 17,2023 at 09:03:01 UTC from IEEE Xplore. Restrictions