Download as pdf or txt
Download as pdf or txt
You are on page 1of 23

Accepted Manuscript

Deep learning based trajectory optimization for UAV aerial refueling docking under bow
wave

Yiheng Liu, Honglun Wang, Zikang Su, Jiaxuan Fan

PII: S1270-9638(18)30823-X
DOI: https://doi.org/10.1016/j.ast.2018.07.024
Reference: AESCTE 4675

To appear in: Aerospace Science and Technology

Received date: 20 April 2018


Revised date: 29 June 2018
Accepted date: 15 July 2018

Please cite this article in press as: Y. Liu et al., Deep learning based trajectory optimization for UAV aerial refueling docking under bow
wave, Aerosp. Sci. Technol. (2018), https://doi.org/10.1016/j.ast.2018.07.024

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing
this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is
published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all
legal disclaimers that apply to the journal pertain.
Deep learning based trajectory optimization for UAV
aerial refueling docking under bow wave
Yiheng Liua,b,c, Honglun Wanga,c,*, Zikang Sua,b,c, Jiaxuan Fana,c
a
School of Automation Science and Electrical Engineering, Beihang University, 100191, Beijing, China;

b
Shenyuan Honors College of Beihang University, 100191, Beijing, China;
c
The Science and Technology on Aircraft Control Laboratory, Beihang University, 100191, Beijing, China.

Abstract: In the autonomous aerial refueling (AAR) docking process, the bow wave generated by the
receiver has a strong effect on the drogue, which affects the docking success rate greatly. Thus, a deep
learning based trajectory optimization method which aims to decrease the bow wave effect on the drogue
is proposed in this paper. There are mainly three parts in the proposed trajectory optimization method.
Firstly, a precise bow wave model based on deep learning is presented to estimate the bow wave effect
on the drogue. Furthermore, due to the dynamic characteristic of the drogue, a simple and practical
drogue motion prediction model under multiple disturbances is carried out to provide a precise prediction
of the drogue position at the next time. Moreover, considering the strict attitude constraints requirements
in the AAR docking process, a novel reference observer is designed to estimate the receiver attitude from
the optimized trajectory under wind perturbations. Then, the proposed trajectory optimization method
could not only diminish the bow wave effect on the drogue largely but also satisfy the attitude constraints
of the receiver. Finally, the effectiveness of the proposed method is demonstrated by the simulations.
Keywords: Autonomous aerial refueling (AAR); Receiver trajectory optimization; Deep learning; Bow
wave; Drogue motion prediction.
1. Introduction
In order to greatly enhance the endurance and region of UAV, Autonomous Aerial Refueling (AAR)
technique becomes more and more important [1,2,36,37]. In general, there are two ways of refueling
[1,2]: flying boom method and probe-and-drogue method. In this paper, the probe-and-drogue refueling
(PDR) is focused on.
In the PDR, the receiver aircraft is required to track the drogue precisely and fast [3,5-7]. However,
the drogue is moving fleetly because of the effect of the hose pull, tanker vortex, atmospheric turbulence
and bow wave [4-7], as shown in Fig. 1. The tanker vortex and atmospheric turbulence are only
considered in most literatures [3-10]. Unfortunately, few literatures discuss the bow wave effect on the
drogue generated by the receiver during the docking process [11-16]. In fact, the bow wave greatly affects
the drogue especially when the nose of the receiver is close to the drogue [13,14]. In this case, the drogue
will be pushed away which may lead to an unsuccessful docking or even threaten the security of the
aircraft. Under strong bow wave, it’s difficult to improve the docking success rate greatly via promoting

* Corresponding author. Tel.: +86-10-82317546.


E-mail address: wang_hl_12@126.com
Fig. 1. The configuration of the hose-drogue aerial refueling system.
the performance of the receiver trajectory tracking controller only [3-10,19]. Thus, in order to relieve
this problem, a trajectory optimization method aiming to diminish the bow wave effect on the drogue is
proposed in this paper. There are some significant issues should be especially considered during the
docking trajectory optimization process:
1) The bow wave needs to be modeled exactly. The results of the trajectory optimization are directly
determined by the precision of the bow wave model.
2) The wobbly drogue motion under multiple flow perturbations and the hose pull is difficult to be
predicted, which is required in the on-line trajectory optimization.
3) The trajectory optimization is tough to accomplish in the AAR docking process because of the
strict attitude constraints of the receiver which is under multiple wind perturbations.
For the first problem, most of existing literatures related to the bow wave focus on either qualitative
static result obtained from the experiments, or lookup tables based on computational fluid dynamics
(CFD) analysis [11]. These modeling methods can not be used directly during the docking controller
design due to the real time performance requirement [14]. In [11,13-16], the bow wave model is given
in the form of a complicated mathematical expression. For different aircrafts, the function form may need
to be inferred once again, which is difficult. In [12], a fitting and interpolation method is used to model
the bow wave based on CFD data, which is easy to realize but has an unsatisfactory precision. For the
second problem, ref. [6] utilizes an exact nonlinear hose-drogue assembly (HDA) model [17] to predict
the drogue position and obtain good results. But 96 state variables are required in this model, which
couldn’t satisfy the real time performance requirement. For the third problem, few literatures discuss the
trajectory optimization problem in the AAR docking process. Although some literatures [3,10] use the
trajectory generator to generate a smooth docking trajectory, the bow wave is not considered in the design
process.
Recently, deep learning has attracted many research interests due to its remarkable performance in
many areas [24-35]. By adding the nonlinear factor using activation function, a deep neural network with
many hidden layers is able to extract different features from diverse perspectives, which benefits from
its powerful non-linear expressive capacity [32-34]. Motivated by the advantages of deep learning, it
could be utilized to solve the significant problems presented above.
Considering the superiority of deep learning, a deep learning based bow wave modeling method with
higher precision compared to [12] is proposed. Even better, this method is easier compared to [14]
because the training process could be accomplished by computer. For the second problem, there are many
methods that could solve this typical problem which could be taken as a time series prediction problem
in deep learning. For instance, the recurrent neural networks (RNN) has been used to solve the time series
prediction maturely [30,31]. There are also some novel structures utilized to solve this problem such as
[24,25]. However, considering the real time performance requirement of AAR docking process, these
structures are too complicated to be used. In this paper, the time series prediction problem has been
converted to a classification problem innovatively. In this way, the demands of a higher prediction
precision and a simpler network structure could be satisfied simultaneously.
For the third problem, an on-line trajectory optimization method based on a novel reference observer
(ROB) is employed. A ROB which could estimate the reference states and inputs of the receiver from the
trajectory has been proposed in [3] however the wind perturbations are ignored. That is, the receiver
attitude obtained from ROB is not exact due to the existence of the tanker vortex. In this paper, a novel
ROB taking the tanker vortex on the receiver into account is designed to estimate the exact receiver
attitude from the trajectory. Then, by limiting the estimated attitude, the optimized trajectory could satisfy
the strict attitude constraints in the AAR docking process.
Inspired by the above analyses, a trajectory optimization method which diminishes the bow wave
effect on the drogue significantly is presented in this paper. The optimizer uses the bow wave effect on
the drogue obtained from the precise bow wave model and the drogue position at the next time obtained
from the drogue motion prediction model to generate an optimized docking trajectory. Then, the novel
ROB is used to estimate the exact receiver attitude from the optimized docking trajectory and ensures
the attitude constraints of the receiver. The main contributions of this paper can be summarized as follows:
1) Considering the bow wave effect during the docking process, the receiver docking trajectory is
optimized. In this way, the moving ranges of the drogue could be diminished and the docking success
rate will be enhanced.
2) The bow wave effect and drogue motion prediction are modeled using deep learning method. The
bow wave modeling method is easy to realize and has higher precision. Further, in the drogue motion
prediction method, the typical time series prediction problem is transferred to a classification problem
tactfully.
3) Based on the bow wave model, drogue motion prediction model and the novel ROB which is
designed to ensure the attitude constraints of the receiver under external wind disturbances, a trajectory
optimization method aiming to decrease the bow wave effect on the drogue is proposed.
The paper is organized as follows. The problem formulation including frames, the 6 DOF linear
receiver model used in this paper, the bow wave model and drogue motion prediction model based on
deep learning are presented in Sec. 2. In Sec. 3, the design process of novel ROB and the detailed overall
design procedure for the trajectory optimizer of the AAR docking process is illustrated. Simulations,
comparisons and analyses are shown in Sec. 4. The paper ends up with a few concluding remarks in Sec.
5.
2. Problem formulation
For the drogue is subjected to the bow wave effect greatly during the docking process [13,14], it is
important to decrease the bow wave effect on the drogue to enhance the docking success rate. In this
paper, the receiver docking trajectory is optimized to decrease the bow wave effect on drogue. However,
there are some significant problems that need to be solved. The bow wave need to be modeled because
the bow wave effect on the drogue should be known in the on-line trajectory optimization process. The
drogue which suffers multiple disturbances should be predicted precisely to promote the trajectory
optimization precision.
In this section, the four frames and the 6 DOF linear receiver model used in this paper are described
in detail. Furthermore, the bow wave model and the drogue motion prediction model using deep learning
method are illustrated as well.
2.1 Frames establishment
As shown in Fig. 1, the following four frames are used in this paper: the inertia frame, the tanker
frame, the receiver body frame and the nose of receiver frame.
1) Inertial frame ( Oi − X iYi Z i ): the earth curvature has been ignored and the earth-surface frame is

assumed as the inertial frame. For convenience, the axis Oi X i points the projection of refueling velocity

v of tanker on X i OiYi plane.

2) Tanker frame ( Ot − X t Yt Z t ): the origin is fixed to the conjunctive point between tanker and hose.

The axis Ot X t points the same direction with v .

3) Receiver body frame ( Ob − X bYb Z b ): the origin is fixed to the mass center of the receiver. The

axis ob xb is parallel to the vertical axis of the receiver and points the nose of the receiver.

4) Receiver nose frame ( On − X nYn Z n ): the origin is fixed to the nose of the receiver. The axis

On X n is parallel to the Ob X b .

2.2 Receiver model


The receiver is modeled as a linear-time-invariant state-space perturbation model.
x (t ) = Ax(t ) + Bu (t )
(1)
y (t ) = Cx(t )

where x (t ) denotes the state vector of the receiver at time t ; A ∈ R12×12 , B ∈ R12×4 , and C ∈ R 3×12

are the state, control and output matrix, respectively; u (t ) ∈ R 4×1 is the control vector; y (t ) are outputs
that need to be optimized.
For simplicity and compactness, the t notation has been dropped.

x = [ ΔV ΔZ ]
T
Δβ Δα Δp Δq Δr Δψ Δθ Δφ ΔX ΔY (2)

where Δ( < ) are the perturbations relative to the tanker which is assumed as the steady level flight. Here,
ΔV , Δβ , Δα are air speed, side slip angle and angle of attack perturbations; Δp, Δq, Δr are

perturbations of the angular velocities relative to the tanker; Δφ , Δθ , Δψ are perturbations of the Euler

attitude angles relative to the tanker; ΔX , ΔY , ΔZ are perturbations of the positions in tanker frame.

u = [ Δδ a Δξ ]
T
Δδ e Δδ r (3)
where the control variables Δδ a - aileron, Δδ e - elevator, Δδ r - rudder and Δξ - throttle setting are

perturbations in the control effectors from the trim values.


2.3 Deep learning model
2.3.1 Bow wave model
The light drogue is subjected to wind perturbations easily [4-7]. During the docking process, there
are mainly three types of wind perturbations which are bow wave, atmospheric turbulence and tanker
vortex [13-15]. When the receiver nose gets close to the drogue, the bow wave increases rapidly and
plays a significant role among these wind perturbations [13,14]. In order to get the bow wave effect on
the drogue in the on-line trajectory optimization process, a precise and simple bow wave modeling
method is necessary. Therefore, a novel deep learning based bow wave modeling method is described in
detail.
Considering the actual flight situations of the receiver in the AAR docking process, the following
assumptions are given before designing the bow wave model.
Assumption 1. The receiver maintains a certain relative velocity between the receiver and the drogue
( ΔV ≡ ΔV0 ) during the docking process.
Assumption 2. The perturbations of the Euler attitude angles ( Δψ , Δθ , Δφ ) could be ignored during the

docking process.
In general, ΔV is invariable and Δφ , Δθ , Δψ are constrained among a small range in the AAR
docking process, which means the Assumption 1 and Assumption 2 are reasonable [14].
Based on the assumption above, the bow wave effect on the drogue is mainly related to the relative
position ( ΔX n , ΔYn , ΔZ n ) between the drogue and the nose of the receiver. Thus, the inputs of the bow

wave model are ΔX n , ΔYn and ΔZ n . The outputs of the bow wave model are estimations of the bow

wave ( wˆ xb , wˆ yb and wˆ zb ).

It should be noted that the bow wave model is used in the on-line trajectory optimization of the
AAR docking process. Thus, a high precision and the real time performance are required. So, a simple
and small fully connected neural network is suitable for modeling the bow wave. With the increasement
of nodes in the hidden layer and the number of hidden layers, the precision will be enhanced, at the same
time, the calculation cost will also increase. To balance these two aspects, a small structure with 5 hidden
layers and 50 nodes in every hidden layer is designed for the bow wave model, as shown in Fig. 2.

Fig. 2. The bow wave model neural network structure.


The calculation process of the forward propagation could be written as
f1b = [ΔX n ΔYn ΔZ n ] (4)

f 7b = [ wˆ xb wˆ by wˆ zb ] (5)

where f1b , f 7b denote the inputs and the outputs layer of the bow wave model.

­ a (a ≥ 0)
f e (a) = ® a (6)
¯e − 1(a < 0)
where f e (a ) denotes the exponential linear unit (ELU) activation function. In this model, ELU is used

to obtain the nonlinear factor.


f nb = f e ( f nb−1Wnb-1 + Bnb-1 ) (2 ≤ n ≤ 7) (7)

where f nb denotes the nth layer; Wnb-1 denotes the weights matrix between f nb−1 and f nb ; Bnb−1

denotes the bias vector of the f nb . The estimation of the bow wave could be obtained using Eqs. (4)-(7).

Lb = ( wˆ xb − wxb ) 2 + ( wˆ by − wby ) 2 + ( wˆ zb − wzb ) 2 + c || W ||2 (8)

where Lb denotes the loss function of the training which is the square error of the bow wave estimation;

c denotes the parameter controlling the level of the L2 weight regularization which is utilized to prevent
the overfitting problem.
2.3.2 Drogue motion prediction model
For the light drogue is moving dynamically during the docking process, a precise drogue motion
prediction under multiple perturbations is necessary for the on-line trajectory optimization. The drogue
is subjected to multiple perturbations including the hose pull, tanker vortex, atmospheric turbulence, and
bow wave [4-7]. Even worse, the hose pull and the random atmospheric turbulence could not be modeled,
which means the drogue motion need to be predicted using the incomplete inputs. Thus, if the deep
learning model is trained by using the incomplete inputs to fit the exact drogue motion directly, the
prediction error will not be satisfactory.
Aiming at the problems mentioned above, a novel structure using classification method is proposed
to predict the drogue motion. This structure does not fit the drogue motion strictly. The variations of the
drogue position have been classified into a series of discrete points. Then, the proposed model is trained
to select a discrete point nearest to the real value of the drogue position. A classification example is shown
in Fig. 3. Assuming the moving ranges of the drogue in 0.1s are −0.03m  0.03m and the classification
precision is set to 0.01m , there will be 7 discrete points as shown in Fig. 3. Then, the proposed method
selects a point from these 7 points as the predicted drogue motion. In this way, by setting the precision
of the classification properly, the influences of the unknown disturbances could be diminished, because
proposed method does not need to fit the drogue movement caused by the unknown disturbances which
could not be used as the inputs. It is worth to know that the classification for the drogue position will
bring the rounding error which is decided by the classification precision. With the enhancement of the
classification precision, the rounding error becomes small but the effect of avoiding the influences of the
unknown disturbances is also decreased. Thus, the classification precision need to be selected carefully
according to the testing results.
Fig. 3. The schematic diagram of the classification method.
The inputs of the proposed prediction structure in Yt axis of drogue motion are prepared as follows.

ª d dY d d 2Y d d 3Y d d 4Y d d 5Y d d 6Y d d 7Y d d 8Y d d 9Y d º
«Y »
dt dt 2 dt 3 dt 4 dt 5 dt 6 dt 7 dt 8 dt 9
f1 = «
d
» (9)
« d dwd d 2 wd d 3 wd d 4 wd d 5 wd d 6 wd d 7 wd d 8 wd d 9 wd »
«w »
¬ dt dt 2 dt 3 dt 4 dt 5 dt 6 dt 7 dt 8 dt 9 ¼

where f1d denotes the input layer of the proposed prediction structure; Y d denotes the real time Yt

axis drogue position in the tanker frame; wd denotes the real time wind perturbations incorporated with
the tanker vortex and bow wave. The differential calculation is approximated using difference method
due to the short sampling time. By preparing the inputs in this way, the historical data could be used and
the features could be extracted more easily. The first row of f1d reflects the motion features of the

moving drogue. The last 0.1s position information has been used to predict the future 0.1s drogue motion.
Moreover, the second row of f1d reflects the variation tendency of the wind perturbations combined

with the tanker vortex and bow wave which are the main factors affecting the drogue.
The proposed prediction structure is shown in Fig. 4. The training and testing dataset are obtained
from HDA model. The fully connected layers are used to learn the relation between input features and
outputs. The softmax layer could transfer the outputs into the probability of every category.
f nd = f e ( f nd−1Wnd−1 + Bnd−1 ) (2 ≤ n ≤ 7) (10)

where f nd denotes the nth layer which is fully connected; Wnd−1 denotes the weights matrix between

f nd-1 and f nd . Bnd−1 denotes the bias vector of the f nd .


d
eoi
f s (oid ) = n
f8d = [ f s (o1d ) ⋅⋅⋅ f s (ond )]
(11)
¦e
o dj

j =1

where f s (oid ) denotes the softmax function; oid denotes the ith output of the f 7d ; f8d denotes the

softmax layer. The output number n is determined by the classification precision ΔT and the moving
range of the drogue | ΔYmd | .
Fig. 4. The proposed drogue motion prediction network structure.
n =| ΔYmd | / ΔT (12)

Ypd = Yrd + arg max( f8d ) ΔT (13)

where Y pd denotes the prediction position of the drogue; Yrd denotes the real time position of the

drogue; arg max( f8d ) is the function that could return the subscript of the maximum value of f8d .

The variation of X t axis of the drogue position has been ignored because the drogue could be

considered as stable in X t axis and Z pd could be obtained using the same method as Y pd . Thus, the

drogue motion prediction could be obtained D p = ª¬ X rd Ypd Z pd º¼ .

The prediction results of drogue motion could be obtained using Eqs. (9)-(13). If the classification
precision and the accuracy of this structure could be ensured, the prediction error could be reduced greatly.
3. Trajectory optimization
Based on the deep learning models introduced above, a novel trajectory optimization method is
proposed to decrease the bow wave effect on the drogue in the AAR docking process. This method is
illustrated at length in this section.
3.1. Novel reference observer
In this paper, a linear 6-DOF model of the receiver is used. The bow wave effect on the drogue and
the requirement of precise docking are related to the receiver position directly. Compared with searching
the control inputs, searching the receiver position directly is much simple and could diminish the
calculation cost in the trajectory optimization process. However, the receiver attitude couldn’t be
obtained from the position directly.
The traditional ROB [3] could estimate the reference states and inputs of the receiver from the
reference outputs. But it doesn’t take the wind perturbations on the receiver into account. That is to say,
the estimations are imprecise under the existence of the wind perturbations. In order to solve this problem,
a novel ROB considering the tanker vortex which is the main wind perturbation on the receiver in the
docking process is used.
The state-space equation containing the wind perturbation could be written as
x * (t ) = Ax* (t ) + Bu * (t ) + Gwt (t )
(14)
y * (t ) = Cx* (t )

where y * (t ) denotes the reference output; x * (t ) denotes the reference state vector; u * (t ) denotes

the reference control vector; wt (t ) ∈ R 3×1 denotes the wind perturbation vector; G ∈ R12×3 denotes the

coefficient matrix of wt (t ) .

For simplicity and compactness, the t notation has been dropped. A new augmented state vector
is defined as
T
X * = ª¬ x * wt u * º¼ (15)

The dynamics of X * could be written as


ª x * º ª A G B º ª x* º ª0 º ª0 º
« » « »
X = « w t » = «« 0 0
*
0 »» « wt » + «« I »» d + ««0 »» u *
« u * » «¬ 0 0 0 »¼ «¬ u * »¼ «¬0 »¼ «¬ I »¼ (16)
¬ ¼
ªC 0 0 º *
yw* = « »X
¬ 0 I 0¼
where 0 represents null matrices and I represents identity matrices of appropriate dimensions. The
state vector of ROB could be defined as
T
Xˆ = ª¬ xˆ wˆ t uˆ º¼ (17)

The dynamics of ROB could be defined as


ª xˆ º ª A G B º ª xˆ º ­ ª xˆ º ½
 « » ° * ªC 0 0 º « t » °
Xˆ = « wˆ t » = «« 0 0 » « t»
0 » « wˆ » + L ® yw − « wˆ ¾ (18)
«  » «0 0 ° ¬ 0 I 0 »¼ « » °
» « »
0 ¼ ¬ uˆ ¼ «
¬ uˆ »¼ ¿
¬« u ¼» ¬
ˆ ¯

Thus, the state-space equations for the desired reference outputs and the observer are
X * = Aa X * + Bw w t + Ba u *
(19)
yw* = Ca X *

Xˆ = Aa Xˆ + LCa ( X * − Xˆ )
(20)
yˆ = C Xˆ
w a

where
ªA G Bº ª0º ª0 º
« ªC 0 0 º
Aa = « 0 0 0 » , Bw = « I » , Ba = «0 »» , Ca = «
» « » «
» (21)
«¬ 0 0 0 »¼ «¬ 0 »¼ «¬ I »¼ ¬ 0 I 0¼

where e is defined to represent the error between the desired and observer states

e = X * − Xˆ (22)
Differentiating Eq. (22) with respect to time and substituting Eqs. (19) and (20)
e = ( Aa − LCa )e + Bw w t + Ba u * (23)
where u* and wt are assumed to vary slowly, so u* and w t could be regarded as 0 . Then the
gain L could be selected to place the poles of Α − LCa properly. L could be calculated using LQR

method [3]. Unfolding Eq. (18), the detailed dynamics of ROB could be written as

­Δyw = yw* − yˆ w
° t
° xˆ = Axˆ + Gw + Buˆ+L1Δyw
® t (24)
° w = L2 Δyw
°
¯uˆ = L3 Δyw
where Δyw denotes the output error between yˆ w and y w* . Then the ROB is designed using Eqs. (24).

In this way, the exact state variables x̂ and control inputs û under tanker vortex on the receiver could
be obtained from the reference output y w* .

3.2. Overall structure


The schematic diagram of the proposed trajectory optimization method is shown in Fig. 5. The
bow wave model and drogue motion prediction model based on deep learning designed above provide
the foundation for the trajectory optimization. The HDA model which considers bow wave, tanker vortex
and atmospheric turbulence has an exact output of drogue motion similar to the real phenomenon. The
drogue motion prediction model uses historical drogue position data obtained from the HDA model,
historical bow wave wb and tanker vortex wt data obtained from their exact model to predict drogue
motion D p precisely. The novel ROB is used to get the receiver attitude under tanker vortex from

optimized trajectory To . The trajectory optimizer uses D p , wb and X̂ to generate an optimized

trajectory To .

Fig. 5. The schematic diagram of the trajectory optimization.


3.3 Detailed process
The trajectory optimizer aims to generate a docking trajectory which decrease the bow wave effect
on the drogue under the guarantee of successful docking. Due to the complexity of the optimization
objective and the special requirement of docking process, a novel trajectory optimization process is
proposed.
Aiming at the on-line trajectory optimization, the normal method is the moving horizon which
predicts the movement of the drogue in the horizon length and executes the corresponding control
strategy in one step [20-23]. If the horizon length is too short, the optimization may get into a local
optimum. But with the increasement of the horizon length, the calculation cost increases rapidly and the
real time performance couldn’t be satisfied [20-23,38].
For the precise bow wave model is obtained using deep learning method, an off-line global optimum
target relative trajectory Tr between the receiver and the drogue is designed considering the bow wave
effect on the drogue wb and the docking error ed . If the receiver could track T r , wb could be

decreased largely under the satisfaction of ed . However, T r may not be smooth. If the drogue

movement trajectory Td is also considered, the receiver real target trajectory Tt = Tr + Td may be
not able to be tracked by the receiver because of the strict attitude constraints in docking process. In order
to ensure the trackable performance of the optimized trajectory To , the receiver attitude estimated by
the novel ROB from To under the tanker vortex are considered in the optimization process. The
distance ΔP = Tt − To which is the main factor deciding the bow wave is also taken into account. The
optimizer searches the possible search space of the receiver’s positions to select a best choice. In this
way, To could not only satisfy the receiver attitude constraints but also diminish wb .
Remark 1. According to the bow wave data obtained from the CFD, the bow wave doesn’t change a lot
in a small range in general, which means the shorter ΔP is, the better the optimization result is.
According to Assumption 1, the search space in the docking process could be divided into several
search planes using an appropriate precision of ΔX t , as shown in Fig. 6.

Fig. 6. The schematic diagram of trajectory optimization process.


STEP 1. The target relative trajectory T r between the receiver and the drogue based on the bow
wave model is designed by searching the relative position to minimize the bow wave effect on the drogue
wb .
STEP 2. The target trajectory Tt is obtained using the drogue motion prediction model and the
target relative trajectory T r designed in STEP 1.
STEP 3. By moving ΔX t to the next search plane, all positions in the search plane are traversed.
The novel ROB is used to estimate the receiver attitude of each position. The bow wave model is used
to estimate wb of each position.

STEP 4. The attitude cost J a is calculated using the estimated receiver attitude. The bow wave

cost J b is calculated using ΔP and wb . Then, the optimized trajectory To is obtained by choosing

a best position that has a minimum total cost J in the search plane.
STEP 5. The STEP 2 - STEP 4 is recycled until the relative forward distance between the drogue
and the probe of the receiver equals zero.
4.4 Cost function
In this paper, the off-line bow wave cost J o is considered in the off-line optimization of the target

relative trajectory Tr . The bow wave cost J b and the attitude cost J a are considered in the on-line

optimization of optimized trajectory To .

( wby ) 2 + ( wzb ) 2 (eyd ) 2 + (ezd ) 2


J o ( wb ) = k1 + k2 (25)
wmb ΔX m − ΔX + ε

where k1 , k2 denote the gain parameters of J o ; ΔX m denotes the max forward distance between the

probe and the drogue; ΔX denotes the forward distance of the receiver in tanker frame. ε denotes an
infinitesimal which is used to avoid that the denominator becomes zero.
Total cost J is composed of bow wave cost J b and attitude cost J a .

J = Jb + J a (26)
Bow wave cost J b is used to describe the bow wave effect on the drogue. In general, the bow wave

is decided by ΔP (see Remark 1). But in order to avoid saltation, the bow wave is still considered in
the bow wave cost J b .

ΔYt 2 + ΔZ t2 ( wby ) 2 + ( wzb ) 2


J b (ΔP, wb ) = k3 + k4 (27)
Pm wmb

where k1 , k2 denote the gain parameters of J b ; ΔYt , ΔZ t denote the component of ΔP in Yi , Zi

respectively; Pm , wmb denote the maximum of ΔP, wb . In this way, the cost could be normalized, which

is convenient for the adjustment of parameters.


Attitude cost J a describes the attitude of the receiver when tracking y o . Considering the strict

attitude constraints of the receiver in docking process, the angular velocity and the Euler angle should be
limited in a proper range.
­ 0 − Sm ≤ S ≤ Sm
J a (S ) = ® (28)
¯+∞ else

S m = [ Δp m Δ qm Δrm Δφ m Δθ m Δψ m ] (29)
where Δ( < ) m denote the maximum of velocity and angular velocity of the receiver; S m denote the

boundary conditions of the receiver attitude.


4. Simulations and comparisons
To verify the validity of the proposed trajectory optimization method in AAR docking process
under bow wave, abundant simulations have been carried out. The specific linear model of the receiver
is given in [9] and the equivalent model for the receiver is given in [18]. The proposed bow wave model
is compared with the fitting and interpolation method proposed in [12]. The proposed drogue motion
prediction model using classification method is compared with the conventional long short-term memory
(LSTM). The proposed novel trajectory optimization method combining the bow wave model and the
drogue motion prediction model is compared with the normal trajectory generator [3] which is commonly
used in the AAR docking process.
For all numerical simulations presented below, some initial flight conditions of the receiver are
provided in Table 1.
Table 1
Initial flight conditions of the receiver
Altitude 7010m

Atmospheric density 1.177 kg / m3

Enthalpy 323608.9 J / kg

Airspeed 200m / s

Successful docking requirement Re = ey2 + ez2 ≤ 0.3m

4.1 Bow wave


The bow wave CFD data are shown in Fig. 7. In this figure, the initial airspeed of the receiver is
210m / s . If there is no effect of the receiver, the speed of air should be 210m / s in the opposite
direction relative to the receiver and should be 0m / s relative to the earth. However, air is rebounded
by the receiver nose, which causes the reduction of the speed of air. Then, the bow wave forms, which is
colored in yellow and green. In the bow wave areas, the speed of air is under 210m / s as shown in Fig.
7, which means the speed of air is more than 0m / s relative to the earth and has the same direction with
the receiver. In this situation, the bow wave will affect the movements of the drogue. It can be seen that
the speed value of bow wave varies largely with position relative to the receiver nose, which demonstrates
the necessity of the trajectory optimization in the AAR docking process.

Fig. 7. The bow wave CFD data.


For the bow wave model, there are several main planes (−0.8  −1.5m) of Z n axis which are

used to make comparisons because the bow wave varies slowly in Z n axis. The standard error eys has

been used to describe the fitting degree of the bow wave model.

1 n
e ys = ¦ (wby − wˆ by )i2
n i =1
(30)

As shown in Fig. 8 and Table 2, the standard error of deep learning model and fitting and
interpolation model [12] have been compared. It has been shown clearly that the standard error has
decreased nearly half percent by using the proposed deep learning model. The smaller standard error
demonstrates that the deep learning model has higher precision in fitting the bow wave, which provides
the foundation for promoting the precision of trajectory optimization.
The fitting results of the bow wave deep learning model have been shown in Fig. 9. For simplicity,
the plane −0.8m and −1.5m of Z n axis have been shown only. In these planes which are close to

the nose of the receiver, the bow wave varies with relative position very rapidly. But the proposed bow
wave model which has a strong representational capacity fits the test data obtained from CFD very well.

0.7 1.4
Deep learning model Deep learning model
0.6 Fitting and interpolation model 1.2 Fitting and interpolation model

0.5 1

0.4 0.8
ey

ez
s

0.3 0.6

0.2 0.4

0.1 0.2

0 0
-0.8 -0.9 -1.0 -1.1 -1.2 -1.3 -1.4 -1.5 -0.8 -0.9 -1.0 -1.1 -1.2 -1.3 -1.4 -1.5
zn(m) zn(m)

Fig. 8. The fitting standard error of the bow wave in Z n main planes.

CFD data CFD data


Deep learning Deep learning

20 0

-5
10
wy (m/s)

wz (m/s)

-10
0
-15
b

-10
-20

-20 -25
4 4
2 4 2 4
0 2 0 2
-2 0 -2 0
-4 -2 -4 -2
yn(m) xn(m) yn(m) xn(m)

(a) −1.5m of Z n axis


CFD data
CFD data
Deep learning
Deep learning

15 0

10 -5

wz (m/s)
wy(m/s)
5
-10

b
0
b

-15
-5

-10 -20
4 4
2 4 2 4
0 2 0 2
-2 0 -2 0
-4 -2 -4 -2
yn(m) xn(m) yn(m) xn(m)

(b) −0.8m of Z n axis

Fig. 9. The fitting results of the proposed bow wave model.


Table 2a
The standard error of wby in main Z n planes.

Model -0.8m -0.9m -1.0m -1.1m -1.2m -1.3m -1.4m -1.5m


Fitting and
0.6485 0.5513 0.3639 0.2862 0.2175 0.2754 0.2179 0.1240
interpolation

Deep learning 0.5082 0.3230 0.2224 0.1466 0.1217 0.1150 0.1094 0.0923

Table 2b
The standard error of wzb in main Z n planes.

Model -0.8m -0.9m -1.0m -1.1m -1.2m -1.3m -1.4m -1.5m


Fitting and
1.1255 0.9075 0.6460 0.4829 0.3913 0.3962 0.3554 0.2282
interpolation

Deep learning 0.7395 0.4092 0.2739 0.1746 0.1415 0.1215 0.1190 0.1146

4.2 Drogue motion prediction


In Table 3, the structure of LSTM and the proposed classification method are described in detail.
These two structures have been constructed using same parameters such as total layers, nodes, activation
function, training steps, batch size, optimizer and so on. In this way, the fairness could be promised.
The prediction error of these two structures have been compared in Fig. 10. The max error has been
diminished greatly by using the proposed method obviously. More specifically, the values of max error
are given in Table 4. LSTM is trained by fitting the exact position of the drogue directly, which causes
the fluctuation in the prediction error and increase the max error. The max error of the proposed method
has been limited into 0.005m. This is because the proposed method divides the moving ranges of the
drogue into a series of discrete points according to the classification precision which is set to 0.01m.
After enough training, the proposed method has the 100 percent classification accuracy rating, and there
is only rounding error left which is 0.005m.
The prediction results of proposed method have been shown in Fig. 11. It could be found that the
drogue moves rapidly under multiple perturbations. The total wind perturbations on the drogue including
the bow wave, tanker vortex and atmospheric turbulence are given in Fig. 12. However, the proposed
method could predict the drogue motion very precisely in 0.1s under multiple wind perturbations as
shown in Fig. 11.
Table 3
The detailed description of LSTM and proposed method.
Model LSTM Proposed method
Input nodes 2 20
Output nodes 1 | Δymd | / ΔT

LSTM layers/nodes 3/20 0


FC layers/nodes 2/20 6/20
Activation function ELU ELU
Learning rate 0.0005 0.0010
Batch size 200 200
Training steps 1000000 1000000
Preventing overfitting L2 regularization L2 regularization
Optimizer AdaDelta AdaDelta

Table 4
The max error of drogue motion prediction
Model y z
Proposed
0.0050m 0.0050m
method
LSTM 0.0180m 0.0126m

0.02 0.02
Proposed Proposed
0.015 LSTM 0.015 LSTM

0.01 0.01

0.005 0.005
yp(m)

zp(m)

0 0
d

-0.005 -0.005

-0.01 -0.01

-0.015 -0.015

-0.02 -0.02
0 200 400 600 800 1000 1200 1400 0 200 400 600 800 1000 1200 1400
n n

Fig. 10. The drogue motion prediction error in the test data.
-3.8
Prediciton
Desired
-4

-4.2
zt(m)
-4.4

-4.6

-4.8

2.8 3 3.2 3.4 3.6 3.8 4 4.2


yt(m)

Fig. 11. The prediction results of the drogue motion in 0.1s.

20
X
15 Y
Z
10

5
w(m/s)

-5

-10

-15

-20
0 5 10 15
t(s)
Fig. 12. The total wind perturbations on the drogue.
4.3 Trajectory optimization
The attitude of the receiver obtained from To using the novel ROB are given in Fig. 13. It can
be seen that attitude constraints of the receiver have been satisfied. The novel ROB designed in this paper
has considered the tanker vortex on the receiver when estimating the receiver attitude. And the tanker
vortex on the receiver is shown in Fig. 14. Thus, the estimations of the receiver attitude are close to the
true values which means the simulation results are reliable.
ROB
20
Limitation 10

Δ p(deg/s)

Δψ (deg)
0 0
-10
-20
0 5 10 15 0 5 10 15
t(s) t(s)
5 5
Δ q(deg/s)

Δθ (deg)
0 0

-5 -5
0 5 10 15 0 5 10 15
t(s) t(s)
20
10

ΔΦ (deg)
Δ r(deg)

0 0
-10
-20
0 5 10 15 0 5 10 15
t(s) t(s)

Fig. 13. The attitudes estimated by the novel ROB.

8
X
7 Y
Z
6

5
w (m/s)

4
v

-1
0 5 10 15
t(s)
Fig. 14. The tanker vortex on the receiver.
As shown in Fig. 15, by using the proposed trajectory optimization method, the max variations of
the drogue position are smaller nearly 0.4m in Yt axis and 0.9m in Zt axis compared to the traditional

trajectory generator [3]. To be more specific, the moving ranges of the drogue in the AAR docking
process are shown in Fig. 16. It can be seen clearly that the moving ranges of the drogue using the
proposed trajectory optimizer are smaller than the moving ranges of the drogue using the traditional
trajectory generator obviously. Because the bow wave effect on the drogue hasn’t been considered in the
traditional trajectory generator when generating the reference trajectory. Thus, the drogue will be pushed
away due to the strong bow wave effect generated by the receiver nose.
1.6 0.5
Trajectory optimizer Trajectory optimizer
1.4 Trajectory generator Trajectory generator
0
1.2
-0.5
1
Δy (m)

Δz (m)
0.8 -1
d

d
0.6
-1.5
0.4
-2
0.2

0 -2.5
0 5 10 15 0 5 10 15
t(s) t(s)

Fig. 15. The drogue motion in the AAR docking process.

0
Trajectory optimizer
R1=1.9
Trajectory generator
-0.5

-1
Δz (m)
d

-1.5

-2

R2=2.8
-2.5
0 0.5 1 1.5 2 2.5 3
Δy (m)
d
Fig. 16. The moving ranges of the drogue in the AAR docking process.

5 5
X X
4 4 Y
Y
3 Z Z
3
2
2
1
1
y (m)
y (m)

0
r
t

0
-1
-1
-2

-3 -2

-4 -3

-5 -4
0 5 10 15 0 5 10 15
t(s) t(s)

(a) Optimized trajectory (b) Relative trajectory


Fig. 17. Trajectories in the AAR docking process.
Fig. 17 shows the final optimized docking trajectory of the receiver and the relative trajectory
between the drogue and the probe. The smooth optimized docking trajectory satisfies the attitude
constraints of the receiver. The relative trajectory satisfies the successful docking requirement which
implies the final relative distance between the probe and the drogue Re ≤ 0.3m .
All the simulation results shown above have proved the fact that the proposed trajectory
optimization method based on deep learning has the ability to decrease the bow wave effect on the drogue
markedly. In general, the smaller the moving ranges of drogue is, the higher the docking success rate is.
5. Conclusion

In this paper, a novel trajectory optimization method aiming to diminish the bow wave effect on

the drogue in the AAR docking process is proposed. The bow wave is modeled simply and exactly using

the deep learning method so that it could be used to estimate the bow wave generated by the receiver in

real time. The drogue motion prediction model which could precisely predict the drogue motion under

multiple wind and other perturbations transforms the typical time series prediction problem into a

classification problem. By using this method, the max error of prediction has decreased observably.

Based on the deep learning model, the proposed trajectory optimization method is utilized to optimize

the docking trajectory of the receiver and decreases the bow wave effect on the drogue. In the

optimization process, the novel ROB which considers the wind perturbations is used to estimate the

receiver attitude exactly from the optimized trajectory. Then, the strict attitude constraints of the receiver

are ensured in the optimization process. Extensive simulations and comparisons are carried out to

effectively verify the superiority and feasibility of the proposed trajectory optimization method.
Acknowledgment
This research has been funded in part by the National Natural Science Foundations of China under
Grant 61673042 and 61175084.
Reference
[1] P R Thomas, U Bhandari, S Bullock, et al. Advances in air to air refueling, Progress in Aerospace
Sciences. 71 (2014) 14-35.
[2] J Nalepka, J Hinchman. Automated aerial refueling: extending the effectiveness of UAVs, AIAA
Modeling and Simulation Technologies Conference and Exhibit. 2005, pp. 6005-6012.
[3] M D Tandale, R Bowers, J Valasek. Trajectory tracking controller for vision-based probe and drogue
autonomous aerial refueling, Journal of Guidance, Control, and Dynamics. 29 (4) (2006) 846-857.
[4] Z Su, H Wang, N Li. Anti-disturbance rapid vibration suppression of the flexible aerial refueling hose,
Mechanical Systems and Signal Processing. 104 (2018) 87-105.
[5] Su Z, Wang H, Li N, et al. Exact docking flight controller for autonomous aerial refueling with back-
stepping based high order sliding mode, Mechanical Systems and Signal Processing. 101 (2018)
338-360.
[6] Z Su, H Wang, P Yao, et al. Back-stepping based anti-disturbance flight controller with preview
methodology for autonomous aerial refueling, Aerospace Science and Technology. 61 (2017) 95-
108.
[7] Z Su, H Wang, et al. Probe motion compound control for autonomous aerial refueling docking,
Aerospace Science and Technology. 72 (2018) 1-13.
[8] J Wang, V Patel, C Cao, et al. Novel L1 adaptive control methodology for aerial refueling with
guaranteed transient performance, Journal of guidance, control, and dynamics. 31 (1) (2008) 182-
193.
[9] A Dogan, S Sato, W Blake. Flight control and simulation for aerial refueling, AIAA guidance,
navigation, and control conference and exhibit. 2005, pp. 6264-6278.
[10] A Dogan, S Venkataramanan. Nonlinear control for reconfiguration of unmanned-aerial-vehicle
formation, Journal of Guidance, Control, and Dynamics. 28 (4) (2005) 667-678.
[11] U Bhandari, P R Thomas, T S Richardson. Bow wave effect in probe and drogue aerial refueling,
AIAA Guidance, Navigation, and Control (GNC) Conference. 2013, pp. 4695-4715.
[12] Z Zhong, D Li, H Wang, et al. Modeling and analysis of receiver aircraft bow wave in UAAR based
on fitting and interpolation, Electronics Optics and Control. 02 (2018) 69-73.
[13] X Dai, Z Wei, Q Quan. Modeling and simulation of bow wave effect in probe and drogue aerial
refueling, Chinese Journal of Aeronautics. 29 (2) (2016) 448-461.
[14] Z Wei, X Dai, Q Quan, et al. Drogue dynamic model under bow wave in probe-and-drogue refueling,
IEEE Transactions on Aerospace and Electronic Systems. 52 (4) (2016) 1728-1742.
[15] A Dogan, W Blake, C Haag. Bow wave effect in aerial refueling: Computational analysis and
modeling, Journal of Aircraft. 50 (6) (2013) 1856-1868.
[16] A Dogan, W Blake. Modeling of bow wave effect in aerial refueling, AIAA Atmospheric Flight
Mechanics Conference. 2010, pp. 7926-7942.
[17] H Wang, X Dong, J Xue, et al. Dynamic modeling of a hose-drogue aerial refueling system and
integral sliding mode backstepping control for the hose whipping phenomenon, Chinese journal of
aeronautics. 27 (4) (2014) 930-946.
[18] A Barfield, J Hinchman. An equivalent model for UAV automated aerial refueling research, AIAA
Modeling and Simulation Technologies Conference and Exhibit. 2005, pp. 6006-6012.
[19] E Kim. Control and simulation of relative motion for aerial refueling in racetrack maneuver, The
University of Texas at Arlington, 2007.
[20] J Wu, H Wang, N Li, et al. Distributed trajectory optimization for multiple solar-powered UAVs
target tracking in urban environment by Adaptive Grasshopper Optimization Algorithm, Aerospace
Science and Technology. 70 (2017) 497-510.
[21] J Zhao, S Zhou, R Zhou. Distributed time-constrained guidance using nonlinear model predictive
control, Nonlinear Dynamics. 84 (3) (2016) 1399-1416.
[22] P Yao, H Wang, Z Su. Real-time path planning of unmanned aerial vehicle for target tracking and
obstacle avoidance in complex dynamic environment, Aerospace Science and Technology. 47 (2015)
269-279.
[23] P Yao, H Wang, H Ji. Multi-UAVs tracking target in urban environment by model predictive control
and Improved Grey Wolf Optimizer, Aerospace Science and Technology. 55 (2016) 131-143.
[24] M Qin, Z Li, Z Du. Red tide time series forecasting by combining ARIMA and deep belief network,
Knowledge-Based Systems. 125 (2017) 39-52.
[25] X Sun, T Li, Q Li, et al. Deep belief echo-state network and its application to time series prediction,
Knowledge-Based Systems. 130 (2017) 17-29.
[26] W Yu, F Zhuang, Q He, et al. Learning deep representations via extreme learning machines,
Neurocomputing. 149 (2015) 308-315.
[27] C Szegedy, W Liu, Y Jia, et al. Going deeper with convolutions, Cvpr. 2015.
[28] A Krizhevsky, I Sutskever, G E Hinton. Imagenet classification with deep convolutional neural
networks, Advances in neural information processing systems. 2012, pp. 1097-1105.
[29] X Qiu, L Zhang, Y Ren, et al. Ensemble deep learning for regression and time series forecasting,
Computational Intelligence in Ensemble Learning (CIEL). 2014 IEEE Symposium on. IEEE, 2014,
pp. 1-6.
[30] H Sak, A Senior, F Beaufays. Long short-term memory recurrent neural network architectures for
large scale acoustic modeling, Fifteenth annual conference of the international speech
communication association. 2014.
[31] T G Barbounis, J B Theocharis, M C Alexiadis, et al. Long-term wind speed and power forecasting
using local recurrent neural network models, IEEE Transactions on Energy Conversion. 21 (1) (2006)
273-284.
[32] Y LeCun, Y Bengio, G Hinton. Deep learning, nature. 521 (7553) (2015) 436.
[33] J Schmidhuber. Deep learning in neural networks: An overview, Neural networks. 61 (2015) 85-117.
[34] M Längkvist, L Karlsson, A Loutfi. A review of unsupervised feature learning and deep learning for
time-series modeling, Pattern Recognition Letters. 42 (2014) 11-24.
[35] F J Ordóñez, D Roggen. Deep convolutional and lstm recurrent neural networks for multimodal
wearable activity recognition, Sensors. 16 (1) (2016) 115.
[36] Y. Yin, X. Wang, D. Xu, et al. Robust Visual Detection-Learning-Tracking Framework for
Autonomous Aerial Refueling of UAVs, IEEE Transactions on Instrumentation and Measurement,
65 (3) (2016) 510-521.
[37] W Wu, X Wang, D Xu, et al. Position and orientation measurement for autonomous aerial refueling
based on monocular vision, International Journal of Robotics and Automation, 32 (1) (2017) 13-21.
[38] Y Huang, H Wang, P Yao. Energy-optimal path planning for Solar-powered UAV with tracking
moving ground target, Aerospace Science and Technology. 53 (2016) 241-251.

You might also like