Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 49

Nanjing University of Aeronautics and Astronautics

The Graduate School


College of Civil Aviation

PREDICTION AND ANALYSIS OF


FLIGHT DELAY IN LARGE AIRPORT

A Thesis in

Transportation Engineering

By
Neima Osman Aden
Advised by
Professor Tian Wen

Submitted in Partial Fulfillment

Of the Requirements

For the Degree of

Master of Engineering

June 2022,
DECLARATION

Student number: SL1807006

I declare that the master's degree thesis submitted is my research work and results
obtained under the guidance of my supervisor. Except for the places specially marked
and acknowledged in the article, the paper does not contain research results that have
been published or written by others, nor does it contain materials used to obtain degrees
or certificates from Nanjing University of Aeronautics and Astronautics or other
educational institutions.
I authorize Nanjing University of Aeronautics and Astronautics to compile all or part of
the content of the dissertation into the relevant database for retrieval, and to save and
compile the dissertation by copying methods such as photocopying, reduction or
scanning.

SIGNATURE_________________
DATE_________________
Nanjing University of Aeronautics and Astronautics Master Degree Thesis

摘要

论文的主要目的是研究大型机场航班延误的分析和预测 ......采用了引人入胜的检查计划和推理
见解(同样采用了量化探索计划策略),并通过日记、报告、文章和官方网站通过表达性测量
分解,并以频率和速率介绍。我们描绘了一个有先见之明的展示运动,利用机器学习技术和统
计模型提前识别延迟。信息索引被清理和记入并且使用了技术,例如多元线性回归。我们通过
识别对航班延误负责的基本界限,努力为飞机业务带来的延期不幸做出回应。不仅飞机每年都
会造成巨大的成本,航站楼专家及其任务也会受到不利影响。推论测量还用于测试假设(皮尔
逊相关性)并确定因素之间的联系(大量直接复发分析)。最后,采用 Cronbach's Alpha 和
Kaiser-Meyer-Olkin Measure of Sampling Adequacy (KMO)和 Bartlett's Test 分别衡量研究的信度
和效度。研究推断,由于降水量大、云层厚、降水雾等因素较少,气候条件会导致航班延误。此
外,研究还建议,机场区应进一步发展现代创新,以鼓励行政评估,加强措施协调,并顺利开
展所有演习,减少和消除大型机场的航班延误问题。,保证后续框架解决旅行者的问题,并有
效和充分地提高重要数据的可用性。战略制定者热衷于改进一些标准和指导方针来控制机场的
演练。旅客延误问题,管理部门应设置限制条件,以防止旅客以任何方式造成航班延误.

关键词 : 航班延误、预测模型、天气状况、组合预测模型、熵权法。

I
Prediction and Analysis of flight delay in large airports

ABSTRACT

The primary point of the paper was to study the analysis and Prediction of Flight Delay in an
Large Airport... The engaging examination plan and inferential insights were adopted (quantitative
exploration plan strategy likewise adopted) and auxiliary information was gathered through diaries,
reports, articles, and official site, broke down through expressive measurements, and introduced as
frequencies and rates. We portray a prescient displaying motor utilizing machine learning techniques
and statistical models to recognize delays ahead of time. The informational index is cleaned and
credited and techniques, for example multiple linear regressions are utilized. We endeavor to advance
an answer for the defer misfortunes brought about by the aircraft business by recognizing the basic
boundaries liable for the flight delays. Not just aircraft cause a gigantic measure of cost every year, air
terminal specialists and their tasks are additionally influenced antagonistically. Inferential
measurements were additionally used to test the hypotheses (Pearson's correlation) and decide the
connection between factors (Numerous Straight Relapse analysis). And at long last, Cronbach's Alpha
and Kaiser-Meyer-Olkin Measure of Sampling Adequacy (KMO) and Bartlett's Test was adopted to
measure the reliability and validity of the study respectively. The study inferred that, the climatic
condition can make the flight delay because of the few factors like high precipitation, thick cloud, and
fog of precipitation. Furthermore, the study suggested that, The airport zone ought to receive further
developed present-day innovation to encourage administration evaluation, improve measure
coordination, and smooth development of all exercises and to diminish and to kill the issue of a flight
delay in the huge airport, guarantee the subsequent framework addresses the issues of traveler and
improve the availability of important data productively and adequately. Strategy creators keen on
improving some standard and guideline to control the well exercises of the airport. The issue of the
traveler delay, the administration ought to set up the confined condition so as to forestall any way that
traveler can cause the delay of flight.

KEYWORDS: Flight delay, prediction model, weather condition, Combination forecasting model,
the entropy weight method.

II
Nanjing University of Aeronautics and Astronautics Master Degree Thesis

Table of Contents

摘 要........................................................................................................................................................I
ABSTRACT..........................................................................................................................................II
Table of Contents................................................................................................................................III
List of tables........................................................................................................................................VI
List of figures.....................................................................................................................................VII
List of Abbreviations........................................................................................................................VIII
CHAPTER 1 INTRODUCTION.........................................................................................................1
1.1 Introduction...........................................................................................................................1
1.1.1Research background......................................................................................................1
1.1.2 Research necessity.........................................................................................................2
1.2 Research Status.....................................................................................................................2
1.2.1 External research situation.............................................................................................2
1.2.2 Research in china...........................................................................................................5
1.3 Research objectives and methods........................................................................................7
1.3.1 The research target.........................................................................................................7
1.3.2 Methodology..................................................................................................................8
1.4 Delay prediction method......................................................................................................8
1.5 Structure of paper.................................................................................................................9
CHAPTER 2 RESEARCH FOUNDATION.....................................................................................11
2.1 Definition of large airport.........................................................................................................11
2.1.1 Concept of flight Delay scenario..................................................................................11
2.1.2 Characteristics of Delay in a Large Airport..................................................................12
2.1.3 Airport performance.....................................................................................................13
2.2 Definition of flight delay...........................................................................................................16
2.3 Delay prediction and analysis method.....................................................................................17
2.3.1 Delay factor analysis method.......................................................................................18
2.3.2 Delay prediction method..............................................................................................18
2.4 Chapter summary.....................................................................................................................20

III
Prediction and Analysis of flight delay in large airports

Chapter 3: INFLUENCING FACTORS ANALYSIS.......................................................................26


3.1 Data acquisition and preprocessing.........................................................................................26
3.1.1 Airport flight operation data selection and processing.................................................11
3.1.2 Airport meteorological data processing.......................................................................12
3.1.3 Data cleaning...............................................................................................................13
3.2 Analysis on influencing factors of airport delay.....................................................................27
3.2.1 Qualitative Analysis.....................................................................................................11
3.2.2 Quantitative Analysis...................................................................................................12
3.3 Screening of key influencing factors........................................................................................28
3.4 Example analysis.......................................................................................................................28
3.5 Summary of this chapter..........................................................................................................30
Chapter 4: PREDICTION METHOD RESEARCH.......................................................................32
4.1 Delay prediction model based on Multiple Regression Analysis...........................................32
4.2 Delay prediction model based on Grey Prediction.................................................................35
4.3 Delay prediction model based on Trend Extrapolation.........................................................35
4.4 Based on Quadratic Polynomial...............................................................................................37
4.5 Combined forecasting model based on Entropy Weight Method..........................................38
4.6 Flight delay prediction process of large airports....................................................................40
4.7 Chapter summary.....................................................................................................................42
CHAPTER 5 EXAMPLE ANALYSIS OF FLIGHT DELAY IN DJIBOUTI AIRPORT.............44
5.1Prediction results........................................................................................................................44
5.2 Result analysis...........................................................................................................................44
CHAPTER 6 CONCLUSION OF THE STUDY..............................................................................46
6.1 Conclusion.................................................................................................................................46
6.2 Limitations of the Studies.........................................................................................................47
LIST OF REFERENCES...................................................................................................................49
ACKNOWLEDGEMENT..................................................................................................................56
Paper Published..................................................................................................................................57

IV
Nanjing University of Aeronautics and Astronautics Master Degree Thesis

List of tables

Table 2.1: Extrem Weather condition List............................................................................................21


Table 2.2: Standard weather Briefing Components..............................................................................23
Table 4.1: Gender of Respondents........................................................................................................32
Table 4.2 Age proportion of Respondents............................................................................................33
Table 4.3 Frequency travel of an employee..........................................................................................34
Table 4.4 Reliability statistics...............................................................................................................35
Table 4.5: KMO and Bartlett's Test.......................................................................................................36
Table 4.6 :Total variance explained......................................................................................................36
Table 4.7 :Correlations..........................................................................................................................37
Table 4.8 : Model summary..................................................................................................................38
Table 4.9 : Anova..................................................................................................................................39
Table 4.10 : Coefficients.......................................................................................................................40
Table 4.11 : Summary of Assumptions.................................................................................................42

List of figures

Figure 1.1:Road Map..............................................................................................................................9

Figure 2.1: A typical operation of a commercial flight.........................................................................12


Figure 2.2 :Airport/ Air Traffic flow management delay......................................................................14
Figure 2.3: Percentage of delay causes in broad categories..................................................................15
Figure 2.4 :The operation cost of airlines.............................................................................................19
Figure 3.1 Research Design..................................................................................................................27
Figure 4.1 : Gender proportion ............................................................................................................33
Figure 4.2 : Age proportion .................................................................................................................34

V
Prediction and Analysis of flight delay in large airports

List of abbreviations

FDL Flight Delay BTS Bureau of Transportation

AAT Actual arrival time ML Machine Learning

Airline service Multiple Linear Regression


ASQP MLR
quality performance

Schedule arrival time


PR Precision, Recall SAT

AUC Area Under Curve WC Weather condition

Weather Impacted
WITI TP Technical problem
Traffic Index

National Aviation
NAS SAF Scarcity of aviation fuel
System

MVA Multivariate analysis PD Passenger Delay

Artificial neural Automated surface observing


ANN ASOS
network systems
Airline service Self Administered
ASQP SAQ
quality performance questionnaire
Automated weather Federal Aviation
AWOS FAA
observing systems Administration

VI
Nanjing University of Aeronautics and Astronautics Master Degree Thesis

CHAPTER 1 INTRODUCTION

1.1 Introduction
1.1.1Research background
China's civil aviation transport capacity, overall strength and international status have improved
significantly. The high-speed growth of China's civil aviation transportation industry does not
prevent the shortage of airspace resources. The structure of the airspace is complex, the flight
conflict is serious, the absence of coordinated management between the air traffic controller and the
airline creates a contradiction between the flight path, the method and the management strategy
traffic is relatively extensive. In reality operational efficiency is reduced congestion and delays have
become the norm.
Although several reasons such as security, technical repairs, connecting flights cause significant
delays, weather conditions are still an important factor of course but most of the time the flights
arrive at the same time with the bad weather, that's because modern planes are designed to withstand
many conditions, even storms. Even a lightning can’t damage an airplane and it happens more than
we think. But the problem of bad weather condition is still present to explain the delays.
Meteorological delays can occur at any time but they only produce by exceptional conditions when
the risks are very high, besides it is not only the weather around the airport of departure it is also the
conditions in roads and the airport of arrival. If there is a storm forecast at the destination, the flight
will be almost or certainly delayed. Although several authors have addressed the issue, few methods
are used to assist airline operational staff to reduce the impacts of these delays and control the
resulting costs. In fact, the complexity of this issue lies in the interdependence of the operational
activities and stakeholders involved in each flight. For example, a delay on one flight can cause
cascade delays on other flights which in turn disrupt the departure of other aircraft, affect the
efficiency of the operators and generate additional costs. As a result, flight plans need to be changed
and promises made to customers are changed as well.
Predictions of flight delays are part of a series of new features for travelers that Google
launched earlier this year, this technique is used to provide guidance on the likelihood of flight
delays on Google Flight app. Thanks to artificial intelligence, Google is now able to predict if a
flight will be delayed. By combining flight history with machine learning, Google is able to predict
delays with a confidence level of 85%.
To manage the delay of a flight, the options offered to the airlines, are rather restricted. On the
one hand, they may decide to accelerate the plane late, in order to catch up partially or entirely the

1
Prediction and Analysis of flight delay in large airports

delay. On the other hand, they may decide not to react and therefore to fully assume the costs. In
other cases, the most extreme, it is also possible to make hijackings or cancellations of flight. Ideally,
the feasible solution would be to re-establish the original flight plan as soon as possible to minimize
the consequences of delays and to meet the planned planning.

1.1.2 Research necessity


In the face of the real needs of the major flight delay problems, in-depth research in the field of
predictive analysis, theoretical models and operational mechanisms is further improved. Thanks to
the introduction of machine learning, the objective choice of flight forecast on technology, which
makes full use of the real factors that form a flight plan as a starting point and fully integrates the
real problem, is also an important way of improving adapt to the actual operation of air traffic
control and promote the coordinated decision between air traffic control and the airline. Through
research on this topic, it is possible to know in advance the risk of uncertainty and possible error
with poor prediction perform, analyze and evaluate the estimation generated by machine learning.
Which is the preference of use of the time slot of the airline? Choosing a plan to improve the
independent choice of the airline, automatically generate a predicted flight prediction thus allowing
to precisely control the problem of late flight.

1.2 Research Status


1.2.1 External research situation
With fast improvement of air traffic, extending flight delays in the United States (US) have
turned into a veritable and indisputable issue. According to the Bureau of Transportation Statistics
(BTS), very nearly one out of four transporter flights displayed at its target over 15 minutes late. It is
represented that the yearly outright cost of air transportation delays was more than $30 billion, which
addresses a colossal test to the progression of Next Generation Air Transportation System. This
reality prods the necessity for careful and sensible forecast of flight delays, especially for particular
flights. Past the deferrals directly made by confined airspace limit, 33% of the late appearances were
achieved by a plane appearance up later than anticipated and in this manner pulling out late on its
next flight, which is known as postpone proliferation.
Mueller and al. in 2002(4) exhibited the portrayal and dispersion of the postponement in a
traditional statistical methodology. He also investigated several causal components of deferrals, for
example, traffic volume, airplane type, airplane support, carrier activities, weather conditions,
change of methodology on the way, limit requirements, client care issues, and late airplane or team
arrival. The outcomes show that weather added to 69% of the postponements. Various outcomes can
be accomplished by various strategy and factors; research aftereffects of Kwan and Hansen (10) show
that air terminal clog added to roughly 32% of the normal postponements, where a progression of

2
Nanjing University of Aeronautics and Astronautics Master Degree Thesis

econometric models was set up to recognize the key causal elements of flight delays, including air
terminal blockage, total traffic, and on the way weather. As well as recognizing their quantitative
impact on flight delay, more investigations center on the improvement of models to decide the
(11)
likelihood of airplane delay. Wesonga and al. proposed and evaluated a different parametric
methodology, which incorporates the obviously huge meteorological and avionics boundaries, to
anticipate the likelihood of airplane delay. In 2010 Klein and al. (5) coordinated the convective
weather figures, terminal air terminals weather conjectures and booked flights data to foresee the day
by day air terminal postpone time dependent on a measurement called Weather Impacted Traffic
Index (WITI).
In the airline industry, disruptions caused either by mechanical aircraft problems or by the
absence of crew can occur and change the schedule of operations as originally planned. Clausen and
(18)
al. (2001) define a disturbance as a sufficiently large difference between the current state of
operations and the planned planning schedule for a change in planned operations. In the airline
industry, these are commonly referred to as irregular operations. According to Clausen and al. (2001)
(18)
, when a disruption occurs during the course of operations, several airlines solve the problem
sequentially in the following order: aircraft, crew, ground operations and passengers. In the same
(24)
vein, Ball and al. (2007) claim that the flight schedule is the basis of three other types of
schedules: aircraft schedules, crew schedules and passenger routing. Generally speaking, in the
literature, scheduling can be done by analyzing one aspect at a time: the restoration of the flight
schedule, the crews (crew recovery) and the passenger itineraries.
(27)
Regarding the restoration of aircraft schedules, the authors Teodorovic and al. (1984) studied
the situation of an aircraft being decommissioned and made an effort to minimize passenger delays
(61)
by exchanging and delaying aircraft. Jarrah and al. (1993) developed a model using the shortest
path algorithm, solving aircraft shortages, delaying flights, allowing swapping and the use of aircraft
surpluses (Spare aircraft) until the problem is resolved. The model developed by Yan and al. (1996)
(28)
represented by a network, makes it possible to deal with a disturbance related to an airplane
failure. Thus, three strategies are taken into account during the resolution process: flight
(62)
cancellations, use of surplus aircraft and flight delays. Argüello and al. (1997) propose solutions,
formulated using a quadratic programming model, to reconstruct airplane routes in response to
(26)
ground delays. For their part, the authors Rosenberger and al. (2003) developed a model to create
a new schedule for flights and reassign aircraft while minimizing the costs of reassignment and
cancellation of flights. Anderson presents a network model for using aircraft cancellation, delay and
interchange to resolve aircraft disturbances.
Regarding the recovery, or recovery, of passenger matches, in the literature, the items found
generally did not only deal with the recovery of passengers, without posing actions on flight
(31)
schedules. On the other hand, Kohl and al. (2004) point out that the objectives of the disturbance

3
Prediction and Analysis of flight delay in large airports

management are to: respect the promise made to the client, minimize the additional real costs of
crews, offsets, hotels and accommodations for passengers and crews, air tickets on other airlines, and
(19)
eventually return to planned planning as soon as possible. Stojkovic and al. (2002) consider the
Day of Operations Scheduling (DAYOPS) problem of determining the real-time changes to airline
schedules when disruptions occur. The goal of this problem is to minimize customer inconvenience
and costs for the airline. Their contribution has been to model and solve optimally in real time the
DAYOPS when minor disturbances occur. Their model consists more precisely in preserving the
flight schedule and the crew itinerary and only the arrivals and departures can be modified at the
same time as the duration of the flights, the service on the ground, the planning of the maintenance
(17)
and the passenger connecting flights. Castro and Oliveira (2007) developed a distributed agent-
based system that represents the various roles that exist in the Airline Operations Control Center. It
considers several operational bases, and for each type of operational problems, there are several
specialized software agents that implement heuristic approaches, mathematical models of
operational research, as well as artificial intelligence algorithms.
These specialized agents compete to find the best solution to each problem. While this method
provides for acceptable solutions, it does not integrate the impact on aircraft, crews and passengers
(22)
into one problem. Each problem is solved individually. Abdelghany (2008) present a decision
support tool for restoring airline schedules during irregular operations called DStar (Decision
Support Tool for Airlines schedule Recovery). The optimization model seeks to find the optimal plan
for exchange between crews and aircraft. The main contributions of this tool are: it integrates
decisions for multiple air resources, such as pilots and flight attendants for example, with different
schedule constraints, it provides new flight plans and it recovers flights with resource problems
based on their departure time. It provides, almost in real time, a plan to restore the situation.
(23)
Beatty and al. (1998) propose the concept of delay multiplier (delay multiplier). This is
based on the length of the initial delay and the moment in the day it occurs. The concept of the delay
multiplier can be considered as the value, which, when multiplied by the initial delay, can be used to
estimate potential downstream delays. Due to its complexity, the effect of passenger operations,
freight and door availability was not considered.
Apart from analyzing the delay from building a mathematical model, machine learning has also
[8]
drawn a lot of attention. Dai and Liou try to develop a model to predict departure delays, thus, to
estimate departure times. Several approaches are tested, including linear regression, nonlinear
regression and Artificial Neural Network (ANN). By training the linear model in the Bayes network
with cancellations, the Bayesian network model can predict the delay propagation within three major
airports according to Laskey and these collaborators in 2005. Based on random forest classification
[7]
and regression, Rebello and al. captured the macro delay patterns and dependencies of airports
within the network of air traffic systems. However, to predict the departure delay, they considered

4
Nanjing University of Aeronautics and Astronautics Master Degree Thesis

the taxi in/out and wheel on/off time. These features are inaccessible in actual practice, especially
when predicting future individual flights. Moreover, most of the previous machine learning work has
[9]
a problem of feature Selection. Recently in January 2019 Jun Chen and Meng Li mixed different
approach for chained delay prediction using machine learning. They introduced another machine
learning based air traffic delay prediction model that consolidated muti-name irregular woodland
arrangement and approximated delay proliferation model. To improve the prediction execution, they
presented an ideal element determination cycle and exhibited to have preferred execution over
straightforwardly utilizing every one of the highlights of accessible data sets. Flight delay and late
showing up aircraft delay are demonstrated to be the main highlights for delay prediction.
In all actuality, different components that effect flight delay are much of the time free. A piece
of the pointers are identified with the departure delay, for example, aircraft type, flight schedule, and
flight departure succession. Others may identify with outer triggers, similar to climate and airport
limit. Vigneau introduced the proliferation of successive flight delays as the reliance between
(33)
upstream arrival delays and downstream departure delays , which are brought about by factors not
identified with air traffic control (ATC). Mueller and Gano make analysis of aircraft arrival and
(4)
departure delay characteristics in NASA research center. Mott presented in 2013 a method for
modeling air carrier departure delays by considering the correlation of a priori demand data to
(16)
significantly reduce prediction error . The results demonstrated that the accuracy of the prediction
delay time can be improved. The other side in Massachusetts on 2014, Rebollo and Hamsa present a
(35)
characterization and prediction of Air Traffic Delays . Gareth James and three others authors
(40)
present in 2013 an introduction to statistical learning with application in R studio , Springer. Sruti
(68)
Oza, Somya Sharma, and others in 2015 develop “Flight Delay Prediction System Using
(34)
Weighted Multiple Linear Regression”. In 2016, Ariyawansa and Aponso did a review on state of
art data mining and machine learning techniques for intelligent Airport systems. Alice Sternberg,
(67)
Jorge Soares and others presented in recent year 2017 a thorough literature review of approaches
used to build flight delay prediction models from the Data Science perspective and summarize the
initiatives used to address the flight delay prediction problem, according to scope, data, and
computational methods, giving particular attention to an increased usage of machine learning
method. The same year, Anish M. Kalliguddi and Aera K. (38) Leboulluec publish in universal journal
management a Predictive Modeling of Aircraft Flight Delay.

1.2.2 Research in china


Contrasted with European and American nations, the pace of flights showing up on time
diminished from 83.19% in 2007 to 71.67% in 2017. The yearly cost of flight delays in China was
assessed to be more than $7.4 billion. Such high financial costs of delay require delay causal factor
examination and delay-decrease methodologies. The research in the aspect of flight delay is

5
Prediction and Analysis of flight delay in large airports

relatively a lot, but here we notice some interesting research in flight delays made in china. Y. J. Lui
(47)
in 2008 established a model of a flight-delay and delay propagation based on Bayesian network
(BN).The influences from the Arrival-delay and the Flight-cancellation on the departure-delay are
analyzed under different all kinds of states. For predicting arrival delay of individual flights, Choi
(39)
and al. in 2016 proposed machine learning based model focused on weather-induced delay
prediction of individual flight. By combining the weather data and air traffic data, this model
improves the binary classification accuracy for specific individual origin-destination pair. However,
this model is limited to provide only binary classification of delay. Young Jin kim and his partners (43)
examine the viability of the profound learning models in the air traffic delay prediction undertakings.
By consolidating numerous models dependent on the profound learning worldview, a precise and
strong prediction model has been assembled which empowers an intricate examination of the
examples in air traffic delays. Specifically, Recurrent Neural Networks (RNN) has shown its
extraordinary precision in displaying consecutive information. Everyday successions of the
departure and arrival flight delays of an individual airport have been displayed by the Long Short-
Term Memory RNN design. It has been shown that the exactness of RNN improves with more
profound designs. Moreover, the outcomes suggest that, aside from the well-known weather and
flow control factors, delay-reduction strategies also need to pay more attention to reducing the
impact of delay at the previous airport. (Including specific research objectives, research content and
key issues to be solved; proposed research method, technical route, experimental scheme and
feasibility analysis).
The theoretical research level, the analysis and prediction of flight delay is the main research
object. Prediction of flight include methods based on linear system theory, intelligent model
prediction methods based on knowledge discovery, methods based on nonlinear system theory, and
(20)
combination-based prediction methods. In 2008, Zonglei develop a new method to alarm large
scale of flight delays based on machine learning. In 2005, Liu Yumei and al. (69) proposed the
construction principle and main framework of the aircraft traffic speculation model based on the
(70)
principle of least squares estimation. In 2010, Cui Deguang and al. improved the artificial neural
network. At the same time of the lack of generalization performance, the regression prediction
(71)
method is combined with the artificial neural network prediction. In 2011, Tian Wen and al.
analyzed the influence of aircraft flight time on airspace sector traffic demand forecast from the
perspective of uncertainty, and gave corresponding information on the randomness of aircraft
(36)
entering and leaving sectors and flying in the sector. Qin and yu make in 2014 a statistical
analysis on the periodicity of flight delay rate of the airport in the United State (US). In 2015,
authors discussed the structure and composition of the flight path model, and initially constructed the
aircraft track prediction simulation and experimental system.

6
Nanjing University of Aeronautics and Astronautics Master Degree Thesis
(50)
Xu and al. discovered that flight delays were the essential justification the more than one-
hour delay at the objective air terminals, and the navigating delay was identified with the past delay.
Be that as it may, takeoff deferral could be consumed by planned turnaround time. At the point when
the falling deferral surpasses 30 minutes, over 80% of flights may lessen their actual turnaround
(15)
time. To track down an appropriate arrangement of flight delays, in 2017, Liu and al. introduced a
streamlined GDP procedure, where the operational effectiveness, aircraft and flight value, and ATC
hazards were considered. The recreation study showed that the proposed arrangement decreased the
total defers time, pointless ground delay, and superfluous ground postpones flights. 2018, Weiwei
(13)
and other teacher distribute an article on similar analysis on proliferation Effect of flight delays,
for catch the between reliance among the grouping of flight delays because of carrier activities in air
(39)
terminals, climate, and airport regulation conditions. Recently Sun Choi and al did in 2016 a
prediction of weather induced airline delays based on machine learning algorithms. Based on
research statues in china and abroad it can be found that most of analyze and predict flight delay
results are basically the default time when the default aircraft enters and exits the airspace or airport.
However, according to the operational data analysis currently available, this topic is aimed at this
practical problem, and it is proposed to introduce a prediction method to correct the flight delay
under the normal distribution in the previous study, so as to more accurately extract the time scale.

1.3 Research objectives and methods


1.3.1 The research target
The background of this study suggests that flight delay plays a major role in airport and as a result
needs to be predicted and managed. Delay in the workplace has negative impact on productivity and
many organizations airlines are affected by this phenomenon. They are generally caused by technical
problem, passenger delay, security, national aviation system (NAS): Delays and scratch-offs
inferable from the public avionics framework that alludes to an expansive arrangement of conditions
(for example, non-extraordinary climate conditions, air terminal activities, hefty traffic volume, and
airport regulation) or an extreme weather: Significant meteorological conditions (real or determined)
that, in the judgment of the transporter, delays or forestalls the activity of a flight (for example:
twister, snowstorm, or storm…). Their consequences in the airport are severe and sometimes
catastrophic. The ability to manage flight delay in aviation industry would be improved by
understanding of what delay is, how it should be measured, and how it affects airport or airline
performance. This would allow studied aspects of Delay to be considered; it should be analyzed and
predicted in terms of experience, performance, individual characteristics, etc.
With the continued growth in air traffic demand and the continued development of the civil
aviation industry around the world, airspace congestion is becoming increasingly serious. Based on
the practical problems of congestion in airports with the flight delay in busy areas, this paper extracts

7
Prediction and Analysis of flight delay in large airports

and analyzes the operational characteristics of delays affecting the current aviation system, and
studies flight delay prediction model. With bad weather, road congestion is more important, which
makes flow control and delays very important.
The general aim of this study is to investigate the:
 Analysis of delay (influencing factors of delay, data analysis method…),

 Prediction Method research,

 Delay prediction model based on a accurate method.

1.3.2 Methodology
Delays are for the most part arranged into two kinds to be specific, arrival and departure delays.
A flight with less than 15 minutes delay as for arrival and departure times imprinted on the ticket is
considered on schedule. The reasons for airport delays incorporate extreme weather, airport
congestion, technical problem, air carrier and passenger connection issues. Correspondingly, flight
delay prediction is a perplexing practical and logical test. The examination and plan of confounded
and huge scope frameworks with numerous factors require new methods that can recognize,
characterize, and analyze voluminous data.
Machine learning is the research that explores the development of algorithms that can learn from
data and provide predictions based on it. Works that study flight systems are increasing the usage of
machine learning methods, which actually motivates me to realize my project of end of studies with
this method using regression algorithm for prediction.
The justification for the choice is determined by the nature of the problem under the study since
the data collection was done at one point at a time and mostly the design was undertaken for
academic courses with time constrains. This particular of design will help examine the influencing
factor of delay analysis and predict method of the flight delay obviously. Therefore, the principal
data is collected by questionnaire and relevant, official data from Djibouti Ambouli International
Airport. This paper mainly involves the analysis of delay influencing factors and the method of delay
prediction.
1.4 Delay Prediction method
In terms of optimization, in addition to the improvement of the original algorithm, some are
achieved through the combination of methods. The cut-in angles of delay prediction also show
diverse characteristics, including individual flights, airport, airlines, airspace networks, etc. The
prediction methods can be roughly divided into classification prediction and regression prediction.
For the prediction of delay in this study, many Regression prediction methods are used; such as
delay prediction model based on multiple linear regressions, on grey prediction, on trend

8
Nanjing University of Aeronautics and Astronautics Master Degree Thesis

extrapolation, on quadratic polynomial and combined Forecasting model based on entropy weight
method, will be used for prediction and comparison of their effects.

1.5 Structure of paper


The present study contains of six chapters. The structure of the paper is as follows:

Chart 1: Road map (Source, Author 2022)

Chapter one is the introduction of the research, it introduces the research background, identify
the research objectives, and introduces the current research status at home and abroad for two
aspects: delay influencing factors and delay prediction technology. The main research contents and
key problems to be solved in this paper are expounded and explain the structure of the paper.
The second chapter is based on the definition and characteristics of flight delay related on large
airport volume. Through the previous researches which have been carried out, the main content here

9
Prediction and Analysis of flight delay in large airports

is present the delay prediction and analysis method.


The chapter three analyzes the influencing factors of airport delay. The flight data and
meteorological information of Djibouti Ambouli Airport in 2016 were processed, and the inbound
and outbound flights in the original data were split and counted. Based on the "Normal Statistical
Measures for Civil Aviation Flights", the departure flights were analyzed. Finally, the principal
component analysis method is used to find out the key factors affecting the delay of the airport, and
to do the basic work for the subsequent related experiments.
The fourth chapter is the research on delay prediction model based on different methods. The
forecast target is clarified. Regression predictions algorithm were selected respectively to construct a
airport departure delay prediction model, including the delay prediction model based on the multiple
regression, on grey prediction, on trend extrapolation, on quadratic polynomial and combined
forecasting model based on entropy weight method. It is verified that the improved prediction has a
better prediction effect, and then use the combined prediction method to output the final prediction
results.
The fifth chapter is an example analysis. Aiming at the common situation of airport delays in a short
period of time, based on the actual operation data and weather data of Djibouti Ambouli Airport, the
airport delays at various time periods and under severe weather disturbances are analyzed. At the
same time, combined with the airport delay prediction models proposed in the previous chapter, an
example of the airport short-term delay prediction under the influence of typical weather is verified,
and the prediction effect of the airport delay prediction models research are compared and explained.
The Chapter six, in this paper summarizes the research results of departure delay prediction of
airports in this paper, sorts out the shortcomings in this paper, and looks forward to further research.

10
Nanjing University of Aeronautics and Astronautics Master Degree Thesis

CHAPTER 2 RESEARCH FOUNDATION

Delay has always and still been a problem for airlines to manage it to avoid airport congestion.
It is therefore of some importance whether there is a research of flight delay, what is causing it and
whether it will continue to rise. The attention goes to the questions such as why delay is still
occurring with all the evolution happened, and what are causing flight delay.

2.1 Definition of large airports


Here, the large airport mentioned in the title is defined from the perspective of
flight takeoff and landing volume and throughput, which can be described from the
perspective of the United States, Europe and China, and then turn to the actual situation
of Djibouti airport, define it as a large airport, and then point out the reasons (the takeoff
and landing volume and throughput of Djibouti airport may not be the same as that of
the world's large airports, but explain the importance of defining it as a large airport
from the perspective of the actual operation of your country.)

2.1.1 Concept of flight Delay scenario


Commercial aviation is a complex disseminated transportation framework. It manages
important assets, demand vacillations, and a complex beginning objective lattice that need
organization to give smooth and wellbeing tasks. Moreover, singular passenger follows her agendas
while airlines plan different timetables for aircraft, pilots and flight attendants. Figure 1 represents a
normal activity of a business flight. Stages can occur at terminal limits, airports, runways, and
airspace, being defenseless to various types of deferrals. A few models incorporate mechanical
issues, weather conditions, ground delays, air traffic control, runway lines and limit imperatives.
This plan in figure 2.1 is rehashed a few times for the duration of the day for each flight in the
framework. Pilots, flight attendants and aircraft may have direct timetables because of legitimate
rests, obligations, and upkeep plans for airplanes. In this way, any disturbance in the framework can
affect the ensuing flights of a similar airline. Also, unsettling influences may cause clog at airspace
or different airports, making lines and deferring a few flights from different transporters. Thusly, the
expectation of flight delays is a fundamental subject for airlines, airports, Air Navigation Service
Providers (ANSP), and organization directors, similar to FAA (Federal Aviation Administration) and
Euro control.

11
Prediction and Analysis of flight delay in large airports

Figure 2.1: A Typical operation of a commercial flight

According to U.S. Sabre's statistical analysis of the number of transfer passengers and
connection rates at airports worldwide[2] [2] Hub Airports by Connecting Passengers. (2015) http://
www.airliners.net/forum., the top 30 airports in the world in 2015 were shown in figure 1. The top
10 hub airports were Atlanta /ATL, Dubai /DXB, Dallas /DFW, Frankfurt /FRA, Chicago O'Hare
/ORD, Charlotte/CLT, Istanbul /IST, Paris De Gaulle /CDG, Amsterdam /AMS, and London
Heathrow /LHR, of which 5 airports had a connection rate of more than 50%, and the data at ATL
and CLT were even greater than 60%. In stark contrast to the statistics, no airports in mainland China
are on the list (only Hong Kong /HKG ranks 14). In 2015, Shanghai Pudong International Airport
contributed the highest connection rate (though only 10%) among airports in mainland China, while
the figure of Beijing Capital International Airport was only 8%. In 2017, the connection rate of
Pudong Airport exceeded 12%, and other major domestic airports were about 10%.

12
Nanjing University of Aeronautics and Astronautics Master Degree Thesis

Figure 1. Statistics of connection rates of the world's top 30 hub airports (2010/2015).
Considering from geographic location, the connection rates of airports (Singapore Changi Airport,
Incheon International Airport, Narita International Airport) near our country were about 20%, all
higher than the 10 major international hub airports (Beijing, Shanghai, Guangzhou, Chengdu,
Kunming, Shenzhen, Chongqing, Xi'an, Urumqi and Harbin ) that our country strives to build.

2.1.2 Characteristics of Delay in a Large Airport


Air transport has as of late become a fundamental and crucial portion of World business and
public activity. Air transportation has some particular attributes that give it normal benefits over
other convincing vehicle mode. The capacity of air transport to beat common deterrents of land and
ocean just as it’s nearly rapid makes it a supported mode for the sped up development of specific
merchandise and people. Delays in air transportation are pervasive and exorbitant. Delays happen in
airline operations subsequently, among others, of unfavorable weather conditions. This investigation
saw aircraft delays because of bad weather condition, specialized and calculated issue, for example,
deficiencies refueling, bird strike, shortage of avionics fuel, traveler delay because of aircraft flight
plan delay. The point of this investigation was to consider delays experienced by the airline
administrators in the airport, know the reasons for delays in the airport and afterward proffer
arrangements/proposals on the best way to limit delay in Aviation Industry. Notwithstanding, it is
fitting to follow the historical backdrop of air transport on the planet and recognize the formative

13
Prediction and Analysis of flight delay in large airports

changes that have happened over the course of the years in the business.
Airport delay can be characterized as the distinction between the time it could take an aircraft or
traveler to be served without obstruction from other aircraft or travelers and the real time it takes the
aircraft or travelers to be commented that delay is characterized from numerous points of view
contingent on the unique circumstance. Planned takeoff and appearance delay is the way late a flight
leaves or shows up contrasted with an airlines plan. Flight can cause delays while airborne or on the
ground, for instance as aircraft taxi between the runway and the door. A late appearance of one flight
may cause a late takeoff of the following trip on the schedule of the aircraft's appearances and
flights.
Delay in the airport is a worldwide issue. In the United States of America, Federal Aviation
Administration, Reports from O' Hare International Airport Chicago showed that in 2000, O' Hare
was positioned the third most delayed airport in the country. Generally speaking, marginally in
excess of six percent or all flight was delayed (over 15 minutes). Current limit benchmark at O'Hare
is 200-202 flights each hour in great weather. Current limit diminishes to 157-160 flights (or less)
each hour in unfriendly weather conditions which may incorporate helpless permeability,
troublesome breezes or hefty precipitation. On great weather days, planned traffic is at or over the
limit benchmark for 3 ½ hours of the day and about 2% of the flights are delayed altogether. Along
these lines to the Airport limit benchmark report 2001, the FAA (2001) positioned O'Hare as the
most delayed airport in the U.S for schedule year 2002. The force of flight delays is exacerbated
during top traffic periods and during periods or helpless weather as well as wet runway conditions.
These delay periods sway the airports capacity to give a reliable degree of air administration to the
voyaging public and other airport clients. As flying interest builds extra time, flight delays will keep
on declining; accordingly further disintegrating the airport operational dependability.

There are out, off, on and in times in aircraft tasks. Out time alludes to the time of resistance
(explicitly when the parking brake is delivered). Off time alludes to the drop time at which weight is
no longer borne on the landing gear. On time is related with the score time, and the in time is
identified with the second the parking brake is applied at the door. These times are recorded and
detailed by the individual airlines, their definitions are utilized for delay and travel time calculations.
(74)
The FAA classifies delays into entryway delay, taxi-out delay, in transit (in-flight) delay,
(73)
terminal delay and taxi-in delay. This is additionally approved by Bjorn Syren (2002) . Every
classification of delay emerges when the aircraft requires additional time than was booked.
According to the Bureau of Transportation Statistics (BTS) (2002) defined a delayed trip as one in
which the aircraft neglects to deliver its parking brake under 15 minutes after the booked flight time.
Surface development inefficiencies, according to the Department of Transportation (DOT), are not
by any means the only justification delays on the ground. Ground delay programs, on the way limit

14
Nanjing University of Aeronautics and Astronautics Master Degree Thesis

constraints, aircraft maintenance issues, ground administrations (fuel, stuff and catering), client
assistance issues, late aircraft team appearance, and helpless weather conditions somewhere else all
add to surface delays.
Flight delay is isolated into re-portable, non-re portable and international delays. Delays
recorded in the Air Traffic Operations Network (OPSNET) information base are just 15 minutes or
more, at the end of the day re-portable. Non-re-portable delays are brought about by pilot initiated in
transit deviations around unfavorable weather; delay brought about by mechanical or other aircraft
activity/organization issues; and delay for taxi time constrained by non-FAA substances.
International delays are brought about by initiatives forced by offices outside the United States.

2.1.3 Airport performance


The performance of an airport is chiefly identified with the quantity of aircraft developments
took care of (airport limit). For this situation, the term limit for the most part alludes to the capacity
of a given transportation office to oblige a traffic volume (developments) in a given time span (on
hourly, day by day, or yearly premise). In the event that the air traffic request draws near or surpasses
the given airport limit, the blockage of gave foundation builds which brings about delays and
scratch-offs. This interest limit awkwardness is a vital reason for late activities and influences
various parts of the entire airport framework on both air side (runways, runways, covers) and land
side (passenger handling). Aftereffects of an information investigation from Djibouti airport show
that over 45% of the inconstancy in every day promptness is identified with neighborhood weather
effects.
The meaning of delay can differ as indicated by the partner so that a ton of terms and
definitions have been set up, like adequate delay, network delay, on-time execution, traditionalist
delays, delays per flight-door to entryway, appearance delays, departure delays, surface maneuvering
delays, and passenger delay minutes.

Airports issues represented 37.9% of all Air Traffic Flow Management (ATFM) delays in
September 2017, for the most part because of airport weather (17.9%), aerodrome limit (13.6%) and
Air Traffic Control (ATC) limit (3%) (figure2.2, EUROCONTROL2017). Delays in the airport air
side region can be alleviated by streamlining of interest management, improvement of existing
runway limit, utilization enhancement, and actual extension of air side runway foundation.

15
Prediction and Analysis of flight delay in large airports

Figure 2.2 : Aiport/Air Traffic Flow Management delay

Le niveau de trafic actuel, sur la base des seules lignes commerciales qui assurent une desserte régulière
de Djibouti et hors activités liées à la présence de militaires étrangers, est par an, voisin de : - 5000 à 5500
mouvements d’aéronefs, - 90 000 à 95 000 passagers locaux, - 7 000 à 8 000t de fret. Djibouti est relié par
vols réguliers à l’Europe (Paris, Londres), au Moyen Orient (Dubaï, Jeddah), aux pays de la Corne et de
l’Est de l’Afrique (Addis Abeba, Dire Dawa, Asmara, Hargeisa, Berbera, Mogadiscio, Bosasso, Nairobi,
Moroni…). La fréquence des dessertes sur les principaux aéroports de Somalie, 5 sur le Yémen, 2 sur
l’Erythrée, 2 sur le Moyen Orient (Djedda, Dubaï).
L’activité souhaitée pour Djibouti, de pôle intermodal régional (plate-forme de transit et de
transbordement pour les trafics de l’Hinterland) et de centre régional d’échanges commerciaux, ne peut se
concevoir sans un développement très important des connexions toutes compagnies confondues
permettant une à deux rotations journalières entre Djibouti et les principales villes et centres économiques
de la région.
Les conditions d’organisation à Djibouti d’une plate forme de transit, de transbordement et de
redistribution de fret devraient être discutées, Partant de la situation actuelle, avec une insuffisance
notable de connexion et de fréquences de lignes, compte tenu de la faiblesse de son trafic primaire,
l’aéroport de Djibouti ne pourra devenir un Hub actif, satisfaisant aux besoins d’un centre régional
d’échanges commerciaux que très progressivement, au fur et à mesure du développement de l’activité
zones franches en particulier et donc de la demande.

2.2 Definition of flight delay


Summarize the definitions of flight delay in the United States FAA, European Eurocontrol, ICAO,
IATA, China, etc. (you are only a simple description now. You should be detailed and have specific

16
Nanjing University of Aeronautics and Astronautics Master Degree Thesis

regulations. Go to the official website to find them). Then, combined with the actual situation of
Djibouti, give the official description of flight delay in Djibouti, or explain the description of flight
delay in Djibouti airport from your own point of view. (For the specific writing method, please refer
to section 3.1.1 of your graduate thesis.)

2.3 Delay prediction and analysis method


2.3.1 Delay factors analysis method
Summarize the current mainstream analysis methods of influencing factors of
delay prediction, including principal component analysis, correlation analysis and other
methods. Note that the analysis methods used in this paper must appear. (this is only a
description of the basic mathematical principles and formulas of mature analysis
methods.)

2.3.2 Delay prediction method


Summarize the current mainstream delay prediction methods, not simply list, but
describe in detail such as grey prediction, multiple regression, trend extrapolation,
combined prediction based on entropy weight method and so on. (this is also a
description of the basic mathematical principles and formulas of mature prediction
methods.)

2.4 Chapter summary


To sum up the theoretical chapter and the core theories discussed in this section, delay, causes of delay
and the management of delay with each core theory including subtopics such as symptoms of flight delay,
factors of delay, airport factors causing delay, types of delay were discussed. This chapter started off by
presenting what the concept of flight delay is and the definitions given by different scholars on this topic.
The aim of research is to understand the causes of delays and build accurate predictive analysis model.

Management plays an important role in evaluating and managing the delay level at the airport and should
use different methods to minimize the delay effect, making sure that the employees get proper guidance
consultation when it’s needed as well as development of a prediction system that would combine all
measurements and techniques that are already used for delay.

17
Prediction and Analysis of flight delay in large airports

Chapter 3: Influencing Factors Analysis

Based on the actual operation of a large airport, this paper focuses on the problem of airport
departure delays and carries out research on related classification and risk prediction. First of all, it is
necessary to carry out the analysis of influencing factors. Based on the historical data of the airport,
combined with the research results of domestic and foreign scholars and the actual operation of the
airport, this chapter will conduct a qualitative and quantitative analysis of the influencing factors that
lead to the delay of airport flights by collecting, sorting and cleaning flight operation data and
meteorological data. Then, the principal component analysis method is used to reduce the
dimensionality of many influencing factors, and finally the key factors affecting the delay are
screened out.
3.1 Data acquisition and preprocessing
3.1.1Airport flight operation data selection and processing
The research data in this paper comes from statistic department of the Djibouti international airport
administration. The data is the flight data of Djibouti Ambouli international Airport in 2019,
including a total of 138,316 inbound and outbound flight information. One piece of data corresponds
to one flight information, including: flight number, execution date, departure/landing airport, aircraft
type, planned arrival/departure time, pre-order delay time, actual arrival/departure time, etc.
Process the collected flight raw data:
(1) Calculate and count the actual departure delay time of each flight as a feature marker of the
subsequent delay prediction model. The calculation rules are defined as follows in accordance with
the "Measures for Normal Statistics of Civil Aviation Flights":
According to the "Measures for Regular Statistics of Civil Aviation Flights" implemented by the
Civil Aviation Administration in 2008, any one of the following situations is an abnormal flight.
class.
15 minutes after the departure time not announced in the schedule time (Beijing, Pudong,
Guangzhou and overseas airports 30 minutes, Hong Kong
Bridge, Shenzhen Airport 25 minutes, Chengdu, Kunming Airport 20 minutes) within the normal
departure, or not announced in the schedule
Landed within 10 minutes before or after the arrival time;

18
Nanjing University of Aeronautics and Astronautics Master Degree Thesis

Flights with abnormal circumstances such as return, diversion and alternate landing;
Airlines change their planned flights on their own without the approval of the CAAC or the
competent authorities of the regional administrations.
Among them, the definition of delay time is: delay time = actual flight departure time - (flight
timetable announcement time + regulations
taxi time).
According to the Civil Aviation Administration's 2012 "Regulations on Regular Statistics of Civil
Aviation Flights", any one of the following situations is an abnormal flight departing from Hong
Kong. -Failure to take off within the airport ground taxi time specified after the planned closing
time, and no return, alternate landing, etc.
-abnormal situation;
-Landing later than 10 minutes after the planned opening time; flights cancelled on the same day;

-Flights for which the airline has changed its flight plan on its own without approval

And the definition of flight delay time is given: flight delay time is equal to the time when the actual
departure time is later than the sum of the planned closing time and the airport ground taxiing time.
The calculation formula is: flight delay time = actual departure time - (planned door closing time +
airport ground taxi time).

According to the Civil Aviation Administration's 2016 "Regulations on Regular Statistics of Civil
Aviation Flights" (Draft for Comment), any one of the following situations is an abnormal flight
departing from Hong Kong.

-Flights that do not depart before 15 minutes (inclusive) after the scheduled departure time; the
airline will change the pre-flight planned flight without approval.
-The corresponding definition of flight delay time is given: the length of time, in minutes, after the
actual arrival time of the flight is 15 minutes (inclusive) later than the planned arrival time.
Calculation formula: flight delay time = actual flight arrival time - (planned arrival time + 15
minutes).

19
Prediction and Analysis of flight delay in large airports

Therefore, in this paper, we define the flight departure delay time = actual departure time – planned
departure time – 15 (unit: minutes). When it is negative or 0, it is considered that there is no previous
flight delay;
(2) The arrival and departure times of all flights are marked in units of hours. The arrival and
departure times of the original data cover almost all times of the day, but too many values can easily
lead to overfitting of the classification. Combined with the actual operating conditions of the airport,
the flight flow at the airport varies greatly at different times. For example, the number of flights in
the afternoon is much higher than that in the early morning, and the meteorological messages are
also observed and recorded every half hour or one hour. It is sent by shooting, so we ignore the
influence of the date, and divide and mark the original data according to the planned departure time
in hourly units;
(3) Add the field "Number of flight delays", and judge flight delays with reference to the calculation
method of delay time in (1), and count the total number of flight delays per unit hour;
(4) Divide the incoming and outgoing flights in the original data, separately count and record the
number of inbound and outbound flights and the total number of flights per unit hour, and add the
fields "number of departure flights" "number of inbound flights" " total number of flights";
(5) In the processing of flight operation data, consider the actual situation of flight operation, use the
existing departure flight date, and add some prior information. Since the number of passengers
traveling on weekends and national statutory holidays will increase, the corresponding "week" and
"holiday/working days" fields are added, where TRUE represents holidays and FALSE represents
working days;
(6) Integrate the processed flight data with the meteorological data of the corresponding period.
3.1.2 Airport Meteorological Data Processing
The source of the airport meteorological data in this article is the METAR report of Djibouti
Ambouli International Airport. The original data content of the report includes: airport surface wind
direction and wind speed, visibility, current weather type, cloud type and cloud base height,
temperature and dew point, corrected sea level pressure, supplementary information, etc. Therefore,
it is necessary to transform the collected data according to the research content:
(1) According to the interpretation of weather types in the METAR telegram code table and the
professional knowledge of civil aviation, the weather types are based on the

20
Nanjing University of Aeronautics and Astronautics Master Degree Thesis

Decrease the degree of influence to mark. 0 means CAVOK, in this case, the visibility is greater than
or equal to 10km, no clouds below 1500m, and no precipitation and cumulonimbus; 1 means
weather with slight influence, including -RA, -RABR, -SHRA, -SHRABR, -DZBR, HZ, BR, MIFG,
DZFG; 2 for moderately affected weather, including RA, SHRA, SHRABR, SQ, +RA, +SHRA,
+RABR; 3 represents badly affected weather, including -TSRA, -TSRABR, TS, TSRA, TSRABR,
TSRASQ, VCTS, +TSRA, +TSRABR;
(2) Combined with the release standards of Djibouti Ambouli International Airport, that is, the
minimum visibility is 800m and the cloud base height is 60m, the visibility and cloud height data are
processed. In the meteorological data, the visibility retains the specific value in the message.
Combining visibility and weather phenomena can comprehensively reflect the impact of airport
weather conditions on flight delays. For example, when the weather is fine (CAVOK), the visibility
is greater than 9999 meters. In contrast, cloud type and cloud height have less impact. In the original
data, there are very few data whose cloud height is lower than the release standard, and there are
many missing values that are difficult to complete. Using too much completed data is easy to
generate noise, so the cloud height data is directly deleted;
(3) Wind direction refers to the direction of the wind. In many cases, the wind direction is not an
accurate angle but is changeable. The wind direction at the same location in a short period of time
also changes, and the wind direction needs to be combined with the airport runway angle and take-
off direction, so it is difficult to quantify the delay effect caused by it. The impact of wind on take-
off and departure, such as wind shear, crosswind, etc., is mostly accompanied by severe weather
phenomena, and the weather type includes the impact of wind on delay to a certain extent. Therefore,
the data related to wind direction and speed are not considered and converted separately;
(4) Air temperature, dew point, and corrected sea level pressure are directly extracted from the
specific values in the message without conversion.

3.1.3 Data cleaning


Since there are many missing and abnormal data in the collected flight operation data and
meteorological data, the prediction needs to be based on a large amount of data. If a large number of
samples are deleted or the data is inaccurate, a lot of useful information is lost for the machine
learning model, and the performance of the trained model will be affected. Therefore, the data needs
to be cleaned. The main process is shown in Figure 2.1.

21
Prediction and Analysis of flight delay in large airports

Figure 2.1 Data cleaning process

The first step is to convert the data into statistics per unit hour. According to the division standard of
unit time period in Section 2.1.1, each piece of data marked by flight number is integrated into
airport-related data marked by hour period. Therefore, the flight number, execution date, aircraft
type, and pre-order delay time will no longer be considered; in addition, since the arrival delay also
involves the front station airport and air route delay, it is difficult to quantify, so this paper focuses
on the departure delay in this field The prediction of the situation will no longer take into account
data such as takeoff/landing airport, planned/actual arrival time, and actual departure time.
The second step is to consider filling the missing values in the data. There are many ways to fill in
the missing values, including: filling in statistics according to the mean, median, mode, etc.; using
values that are not within the normal range of values, For example -999, 0 and other special value
filling; sub-category filling; interpolation filling; predicted value filling, etc. This paper chooses
GBDT to fill in missing data, because it has the characteristics of flexible processing of various types
of data including continuous values and discrete values, and can use the values of other feature
attributes to predict and fill the values of missing feature attributes.
The third step is to remove the extremely abnormal noise data existing in the data.
In the data completion part, using the GBDT algorithm to complete the data is essentially to use the

22
Nanjing University of Aeronautics and Astronautics Master Degree Thesis

prediction method to predict the missing part to achieve the purpose of completion by using the data
without missing data. The working principle of GBDT is shown in the following figure:

Figure 2.2 How GDBT works


The specific working principle is as follows: first, use the initial value to predict a decision tree, and

obtain the residual of this round (that is, the difference between the actual value and the predicted value);

use the residual as the prediction object of the next round of decision tree, this round of prediction One

more residual will be generated; continue to use this residual as the prediction object of the next round,

and this loop will iterate until the prediction residual of the last round is extremely small or 0. Stop

iterating, and use all decision tree models to predict the results. Summation to get the final prediction

result.
Example description

The collected data are integrated by hour, and a total of 8760 pieces of data are obtained throughout the

year. Among them, temperature, dew point, air pressure, cloud height

There are many missing values in the wind direction, 327 missing temperature periods, 634 missing dew

point periods, 326 missing air pressure periods, and cloud heights.

There are 1359 missing periods and 2077 missing periods in wind direction. Among them, there are many

periods of missing values for cloud height and wind direction, and due to the weak correlation with other

attributes, it is difficult to fill in the predicted values. Referring to existing research results, cloud height

and wind direction have little direct impact on delays, and their adverse effects on airport flight delays are

often accompanied by bad weather. Therefore, the data of these two types of characteristic attributes will

be deleted without completion, and the three types of data of temperature, dew point and air pressure will

23
Prediction and Analysis of flight delay in large airports

be supplemented. Select some time periods with missing dew points, and illustrate their completion with

examples, as shown in Table 2.1.

Taking dew point as the class label attribute, other 11 attributes (planned departure period, number of
delays, total number of airport flights per hour, number of flights arriving per hour, number of flights
departing per hour, week, holiday, visibility, weather type, temperature , air pressure) as the
characteristic attribute, use the GBDT algorithm for training iteration, and input 11 characteristic
attributes in the prediction period respectively, and finally obtain the prediction result, that is, the
dew point of the class label attribute.
Using the GBDT algorithm, 1241 pieces of data with missing values of the above feature attributes
were filled. In addition, a total of 651 data outliers were removed, and a total of 8,109 valid data
were finally retained.
Finally, the fields retained in the data include: scheduled departure time, delay time, total airport
flights per hour, number of flights arriving per hour, number of flights departing per hour, week,
holiday/working day, visibility, weather type, temperature, dew point and air pressure.
EXEMPLE DU GRAND TABLEAU DE DATA

3.2 Analysis on Influencing Factors of airport delay


Analyze in detail the factors affecting flights, and describe the action mode and effect of
each factor: it can be described in language, and it is best to analyze it in combination
with the actual data of Djibouti airport.
3.2.1 Qualitative analysis
3.2.2 Quantitative analysis
3.3 Screening of key influencing factors

The writing method here is to describe the analysis process and basic process in the
form of text and flow chart. It is recommended that you use principal component
analysis or correlation analysis.

24
Nanjing University of Aeronautics and Astronautics Master Degree Thesis

3.4 Example analysis


In this study, exploratory factor analysis is used by SPSS software, exploratory factor analysis is
being demonstrated using principal component method with Varimax extraction and rotation
interpretation.
For this analysis, the observation to variables ratio, the Kaiser-Meyer Olkin statistic and Bartlett’s
test sphericity are used to evaluate the sample size. The study applied KMO test, it explains if partial
correlations among variables are small. It is a test to measure how suited the date is for factor
analysis. The test measures sampling adequacy for each variable in the model. According to previous
researches, a sample size must be at least above 100 observations, in the present study, the sample
size is 330 which make available an acceptable basis for the calculation of the correlations between
variables. It is recommended to have a KMO value of at least 0.5 which means that the factor
analysis is appropriate to analyse the attributes. The KMO result in this analysis is:

Table 4.5: KMO and Bartlett's Test


Kaiser-Meyer-Olkin Measure of Sampling Adequacy. .729
Bartlett's Test of Sphericity Approx. Chi-Square 285.785
df 6
Sig. .000

Small values of KMO test demonstrates that correlations between pairs of variables can’t be
described by other variables which are the result of unsuitability. It is advised to have a KMO value
of at least 0.5 which refers that factor analysis is suitable to analyze the features. A value of 0.729
(Table 4.5) was obtained which is sufficient enough to conduct factor analysis. The Bartlett’s test of
sphericity is highly significant (p=0.000), so it is good for further analysis.

Table 4.6 Total Variance Explained

Initial Eigenvalues Extraction Sums of Squared Loadings


Component Total % of Variance Cumulative % Total % of Variance Cumulative %
1 2.202 55.049 55.049 2.202 55.049 55.049

2 .841 21.016 76.065

3 .545 13.629 89.695

25
Prediction and Analysis of flight delay in large airports

4 .412 10.305 100.000

One factor has been extracted and identified (Table 4.6). The total variance explained by the 4
factors is 55.049 (Table 4.6).
It is generally recommended that the extracted factors account for at least 60% of the variance,
a condition sufficiently sustained in the present analysis. In this study, varimax rotation with Kaiser’s
normalization was used to analyze. Varimax performs well mostly in combination with the so-
called Kaiser's normalization (equalizing communalities temporarily while rotating), it is advised to
always use it with varimax (and recommended to use it with any other method, too).

3.5 Summary of this chapter


This chapter looks at basic preparations such as data collection and processing, and explores the key
factors that affect short-term airport delays. First, select the annual flight operation data of Djibouti
Ambouli international airport, process the collected original airport flight data, divide and integrate
the data according to the unit hour, and analyze and add relevant influencing factor fields; The
weather information, combined with the basic knowledge of civil aviation and the actual situation of
the airport, can be used for data analysis and conversion. Then, clean and complete all the collected
data, and delete the data that cannot be completed and abnormal data. In addition, according to the
actual operation status of the airport, referring to the existing research literature review at home and
abroad, a qualitative analysis of the influencing factors of flight delays was carried out, and the
results were obtained.
Combined with the quantitative analysis of the data, the situation of flight delays at the airport under
the influence of single factor and the combined action of multiple factors is further comprehensively
explored. Finally, the principal component analysis method is used to screen many influencing
factors, and an example analysis is carried out in combination with actual data, and finally the key
characteristic variables that affect flight delays at the airport are screened out, which is good for the
subsequent use of machine learning algorithms to achieve good delay risk prediction effect.

26
Nanjing University of Aeronautics and Astronautics Master Degree Thesis

Chapter 4: Prediction Method Research

Write 2 to 3 sentences to introduce the content of the chapter 4….

4.1 delay prediction model based on multiple regression analysis

The research design articulates what data is required, what methods are going to be used to collect
and analyze this data, and how all of this is going to answer research question. It is the framework
that specifies the types of information to be collected, source of data and data collection procedure. A
good design will make sure that the information gathered is consistent with the study objectives and
data are collected by accurate and economical procedure. Therefore, the study employed a cross-
sectional research design.
The justification for the choice is determined by the nature of the problem under the study since the
data collection was done at one point at a time and mostly the design was undertaken for academic
courses with time constrains. This particular of design will help examine factors of flight delays.
The aim of coding was to reduce data to simple categories and themes that allows comparison and
testing of the critical questions of the study. “Using quantitative method tools increases overall
confidence in the findings of the study” and helps address complex social problems that may not be
answered successfully by a single approach. Triangulation of instruments in research is essential for
it enhances validity and reliability. Also, the integration of quantitative approach was very necessary
in solving research problems. SPSS version 24 was used to conduct the quantitative data analysis.
Data was collected through primary and secondary as per the requirements of the research. The
sample size targeted was 300 however, and 330 respondents were collected. A self-administered
questionnaire was used to collect primary data after a pilot test. The data were described using
frequency distribution and percentages. Statistics analysis such as reliability test analysis and factor
analysis along with correlation analysis was conducted to discover how variables were related and
findings were reliable. To justify the research objective of this study, multiple regressions were
conducted.
The researcher further tested the variables using the Multiple Linear Regression (MLR) by
combining all independent variables in a single equation to establish the contribution on the
dependent variable (flight delay).

27
Prediction and Analysis of flight delay in large airports

The model was expressed as:


γ=α+β1X1+ β2X2+ β3X3+ β4X4+ ε (4.1)

Where by ε residual errors

α= y-intercept constant term


γ= Represent Flight Delay of Airline
X1=Weather Condition (WC)
X2= Flight Delay (FD)
X3= Air Carrier (AC)
X4= National aviation system (NAS)
β1= Slope coefficient for X1 explanatory variable (weather)
β2= Slope coefficient for X2 explanatory variable (delay)
β3= Slope coefficient for X3 explanatory variable (air carrier)
β4= Slope coefficient for X4 explanatory variable (NAS)
ε= Residual error term

γ=α+β1Weather condition + β2Technical Problem + β3Scarcity of aviation fuel + β4Passenger


delay, + ε
The Data collected will be analyzed statistically using SPSS Statistics version 24. The Statistical
package for the social sciences is the main software used in academic researches. Data will be
collected and analyzed based on the demographic, age, gender and situations.
There is a systematic procedure of carrying out the data analysis in an appropriate manner. The
process is as follows:
- Firstly, the collected data through questionnaire has been coded in the excel sheet.
- Secondly the data has been then imported in the SPSS tool to accomplish the analysis of several
variables being selected for the research.
- In order to test the hypothesis, the researcher has implemented multiple regressions to gain a better
understanding about the relationship between the different variables respectively.
The collected data was quantitative in nature and it will be analyzed using multiple regression
models via SPSS tool version 24.

28
Nanjing University of Aeronautics and Astronautics Master Degree Thesis

4.2 delay prediction model based on Grey Prediction


4.3 delay prediction model based on trend extrapolation
4.4 based on quadratic polynomial
4.5 combined forecasting model based on entropy weight method
Please add these prediction methods. The writing method is the same as above.
These methods are very simple. Don't say they can't be completed, because there are
many tutorials on the Internet and the process is very simple.
4.6 Flight Delay prediction process of large airports
Predict the above 4.1-4.4 respectively, and then use the combined prediction
method of 4.5 to output the final prediction results. This part is described with text and
flow chart.
4.7 Chapter of Summary

29
Prediction and Analysis of flight delay in large airports

Chapter 5: Example Analysis of flight delay in Djibouti Airport

Write 2 to 3 sentences to introduce the content of the chapter 5….

5.1Prediction results
Describe the prediction results in 4.1-4.5. (the results of Chapter 5 in the original paper can be used,
but the new contents under the new method should be supplemented)

5.2 Result analysis


Compare the predicted results with the actual results, and analyze which method in 4.1-4.5 is more
accurate, usually 4.5 is more accurate. This also requires multi-dimensional comparison.

30
Nanjing University of Aeronautics and Astronautics Master Degree Thesis

CHAPTER 6 CONCLUSION OF THE STUDY

6.1 Conclusion
The main of the study is assessing the analysis of the factors causing really delay in airport sector.
The descriptive research design and inferential statistics were adopted (quantitative research design
technique also adopted) and secondary data was collected through journals, report, articles and
official website, analyzed through descriptive statistics and presented in form of frequencies and
percentages. A literature review was conducted to analyze previous academics researches concerning
the subject.
Inferential statistics were also used to test the hypotheses (Pearson’s correlation) and determine the
relationship between variables (Multiple Linear Regression analysis). And finally, Cronbach’s Alpha
and Kaiser-Meyer-Olkin Measure of Sampling Adequacy (KMO) and Bartlett’s Test was adopted to
measure the reliability and validity of the study respectively. The study conducted contains about
330 respondents and a factor analysis along with multiple regressions was run.
The study explained that the model developed was statistically reliable explaining the relationship
between weather condition and flight delay. The results gave a significant result with B 1 = 0,309 and
p-value greater than > 0.000. Consequently, climate is an important component and a strong
predictor of flight delay. It can be recognized that weather have a significant impact on delay in the
case of Djibouti airport. Applying the hypothesis test procedure to this result and analysis, the
overall p-value of air carrier according to the multiple regression result in Table 5.8 was not
significant (0.251). Consequently, handling is an important component and a strong predictor of
level of accident.
NAS is an important component and the strongest predictor of level of accident. It can be recognized
that NAS have a positive and significant impact on flight delay with the significance level greater
than > (0.000).
It is imperative to note that each coefficient was influenced by other variables in a regression model,
because predictor variables were nearly always associated. In other cases two or more variables may

explain the same variation in . Therefore, each coefficient does explain the total effect on of its

corresponding variable.
Predicting fight delays is on interesting research topic and required many attentions these years.
Majority of research have tried to develop and expand their models in order to increase the precision
and accuracy of predicting fight delays. Since the issue of fights being on-time is very important,
fight delay prediction models must have high precision and accuracy.

31
Prediction and Analysis of flight delay in large airports

6.2 Implication of Findings


6.2.1 Academicians and other Researchers
On the other hand, the study is important to shed light on the study area, opening doors to other
researchers to investigate similar topics in other environments or the same environments but using
different methodologies. Furthermore, the same study can be replicated, using the same
methodologies, objectives and hypothesis to find out if the findings of the study are valid and
reliable. This study is also important because it provides insights on the theories that are useful for
this study. The use of these theories in this study can be helpful for other researchers to use in studies
of recruitment and selection.

6.2.2 Policy makers


There is a problem in aviation sectors specifically in large airport for the policies making. Therefore
with the aid of this research the policy makers will review Prediction of Flight Delay in Large
Airport policies for the satisfaction of various organizations’ needs mainly in raising their
performance.

6.2.3 Researcher
The study was imperative to the researcher in gaining knowledge on the relationship between
independent and dependent variables of the study, and knowledge on the influence of the coefficient
of determinants on the dependent variable. Furthermore the findings broadened the researcher’s
knowledge on Airport and management of Airport relation to specific objectives (weather condition,
technical problem, passenger delay and scarcity of aviation fuel). The findings will add knowledge
to the researcher regarding the role of Prediction of Flight Delay in a Large Airport. The completion
of this study is a fulfillment for the requirement of a bachelor degree in sectors of Aviation.

6.3 Limitations of the Studies


Prior to conducting the study, researcher expected to encounter the following obstacles; some
respondents would reserve some vital information that they considered confidential thus the
researcher would have to rely on basic information. Some respondents would be too busy to
participate fully in study and finally, the time apportioned by the University to complete the study
was insufficient and was expected to hinder the collection of sufficient data.
In real field practice obstacles encountered include; methodological problems regarding data
collection techniques, and the use of closed ended questionnaires as the main instrument of acquiring
information. It appeared on some occasions that some subjects were very busy confirming the

32
Nanjing University of Aeronautics and Astronautics Master Degree Thesis

researcher’s fears. Some participants reserved information they considered confidential forcing the
researcher to rely on basic information thus failing to collect secondary data. Although the researcher
prepared a budget of TZS 690,000/= in the proposal stage of the study, the researcher failed to obtain
any support from the initial to the final stage of the study.

33
Prediction and Analysis of flight delay in large airports

LIST OF REFERENCES

[1] Ball, M., Barnhart, C., Dresner, M., Hansen, M., Neels, K., Odoni, A., Peterson, E.,
Sherry, L., Trani, A. A., and Zou, B., “Total delay impact study: a comprehensive
assessment of the costs and impacts of flight delay in the United States, NEXTOR,”
2010.
[2] Chen, J. and Sun, D., “Stochastic Ground-Delay-Program Planning in a Metroplex,”
Journal of Guidance, Control, and Dynamics, Vol. 41, No. 1, 2017, pp. 231–239
[3] Ferguson, J., Kara, A. Q., Hoffman, K., and Sherry, L., “Estimating domestic US airline
cost of delay based on European model,” Transportation Research Part C: Emerging
Technologies, Vol. 33, 2013, pp. 311–323
[4] Mueller, E. R. and Chatterji, G. B., “Analysis of aircraft arrival and departure delay
characteristics,” AIAA aircraft technology, integration and operations (ATIO)
conference, 2002.
[5] Klein, A., “Airport delay prediction using weather-impacted traffic index (WITI)
model,” Digital Avionics Systems Conference (DASC), 2010 IEEE/AIAA 29th, IEEE,
2010, pp. 2–B.
[6] Xu, N., Donohue, G., Laskey, K. B., and Chen, C.-H., “Estimation of delay propagation
in the national aviation system using Bayesian networks,” 6th USA/Europe Air Traffic
Management Research and Development Seminar, 2005.
[7] Rebollo, J. J. and Balakrishnan, H., “Characterization and prediction of air traffic
delays,” Transportation research part C: Emerging technologies, Vol. 44, 2014, pp. 231–
241.
[8] D.M. Dai and J. S. Liou, “Delay prediction models for departure flights,”
Transportation Research Record, 2006.
[9] Chained Predictions of Flight Delay Using Machine Learning, Jun Chen, Meng Li.
January 2019.
[10] Kwan and M. Hansen, “US flight delay in the 2000s: an econometric analysis,” in
Proceedings of the Transportation Research Board 90th Annual Meeting, vol. 11-4283,
2011.
[11]R. Wesonga, F. Nabugoomu, and P. Jehopio, “Parameterized framework for the analysis
of probabilities of aircraft delay at an airport,” Journal of Air Transport Management,
vol. 23, pp. 1–4, 2012.

34
Nanjing University of Aeronautics and Astronautics Master Degree Thesis

[12] Tan Zhou, Qiang Gao, Xin Chen and Zongwei Xun, Flight Delay Prediction Based on
Characteristics of Aviation Network. MATEC Web of Conferences 259, 0 0 (2019).
[13] Weiwei Wu, Cheng-Lung Wu, Tao Feng, Haoyu Zhang, and Shuping Qiu.
Comparative Analysis on Propagation Effects of Flight Delays: A Case Study of China
Airlines. Journal of Advanced Transportation Volume 2018, Article ID 5236798, 10
pages.
[14] Yiming Xing, Xiaojuan Ban, Xu Liu and Qing Shen, Large-Scale Traffic Congestion
Prediction Based on the Symmetric Extreme Learning Machine Cluster Fast Learning
Method, Published: 28 May 2019.
[15] J. Liu, K. Li, M. Yin, X. Zhu, and K. Han, “Optimizing key parameters of ground
delay program with uncertain airport capacity,” Journal of Advanced Transportation,
vol. 2017, Article ID 7494213, 2017.
[16] J. H. Mott, “The use of demand correlation in the modeling of air carrier departure
delays as first-order autoregressive random processes,” Journal of Advanced
Transportation, vol. 47, no. 5, pp. 498–511, 2013.
[17] Castro, A. and E. Oliveira (2007). A distributed multi-agent system to solve airline
operations problems. Proceedings of the ninth international conference on enterprise
information systems,vol, Madeira, Portugal.
[18] Clausen, J., J. Hansen, et al. (2001). Disruption Management – Operations Research
between Planning and Execution. OR/MS Today, 28(5), 40-43.
[19] Stojkovic, G., F. Soumis, et al. (2002). An optimization model for a real-time flight
scheduling problem. Transportation Research Part, 36A, 779-788.
[20] L. Zonglei, W. Jiandong, and Z. Guansheng. A New Method to Alarm Large Scale of
Flights Delay Based on Machine Learning. In2008 International Symposium on
Knowledge Acquisition and Modeling, pages 589–592, Dec. 2008.
[21] Abdelghany, K. F., S. S. Shah, et al. (2004). A model for projecting flight delays
during irregular operation conditions. Journal of Air Transport Management, 10, 385-
394.
[22] Abdelghany, K. F., A. F. Abdelghany, et al. (2008). An integrated decision support tool
for airlines schedule recovery during irregular operations. European Journal of
Operational Research, 185(2), 825-848.
[23] Beatty, R., R. Hsu, et al. (1998). Preliminary evaluation of flight delay propagation
through an airline schedule. Proceedings of the Second USA/Europe Air Traffic
Management R&D Seminar. Orlando.
[24] Ball, M., C. Barnhart, et al. (2007). Air transportation: Irregular operations and

35
Prediction and Analysis of flight delay in large airports

control. Handbook in Operations Research and Management Science, 14, 1-73.


[25] Babic, O., M. Kalic, et al. (2011). The airline schedule optimization model: validation
and sensitivity analysis. Procedia - Social and Behavioral Sciences, 20, 1029-1040.
[26] Rosenberger, J. M., E. L. Johnson, et al. (2003). Rerouting aircraft for airline
recovery.Transportation Science, 37(4), 408-421.
[27] Teodorovic, D. and S. Guberinic (1984). Optimal dispatching strategy on an airline
network after a schedule perturbation. European Journal of Operational Research, 15(2),
178-183.
[28] Yan, S. and D.-H. Yang (1996). A decision support framework for handling schedule
perturbation. Transportation Research Part B: Methodological, 30(6), 405-419.
[29] Yu, G., M. Arguello, et al. (2003). A new era for crew recovery at Continental Airlines.
Interfaces, 33, 5-22.
[30] Cook, A. and G. Tanner (2009). The challenge of managing airline delay costs.
Conference on Air Traffic Management (ATM) Economics.10. University of Belgrade.
[31] Kohl, N., A. Larsen, et al. (2004). Airline disruption management - perspectives,
experiences and outlook. Journal of Air Transport Management, 13, 149–162
[32] https://www.transtats.bts.gov/
[33] W. Vigneau, “Flight Delay Propagation Synthesis of the Study,” in Eurocontrol
experimental centre, EEC Note No 18/03, Issued: October, p. 33, October, Issued, 2003.
View at Google Scholar.
[34] C. Ariyawansa and A. Aponso. Review on state of art data mining and machine learn-
ing techniques for intelligent Airport systems. InProceedings of 2016
InternationalConference on Information Management, ICIM 2016, pages 134–138,
2016.
[35] Juan Jose Rebollo and Hamsa Balakrishnan (2014). ‘Characterization and Prediction
of Air Traffic Delays’, Massachusetts Institute of Technology.
[36] Q. L. Qin and H. Yu (2014), A statistical analysis on the periodicity of flight delay rate
of the airports in the US, Advances in Transportation Studies an international Journal
2014 Special Issue, Vol. 3.
[37] Gareth James, Daniela Witten, Trevor Hastie and Robert Tibshirani (2013), An
Introduction to statistical Learning with applications in R, Springer.
[38] Predictive Modeling of Aircraft Flight DelayAnish M. Kalliguddi*, Aera K.
Leboulluec Universal Journal of Management 5(10): 485-491, 2017.DOI:
10.13189/ujm.2017.051003.\/’
[39] SunChoi, YoungJinKim, SimonBriceno, DimitriMavris ”Prediction of weather-

36
Nanjing University of Aeronautics and Astronautics Master Degree Thesis

induced airline delays based on machine learning algorithms”,


35thDigitalAvionicsSystemsConference(DASC),2016.
[40] Gareth James, Daniela Witten, Trevor Hastie and Robert Tibshirani (2013), An
Introduction to statistical Learning with applications in R, Springer.
[41] Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5-32.
[42] Breiman, L.; Friedman, J. H.; Olshen, R. A.; Stone, C. J. Classification and
Regression Trees; Chapman & Hall/CRC: Boca Raton, 1984.
[43] Kim, A.; Hansen, M. Deconstructing delay: A non-parametric approach to analyzing
delay changes in single server queuing systems. Transp. Res. Part B 2013, 58, 119–133.
[44] Abdelghany, K.F.; Shah, S.S.; Raina, S.; Abdelghany, A.F. A model for projecting
flight delays during irregular operation conditions. J. Air Transp. Manag. 2004, 10,
385–394.
[45] Allan, S.S.; Beesley, J.A.; Evans, J.E.; Gaddy, S.G. Analysis of delay causality at
Newark international airport. In Proceedings of the 4nd USA/Europe Air Traffic
Management R&D Seminar, Santa Fe, NM,
[46] Balakrishna, P.; Ganesan, R.; Sherry, L. Accuracy of reinforcement learning
algorithms for predicting aircraft taxi-out times: A case-study of Tampa Bay departures.
Transp. Res. Part C 2010, 18, 950–962.
[47] Liu, Y.L.; Liu, Y.; Hansen, M.; Pozdnukhov, A.; Zhang, D.Q. Using machine learning
to analyze air traffic management actions: Ground delay program case study. Transp.
Res. Part E 2019, 131, 80–95.
[48] Zhang, Y.; Hansen, M. Real-time intermodal substitution: Strategy for airline recovery
from schedule perturbation and for mitigation of airport congestion. Transp. Res. Rec.
2008, 2052, 90–99.
[49] Xu, N.; Donohue, G.; Laskey, K.B.; Chen, C.H. Estimation of delay propagation in
aviation system using Bayesian Network. In Proceedings of the 6th USA/Europe ATM
Seminar, Baltimore, MD, USA, 27–30 June 2005.
[50] Xu, N.; Sherry, L.; Laskey, K.B. Multi-factor model for predicting delays at US
airports. Transp. Res. Rec. 2008, 2052, 62–71.
[51] Pyrgiotis, N.; Malone, K.M.; Odoni, A. Modelling delay propagation within an airport
network. Transp. Res. Part C 2013, 27, 60–75.
[52] Fleurquin, P.; Ramasco, J.J.; Eguiluz, V.M. Systemic delay propagation in the US
airport network. Sci. Rep. 2013, 3, 1159.
[53] Guleria, Y.; Cai, Q.; Alam, S.; Li, L.S. A multi-agent approach for reactionary delay
prediction of flights. IEEE Access 2019, 7, 181565–181579.

37
Prediction and Analysis of flight delay in large airports

[54] Khanmohammadi, S.; Tutun, S.; Kucuk, Y. A new multilevel input layer artificial
neural network for predicting flight delays at JFK airport. Procedia Comput. Sci. 2016,
95, 237–244.
[55] Belcastro, L.; Marozzo, F.; Talia, D.; Trunfio, P. Using scalable data mining for
predicting flight delays. ACM Trans. Intell. Syst. Technol. 2016, 8.
[56] Yu, B.; Guo, Z.; Asian, S.; Wang, H.Z.; Chen, G. Flight delay prediction for
commercial air transport: A deep learning approach. Transp. Res. Part E 2019, 125,
203–221.
[57] Wong, J.T.; Tsai, S.C. A survival model for flight delay propagation. J. Air Transp.
Manag. 2012, 23, 5–11.
[58] Geurts, P.; Ernst, D.; Wehenkel, L. Extremely randomized trees. Mach. Learn. 2006,
63, 3–42.
[59] Geurts, P.; Louppe, G. Proceedings of the learning to rank challenge. PMLR 2011, 14,
49–61.
[60] Ke, G.; Meng, Q.; Finley, T.; Wang, T.; Chen, W.; Ma, W.; Ye, Q.; Liu, T.Y.
LightGBM: A Highly Efficient Gradient Boosting Decision Tree. Available online:
http://papers.nips.cc/paper/6907-lightgbm-a-highly-efficient-gradient-boosting-decision
(accessed on 26 March 2020)
[61] Ahmad I. Z. Jarrah, Gang Yu, Nirup Krishnamurthy, Ananda Rakshit. A Decision

Support Framework for Airline Flight Cancellations and Delays. Transportation

Science, Volume 27, No 3, 1 Aug 1993


[62] Michael F. Argüello, Jonathan F. Bard & Gang Yu. A Grasp for Aircraft Routing in

Response to Groundings and Delays. Journal of Combinatorial Optimization

volume 1, pages 211–228 (1997)


[63] Lu Hao, Mark Hansen, Yu Zhang, Joseph Post. New York: Two ways of estimating

the delay impact of New York airport. Transportation Research Part E: Logistics and

Transportation Review. Volume 70, October 2014, Pages 245-260.


[64] Somchai Pathomsiri Ali Haghani Martin Dresner Robert J.Windle. Impact of

undesirable outputs on the productivity of US airports. Transportation Research Part E:

Logistics and Transportation Review Volume 44, Issue 2, March 2008, Pages 235-259.

38
Nanjing University of Aeronautics and Astronautics Master Degree Thesis

[65] Aisling J, Reynolds-Feighan Kenneth, J.Button. An assessment of the capacity and

congestion levels at European airports. Journal of Air Transport Management, Volume

5, Issue 3, July 1999, Pages 113-134.


[66] Leonardo Carvalho, Alice Sternberg, Leandro Maia, Gonçalves, Ana Beatriz, Cruz

Jorge A. Soares Diego Brandão , Diego Carvalho, Eduardo Ogasawara. On the

relevance of data science for flight delay research: a systematic review. Dec 2020,

Pages 499-528.
[67] Alice Sternberg, Jorge Soares, Diego Carvalho, Eduardo Ogasawara. A Review on

Flight Delay Prediction. Computational Engineering, Finance, and Science, March

2017.
[68] Sruti Oza, Somya Sharma. Predicting flight delay using Machine Learning,

International Journal Of Engineering And Computer Science. Volume 4, 2015.


[69] The Relationship between Principal Component Analysis and Factor Analysis and

SPSS Software. Discuss with Comrade Liu Yumei,Lu Wendai etc. 2005
[70] MA Zhengping, CUI Deguang. Optimizing airport flight delays. Department of

Automation, Tsinghua University, Beijing 100084, China.


[71] Wen Tian, Xiaoxu Dai, and Minghua Hu. Systemic Congestion Propagation in the

Airspace Network. National Key Laboratory of Air Traffic Flow Management, Nanjing

University of Aeronautics and Astronautics, China.


[72] Wen Tian, Huiqing Xu, Yixing Guo, Bin Hu, and Yi Yao. Probabilistic En Route

Sector Traffic Demand Prediction Based upon Statistical Analysis of Error Distribution

Characteristics. Journal of Advanced Transportation, volume 2018. Nanjing University

of Aeronautics and Astronautics, Nanjing 211106, China.


[73] G. N. Okeudo and E. A. Ejem. ARRIVAL AND DEPARTURE DELAY

CHARACTERISTICS IN NIGERIAN AIRLINES. JOURNAL OF RESEARCH IN

NATIONAL DEVELOPMENT VOLUME 7 NO 1, JUNE, 2009.


[74] Federal Aviation Administration, United states Department of transportation.

39
Prediction and Analysis of flight delay in large airports

ACKNOWLEDGEMENT

First, I would like to express my heartfelt gratitude and thanks to the Almighty God. I would
like to express my sincere appreciation to all the people who contributed in different ways in making
the accomplishment of this research dissertation possible, for their support, suggestions and hard
work. I also thank her for responding selflessly to my endless questions and requests for more
information on my topic of research. I also thank the respondents for taking time to respond to the
secondary data. They gave me maximum co-operation to enable me accomplish this research.
I would also like to thank all those people who made this an unforgettable experience for me.
Foremost, I would like to express my sincere gratitude to my supervisor and classmates for their
continuous support throughout the proposal until to the final dissertation, for their patience,
motivation, enthusiasm, and immense knowledge.
Lot of thanks should go to my University and its corresponding staff in particular, for their efforts in
enlightening my academic life horizon. Moreover I dedicate this work to my beloved family, for
their support and encouragement; generally their overall contributions during the progression of my
thesis work.

40
Nanjing University of Aeronautics and Astronautics Master Degree Thesis

Paper Published

https://doi.org/10.1145/3450292.3450307
Neima Osman Aden (2020). Air transportation and socioeconomic development: the case of
Djibouti; RSVT 2020: 2nd International Conference on Robotics Systems and Vehicle Technology,
pages 89-94.

41

You might also like