Airline On-Time Performance

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 4

Airline on-time performance

Executive summary
Introduction
The comparative study between 1991 and 2001 aims to unearth the evolving characteristics that
define on-time flights across these two distinct periods. (United States Department of
Transportation, n.d.) presented a criterion that allows for a 15-minute leeway before and after the
scheduled time for both arrival and departure. Flights’ departure falling within this 15-minute range
are classified as "on time". Hence, through meticulous data analysis and modeling techniques, this
investigation seeks to uncover temporal patterns, identify shifting influences, and draw parallels or
disparities in the determinants of on-time performance.

Furthermore, the report critically examines a hypothetical scenario where a larger airport intends to
gather comprehensive traveler data for categorization and use predictive analytics to identify
potential late departure causes. Drawing from Solove's ethical framework, this analysis assesses the
ethical ramifications concerning privacy, data security, and individual autonomy inherent in such an
extensive data collection and utilization scheme.

Guided by these analyses and ethical considerations, the report concludes by presenting actionable
recommendations. These suggestions aim to not only enhance on-time performance strategies for
airlines but also advocate for ethical and responsible practices in the realm of data collection and
utilization within the aviation industry.

Methodology
Explain the data sources utilized for analyzing on-time flight characteristics in 1991
and 2001.
The data sets consist of 29 features which provide different information about the flight arrival and
departure details for all commercial flights within the USA in 1991 and 2001. The 1991 data set has
5,076,925 records while data set of 2001 contains 5,967,780 rows. The features include time-related
factors which represent actual and schedule flights, airline code, flight numbers, actual and
scheduled elapsed time, arrival and departure delay time, origin and destination, travel distance, as
well as cancellation code and related cancellation reasons, data about diverted flights, tail numbers,
airtime, taxi in and taxi out.

Prepare the data


In order to have a better analysis and work with a lighter dataset, I deleted some of those unwanted
variables and create new ones. By exploring the data sets, I found some features that do not have
any value such as reason for cancellation, carrier delay, weather delay, NAS (National Airspace
System) delay, security delay, and late aircraft delay. Especially, in 1991 data set, there is no record of
tail number, airtime, taxi in, taxi out. Therefore, I decided to drop these columns and processed to
remove missing values in the other features. During the process, some features that include an
acceptable number of missing values in both data sets are departure time, arrival time, actual
elapsed time, arrival delay, departure delay. Not only excluding the missing records, but the
duplicated values are also considered. Hence, any duplicated values are removed from both data
sets.
After processing missing and duplicate values in the data sets, I selected features by examining their
correlations. Therefore, I excluded several features which have extremely low influence on the on-
time flight performance which in this case include year, flight number, cancellation, diverted, taxi in,
taxi out, tail number.

Furthermore, I created a new variable called “delayed” which indicates whether the flight departed
more than 15 minutes late compared to the scheduled departure time. The feature is Boolean type.

Examining the effect of airlines on flight’s performance, we first observe the distribution of each
airline in both data sets. US airline had the highest number of flights in 1991 with approximately
more than 800,000 flights but only finished in 5th place in 2001. Meanwhile, although WN started at
7th place in 1991 with approximately 370,000 flights, they claimed the 1st position in 2001. In terms of
arrival punctuality, AA, AS, HP, and UA experienced a significant increase in the average number of
delayed minutes in 2001 compared to 1991. On the other hand, CO, TW, and US depict a slight
decrease while the others remained relatively stable. Regarding to the number of departures delayed
minutes, AA, AS, UA, and HP had notable escalation while the others did not fluctuate much.
Therefore, the choice of carrier could be considered as a factor that influenced the on-time flight
performance.

Based on (United States Department of Transportation, n.d.) criterion, we considered if both arrival
delay is non-positive (on time or early) and departure delay is within the 15-minute window, the
flight meets the conditions for being classified as on-time while the others is considered delayed.
Comparing the correlation between delayed flights and on-time flights in both years, departure
delayed time impacts flights punctuality the most. If the departure delayed time was more than 15-
minutes window, it is most likely to arrive late at the destination, thus considered delayed flight.

It is also possible to obtain meaningful insights such as airport congestion by observing monthly
delayed flights. Looking at the monthly distribution of number of flights in both data sets, the
amount of flights spread evenly across the years. Moreover, the histogram plot shows that the lowest
number of delayed flights was observed in February 1991 while the lowest point of the year 2001
was identified in September. Nevertheless, the number of delayed flights reached its peak during July
and August 1991 and 2001. Hence, monthly delayed flights could be a factor that influences flights’
punctuality.

Comparison of On-Time Flight Characteristics (1991 vs. 2001):

- Present findings on what characterized on-time flights in each year.


- Highlight any significant changes or similarities between the two periods.

Additional Analytics Questions and Answers:


Example Question:

"Can we predict the likelihood of a flight delay based on historical data and weather conditions?"

Answer:

Describe the predictive model used, its accuracy, and insights gained regarding factors influencing
flight delays.
Ethical Analysis Based on Solove (2006)
Solove’s framework on privacy concerns revolves around the notion of a "taxonomy of privacy"
where he identifies several distinct problems that relate to privacy invasion. Applying Solove's
framework to the scenario of a larger airport's extensive data collection and traveler categorization
yields several ethical considerations:

The airport's plan involves aggregating extensive data from travelers' smartphones, flight plans,
passport control, security control, and commercial activities. This aggregation creates a
comprehensive profile of individuals, raising concerns about the potential for excessive surveillance
and intrusion into personal lives.

The use of facial recognition technology in shops and restaurants raises issues of identification
without consent. This technology might be perceived as invasive, especially if travelers are not
explicitly aware or have not consented to this level of data collection and tracking.

The categorization of travelers as "likely to cause late departure of flight" might lead to profiling and
discrimination. This could result in the exclusion of individuals from certain airport services or
additional scrutiny solely based on predictive analytics, without transparent criteria or the ability for
individuals to contest their categorization.

There's a risk that the collected data might be used beyond its original purpose. Even though the
current plan aims for reminders and staff access to improve on-time performance, there's potential
for misuse or sharing of data with third parties for unrelated purposes.

Collecting vast amounts of traveler data also poses significant security risks. Storing sensitive
information increases the likelihood of data breaches, potentially exposing personal details to
malicious entities.

Recommendations
Ensure transparency in data collection practices, informing travelers about the extent, purpose, and
implications of data collection. Obtaining explicit consent is crucial.

Collect only essential data required for improving on-time performance and limit its use strictly to
that purpose. Minimize the collection of sensitive information that isn’t necessary for flight
operations.

Establish an independent ethical oversight committee to regularly review data collection,


categorization methods, and usage to ensure alignment with ethical principles and privacy laws.

Implement robust security measures and privacy-enhancing technologies to safeguard collected


data. This includes encryption, anonymization where possible, and strict access controls.

Conduct periodic reviews of data collection practices, categorization algorithms, and adherence to
ethical guidelines. Hold responsible parties accountable for any misuse or breaches of collected data.

Provide ethical guidelines for the airport's data collection and use, ensuring respect for travelers'
privacy and rights.

Conclusion
Summarize key findings, implications for the airline industry, and ethical considerations.

Reiterate the importance of balancing business intelligence goals with ethical responsibilities.
References
Include all sources and literature referenced in the report.

You might also like