Assignment Report

1.
0 Define Goal:
1.1 Brief background of RPM
Revenue Passenger Miles (RPM) serves as a crucial indicator for evaluating the air
travel industry in the United States, specifically for flights that are scheduled for
passengers. In aviation, RPM gives insight into the total distance travelled by
fare-paying passengers. This metric is derived by multiplying the number of
passengers who have purchased tickets by the miles covered during the flight. For
example, a plane journey covering 300 miles with 100 ticket-holding passengers
would generate 30,000 RPM. Prior to the COVID-19 pandemic, the industry saw a
surge in revenue, which predominantly comes from passenger flights, constituting
over 90% of the total earnings. A case in point is American Airlines, a major U.S.
carrier offering both domestic and international flights. The airline's primary source of
revenue is from passenger operations. Between 2016 and 2019, there was a stable
uptick in RPM—from 223.5 billion miles in 2016 to 241.3 billion miles in 2019,
marking an 8% growth over this four-year span (Fedorik, 2021). However, the
industry took a significant hit in 2020 due to the pandemic, severely affecting its RPM
performance.
1.2 How Covid-19 has impacted airline industry in United States

In today's landscape, the airline industry for passengers is highly susceptible to a
range of external influences, such as safety concerns, public health issues, and
broader economic events (Varma, 2021). These external shocks can either cause a
temporary dip or a long-lasting change in the industry's growth trajectory, often
influenced by how well the sector recovers from economic downturns (Fedorik,
2021). Given the industry's history of narrow profit margins, many airlines are
unprepared with robust business strategies to weather unforeseen crisis. In light of
the economic and social factors affecting the sector, the COVID-19 pandemic has
served as a wake-up call for airlines to transition from a high-volume, low-profit ml to
more sustainable business practices (Han, 2020). Airlines that have built their
business on the principles of cost-effectiveness and affordable ticket pricing have
shown resilience in mitigating the adverse impacts of the pandemic. To compensate
for operational expenses and the lack of revenue-generating opportunities, many
airlines have turned to streamlined operations as a competitive edge during periods
of dwindling demand.
1.3 Goal of the forecasting problem

The objective of this predictive analysis is to evaluate the impact of the COVID-19
pandemic on the U.S. airline sector by quantifying the anticipated shortfall in
Revenue Passenger Miles (RPM) due to the crisis. Utilising time series analysis, we
aim to project the total 2020 revenue based on pre-pandemic data related to
passenger revenue. By comparing this projected revenue with the actual revenue
figures, we can achieve the study's aim of assessing the financial implications of the
pandemic on the industry.
2.0 Explore and Visualise Series
Figure 1 illustrates the time series plot of US revenue passenger miles from January
2000 to April 2023. The graph offers a thorough look at the changes in Revenue
Passenger Miles (RPM) across time, emphasising many crucial events. A noticeable
drop started in 2002, which had an impact on the larger economy. During the Great
Recession in 2008 and 2009, there was a further considerable decline in RPM.
Widespread loan defaults and a decline in U.S. housing values were two features of
this economic crisis. The RPM indicates a steady increasing tendency after bouncing
back from these setbacks, peaking in 2019 at 101,794,185. The COVID-19
pandemic, however, quickly put an end to this rise in 2020. Due to the enormous
travel limitations brought on by the worldwide health crisis, including an entry ban to
the United States, RPM significantly decreased. The graph captures these significant
structural fractures and provides insightful information on how outside factors have
affected the success of the airline industry throughout time.
Revenue Passenger Miles (RPM) ggseasonplot illustrates numerous significant
patterns and anomalies throughout time. RPM normally peaks in the summer, more
notably in the months of June and July, before declining in the latter part of the year,
more so in December. This shows that travel patterns follow a seasonal pattern,
perhaps affected by events like holidays and the weather. However, this tendency is
disturbed in years with big national or international calamities. For example, the
seasons, the Great Recession, and the COVID-19 pandemic all differed from
expectations in the years 2002, 2008–2009, and 2020, respectively. In spite of these
disturbances, the RPM generally moves upward, peaking in 2019. Particularly
noticeable is the sharp decline in 2020, which highlights how severely the COVID-19
pandemic has affected aviation travel. Overall, the narrative provides a detailed look
at how RPM has changed over time as a result of seasonal and extraordinary
events.
The graph clearly demonstrates an upward trajectory in Revenue Passenger Miles
(RPM) on a monthly basis across multiple years, emphasising a growth in air travel.
However, this growth was significantly interrupted in 2020 due to the global
COVID-19 pandemic, which led to stringent travel restrictions, particularly for foreign
visitors to the United States. Additionally, the plot reveals that July consistently sees
the highest RPM, likely due to the peak travel season during the summer. June and
August also register high RPM, making the entire summer period a busy travel
season. The favourable weather conditions during these months attract a large
number of travellers to the U.S. Lastly, December also experiences a noticeable
uptick in RPM, probably due to the Christmas holiday, which is widely celebrated in
the U.S. Overall, the plot offers valuable insights into both the seasonal patterns and
the impact of external events on air travel.
3.0 Pre-Process Data
As seen by the graph above, choosing the period from 2010 to 2023 for analysis is a
wise choice. The plot shows an overall upward trend in Revenue Passenger Miles
(RPM), emphasising the industry's adaptability and expansion under rather stable
economic conditions following the Great Recession of 2008–2009. The COVID-19
pandemic is significant because it provides crucial information about how the airline
sector responds to enormous worldwide disasters. A crucial data point for analysing
industry weaknesses and resilience is provided by the steep fall in RPM in 2020.
Additionally, this time period is long enough to capture the strong seasonality seen in
RPM, especially the summer peaks. These structural breaks would cause model
parameters to be unstable, which decreases the reliability and validity of the model.
Therefore, the “full dataset” will begin from January 2010 until April 2023. The
analysis is likely to produce more pertinent and useful insights for comprehending
current trends and creating forecasts for the near future by concentrating on this
more recent time period.
4.0 Partition Series
The "full dataset" will be divided into "Pre-Covid" and "Covid" portions.
By looking at the COVID-19 timeline for the United States, one can identify the
"pre-Covid" time period. On February 26, 2020, the first COVID-19 case in the US
was officially confirmed (Han, 2020). In reality, individuals who returned from Wuhan
and had direct contact with COVID positive persons in the heavily infected area were
the cause of the first few cases of COVID-19 in the United States (Fedorik, 2021).
4.1 Precovid dataset [Appendix 1]

4.2 Covid dataset [Appendix 2]
4.3 Training and Test dataset [Appendix 3]
The dataset designated as "Pre-Covid" spans from January 2010 to February 2020,
while the "Covid" dataset covers the period from March 2020 to April 2022. Prior to
splitting the "Pre-Covid" dataset, a check is performed to confirm that it contains 122
observations, as verified by the `length()` function. Subsequently, this "Pre-Covid"
data is partitioned into training and test sets, allocating 80% of the data for training
purposes and the remaining 20% for testing. Specifically, the training set for the
"Pre-Covid" period extends from January 2010 through December 2018. On the
other hand, the test set for the same period starts in January 2018 and concludes in
January 2020.
The number of observations in these training and test sets are 96 and 26,
respectively as shown in appendix 4
5.0 Apply Forecasting Methods/Models

5.1 One Simple Forecasting Method: Seasonal Naive
The time series plot above displays seasonality and trend during the training set
timeline from January 2010 to December 2017). Observing the scenario, seasonal
naive is the only method that considers seasonality through extraction of last value
from the same season.
The Seasonal Naive forecast closely follows the observed data, capturing the
seasonality quite well. However, it doesn't account for any trend, and there are still
areas where it deviates from the actual values.
This method provides a straight line, essentially assuming that future values will
equate to the historical mean. This clearly is not a good model for the data as it
neither captures the seasonality nor the trend in the data.
This method appears to add a drift component to the naive forecast, aiming to
capture some level of trend. However, it also fails to capture the seasonality in the
data and has wider deviations from the actual values compared to the Seasonal
Naive method.
Based solely on visual inspection, the Seasonal Naive method seems to perform the
best among the three, as it captures the seasonal patterns in the data more closely
than the other two methods. However, it's important to note that visual inspection is
just one part of model evaluation and statistical measures should also be considered
for a comprehensive evaluation.
Residuals Q* df p-value Model df Total lags

from used
Ljung-Box
test
Values 129.79 19 < 2.2e-16 0 19

The residuals from the Seasonal Naive forecasting model show some noticeable
patterns. Specifically, from 2014 to 2016, most residuals are greater than zero, while
from 2011 to 2014, some are less than zero. This suggests that the model's residuals
do not exhibit a mean-reverting behaviour, even though they appear to be normally
distributed when viewed as a histogram.
The Autocorrelation Function (ACF) plot indicates that the residuals still contain
some unexplained information. In particular, the first few lags in the residuals show a
statistically significant correlation with the final observations in the training set. This
suggests that the Seasonal Naive method has not fully accounted for the trend
component in the data. The significant lags also point to the model's inability to
capture certain seasonal elements.
This interpretation is further supported by the Ljung-Box test, which yields a p-value
much smaller than the 0.05 significance level. This result leads us to reject the null
hypothesis that the residuals are not correlated up to lag 19, confirming that the
residuals are not simply white noise. Although the first few lags show significant
correlation, indicating a trend pattern that the model failed to capture, the other lags
do not show significant correlation. In summary, compared to four other simple
forecasting techniques, Seasonal Naive forecasting performs relatively well.
However, a more advanced model that takes into account both error and trend
components could potentially offer more accurate forecasts.
6.2 Two ETS Exponential Smoothing Models

To effectively study time series data, it's crucial to grasp how the foundational trend
interacts with seasonal patterns, which can usually be either multiplicative or additive
in nature. The chart reveals some important attributes:
● Multiplicative error: The time series displays variations of different magnitudes

over its duration. These fluctuations are likely the result of multiplicative
errors. The observed multiplicative seasonality in the data further corroborates
the likelihood of such errors being present.
● Multiplicative seasonality: The time series graph clearly exhibits seasonal
patterns due to noticeable ups and downs. However, these seasonal
variations are inconsistent over time, displaying both increases and
decreases. As a result, the seasonality in the data is better described as
multiplicative rather than additive.
● Additive damped or additive trend: The trend component reveals a generally
upward trajectory, though it appears to plateau towards the end. This makes it
unclear whether the trend is dampening or simply linear. Additionally, the
consistent growth in Revenue Passenger Miles (RPM) may not be
guaranteed, given that some passengers might travel to the U.S. and not
make a return trip in the short to medium term.
The two ETS exponential smoothing models identified by the observation of time
series plot is:
1. ETS(M,A,M): Multiplicative Error, Additive Trend and Multiplicative Seasonality
2. ETS(M,Ad,M) - Multiplicative Error, Additive Damped Trend and Multiplicative
Seasonality
First model: ETS(M, A, M)

The ETS function returns parameter estimates as shown below:
Smoothing alpha beta gamma

parameters
Values 0.19 0.0127 0.0023
Initial States l b s
Values 65374563.1833 142154.498 0.9737 0.9145

0.9889 0.951
1.1251 1.1674
1.113 1.042
0.9819 1.0237
0.8181 0.9007
The table above shows the parameters estimates and initial states returned from the
ETS() function. It can be observed that:
● Alpha (α = 0.19): The model's response time to changes in the level (or mean)
of the series is determined by the level smoothing parameter, which has a
value. A score that approaches 0 indicates that the model will react slowly to
level changes and will rely more on historical data. A number that approaches
1 indicates that the model will react swiftly to level changes and will depend
more heavily on recent observations. The number in this instance, which is
roughly 0.190, points to a moderate rate of adaptability to changes in the
series' level.
● Beta (β = 0.0127): This variable regulates how quickly the model reacts to
shifts in the series' trend. A score near 0 indicates that the model will react
slowly to changes in the trend, whereas a value near 1 indicates speedy
adaptation. The approximately 0.0127 value indicates that the model adapts
to changes in the trend somewhat slowly.
● Gamma (γ = 0.0023): This parameter determines how quickly the model
reacts to variations in seasonality. The model will react slowly to seasonal
changes if the number is close to 0, whereas a value close to 1 indicates
speedy adaptability. The about 0.0023 value indicates that the model adapts
to variations in seasonality rather slowly.
According to the chart it seems like the forecast predicts an increase in trend while
also showing seasonal patterns. This is expected because of the addition of a trend
element. As a result it is projected that the time series plot will continue to rise. The
confidence interval, which has boundaries, appears to be reasonable. Aligns well,
with past patterns observed in the training data.
The ETS(M,A,M) model generates fitted values that closely align with the observed
data in the training set. It successfully captures the non-linear trends in the data,
although it doesn't fully account for some of the more extreme spikes, particularly
those observed in 2010 and 2015.
Residual Diagnostic Checks:
While the ETS(M,A,M) model's residuals display some inconsistencies, particularly in

2010 and 2012, they generally tend to revert to the mean. The distribution of these
residuals, as seen in the histogram, is mostly normal. However, slightly large lags at
positions 8 and 24 could indicate that there may be some seasonality or patterns in
the residuals that the model has not fully captured.
Residuals from Q* df p-value Model df Total lags used

Ljung-Box test
Values 21.72 19 0.2984 0 19
The Ljung-Box test results suggest that the ETS(M,A,M) model is a good fit for the
data. With a test statistic (Q*) of 21.72 and 19 degrees of freedom, the p-value of
0.2984 is well above the conventional 0.05 threshold for statistical significance. This
high p-value indicates that we fail to reject the null hypothesis, suggesting that the
residuals are essentially random, or white noise, and thus the model has captured
most of the underlying structure in the time series. Moreover, the Ljung-Box test
p-value above 0.05 indicates that these slightly large lags might not be statistically
significant, suggesting that they may not materially affect the model's performance.
The test considers 19 lags and has 0 model degrees of freedom, which further
supports the model's adequacy. Overall, these statistics provide confidence in the
model's ability to produce reliable forecasts.
First model: ETS(M, Ad, M)
Estimate ETS(M, Ad, M) Model
Smoothing alpha beta gamma phi

parameters
Values 0.3529 0.0243 0.0001 0.9765
Values 65374562.8052 162995.0916 0.9718 0.9142

0.9903 0.9507
1.1237 1.1694
1.1098 1.0423
0.9826 1.0254
0.8195 0.9003
The table above shows the parameters estimates and initial states returned from the
ETS() function. It can be observed that:
● Alpha (α = 0.3529): This parameter represents the smoothing factor, for the
level indicating how quickly the model adapts to fluctuations in data. A higher
value ( to 1) indicates a responsive model that quickly adjusts to recent
changes. Conversely a lower value (closer to 0) implies a model that takes
into account historical data as well.The value here is 0.3529, which suggests
a level of responsiveness to changes in the level or magnitude of the series.
● Beta (β = 0.0243): This parameter determines how much the model reacts to
changes in the trend. A value close to 1 means that the model is very
sensitive to trend changes while a value close to 0 means it is less sensitive.
Based on value R generated of 0.0243 it seems that the model doesn't
respond strongly to trends, which might be because the trend itself is fairly
stable.
● Gamma (γ = 0.0001): This parameter is for the seasonality component. A
value close to 1 indicates high responsiveness to seasonal changes, while a
value close to 0 suggests the opposite. The value is very close to zero,
indicating that the model sees the seasonal pattern as very stable and thus
doesn't need to update its seasonal factors much.
● Phi (φ = 0.9765): This damping parameter is only there because the trend
(Ad) has been dampened. It regulates the speed at which the trend eventually
approaches a flat line. Less damping is indicated by values closer to 1, and
more damping is indicated by values closer to 0. The result of 0.9765
indicates that there is only a little trend dampening.
ETS(M,Ad,M) and ETS(M,A,M) have similar prediction intervals. As additive damped

is applied to trend, an upward trend will be continued from the forecast, however it
flattens in the long-term
The ETS(M,Ad,M) model effectively replicates the training set's data, closely
following its non-linear patterns while omitting an unusually high peak observed in
2010 and 2015. This model appears to be a good fit for the training data, capturing
most of its variations similarly to the ETS(M,A,M) model.
Residuals from Q* df p-value Model df Total lags used
Ljung-Box test
Values 25.475 19 0.1455 0 19
The Q* the statistic has 19 degrees of freedom (df) and a value of 25.475. The
p-value is larger than the conventional alpha level of 0.05, being 0.1455 instead. As
a result, it appears that the null hypothesis cannot be ruled out, supporting the notion
that the residuals are independent and the model accurately captures the underlying
data structure. The Q* statistic's "Model df" or model degrees of freedom is 0, which
indicates that no parameters were subtracted from the calculation. Additionally, there
were 19 lags altogether in the test.
Overall, the statistics indicate that the ETS(M, Ad, M) model adequately describes
the data because the residuals appear to be white noise, or equally and
independently distributed.
(iii) One ETS Model Selected by R
The parameter that contains the letter "Z" allows R to automatically pick the
best-fitting ETS model. By setting the model to "ZZZ," R is instructed to evaluate all
components based on informational criteria to choose the best model. Similarly,
when `damped` is set to "NULL," R will decide whether to use a damped or
undamped trend, based on which option minimises the information criteria.
Smoothing parameters alpha beta gamma
Values 0.3529 0.0243 0.0001

The ETS(M,A,M) model selected by the R function consists of Multiplicative Error,
Additive Trend, and Multiplicative Seasonality components. The choice of this model
by R is consistent with the evident patterns in the training set data. The additive
trend component in the model captures the incremental changes over time, while the
multiplicative seasonality accounts for fluctuations that aren't constant but
proportionate to the level of the series. The multiplicative error term allows the model
to accommodate varying levels of volatility in the data. The selection of this model by
R is hardly surprising, considering the training set data displayed clear trends and
seasonality, which the ETS(M,A,M) model is well-suited to capture. Given that the
AIC, AICc, and BIC values are measures of the model's goodness-of-fit, their
relatively low levels in this case suggest that the model provides a robust fit to the
historical data. This makes it a reliable tool for capturing the underlying patterns in
the data and for making future forecasts.
The Model/Method with Best Goodness Of Fit

Model Values
Seasonal Naive Residual Standard Deviation:

2409183.0407
ETS(M,A,M) AIC: 3105.644
ETS(M,Ad,M) AIC: 3105.661
As seen from table above, ETS(M,A,M) has the lowest AIC, which is preferable as it
reproduces the data well so it has the best fit for data. This is expected as there is
evidence of this method producing robust multi-step forecasts
Forecast plot of three forecasting models and method
The forecasts from the three models appear to mirror the trend and seasonality of
the actual data, with minor differences. The ETS(MAM) and ETS(MAdM) models
appear to capture trend and seasonality more correctly than the Seasonal Naive
model. However, the ETS(MAdM) model, which includes a damped trend, looks to
be a little more conservative in its future estimates.
The ETS(M, A, M) model provides a balanced projection, maintaining the rising trend
and seasonality seen in pre-Covid data. Based on the current trend, this model
neither significantly underpredicts nor overpredicts until January 2020. The ETS(M,
Ad, M) model begins strongly but eventually plateaus, indicating a more conservative
estimate. Because of its damped trend component, this model may underpredict
RPM if the real-world growth rate remains robust. Finally, the Seasonal Naive model
merely copies the seasonal pattern from the previous year, completely ignoring the
increasing tendency. As a result, it is very likely to underestimate RPM, especially if
the actual RPM continues on its upward trend.
6.0 Evaluate & Compare Forecasting Performance (Appendix 5)
Several observations can be made based on the forecast accuracy metrics provided
to evaluate the performance of the forecasting models on the test set. The Root
Mean Square Error (RMSE) is an important indication of model fit, with lower values
indicating a better model. With an RMSE of 2,062,296.7, the ETS(MAM) model
outperforms the others, followed by the ETS(MAdM) model with an RMSE of
2,939,174.1. With an RMSE of 6,348,579, the Seasonal Naive model lags
significantly. Similarly, in terms of Mean Absolute Error (MAE), the ETS(MAM) model
has the lowest MAE, indicating superior forecast accuracy. The ETS(MAM) model
also has a lower Theil's U statistic, indicating that its projections are more accurate.
The Seasonal Naive model, on the other hand, has the greatest RMSE, MAE, and
Theil's U, indicating that it is the least accurate of the three. When compared to the
ETS(MAdM) and Seasonal Naive models, the ETS(MAM) model appears to produce
the most accurate out-of-sample forecasts.
Time Series Cross Validation

A time series cross validation model is used with 12 steps ahead forecast.
Method/Model Cross-Validation RMSE output
Seasonal Naive 4018102
ETS(M,A,M) 1468178.6
ETS(M,Ad,M) 1706924.1
The RMSE (Root Mean Square Error) figures provide key insights in evaluating the
forecasting performance of the three alternative models—Seasonal Naive,
ETS(M,A,M), and ETS(M,Ad,M)—using time series cross-validation with a prediction
horizon of h=12. The Seasonal Naive model, in particular, had the greatest RMSE of
4,018,102, suggesting the worst fit between the anticipated values and the actual
test data. The ETS(M,A,M) model, on the other hand, performed the best, with the
lowest RMSE of 1,468,178.6. The ETS(M,Ad,M) model, which included damping,
had an RMSE of 1,706,924.1, putting it in the middle of the forecast accuracy range.
The ETS(M,A,M) model appears to offer the most accurate forecasts for this specific
dataset and forecast horizon based on these RMSE values. Thus, if one were to
choose a model only on the basis of minimising the RMSE, the ETS(M,A,M) model
would be the best option.
According to the findings of the evaluation, the ETS(M,A,M) model is the most
promising for making future projections. It has the lowest RMSE in both traditional
assessment and time-series cross-validation, indicating that it provides the best
accurate point forecasts for this dataset and for a 12-step ahead forecast. The
general adaptability of the ETS(M,A,M) model also contributes to its selection. ETS
models are effective in capturing various time series components, such as
seasonality and trend, which appear to be present in this situation. They are
frequently resistant to different data patterns and are reasonably simple to read and
explain, making them an appropriate choice for both technical and non-technical
stakeholders.
7.0: Implement Forecasts
We would use the ETS(M,A,M) model on the Pre-Covid data set as a result of the
evaluation in stage 6 selecting the ETS(M,A,M) model as the champion model. The
updated parameter estimates are listed below.
Smoothing alpha beta gamma phi

parameters
Values 0.352 0.0264 0.0001 0.9777
Values 65364573.4781 143164.8325 0.9733 0.9149

0.9893 0.9472
1.1211 1.1668
1.1089 1.0447
0.9853 1.0293
0.8209 0.8982
The model appears to moderately adjust to changes in the level of the series,
according to the alpha parameter, which is set at 0.352 and indicates level
smoothing. The trend smoothing parameter is beta, with a value of 0.0264. This low
score shows that the model reacts to trend changes somewhat slowly. Seasonal
smoothing is accomplished using the gamma value, which is practically zero at
1e-04. Due to its insignificance, it may be assumed that seasonality has remained
essentially constant across time. Last but not least, the damping factor for the trend
component, or phi, is 0.9777. A number that is closer to 1 denotes that the trend will
likely continue for a while before flattening off.
The graph depicts Revenue Passenger Miles (RPM) forecasted using the
ETS(M,A,M) model from February 2020 to July 2021, spanning the COVID-19
impact period. While the model appears to capture seasonal patterns and the overall
trend, there is a considerable difference between actual and anticipated RPM,
especially in the later months. This disparity could be attributed to the pandemic's
exceptional market conditions, which are difficult to foresee precisely. The widening
confidence intervals indicate increasing uncertainty in the forecasts, emphasising the
importance of exercising caution when depending entirely on the model for
decision-making.
The projected data, indicated by the red line in the plot, follows the trend and
seasonality observed in the actual data in general. It is crucial to note, however, that
the projection appears to underestimate the RPM for the "Covid" time period. The
forecast intervals are not explicitly depicted in the graphic, but if they were, they
would indicate a range of probable future values around the anticipated line,
emphasising the inherent uncertainty in any prognosis. Based on the current plot, the
forecast appears to be reasonable in terms of capturing the long-term trend and
seasonal changes, while the model's underestimation during the "Covid" period may
indicate the need for revisions or the inclusion of new factors such as pandemic
indicators.
Despite being automatically chosen by R's 'ets()' function and performing well on
pre-Covid data, the ETS(M, A, M) model does not produce reliable projections for the
Covid-19 period. When applied to the test set, significantly higher error metrics like
ME, RMSE, MAE, and MAPE show this to be the case. The model consistently
underestimates Revenue Passenger Miles (RPM), demonstrating that it is unable to
take into account the disruption that the epidemic has caused to travel habits. Theil's
U and MASE's high values both support the model's unsatisfactory fit. This
emphasises the constraints of predicting unusual events like a worldwide pandemic
using historical data and suggests the need for an alternative modelling strategy that
can account for these exceptional situations.
9.0 Quantifying the Forecasted Loss in Revenue Passenger Miles
Summary
In this analysis, we aimed to understand the trends and patterns in the United States'
Revenue Passenger Miles (RPM) from January 2010 to April 2023. We were
particularly interested in the COVID-19 pandemic's impact on aviation travel. We
started by dividing the data into three sets: a 'Pre-Covid' set, a 'Covid' set, and a test
set. The 'Pre-Covid' training set was subjected to several forecasting models,
including the Mean Method, Seasonal Naive, and Drift Method. Based on multiple
accuracy criteria, ETS(M,A,M) was chosen as the best-performing model. We
prepared projections for the "Covid" period using this model, compared them to
actual RPM, and scored forecast accuracy. According to our calculations, the
expected loss in Revenue Passenger Miles (RPM) from the commencement of the
COVID-19 epidemic is roughly 1.035 billion miles. Taking into account the forecast
intervals and uncertainties inherent in any predictive model, the range of this loss is
between 0.984 and 1.086 billion miles (Appendix 6). These numbers highlight the
devastating impact of the epidemic on the airline industry and highlight the critical
need for comprehensive solutions.
Policy Recommendations
Given the large drop in Revenue Passenger Miles (RPM) during the COVID-19
epidemic, airlines and related businesses must take a multi-pronged effort to
alleviate the effects and plan for the future. First and foremost, disaster planning is
critical. Companies should have a well-thought-out plan in place that allows them to
respond quickly to changes in travel demand. This planning should be supplemented
with adaptable business models, such as variable pricing strategies or dynamic route
alterations, that can respond quickly to changing market conditions. Given the
predicted loss in RPM and its knock-on effect on the entire economy, the
government's role becomes critical. Financial assistance in the form of bailouts, tax
breaks, or subsidies could be lifelines for the aviation industry. Along with
cost-cutting efforts, airlines must invest in strong health and safety regulations.
Strengthening these safeguards is not only a legal necessity, but also a tactic for
reestablishing consumer confidence and, by extension, RPM. Beyond immediate
problems, the sector may want to consider diversifying its revenue streams in order
to become less reliant on passenger travel. Diversification, whether through cargo
services or partnerships with other transportation sectors, could provide a buffer
against future shocks. Finally, data should be central to all of these methods.
Continuous RPM and other key performance indicators monitoring will allow the
industry to adjust projections and pivot strategy as needed. By incorporating these
recommendations, the aviation industry will be better prepared to deal with not only
the current pandemic, but also future emergencies (Organisation for Economic
Co-operation and Development, 2022).
References:
Fedorik, M. (2021). Impacts of various critical situations on the aviation industry and
parallel with the COVID-19 pandemic.
https://dspace.cuni.cz/handle/20.500.11956/126565
Varma, T. M. (2021). Responsible leadership and reputation management during a

crisis: The cases of Delta and United Airlines.
https://link.springer.com/article/10.1007/s10551-020-04554-w
Song, K. H., Choi, S., & Han, I. H. (2020). Competitiveness Evaluation Methodology
for Aviation Industry Sustainability Using Network DEA.
https://www.mdpi.com/2071-1050/12/24/10323
Organisation for Economic Co-operation and Development. (2022). COVID-19 and

the aviation industry: Impact and policy responses.
https://www.oecd.org/coronavirus/policy-responses/covid-19-and-the-aviation-industr
y-impact-and-policy-responses-26d521c1/
Disclaimer:
23 pages of report - 10 pages of graph = 13 pages (within the limit)
Appendix:
1) Precovid dataset
2) Covid dataset
3) Training and test dataset
4) Length
5) Accuracy
6) Calculation for stage 8:

Assignment Report

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Assignment Report

Uploaded by

Copyright:

Available Formats

1.

1.2 How Covid-19 has impacted airline industry in United States

1.3 Goal of the forecasting problem

2.0 Explore and Visualise Series

4.0 Partition Series

4.1 Precovid dataset [Appendix 1]

5.0 Apply Forecasting Methods/Models

Residuals Q* df p-value Model df Total lags

Values 129.79 19 < 2.2e-16 0 19

6.2 Two ETS Exponential Smoothing Models

● Multiplicative error: The time series displays variations of different magnitudes

First model: ETS(M, A, M)

Smoothing alpha beta gamma

Values 0.19 0.0127 0.0023

Values 65374563.1833 142154.498 0.9737 0.9145

While the ETS(M,A,M) model's residuals display some inconsistencies, particularly in

Residuals from Q* df p-value Model df Total lags used

Values 21.72 19 0.2984 0 19

Smoothing alpha beta gamma phi

Values 0.3529 0.0243 0.0001 0.9765

Values 65374562.8052 162995.0916 0.9718 0.9142

ETS(M,Ad,M) and ETS(M,A,M) have similar prediction intervals. As additive damped

Values 25.475 19 0.1455 0 19

(iii) One ETS Model Selected by R

Smoothing parameters alpha beta gamma

Values 0.3529 0.0243 0.0001

The Model/Method with Best Goodness Of Fit

Seasonal Naive Residual Standard Deviation:

ETS(M,A,M) AIC: 3105.644

ETS(M,Ad,M) AIC: 3105.661

6.0 Evaluate & Compare Forecasting Performance (Appendix 5)

Time Series Cross Validation

Method/Model Cross-Validation RMSE output

Seasonal Naive 4018102

7.0: Implement Forecasts

Smoothing alpha beta gamma phi

Values 0.352 0.0264 0.0001 0.9777

Values 65364573.4781 143164.8325 0.9733 0.9149

9.0 Quantifying the Forecasted Loss in Revenue Passenger Miles

Varma, T. M. (2021). Responsible leadership and reputation management during a

Organisation for Economic Co-operation and Development. (2022). COVID-19 and

3) Training and test dataset

You might also like