Professional Documents
Culture Documents
Project Forecasting
Project Forecasting
Contents
INTRODUCTION........................................................................................................................2
EXOGENOUS VARIABLES.......................................................................................................3
TIME SERIES OF DATA............................................................................................................4
DATA IMPUTATION...................................................................................................................5
DATA SMOOTHING...................................................................................................................8
SAMPLE MEAN AND ACF FUNCTION................................................................................10
DATA TRANSFORMATIONS..................................................................................................12
1|Page
Sarath Chandra Tumuluri
Introduction
1) This file contains data of Wallace Library Heat consumption per hour in
BTU (traditional unit of work equal to about 1055 joules).
Microsoft Excel
Macro-Enabled Worksheet
2) Source of data : Wallace Library heat consumption provided by Rochester Institute of
Technology.
3) Exogenous variables : Outside and inside temperatures , winter ventilation provided by
mechanical and other systems, Infiltration resulting from building construction and
usage and heat required to raise the temperature of materials that is frequently
brought into heated space from outdoors.
2|Page
Sarath Chandra Tumuluri
Cyclic or Seasonal Data and non-stationary data(no natural mean over time, trend exhibited)
with no atypical events.
3|Page
Sarath Chandra Tumuluri
It can be seen from the graph that the heat consumption is more starting from January to Apirl
which is understandable as it is winter season and outside temperatures drop to the lowest.
4|Page
Sarath Chandra Tumuluri
Comparing the data of January 2015 with January 2016 to see if there is any correlation with
the months, it can be seen that the amount of heat consumed in the year of 2016 is more when
compared with 2015. This can be related to the exogenous variable of Outside Temperature
which might be higher in the year 2016, which resulted in more heat consumption per hour, or
BTU/hr.
5|Page
Sarath Chandra Tumuluri
Data subset of only taking the month of January and seeing if there is any trend or seasonality
followed. It can be depicted from the graph that the heat consumption is high in the month
end of January 2015.
Taking a single week of January 2015 and exploring for if any trend or seasonality shown.
6|Page
Sarath Chandra Tumuluri
We can see sudden jump of heat consumption starting 3 rd day of the week and literally in the
first few days of the week of January 2015.
7|Page
Sarath Chandra Tumuluri
Data that is provided is verified and can be seen as evenlyspaced data without any data missing
in between. For the date to come on X-Axis , special function of as.posxict is used and
sequence is spilt by 3 month duration, which can be seen in the graph above.
Now to perform data imputation techniques intentionally data has been taken out and tried
several data imputation methods on it to see the perform of each method of imputation on this
particular data set of heat consumption.Data imputation methods that are used are , Kalman ,
Interpolation and moving average.
Kalman Technique :
Interpolation Technique :
Moving Average :
8|Page
Sarath Chandra Tumuluri
Data smoothing :
Plotted data smoothing using rolling median and rolling mean for the 2 nd week of January 2015
and compared the data smoothing done by rolling median and rolling mean, which can be seen
in the graph.
9|Page
Sarath Chandra Tumuluri
10 | P a g e
Sarath Chandra Tumuluri
11 | P a g e
Sarath Chandra Tumuluri
Exogenous variables
1) Aircraft Movement, that is number of planes moving in and out of the airport. Number
of passengers carried increases with the increase in the aircraft movement but not the
vice-versa. Making it a significant exogenous variable for forecasting the number of
passengers carried by the airport in the coming years.
2) Extended Airport Terminal (Hotels, Retail, Parking)
With the increase in the terminals, more business happens near the airport which
increases the number of passengers carried by the airport monthly.
3) Location of the airport
If the location of the airport is developing and creating more business around the
world, then it will have an effect on the passengers carried, which makes it an
exogenous variable.
12 | P a g e
Sarath Chandra Tumuluri
Cyclic or Seasonal Data and non-stationary data (no natural mean over time, trend exhibited)
with no atypical events
13 | P a g e
Sarath Chandra Tumuluri
Increase in the number of passengers carried generally happens in the months of June and July
and declines at the end of the year.
The increase in the number of passengers can be attributed to the special festival that happens
in the location of Bulgaria between June and July.
Data Imputation
To understand the accuracy of the three imputation methods of, mean imputation,
interpolation imputation and kalman imputation, two observations were taken out and
tried to impute them with each of the methods.
Interpolation Imputation
10 % Error recorded
Mean Imputation
15 | P a g e
Sarath Chandra Tumuluri
Kalman Imputation
16 | P a g e
Sarath Chandra Tumuluri
Data Smoothing
Moving Average (N=5) for 2006
R- Code used :
library(zoo)
strdates<-as.Date(Passenngerdata$Time,"%Y-%m")
movingaverage5periods=rollmedian(Passenngerdata$Units,3)
View(movingaverage5periods)
rollingmedian<-Passenngerdata[4:123,]
View(rollingmedian)
Passengerdatess<-data.frame(rollingmedian$time,movingaverage5periods)
17 | P a g e
Sarath Chandra Tumuluri
A simple moving average for span 5, which assigns 1/5 weigh to the most recent
observations. Exhibits less variability and easier to interpret and analyse if there is any
trend. But failed to remove the potential outliers.
18 | P a g e
Sarath Chandra Tumuluri
19 | P a g e
Sarath Chandra Tumuluri
20 | P a g e
Sarath Chandra Tumuluri
Data Transformations
Using Differencing to remove Trend and Seasonality:
X(t)=y(t)-y(t-1)
21 | P a g e
Sarath Chandra Tumuluri
22 | P a g e
Sarath Chandra Tumuluri
23 | P a g e