Download as txt, pdf, or txt
Download as txt, pdf, or txt
You are on page 1of 2

Welcome back! In the last task.

we have seen that time series data is basically observing


an entity over time.
From fault rates in production lines to infection counts
during pandemic,
are all examples of time series data.
A time series data can be split into three different
components: trend, seasonality and cyclicity.
Any real world data would exhibit some of these components
if not all of them.
Let us have a look at the data set we are going to use
in this project.
We will be using the EU stock markets data here.
You can execute this line by pressing control and enter
together or you can use the run button from here.
Now you can click here to have a look at the data in the data
viewer. As you can see that the data set here records
four major European stock indices.
Now execute this line to fetch the data description.
The description tells you that the values in each cell are
in fact the closing prices for the respective indices.
They have 1860 observations recorded during the time span
of 1991 to 1998.
For time series analysis
our focus should be on a single entity.
Therefore we can pick only one of the four indices from here.
For the illustration purpose
I will use the first index.
The next thing we do is to tell the program to fetch the data
in the time series format. Here,
we have created the time series data only
with the information from the first column and then
we have plotted it.
Note that the plot extracts the time information
from the metadata.
You can also populate the time information manually like this.
Here, ts
is the time series function.
You need to pass these information through the function.
This is your data vector. The start and end denote
the time of the first and last observations respectively.
Frequency is the interval between two data points.
A frequency value of 12 denotes monthly data, one for yearly
data and four for quarterly data and so on.
Now, going back to the time series components.
The trend is the overall ascending or descending tendency
observed in the data over a long time horizon.
As you can see in the plot from our data. Seasonality denotes
a particular behavior observed over a known period of time
repeatedly. For example, higher sales in the festive seasons.
Cyclicity is a distinct, increasing and decreasing pattern
in the data, but they are not defined for a fixed period.
For example, the business cycles. There are repetitive
patterns in the long term horizon but they do not occur
in fixed intervals or last for a fixed time.
Now our next step is to conduct an exploratory analysis
of the data to understand the content better.
We have already set out data in time series format.
Now we extract the time series components by decomposing
the data.
You can see the four plots are being stacked up here.
The top row shows the original data. The same as the previous
plot. The second shows the increasing trend through 1991 -
1998. The third is the repetitive seasonal pattern. And, the
fourth shows the error or randomness in the data. Now we select
which method to use to further analyze this information.
There are multiple ways to conduct a time series analysis.
These three are the very popular and robust time series
analysis techniques. In this project, we will cover ARIMA in
detail. To employ ARIMA, our data must meet these two
assumptions, stationarity and univariate data. I will
elaborate on these in detail in the coming sections. Once our
data meets the core assumptions, we find the best fit model
and then use it to forecast future values. I will stop here
for now and continue with the tests for core assumptions in
the next task. See you there.

You might also like