we have seen that time series data is basically observing
an entity over time. From fault rates in production lines to infection counts during pandemic, are all examples of time series data. A time series data can be split into three different components: trend, seasonality and cyclicity. Any real world data would exhibit some of these components if not all of them. Let us have a look at the data set we are going to use in this project. We will be using the EU stock markets data here. You can execute this line by pressing control and enter together or you can use the run button from here. Now you can click here to have a look at the data in the data viewer. As you can see that the data set here records four major European stock indices. Now execute this line to fetch the data description. The description tells you that the values in each cell are in fact the closing prices for the respective indices. They have 1860 observations recorded during the time span of 1991 to 1998. For time series analysis our focus should be on a single entity. Therefore we can pick only one of the four indices from here. For the illustration purpose I will use the first index. The next thing we do is to tell the program to fetch the data in the time series format. Here, we have created the time series data only with the information from the first column and then we have plotted it. Note that the plot extracts the time information from the metadata. You can also populate the time information manually like this. Here, ts is the time series function. You need to pass these information through the function. This is your data vector. The start and end denote the time of the first and last observations respectively. Frequency is the interval between two data points. A frequency value of 12 denotes monthly data, one for yearly data and four for quarterly data and so on. Now, going back to the time series components. The trend is the overall ascending or descending tendency observed in the data over a long time horizon. As you can see in the plot from our data. Seasonality denotes a particular behavior observed over a known period of time repeatedly. For example, higher sales in the festive seasons. Cyclicity is a distinct, increasing and decreasing pattern in the data, but they are not defined for a fixed period. For example, the business cycles. There are repetitive patterns in the long term horizon but they do not occur in fixed intervals or last for a fixed time. Now our next step is to conduct an exploratory analysis of the data to understand the content better. We have already set out data in time series format. Now we extract the time series components by decomposing the data. You can see the four plots are being stacked up here. The top row shows the original data. The same as the previous plot. The second shows the increasing trend through 1991 - 1998. The third is the repetitive seasonal pattern. And, the fourth shows the error or randomness in the data. Now we select which method to use to further analyze this information. There are multiple ways to conduct a time series analysis. These three are the very popular and robust time series analysis techniques. In this project, we will cover ARIMA in detail. To employ ARIMA, our data must meet these two assumptions, stationarity and univariate data. I will elaborate on these in detail in the coming sections. Once our data meets the core assumptions, we find the best fit model and then use it to forecast future values. I will stop here for now and continue with the tests for core assumptions in the next task. See you there.