Professional Documents
Culture Documents
Written Report - 890
Written Report - 890
Written Report - 890
Course Project
Student name: Bains, Prabhraj Kaur
TIME SERIES
A time series is a dataset with observations in an ordered sequence with an explicit or implicit
attribute (or attributes) that indicate a temporal value. A time series is simply a series of data
points ordered in time. In a time series, time is often the independent variable and the goal is
usually to make a forecast for the future. Time series metrics refer to a piece of data that is
tracked at an increment in time. For instance, a metric could refer to how much inventory was
sold in a store from one day to the next. Time series data is everywhere, since time is a
constituent of everything that is observable. As our world gets increasingly instrumented,
sensors and systems are constantly emitting a relentless stream of time series data. Examples
of time series analysis:
FOURIER ANALYSIS
The Fourier analysis or harmonic analysis of a time series is a decomposition of the series into
sum of sinusoidal components (the coefficient of which are discrete Fourier transform of the
series). However, the term is used in a wider sense to describe any data analysis procedure that
describes or measures the fluctuations in a time series by comparing them with sinusoids. The
official definition of the Fourier Transform states that it is a method that allows you to
decompose functions depending on space or time into functions depending on frequency. The
Fourier Transform is a great tool for extracting the different seasonality patterns from a single
time series variable. For an hourly temperature data set, for example, the Fourier Transform
can detect the presence of day/night variations and summer/winter variations and it will tell
you that those two seasonality (frequencies) are present in your data.
To calculate the Fast Fourier graph, we need to calculate the moving average. For calculating
moving average, we use the formula: -
After calculating MAB we will calculate the Delta f that is 1/total number of months. Using delta
f we will calculate the frequencies. Next step is to apply Fourier analysis on the data of
calculated moving averages. Lastly, we calculate the absolute values of the Fourier analysis data
set in a separate column. Finally, we plot the graph of the frequencies and the absolute values
(setting the interval of x axis from 0 to 0.5). All these steps are to be followed while calculating
Moving Average Temperature, Birth and Death.
FFT(Temp)
23000
22000
21000
20000
19000
18000
0 0.1 0.2 0.3 0.4 0.5
Insights from Graph
It is pretty evident that the graph is having its highest peak at point (0.0820313, 22789.0324) in
the interval of x axis from 0 to 0.5.
FFT (Birth)
60
50
40
30
20
10
0
0 0.1 0.2 0.3 0.4 0.5
The highest peak of the graph is at point (0.0820313, 54.8825175) in the interval of x axis from
0 to 0.5.
FFT(Death)
70
60
50
40
30
20
10
0
0 0.1 0.2 0.3 0.4 0.5
Insights from Graph
The highest peak of the graph is at point (0.0820313, 59.03373405) in the interval of x axis from
0 to 0.5.
Interesting Insight: - Every graph has its highest peak when x coordinates is 0.0820313.
MULTIPLE REGRESSION
Multiple regression is a powerful statistical technique used to analyze the relationship between
a dependent variable and multiple independent variables. It expands upon the concept of
simple linear regression, which focuses on the relationship between a dependent variable and a
single independent variable.
In multiple regression, the goal is to understand how a set of independent variables, when
considered together, influences the dependent variable. It enables researchers to explore
complex relationships and determine the relative contributions of each independent variable in
explaining the variability in the dependent variable.
Regression Statistics
Multiple R 0.727765196
R Square 0.52964218
Adjusted R Square 0.525923937
Standard Error 326.1778955
Observations 256
ANOVA
Significance
df SS MS F F
Regression 2 30309848.81 15154924.41 142.444184 3.64912E-42
Residual 253 26917180.94 106392.0195
Total 255 57227029.75
Standard P-
Coefficients Error t Stat value Lower 95% Upper 95% Lower 95.0% Upper 95.0%
5.28885E-
Intercept 1040.548682 161.0396158 6.461445378 10 723.3997092 1357.697654 723.3997092 1357.697654
1.76675E-
𝑥1 23.44406306 2.118239371 11.06771188 23 19.27243463 27.61569149 19.27243463 27.61569149
6.02872E-
𝑥2 1.610667203 0.105095589 15.32573554 38 1.403693548 1.817640859 1.403693548 1.817640859
For 𝑥1 the p-value is 1.76675E-23 which is much smaller than the significance level of 0.05. Therefore, we
can reject the null hypothesis for 𝑥1 and conclude that there is a statistically significant relationship
between 𝑥1 and the dependent variable.
For 𝑥2 , the p-value is 6.02872E-38, which is also significantly smaller than 0.05. Hence, we can reject the
null hypothesis for 𝑥2 , and conclude that there is a statistically significant relationship between 𝑥2 and
the dependent variable.
In both cases, the p-values are much smaller than the significance level, providing strong evidence
against the null hypothesis. Therefore, based on the provided information, we can reject the null
hypothesis for both 𝑥1 and 𝑥2 , indicating that they have significant relationships with the dependent
variable.