Download as pdf or txt
Download as pdf or txt
You are on page 1of 33

Lesson 3_Unit 1:

Summary Statistics
Learning Outcomes

■ The learner will be able to estimate the data.


■ The learner will be able to analyze the data
Data Sets and Data Analysis
A Scenario –
A company has a growing trend. It wants to forecast the sales of next five
years, where it can see the impact of COVID-19 sustaining, on the business.
Identify the data set in the above statement, which needs to worked upon.
■ It is SALES, since, which is a quantitative data, and the ONLY VARIABLE
which needs to be worked upon. We can call such estimation and analysis as
UNIVARIATE analysis.
■ Now, suppose, as stated in the above scenario, that sales are to be
forecasted, which is determined by different other factors. Understanding
that what factors impact sales, is Bivariate or Multivariate Analysis.
■ Please see the link- An instance of forecasts, where you can observe how
forecasts are done.
https://www.business-standard.com/article/companies/covid-19-impact-yamah
a-expects-india-sales-to-be-10-year-low-in-2020-120092000591_1.html
Univariate ■ For a single data set or a single time series, the
most common descriptive statistics are the

Analysis ■
mean, the standard deviation, and the variance.
Consider, the stock prices given under:
Date Open Price High Price Low Price Close Price
29-January-2021 39.5 39.8 37.4 37.7
28-January-2021 39 39.5 38.4 38.9
27-January-2021 41.1 41.1 39.25 39.4
25-January-2021 40.75 41.95 39.7 40.8
22-January-2021 42.95 43 40.05 40.95
21-January-2021 44.6 44.75 42.1 42.4
20-January-2021 45.1 45.2 43.95 44.05
19-January-2021 43.35 44.1 41.8 43.75
18-January-2021 45.15 45.15 41.85 42.35
15-January-2021 44.5 45.7 43.9 44.45
14-January-2021 44.85 44.9 43.8 44.25
13-January-2021 44.25 44.75 43.35 44.4
12-January-2021 43.3 45.15 43.2 43.45
11-January-2021 44.9 44.9 42.8 43.3
8-January-2021 45.75 45.75 43.65 44.3
7-January-2021 43.2 45.6 43 44.95
6-January-2021 42.15 43.3 41.75 42.4
5-January-2021 42.85 42.95 41.85 42.15
4-January-2021 40.25 43.3 40.15 42.85
1-January-2021 40.9 40.9 39.7 39.9
Source: BSE India, Stock Prices of Tata
Mean

■ Using S as the indicator of measurement of mean of stock prices, it will be


calculated as-

■ Thus, the average price of the stock of TATA steel is = S1 + S2 + S3 + S4 + S5… + S20
/Si
■ Mean is = 42.3 ₹
Median

■ The mean should not be confused with the Date Open Price High Price Low Price Close Price
median.
29-January-2021 39.5 39.8 37.4 37.7
28-January-2021 39 39.5 38.4 38.9
27-January-2021 41.1 41.1 39.25 39.4
■ Median is the middle observation. 25-January-2021 40.75 41.95 39.7 40.8
22-January-2021 42.95 43 40.05 40.95

■ Both the mean and median are designed 21-January-2021


20-January-2021
44.6
45.1
44.75
45.2
42.1
43.95
42.4
44.05
to provide a numerical measure of the 19-January-2021 43.35 44.1 41.8 43.75

center of the data.


18-January-2021 45.15 45.15 41.85 42.35
15-January-2021 44.5 45.7 43.9 44.45
14-January-2021 44.85 44.9 43.8 44.25
■ So, Median= 42.6 ₹ 13-January-2021 44.25 44.75 43.35 44.4
12-January-2021 43.3 45.15 43.2 43.45

■ It is valuable to measure the spread of the 11-January-2021


8-January-2021
44.9
45.75
44.9
45.75
42.8
43.65
43.3
44.3
data, i.e. we want a numerical measure 7-January-2021 43.2 45.6 43 44.95

indicating if the data are tightly bunched


6-January-2021 42.15 43.3 41.75 42.4
5-January-2021 42.85 42.95 41.85 42.15
together or spread across a wide range. 4-January-2021 40.25 43.3 40.15 42.85
1-January-2021 40.9 40.9 39.7 39.9
Mean and Median

■ For symmetric distributions, mean = median


■ For skewed distributions, mean is drawn in direction of
longer tail, relative to median
■ Mean valid for interval scales, median for interval or ordinal
scales
■ Mean sensitive to “outliers” (median often preferred for
highly skewed distributions)
■ When distribution symmetric or mildly skewed or discrete
with few values, mean preferred because uses numerical
values of observations
Deviations

■ To develop a measure of spread, calculations for each day stock price can be
done, to know how far it is from the mean stock price.
■ The Mean S is subtracted from each Si to give the ith deviation from the
mean,
■ Deviation from mean will be (S- Si)
■ The sum of the deviations will always be equal to zero.
■ Hence, either squared deviations or absolute (occasionally) deviations are
taken.
Deviations

■ The mean of the absolute deviations is denoted by MAD,

■ The mean of squared deviations is denoted by MSD


Variance

■ It is defined as the sum of squared deviations divided by one less than the total
number of observations.

■ Variance in the stock prices = 4.3 ₹


■ Note that MSD uses 20 observations (n), where as Variance uses (n-1) i.e. 19
observations.
■ Here, n-1 is known as degree of freedom.
Standard Deviation

■ The square root of the variance reveals the average deviation of the
observations from the mean.
Standard Deviation

1. Larger S.D. = greater amounts of variation around the mean.


For example:

19 25 31 13 25 37
Y = 25 Y = 25
S.D. = 3 S.D. = 6
2. S.D. = 0 only when all values are the same (only when you have a constant and not a
“variable”)
3. If you were to “rescale” a variable, the S.D.. would change by the same magnitude—if we
changed units above so the mean equaled 250, the S.D. on the left would be 30, and on
the right, 60
4. Like the mean, the S.D. will be inflated by an outlier case value.
Review Date Close Price S Deviatio
n
Absolute Mean Squared Mean

29-January-2021 37.7 42.3 4.6 4.6 21.5


28-January-2021 38.9 42.3 3.4 3.4 11.8
27-January-2021 39.4 42.3 2.9 2.9 8.6
25-January-2021 40.8 42.3 1.5 1.5 2.4
■ Mean = 42.3 ₹ 22-January-2021 40.95 42.34 1.4 1.4 1.9
21-January-2021 42.4 42.3 -0.1 0.1 0.0
■ Median = 42. 6 ₹ 20-January-2021 44.05 42.34 -1.7 1.7 2.9
19-January-2021 43.75 42.34 -1.4 1.4 2.0
■ Mean Absolute Deviation = 18-January-2021 42.35 42.34 0.0 0.0 0.0
15-January-2021 44.45 42.34 -2.1 2.1 4.5
(Column 5 sum )/ 20 = 33.1 / 20= 1.65 ₹ 14-January-2021 44.25 42.34 -1.9 1.9 3.7
13-January-2021 44.4 42.3 -2.1 2.1 4.3
■ MSD = (column 6 sum)/ 20 = 4.13 ₹ 12-January-2021 43.45 42.34 -1.1 1.1 1.2
11-January-2021 43.3 42.3 -1.0 1.0 0.9
2
■ Variance (S ) = (column 6 sum)/19 = 4.34 ₹ 8-January-2021 44.3 42.3 -2.0 2.0 3.9
7-January-2021 44.95 42.34 -2.6 2.6 6.8
2
■ Standard Deviation = Sqrt S = 2.08 ₹ 6-January-2021 42.4 42.3 -0.1 0.1 0.0
5-January-2021 42.15 42.34 0.2 0.2 0.0
4-January-2021 42.85 42.34 -0.5 0.5 0.3
1-January-2021 39.9 42.3 2.4 2.4 5.9
0.0 33.1 82.6
When using standard deviation
to measure risk in the stock

Practical Application of S.D.


market, the underlying
assumption is that the majority
of price activity follows the
pattern of a normal distribution.
In a normal distribution,
individual values fall within one
standard deviation of the mean,
■ For many data sets the following useful rules of thumb hold:above or below, 68% of the time.
Values are within two standard
■ For instance, The standard deviation can be used to quantifydeviations 95% of the time.
risk as indicated in the calculation of the Beta for a stock. For example, in a stock with a
mean price of $45 and a
– Approximately two-thirds of the observations lie standard deviation of $5, it can
be assumed with 95% certainty
within 1 S.D. of the mean, and the next closing price remains
between $35 and $55. However,
– Approximately 95% of the observations lie within 2 S.D.price plummets or spikes outside
of this range 5% of the time. A
stock with high
volatility generally has a high
standard deviation, while the
deviation of a
stable blue-chip stock is usually
fairly low.

Source: Investopedia
Bi-Variate and Multivariate Analysis

A Case -
Anuj is a retail trader, who wishes to make investment in
the stocks of Titanium Ltd. As he is not a naïve investor,
he understands how value of the share price is
determined. Therefore, he can decide by looking at the
trend of the prices or he can look for the factors which
make a change in the stock prices. What do you think
what he will do to arrive to a decision?
Contd..

■ Factors that can affect stock prices:


– Fundamental Factors
– Technical Factors
– Demand and supply forces
Bivariate Analysis ■ It shows the relationship between two
or more variables.
■ There can a positive or negative
relationship between the variables.
Bivariate Analysis: Co-Variance

■ It indicates how two variables “ço-vary”.


■ It can be defined as:
Practical application of Co-Variance

■ It measures the directional relationship between two asset returns.


■ For example, if the variable is stock returns, co-variance can predict how two
stocks might perform relative to each other in the future.
■ The investor can identify which stocks complement each other or can be the
substitutes to each other.
■ Co-variance enhances the portfolio management but reducing the risk and
increasing the overall return of the portfolio.
Practical
Date Close Price of Tata Steel SR_TS Close Price of Birla Corp SR_Bco
1 2 3 4 5

application of
29-January-2021 37.7 718.3
28-January-2021 38.9 0.032 711.3 -0.010

Co-Variance
27-January-2021 39.4 0.013 716.5 0.007
25-January-2021 40.8 0.036 720.85 0.006
22-January-2021 40.95 0.004 720.1 -0.001

■ Step 1: 21-January-2021 42.4 0.035 721.85 0.002


20-January-2021 44.05 0.039 716.75 -0.007
Collect the Stock Returns 19-January-2021 43.75 -0.007 716.35 -0.001

of the two stocks 18-January-2021 42.35 -0.032 707.1 -0.013


15-January-2021 44.45 0.050 722.9 0.022
14-January-2021 44.25 -0.004 723.35 0.001
13-January-2021 44.4 0.003 732.85 0.013
12-January-2021 43.45 -0.021 741.95 0.012
11-January-2021 43.3 -0.003 744.65 0.004
8-January-2021 44.3 0.023 754.5 0.013
7-January-2021 44.95 0.015 743.35 -0.015
6-January-2021 42.4 -0.057 743.7 0.000
5-January-2021 42.15 -0.006 731.75 -0.016
4-January-2021 42.85 0.017 736.3 0.006
1-January-2021 39.9 -0.069 728 -0.011
Date Close Price SR_TS Close Price SR_Bco Mean of TS Mean of Bco Deviations Deviations_Bco 8*9

Co-variance in
of Tata of Birla Corp TS
Steel
1 2 3 4 56 78 9 10

the stock
29-January-2021 37.7 718.3 0.003 727.6-37.69653 9.32 -351.331655

28-January-2021 38.9 0.032 711.3 -0.010 0.0035 0.001-0.0283597 0.01 -0.000297933

returns
27-January-2021 39.4 0.013 716.5 0.007 0.003470 0.00076026-0.009383 -0.01 6.14613E-05

25-January-2021 40.8 0.036 720.85 0.006 0.003470 0.00076026-0.0320625 -0.01 0.000170281

22-January-2021 40.95 0.004 720.1 -0.001 0.003470 0.00076026-0.000206 0.00 -3.70901E-07


■ Step 2: 21-January-2021 42.4 0.035 721.85 0.002 0.003470 0.00076026-0.0319385 0.00 5.33361E-05

Mean and Deviations from 20-January-2021 44.05 0.039 716.75 -0.007 0.003470 0.00076026-0.0354446 0.01 -0.00027737

mean are calculated 19-January-2021 43.75 -0.007 716.35 -0.001 0.003470 0.000760260.01028094 0.00 1.35537E-05
(Column 6-9). 18-January-2021 42.35 -0.032 707.1 -0.013 0.003470 0.000760260.03547049 0.01 0.000484986

15-January-2021 44.45 0.050 722.9 0.022 0.003470 0.00076026-0.0461163 -0.02 0.000995398

Step 3:
14-January-2021 44.25 -0.004 723.35 0.001 0.003470 0.000760260.00796993 0.00 1.09799E-06
■ 13-January-2021 44.4 0.003 732.85 0.013 0.003470 0.000760268.0664E-05 -0.01 -9.98056E-07

Column8*Column9 12-January-2021 43.45 -0.021 741.95 0.012 0.003470 0.000760260.02486689 -0.01 -0.000289874

■ Step 4: 11-January-2021 43.3 -0.003 744.65 0.004 0.003470 0.000760260.00692274 0.00 -1.99292E-05

8-January-2021 44.3 0.023 754.5 0.013 0.003470 0.00076026-0.0196242 -0.01 0.000244663


Covariance = (Sum of
Column 10) / n-1
7-January-2021 44.95 0.015 743.35 -0.015 0.003470 0.00076026-0.0112022 0.02 -0.000174063

6-January-2021 42.4 -0.057 743.7 0.000 0.003470 0.000760260.06020019 0.00 1.7423E-05

= -351.32 / 19= -18.49 5-January-2021 42.15 -0.006 731.75 -0.016 0.003470 0.000760260.00936672 0.02 0.000157628

4-January-2021 42.85 0.017 736.3 0.006 0.003470 0.00076026-0.0131369 -0.01 7.16972E-05

1-January-2021 39.9 -0.069 728 -0.011 0.003470 0.000760260.0723153 0.01 0.000870158

-351.3295738
Date Close Price Close Price
of Tata Steel ONGC
1 2 3

Tasks
29-January-2021 37.7 88.4
28-January-2021 38.9 90.7
27-January-2021 39.4 89.65
25-January-2021 40.8 91.3
22-January-2021 40.95 92.8 ■ 1. Find out the Covariance
21-January-2021 42.4 94.9 in the stock prices of the
20-January-2021 44.05 98.85 two companies and
19-January-2021 43.75 98.1 interpret the same.
18-January-2021 42.35 96.7 ■ 2. Find out the
15-January-2021 44.45 101.35 covariances in Stock
14-January-2021 44.25 105 Returns of ONGC and
13-January-2021 44.4 105.2 Steel Industry and
12-January-2021 43.45 103.45 interpret the same.
11-January-2021 43.3 102.55
8-January-2021 44.3 100.65
7-January-2021 44.95 97.9
6-January-2021 42.4 96.95
5-January-2021 42.15 94.95
4-January-2021 42.85 96.95
1-January-2021 39.9 93.2
Correlation

■ It is difficult to interpret the units of Co-variance.


■ Hence, the value of computing Correlation Coefficient is important.
■ The correlation coefficient is denoted with r is a special covariance measure
that takes care of the scale problem.
■ If the covariance (Cov xy) is divided by the two standard deviations (Sx and
Sy), then the units in the numerator and the denominator cancel out, leaving
a dimensionless number, which is a correlation coefficient between X and Y.
■ This is written as:
Correlation
Date Close Price of SR_TS Close Price of SR_Bco Mean of TS Mean of Bco Deviations TS Devitions Sqaured Sqaured
Tata Steel Birla Corp Bco Deviations TS Deviations Bco
1 2 3 4 5 4 5 6 7 8 9
29-January-20 37.7 718.3 0.003 727.6 -37.6965295 9.32 21.483225 86.8624
21

Calculated
28-January-20 38.9 0.032 711.3 -0.010 0.0035 0.001 -0.02835974 0.01 11.799225 266.3424
21
27-January-20 39.4 0.013 716.5 0.007 0.003470 0.000760259 -0.00938298 -0.01 8.614225 123.6544
21
25-January-20 40.8 0.036 720.85 0.006 0.003470 0.000760259 -0.0320625 -0.01 2.356225 45.8329
21
22-January-20 40.95 0.004 720.1 -0.001 0.003470 0.000760259 -0.00020598 0.00 1.918225 56.5504

The effect of scaling


21
■ 21-January-20 42.4 0.035 721.85 0.002 0.003470 0.000760259 -0.03193854 0.00 0.004225 33.2929

(dividing COVxy by Sx and 21


20-January-20 44.05 0.039 716.75 -0.007 0.003470 0.000760259 -0.0354446 0.01 2.941225 118.1569
Sy) is to restrict the range 21

of rXY to the interval -1 to 19-January-20


21
43.75 -0.007 716.35 -0.001 0.003470 0.000760259 0.01028094 0.00 2.002225 127.0129

+1. 18-January-20
21
42.35 -0.032 707.1 -0.013 0.003470 0.000760259 0.03547049 0.01 0.000225 421.0704

15-January-20 44.45 0.050 722.9 0.022 0.003470 0.000760259 -0.04611628 -0.02 4.473225 22.2784
■ For the data in the previous 21

case, 14-January-20
21
44.25 -0.004 723.35 0.001 0.003470 0.000760259 0.00796993 0.00 3.667225 18.2329

13-January-20 44.4 0.003 732.85 0.013 0.003470 0.000760259 8.0664E-05 -0.01 4.264225 27.3529

■ S_TS = SQRT (∑Col.8/n-1) 21


12-January-20 43.45 -0.021 741.95 0.012 0.003470 0.000760259 0.02486689 -0.01 1.243225 205.3489
21
■ S_BCo = SQRT (∑Col.9/n-1) 11-January-20
21
43.3 -0.003 744.65 0.004 0.003470 0.000760259 0.00692274 0.00 0.931225 290.0209

S_TS = 2.085
8-January-202 44.3 0.023 754.5 0.013 0.003470 0.000760259 -0.01962419 -0.01 3.861225 722.5344
■ 1
7-January-202 44.95 0.015 743.35 -0.015 0.003470 0.000760259 -0.01120219 0.02 6.838225 247.4329

S_BCo = 12.90
1
■ 6-January-202 42.4 -0.057 743.7 0.000 0.003470 0.000760259 0.06020019 0.00 0.004225 258.5664
1

■ r = - 0.6872 5-January-202
1
42.15 -0.006 731.75 -0.016 0.003470 0.000760259 0.00936672 0.02 0.034225 17.0569

4-January-202 42.85 0.017 736.3 0.006 0.003470 0.000760259 -0.01313686 -0.01 0.265225 75.3424
1
1-January-202 39.9 -0.069 728 -0.011 0.003470 0.000760259 0.0723153 0.01 5.929225 0.1444
1
82.6305 3163.087
Date Close Price Close Price
of Tata Steel ONGC
1 2 3

Task
29-January-2021 37.7 88.4
28-January-2021 38.9 90.7
27-January-2021 39.4 89.65
25-January-2021 40.8 91.3
22-January-2021 40.95 92.8 ■ Find out the r in Stock
21-January-2021 42.4 94.9 Returns of ONGC and
20-January-2021 44.05 98.85 Steel Industry and
19-January-2021 43.75 98.1 interpret the same.
18-January-2021 42.35 96.7
15-January-2021 44.45 101.35
14-January-2021 44.25 105
13-January-2021 44.4 105.2
12-January-2021 43.45 103.45
11-January-2021 43.3 102.55
8-January-2021 44.3 100.65
7-January-2021 44.95 97.9
6-January-2021 42.4 96.95
5-January-2021 42.15 94.95
4-January-2021 42.85 96.95
1-January-2021 39.9 93.2
Auto Covariance and
Auto-Correlation
■ The covariance and correlation coefficient are summary statistics that
measure the extent of linear relationship between two variables.
■ They can be used to identify explanatory relationships.
■ Autocorrelation and Autocovariance are the measures that serve same
purpose for a single series.
■ For Example, if we compare Yt (observation at time t) with Yt-1, then we see
how consecutive observations are related.
■ Yt-1 is a lagged observation by one period. So, there can be Yt-2, Yt-3, Yt-4…
Auto Covariance and
Auto-Correlation
■ Auto-Covariance is written as : ■ Auto-correlation is written as :
Graphical
Summary of ACF
Auto-Covarian
ce and
Auto-Correlati
onMonthly Returns of Tata Steel.

Sum Up

■ TO understand the behavior of single variable, univariate analysis is used with


important measures like Mean, Median, MAD, MSD, Variance and Standard
Deviation
■ Two vital statistics for bivariate data are Covariance and Correlation
■ Much is learned about single time series by examining the autocorrelations
of the series itself, lagged one period, two periods and so on.
■ The ACF plays a very important role in series forecasting.
DHA 3:

a. Bring examples of univariate analysis and bivariate analysis.


b. Make the graph of autocorrelations calculated.
c. What are the other measures of autocorrelation?
Test Your understanding

■ Data Collected at a point of time is called:


a. Cross Sectional data b. Time Series data c. Pooled Data d. Panel Data
■ Data collected for a variable over a period of time is called:
a. Cross Sectional data b. Time Series data c. Pooled Data d. Panel Data
■ To study the performance of various states across India for time period 1990 to 2010, will
be
a. Cross Sectional data b. Time Series data c. Pooled Data d. Panel Data
■ Correlation Analysis is
a. Univariate b. Bivariate c. Multivariate d. b and c
■ When the values of two variables move in the same direction, correlation is said to be
...................
a. Linear b. Non-Linearc. Positive d. Negative
Thank You

You might also like