BS Classwork

You might also like

Download as xlsx, pdf, or txt
Download as xlsx, pdf, or txt
You are on page 1of 24

Business Stats

Statistics:
It is a science of collecting, collating, analysing and interpreti

In BS, our focus will be primarily to solve problems faced by


Supply Chain, Production etc
So the data analysis should help them to take more powerfu

It should be studied in three parts:


D Descriptive Stats Central tendency (Mean, Median etc)
Skewness,
P Probabilistic Stats General theories of probability, Proba
I Inferential Stats Sampling Theory, Central limit theore
Regression: Bivariate and Multi variat
Marketing research, Research methodology
Portfolio management and other Finance couses like Deriva
Operations and Supply Chain

Ex 1
Simple data: Calculate Mean and standard deviation
Sl Age (x) Deviation (xi - x-bar)
1 20 -5.50
2 23 -2.50
3 24 -1.50
4 21 -4.50
5 25 -0.50
6 22 -3.50
7 24 -1.50
8 45 19.50
SUM 204 0.00
Mean 25.50
SD 7.53
Mean (SAM) (x-bar) = SUM(xi) / N
Advantage of calculating MEAN (SAM): It helps us to have a
Basiaclly it gives us a central representative value
It ignores the individuality and the diversity
Sum of Deviations about MEAN is always ZERO
Simple Variance (Measure of Dispersion) = [SUM(xi- x-bar)^
Dispersion mesaures can broadly be cassified into two categ

Standard Deviation (σ), is the Square Root of Variance, is th


Please note that variance will have sq units of the unit of me

Measures of dispersion helps us to quantify the volatility inh


broadly be classified into TWO categories
a. Deviation Family: Variance, Std Dev, Mean Abs Deviation, M
b. Range Family: Range (HV - LV), Inter-quartile Range, Quartil

Mean 25.50
Variance 56.75
Standard Deviation
(σ) 7.53
Coefficient of
Variation = SD/Mean 29.54%

So we can conclude that most of the people in this group ar


Only one data is not within the range, and we can call it an O

If all the data is pretty close to trhe mean, then SD will be lo


comes down

Now if someone (9th person) joins the group with an age of


Answer without any calculation

Task 2
Grade 1
Employees 2262
Mean salary LPA 3.95
Std Deviation 0.61
Coeff of Variation 15.4%

Example 2
To compute Mean and SD from WEIGHTED DATA
Weighted data is a kind of data set, where individual data p
Sl Equity Shares Investement (Rs. Lakhs)
w
1 A 25
2 B 20
3 C 40
4 D 30
5 E 15
130

Q1 Average Expected return in the portfolio


Q2 Corresponding SD
WAM: x-bar = SUM(w*x) / SUM(w)
Weighted Variance = SUM[w*(x - MEAN)^2] / SUM(w)
Weigted Standard Deviation

Note that the weightage factor can be natural or artificially

Task 2
TFRSocio-Economic
Survey:
Sl category No of respondents
w
1 Metro 5400
2 Tier I 7800
3 Teir II + III 12100
4 Sub-urban 15300
5 Rural 20400
61000
Q1 Average no of children in the entire sample, considering all
Q2 Corresponding SD
Mean 2.477
Variance 0.126
SD 0.356
coeff of variation 14.4%

Example 3
Categorise raw data into Frequncy Distn Table
508 537
530 553
579 503
577 563
529 543
527 564
540 549
549 526
535 587
522 522
554 504
565 572
504 558
599 539
579 508
518 555
553 553
514 532
553 519
503 520

Class Boundaries
LCB UCB
500 510
510 520
520 530
530 540
540 550
550 560
560 570
570 580
580 590
590 600

Advantages:
a) It helps us to visualise the density of data points in various p
b) It helps us to construct a Probability Distribution, which in t

To compute Mean and SD from Categorised data (Freq Distn


###
Class Boundaries
LCB UCB
500 510
510 520
520 530
530 540
540 550
550 560
560 570
570 580
580 590
590 600

From the FD Table


Mean 545.250
Variance 700.771
SD 26.472
Coeff of variation 4.86%
Name Mobile Number Email Id
DHRUV YAGNIK 9619281803 yagnik.dhruv24@sibm.edu.in
ARAVIND B 7736737678 aravind.b24@sibm.edu.in

lysing and interpreting data

problems faced by Business Managers in various Domains, e.g. HR, Fin, M

take more powerful decisions

Mean, Median etc), Dispersion (Range, quartile deviation, Variance and St

of probability, Probability distributions: Discrete and Continuous Distns


Central limit theorem, Estimation, Hypothesis Testing, Regression and ANO
ate and Multi variate

e couses like Derivative

rd deviation
Sq of Deviation
(xi - x-bar)^2 Pop quiz
30.25 Qs 1
6.25 Mean
2.25 SD
20.25
0.25 Qs 2
12.25 Mean
2.25 SD
380.25
454.00 SSD
56.75 MSD or Variance

t helps us to have an idea about the CENTRAL TENDENCY of the data


tive value

s ZERO
) = [SUM(xi- x-bar)^2]/N
sified into two categories: DEVIATION FAMILY and RANGE FAMILY

ot of Variance, is the most important measure of dispersion, matches the d


nits of the unit of mean

tify the volatility inherently present in the data. The neasures of Dispersion

an Abs Deviation, Mean Dev about Median


rtile Range, Quartile deviation

Years
Years^2
Years

This coeff measures RELATIVE VOLATILITY, it allows us to compare v


Lower value of this coeff indicates higher consistency (lower volatilit

ople in this group are within a range of 25.5 years ± 7.53 years
nd we can call it an OUTLIER

n, then SD will be low, so if we re calculate the SD ignoring the Outlier, SD

roup with an age of 35, will Mean and SD or both go up or go down?

Grade 2
1754 Combined Mean 4.7929
5.88 Combined SD 0.7611
0.92
15.6%

ED DATA
re individual data points carry corresponding weight / priority
Return (%)
x w*x w*(xi - x-bar)^2
12.60% 3.15 0.06770003698
15.80% 3.16 0.008030798817
20.0% 8 0.01929236686
18.40% 5.52 0.001066198225
22.10% 3.315 0.0276854068
23.145 0.1237748077

17.80%
)^2] / SUM(w) 0.0009521139053
3.09%

atural or artificially assigned

Category Avg no weighted sq of


of children deviation
x w*x w*(x - MEAN)^2
1.78 9612 2621.394519
2.07 16146 1290.397373
2.29 27709 421.9387423
2.54 38862 61.23240508
2.88 58752 3317.457764
151081 7712.420803
ple, considering all the categories together

562 583 563


542 536 554
584 538 544
532 502 540
516 584 532
509 580 531
531 558 573
527 514 590
510 591 596
531 529 546
582 541 573
562 539 592
593 549 565
507 520 514
512 573 600
513 564 505
534 578 514
543 530 501
552 556 564
586 538 573
Relative
Cumulative Frequency or
Frequency Frequency Probability
F f P(x)
#REF! #REF! #REF!
#REF! #REF! #REF!
#REF! #REF! #REF!
#REF! #REF! #REF!
#REF! #REF! #REF!
#REF! #REF! #REF!
#REF! #REF! #REF!
#REF! #REF! #REF!
#REF! #REF! #REF!
120 #REF! #REF!
0 #REF! #REF!

a points in various parts of the data range


ribution, which in turn helps us to build a Predictive Model, using STOCHAS

sed data (Freq Distn Table)


Class
Frequency representative
f x f*x
14 505 7070
12 515 6180
11 525 5775
20 535 10700
10 545 5450
14 555 7770
13 565 7345
13 575 7475
7 585 4095
6 595 3570
120 65430
From the Raw data
545.542
694.865
26.360
4.83%
uv24@sibm.edu.in
4@sibm.edu.in

mains, e.g. HR, Fin, Marketing, Ops

ation, Variance and Std deviation),

Continuous Distns
g, Regression and ANOVA

If a student aged 31 joins this group then, the


will go up
Will go down
If a student aged 17 joins this group then, the
Will go down
will go up

NCY of the data

ANGE FAMILY

persion, matches the dimension of mean

neasures of Dispersion can


llows us to compare volatility
istency (lower volatility), in a data set

53 years

oring the Outlier, SD

up or go down?

/ priority
504 Max 600
567 Min 501
505
534
579
580
552
514
540
565
546
559
572
531
562
552
530
562
538
527
Class
representative
x
505
515
525
535
545
555
565
575
585
595

Model, using STOCHASTIC SIMULATION

f*(x - MEAN)^2
22680.875
10980.75
4510.6875
2101.25
0.625
1330.875
5070.8125
11505.8125
11060.4375
14850.375
84092.5
Column1

Mean 25.5
Standard Error 2.847
Median 23.5
Mode 24
Standard Deviation 8.053
Sample Variance 64.86
Kurtosis 7.009
Skewness 2.586
Range 25
Minimum 20
Maximum 45
Sum 204
Count 8
0

You might also like