Professional Documents
Culture Documents
CH 03
CH 03
by Ken Black
Chapter 3
Descriptive
Statistics
Copyright2011
2011John
JohnWiley
Wiley&&Sons,
Sons,Inc.
Inc. 1
Copyright
Statistics and Analytics
Statistics
The study of the collection, organization, analysis, interpretation, and presentation of data
Analytics
The discovery and communication of meaningful patterns in data.
Especially valuable in areas rich with recorded information
Analytics relies on the simultaneous application of statistics, computer programming and
operations research to quantify performance.
Solutions
Q1 Q2 Q3
Interquartile Range - range of values between the first and third quartiles
Range of the “middle half”; middle 50%
Useful when researchers are interested in the middle 50%, and not the extremes
Example: For the cars in service data, the IQR is 204,000 – 9,000 = 195,000
For example: salary IQR is 10LPA..we understand this?
X X- |X-
5 -8 8
9 -4 4 MAD
X
24
8.4
16 3 3 n 5
17 4 4
18 5 5
X 2
663,886
s
2
221,289
n 1 3
Copyright 2011 John Wiley & Sons, Inc. 24
Sample Standard Deviation
s s 2 221,289 470.4
Solution
The researcher computes the mean absolute
deviation, the variance, and the standard deviation
for these data in the following manner.
X X-Xbar |X-Xbar| (X-Xbar)2
55 -41 41 1,681 MAD 154 / 5 30.8
100 4 4 16
s 2 5,770 / 4 1,443
125 29 29 841
140 44 44 1,936 s 1,443 38
60 -36 36 1,296
SUM: 480 0 154 5,770
Example:
C .V . 100 A SD with 10 on a mean of 20
A SD with 10 on a mean of 1000
f * M 2150
43.0
f 50
Cumulative N
cfp
Class Interval Frequency Frequency
Md L 2 W
20-under 30 6 6 fmed
30-under 40 18 24 50
40-under 50 11 35 24
50-under 60 11 46 40 2 10
11
60-under 70 3 49
40.909
70-under 80 1 50
N = 50 Steps:
1. Cumm Freq.
2. Find N/2
3. Find higher class
4. L: Lower limit of the class
5. Cf: Cumm Freq
6. W: Class width
7. F-med: freq of median class
Copyright 2011 John Wiley & Sons, Inc. 32
Mode of Grouped Data
M X
2
2 f
S
n 1
2
S S
M fM M M M
2
f 2
f
Class Interval
20-under 30 25 150
6 -18 324 1944
30-under 40 35 630
18 -8 64 1152
40-under 50 45 495
11 2 4 44
Can we 605
determine
12
the standard deviation, consid
1584
50-under 60 55 11 144
65 195
3
this
22 as a population
484 1452
60-under 70
70-under 80 75 175 32 1024 1024
2150
50 7200
2
f M
2 144 12
2 7200
144
N 50
M fM M M M
2
f 2
f
Class Interval
20-under 30 6 25 150 -18 324 1944
30-under 40 18 35 630 -8 64 1152
40-under 50 11 45 495 2 4 44
50-under 60 11 55 605 12 144 1584
60-under 70 3 65 195 22 484 1452
70-under 80 1 75 75 32 1024 1024
50 2150 7200
2
M
2 144 12
f 7200
2 144
N 50
0.10
0.10
y
y
0.05
0.05
0.00
0.00
0 5 10 15 20 0 5 10 15 20
x x
3 Md
Sk
Copyright 2011 John Wiley & Sons, Inc. 38
Coefficient of Skewness
3 Md
Sk
If Sk < 0, the distribution is negatively skewed (skewed to
the left).
If Sk = 0, the distribution is symmetric (not skewed). If Sk is
close to 0, it’s almost symmetric
If Sk > 0, the distribution is positively skewed (skewed to
the right).