Professional Documents
Culture Documents
IME602Slides 01
IME602Slides 01
IME602Slides 01
Will facilitate students with both the basic and advanced theoretical background in
Probability & Statistics.
Will equip learners with the requisite skills in utilizing different techniques through
practical applications and solved examples.
• Will help participants learn and build on their expertise in the use of data, statistical
theory and statistical tools in myriad of applications like engineering, management
science, social science, basic sciences, etc.
11)Loeve, Michel, Probability Theory, Affiliated East West Press Pvt. Ltd., 1963.
14) Lehmann, E. L. and Romano, J. P., Testing Statistical Hypotheses, Springer Verlag
Publishers, 2005, ISBN (10): 0387988645.
15) Draper, N. R. and Smith, H., Applied Regression Analysis, Wiley & Sons, 1981,
ISBN (10): 0471170828.
16) Walpole, R. E., Myers, R. H., Myers, S. L. and Ye, K., Probability and Statistics
for Engineers and Scientists, Pearson Education, 2007, ISBN (10): 81-317-1552-3.
1) Quizzes: 20%
2) Assignments: 20%
3) Mid-term examination: 25%
4) Final examination: 35%
0
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
(Part # 01)
3-Jan-94
5-Jan-94
7-Jan-94
11-Jan-94
19-Jan-94
24-Jan-94
27-Jan-94
31-Jan-94
2-Feb-94
4-Feb-94
8-Feb-94
10-Feb-94
14-Feb-94
16-Feb-94
Date
BSE(30) Close)
18-Feb-94
22-Feb-94
R.N.Sengupta,DoMS.,IIT Kanpur,INDIA
24-Feb-94
28-Feb-94
1-Mar-94
3-Mar-94
7-Mar-94
9-Mar-94
Non frequency data
15-Mar-94
17-Mar-94
21-Mar-94
23-Mar-94
25-Mar-94
Time series or Historical data
29-Mar-94
21
31-Mar-94
Non frequency data
Spatial series data
It may be that the values of one or more
variables are given for different individuals
in a group for the same period of time. But
instead of considering the group as such
we may be more interested in studying the
way the values of the variable(s) change
from individual to individual in that group.
3500000
3000000
2500000
2000000
1500000
1000000
500000
3500000
3000000
2500000
2000000
1500000
1000000
500000
6000000000
5000000000
Population
4000000000
3000000000
2000000000
1000000000
0
1950 1960 1970 1980 1990 2000
Year
World Population
2000
1990
1980
Year
1970
1960
1950
Population
World Population
10
8
Number
0
10 to 15 15 to 20 20 to 25 25 to 30 30 to 35 35 to 40 40 to 45 45 to 50
120
100
80
60
40
20
0
Ram Shyam Rahim Praveen Saikat Govind Alan
Individual
Height (in cms) Weight (in kgs)
Alan
Govind
Saikat
Individual
Praveen
Rahim
Shyam
Ram
Height/Weight
8.0
7.0
6.0
5.0
Hours
4.0
3.0
2.0
1.0
0.0
Monday Tuesday Wednesday Thursday Friday
Day
June
May
April
Month
March
February
January
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
Whisker
LQ Median UQ
2, 6, 3, 4, 4, 5, 3, 6, 4, 4, 5, 3, 2, 3, 6, 5, 4, 4, 4,
3, 2, 4, 5, 6, 7, 4, 4, 5, 3, 3.
2 ||| 3 3 30
3 |||| || 7 10 27
4 |||| |||| 10 20 20
5 |||| 5 25 10
6 |||| 4 29 5
7 | 1 30 1
35
30
25
Cumulative frequency
20
15
10
0
2 3 4 5 6 7
40
35
Cumulative frequency
30
25
20
15
10
0
145.95-152.95 152.95-159.95 159.95-166.95 166.95-173.95 173.95-180.95 180.95-187.95
HM=n/(1/X1+…..+1/Xn)
3 3
Skewness = 1 3
3
2 2
4
Kurtosis = 2 2 3 4 3
14
12
10
Frequency
0
145.95-152.95 152.95-159.95 159.95-166.95 166.95-173.95 173.95-180.95 180.95-187.95
frequency Class
40
35
Cumulative frequency
30
25
20
15
10
0
145.95-152.95 152.95-159.95 159.95-166.95 166.95-173.95 173.95-180.95 180.95-187.95
A is Red
A is RED
A B
A B
B-A=BAC
P(i) = pi P( A) pi
i A
Where:
P(i) = pi = Probability of occurrence of the sample
point i
P(A) = Probability of occurrence of the event
P () p i 1
i
1
2
3
4
5 0 1/6
6
1
Domain Co-domain/Range
F ( x) P ( X x) f ( xi )
xi x
x x
F ( x) P( X x) f ( x)dx dF ( x)
16 1 16
P( A | B) P( B | A) 1
36 3 16
B6
B7 B8 B9
A
B10 B11
0.12
0.1
0.08
0.06
f(x)
0.04
0.02
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
0.16
0.14
0.12
0.1
0.08
f(x)
0.06
0.04
0.02
0
0 2 4 6 8 10 12 14 16 18 20 22 24 30 32 34 36 38 40 42 44 46 48 50
x
0.35
0.3
0.25
0.2
f(x)
0.15
0.1
0.05
0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
x
0.9
0.8
0.7
0.6
0.5
f(x)
0.4
0.3
0.2
0.1
0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
x
0.4
0.35
0.3
0.25
f(x)
0.2
0.15
0.1
0.05
0
1 2 3 4 5 6 7 8 9 10
f(x)
x
0.2
0.18
0.16
0.14
0.12
f(x)
0.1
0.08
0.06
0.04
0.02
0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
0.6
0.5
0.4
f(x)
0.3
0.2
0.1
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
x
0.12
0.1
0.08
f(x)
0.06
0.04
0.02
0
1 10
x
0.25
0.2
0.15
f(x)
0.1
0.05
0
-10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10
x
0.45
0.4
0.35
0.3
0.25
0.2
f(z)
0.15
0.1
0.05
0
-10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10
-0.05
z
0.8
Normal F(x)
0.6
0.4
0.2
0
2.46
2.74
3.12
3.25
3.45
3.74
3.98
4.08
5.16
1.83
2.99
3.36
3.51
3.56
4.28
4.48
4.78
3.8
3.9
X
0.25
a b
0.2
0.15
f(x)
0.1
0.05
0
-10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10
x
0.2 0.8 0.2881 0.2910 0.2939 0.2967 0.2995 0.3023 0.3051 0.3078 0.3106 0.3133
0.9 0.3159 0.3186 0.3212 0.3238 0.3264 0.3289 0.3315 0.3340 0.3365 0.3389
1.0 0.3413 0.3438 0.3461 0.3485 0.3508 0.3531 0.3554 0.3577 0.3599 0.3621
0.1
1.56 1.1 0.3643 0.3665 0.3686 0.3708 0.3729 0.3749 0.3770 0.3790 0.3810 0.3830
{
1.2 0.3849 0.3869 0.3888 0.3907 0.3925 0.3944 0.3962 0.3980 0.3997 0.4015
1.3 0.4032 0.4049 0.4066 0.4082 0.4099 0.4115 0.4131 0.4147 0.4162 0.4177
0.0
1.4 0.4192 0.4207 0.4222 0.4236 0.4251 0.4265 0.4279 0.4292 0.4306 0.4319
-5 -4 -3 -2 -1 0 1 2 3 4 5 1.5 0.4332 0.4345 0.4357 0.4370 0.4382 0.4394 0.4406 0.4418 0.4429 0.4441
Z 1.6 0.4452 0.4463 0.4474 0.4484 0.4495 0.4505 0.4515 0.4525 0.4535 0.4545
1.7 0.4554 0.4564 0.4573 0.4582 0.4591 0.4599 0.4608 0.4616 0.4625 0.4633
1.8 0.4641 0.4649 0.4656 0.4664 0.4671 0.4678 0.4686 0.4693 0.4699 0.4706
1.9 0.4713 0.4719 0.4726 0.4732 0.4738 0.4744 0.4750 0.4756 0.4761 0.4767
2.0 0.4772 0.4778 0.4783 0.4788 0.4793 0.4798 0.4803 0.4808 0.4812 0.4817
b z2
1 which is not integrable
F ( a Z b) e 2 dz
a 2
algebraically. The Taylor’s expansion of the above assists in
speeding up the calculation, which is
1 1 (1) k z 2k 1
F (Z z)
2 2 k 0 (2k 1)2 k k!
0.3
0.25
0.2
0.15
f(x)
0.1
0.05
0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
x
0.018
0.016
0.014
0.012
0.01
F(X)
0.008
0.006
0.004
0.002
0
1.83
2.46
2.74
2.99
3.12
3.25
3.36
3.45
3.51
3.56
3.74
4.08
4.78
5.16
3.98
4.28
4.48
3.8
3.9
X
1 200
x
P( X 200) 1 exp( )dx
x 0 100 100
60
50
Number of failures
40
30
20
10
0
0.0 2.0 4.0 6.0 8.0 10.0
0.0900
0.0800
0.0700
0.0600
Probability
0.0500
0.0400
0.0300
0.0200
0.0100
0.0000
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Reading number
P[X=x]=f(x)
1.0000
0.8000
Cumulative Probability
0.6000
0.4000
0.2000
0.0000
0.0 2.0 4.0 6.0 8.0 10.0
Cumulative Probability Hours of w orking
Temp(25)- Wind
Tem(2 directi
Ln(No2) ln(Cars) Temp(2 m) Wind Speed ) on Hour of day Day #
7
6
Ln(No2) Concentration
5
4
3
2
1
0
1
26
76
51
101
126
151
176
201
226
251
276
301
326
351
376
401
426
451
476
Data #
Ln(No2)
9
8
7
6
ln(Cars)
5
4
3
2
1
0
101
176
301
326
401
426
451
26
51
76
126
151
201
226
251
276
351
376
476
1
Data #
ln(Cars)
25
20
15
10
Temp (2 m)
5
0
1
27
53
79
105
131
157
183
209
235
287
313
339
391
417
443
469
261
365
495
-5
-10
-15
-20
-25
Data #
Temp(2 m)
10
8
Wind Speed
0
1
26
51
76
126
151
176
201
226
251
301
326
351
376
401
426
451
476
101
276
Data #
Wind Speed
4
Temp(25)-Tem(2)
0
1
53
27
79
131
287
313
365
105
157
183
209
235
261
339
391
417
443
469
495
-2
-4
-6
Data #
Temp(25)-Tem(2)
400
350
300
Wind direction
250
200
150
100
50
0
1
79
27
53
131
157
105
183
209
235
261
287
313
339
365
391
417
443
469
495
Data #
Wind direction
6.5
6
5.5
5
4.5
4
1 2 3 4 5 6 7
Ln(No2)
14
12
10
8
6
4
1.5 2 2.5 3 3.5 4 4.5 5
Ln(No2)
180
160
140
120
Frequency
100
80
60
40
20
0
2 to 2.5
5.5 to 6
1 to 1.5
1.5 to 2
2.5 to 3
3 to 3.5
3.5 to 4
4 to 4.5
4.5 to 5
5 to 5.5
6 to 6.5
Data Range
Frequency
100
90
80
70
Frequency
60
50
40
30
20
10
0
1 to 1.25
3 to 3.25
2 to 2.25
4 to 4.25
5 to 5.25
6 to 6.25
5.5 t 5.75
1.5 to 1.75
2.5 to 2.75
3.5 to 3.75
4.5 to 4.75
Data Range
Frequency
0.80000
0.70000
0.60000
Ln(No2)
0.50000
0.40000
0.30000
0.20000
0.10000
0.00000
0
10
15
20
25
Ln(No2) Range
5
Frequency
0
0 1 2 3 4 5 6 7 8 9
# of Arrivals
0.2
Relative Frequency
0.15
0.1
0.05
0
0
9
# of Arrivals
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0
10
# of Arrivals
20
15
Frequency
10
0
4 5 6 7 8 9 10 11 12
# of Arrivals
0.12
0.1
0.08
0.06
0.04
0.02
0
4
10
12
11
# of Arrivals
1
0.9
Cumulative Relative Frequency
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
4
10
11
12
# of Arrivals
12
10
COST State
30
25
Cost and Ratio
20
15
10
5
0
Nevada
Illinois
Texas
Maine
Georgia
Oregon
Virginia
Alabama
Michigan
New Mexico
Missouri
North Dakota
South Carolina
Arkansas
Kansas
Connecticut
Wisconsin
COST RATIO State
(Part # 01)
10
20
30
40
50
60
0
Alabama
Arizona
California
COST
RATIO
Florida
Hawaii
Illinois
SALARY
Iowa
Kentucky
Maine
Massachusetts
Minnesota
Missouri
State
Nebraska
New
Histogram of Cost, Ratio and Salary
R.N.Sengupta,DoMS.,IIT Kanpur,INDIA
New Mexico
(contd…)
North Carolina
Ohio
Oregon
Rhode Islan
South Dakota
Example # 041 (SAT)
Texas
Vermont
Washington
Wisconsin
262
Example # 041 (SAT)
(contd…)
1 n
Average value is given by E X X i
n i 1
1 n
V X X i E X
Variance is given by 2
n i 1
Covariance is given by
CovX , Y E X E X Y E Y X ,Y V X V Y
20
Value (USD)
15
10
0
Date
VALUE Date
80
70
60
Frequency
50
40
30
20
10
0
Frequency Range
0.3
0.25
Relative Frequency
0.2
0.15
0.1
0.05
0
11 to 12
13 to 14
14 to 15
15 to 16
16 to 17
17 to 18
18 to 19
19 to 20
20 to 21
12 to 13
1
Cumulative Relative Frequency
0.8
0.6
0.4
0.2
0
0 2 4 6 8 10
Range Number
Cumulative Relative Frequency
f ( x)
1 1
e
2 X2 0<x<
2 X x
0.012
0.01
0.008
0.006
f(x)
0.004
0.002
0
0.5 3 5.5 8 10.5 13 15.5 18 20.5 23 25.5 28 30.5 33 35.5 38
x
a np
2) P (a X ) 1 [ ]
npq
b np
3) P ( X b) [ ]
npq