Download as pdf or txt
Download as pdf or txt
You are on page 1of 585

Revised 13th Edition According to

the Latest Syllabus of MAKAUT


and other Equivalent Courses

ENGINEERING
MATHEMATICS
Volume-IIA for 2~ ,0 er
(For CSE & In
"
f~' 1t ~~
~l
6',?'~
~ ,

I~'O~//J}f!~ ~
B. K. PAL, M.Sc.,Ph.D. (\~\ _ ~.:
(P. N. Banerjee Gold Medalist & Recipient of ;t~,4tGe.1f6~~)
Head of the Department of Mathematics(Ret. . ,-"
Maulana Azad College, Kolkata
Formerly, Reader in Mathematics
Kalyani Government Engineering College, Nadia, West Bengal.

K. DAB, M.Sc.,B.Ed.,Ph.D.
Associate Professor and Head of the Department of Mathematics,
Krishnanagor Government Colleg"e, Nadia, West Bengal
Formerly,
Associate Professor, A.B.N. Seal College, Coochbehar, West Bengal
Assistant Professor and Head 'of the Department of Mathematics
Kalyani Government 1'__ : : __ 1'_11 ~Tadia, West Bengal.
Acadtmy or Technology

1111111111111111111111
49296

U. N. DHUR & SONS PRIVATE LIMITED


KOLKATA - 700 073
© Reserved by the Authors

Publication, Distribution and Promotion Rights reserved by the Publisher


AU rights reserved. No part of the text in general, figures, diagrams,
page-layout and cover design in particular, may be reproduced or
transmitted in any form or. by any means - electronic, mechanical
photocopying, recording or by any information storage and retrieval system
- without the prior written permission of the Publisher.

ISBN-97S-93-S0673-16-5

First Edition 2004


Second . Edition 2005
Third Edition 2006
Fourth Edition 2007
Reprint 2008
Fifth Edition 2009
Sixth Edition 2010 51'0 • '~}lb2-
Seventh
Reprint
Edition 2010
2012
PAL
Eighth Edition 2013
Ninth Edition 2014
Reprint 2015
Tenth Edition 2015
Eleventh Edition 2017
Twelveth Edition 2018
Thirteenth Edition 2019

Price: Rupees Three hundred fifty only

Published by Dr. Pumendu Dhar, M.Sc. (Chern.), Ph.D. on behalf of


w« U. N. DHUR & SONS PRIVATE LIMITED
2A, Bhawani Dutta Lane, Kolkata.- 700 073
Phone: (033) 2241-9573 /2241-1734 12257-1209
Mobile: 9432889588 19433017104 198301 69816
Printed by : Nabaloke Press
15/2,Nerode Behan Mullick Road, Kolkata - 700 006
Preface to the Revised 13th edition

This edition is thoroughly revised according to


the latest syllabus of MAKAUT according to the
Modules of the present syllabus for
II A Semester [ FOR CSE & IT ]
We do hope that. this book will be the most helpful
for the students to cover the specific syllabus in
time and to secure high marks in examinations will
guide the students to secure high marks in the
examination.
We express our heartiest thanks to all the
Professors for recommending our books to the
students.
All the suggestions for improvement will be
accepted with thanks.
We are thankful to our publisher for bringing out
edition in time.
Kolkata. B.K. Pal
January.2019 K.Das
Preface
Engineering Mathematics being one of the most essential
subjects of the B.Tech.. course offered by the Engineering or
Technological Institutes affiliated by the West Bengal University
ofTechnology i.e. W.B.U.T.
This text book is written according to the syllabus ofW.B.U.T.
covering the entire course in four separate volumes as per the
four semesters.
The present volume (VolII)has been designed for the students
of B.Tech first year (2nd semester. all branches) under W.B.U.
Tech.
Sincere attempts have been made to present the topics of the
prescribed syllabus of the University in a simple and lucid form
to create interest to the students about the subject for securing
very good marks in the examination.
Special emphasis has been given on various types of problems
along with worked out model solutions which are particularly
relevant to the engineering Students.
We are sure that this book will certainly be very much helpful
to the students of the present generation.
Any positive suggestion for the improvement of the book shall
be acknowledged with thanks.
We take this opportunity to express our gratitude to Dr. Parijat
De. Principal. Kalyani Government Engineering College. for
encouraging us to complete this difficult task.
We are thankful to Dr. Purnendu Dhar, M. Sc.(Chem) ..
Ph.D..Director. M/s.U.N.Dhur& Sons Pvt. Ltd. for publishing the
book in a very short time.
Our thanks are also due to our family members for their
constant encouragement and enthusiasm time to time.

Kalyani Govt. Engg. College Bidyut Kumar Pal


Nadia. West Bengal. Kalidas Das
Dated: February. 2004.
SYLLABUS
MaulanaAbul Kalam Azad University of Technology, West Bengal
(Formerly West Bengal University of Technology)1st Year Curriculum
Structure for B.Tech courses in Engineering & Technology
(Applicable from the academic session 2018-2019)Page 24 of33
Course Code: BS-M201 Category: Basic Science Course

Course Title: Mathematics - II A Semester: Second (CSE &IT)


L- T-P: 3-1-0 Credit: 4
Pre-Requisites: High School Mathematics and BS-MIO 1

Description of Topic Lectures Hours 1

Module No. 1.

Basic Probability:
Probability spaces, conditional probability, independence;
Discrete random variables, Independent random variables, the Multinomial
distribution, Poisson approximation to the Binomial distribution, infinite
sequences of Bernoulli trials, sums of independent random variables; Expectation
of Discrete Random Variables, Moments, Variance of a sum, Correlation
coefficient, Chebyshev's Inequality. 11L
Module No.2
Continuous Probability Distributions:
Continuous random variables and their properties, Distribution functions and
densities, Normal, Exponential and Gamma densities 4 L
Module No.3
Bivariate Distributions:
Bivariate distributions and their properties, distribution of sums and quotients,

Conditional densities, Bayes' rule. 5L


viii ENGINEERING MATHEMATICS -liB

Module No 4
Basic Statistics:
Measures of Central tendency, Moments, Skewness and Kurtosis, Probability
distributions: Binomial, Poisson and Normal and evaluation of statistical
parameters for these three distributions, Correlation and regression - Rank

correlation. 8L

Module No 5

Applied Statistics:

Curve fitting by the method of least squares- fitting of straight lines, second

degree parabolas and more general curves. Test of significance: Large sample

test for single proportion, difference of proportions, single mean, difference of

means, and difference of standard deviations. 8L

Module No 6

Small samples:

Test for single mean, difference of means and correlation coefficients, test for

ratio of variances - Chi-square test for goodness of fit and independence of

attributes. 4L
CONTENTS
MODULE-I

m;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;_B_AS~IC~P_R_O_B_A_B_IL_IT_Y;;;;;;;;;;;;T;;;;H;;E;;O;;;;R;;;;Y
1.1. Introduction 1
1.2. Introductory Terminologies 1
1.3. Classical Definition of Probability. 3
1.4. Theorems on Probability. 4
1.5. Axiomatic definition of probability. 7
1.6. Conditional Probability. 10
1.7. Independent Events 11
1.8. Baye's Theorem 12
1.9. Illustrative Examples. 12
Exercises 26
Short Answer Questions 26
Answers 30
II Long Answer Questions 31
Answers 37
III Multiple Choice Questions 37


m Answers

THE BERNOULLI TRIAL


45

2.1. Joint Independent Experiments 46


2.2 Bernoulli Trial (Finite & Infinite) 46
2.3. Illustrative Examples. 50
Exercise 2 60
Answers 61

m
fJl
~~~~~~~~
3.1. Random Variable.
3.2. Probability Mass Function and Discrete Distribution
DISCRETE RANDOM VARIABLE
AND ITS EXPECTATION

62
63
3.3. Distribution Function or Cumulative Distribution Function
(For Continuous & Discrete) 64
3.4. Expectation or Mean of a Discrete Random Variable 67
3.5. Variance and Standard Deviation ofa Random Variable 69
x ENGINEERING MATHEMATICS -IIA

3.6. Moments ofa Random Variable 70


3.7. Illustrative Examples. 72
Exercises 82
Short Answer Questions 82
Answers 83
II Long Answer Questions 84
Answers 87
III Multiple Choice Questions 88
Answers 93

[!j]1;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;S;;;;PE;;;;C;;;;IAL;;;;;;;;;;;;;;;TY;;;;;;;;;;;P;;;;E;;;;O;;;;F;;;;D;;;;;;;;;;;IS;;;;C;;;;RE;;;;T;;;;E;;;;;;;;;;;D;;;;IS;;;;T;;;;RI;;;;;;;;;;;BU;;;;T;;;;I;;;;O;;;;N

4.1. Introduction 94
4.2. Binomial Distribution 95
4.4. Poisson Approximation to Binomial Distribution. 112
Exercises 117
Short Answer Questions 117
Answers 119
II Long Answer Questions 119
Answers 124
III Multiple Choice Questions 125
Answers 127

~ DISCRETE JOINT DISTRIBUTION& CHEBYSHEV'S


~~~~~~~~~~~~~~~I~N~EQ~U~A~L~ITY~
5.1. Introduction. 128
5.2 Joint Distribution of two random Variables 128
5.3. Independent Random Variables 130
5.4. The Multinomial Distribution 134
5.5. Joint Distribution Function 136
5.6. Sum of independent Random Variables. 137
5.7. Bivariate Expectation 139
5.8. Theorems on Expectation 141
5.9. Covariance of two Variables 142
5.10. Correlation Coefficient between two Variables 143
5.11. Properties of Correlation Coefficients 148
CONTENTS xi

5.12. Variance of Sum of two Variables. 149


5.13. Illustrative Examples. 151
Exercise 5 183
Answers 190

m
MODULE-II

CONTINUOUS PROBABILITY DISTRIBUTION


6.1. Introduction. 195
6.2. Probability Density Function (or density function 195
6.3. Expectation or Mean of a Continuous Random Variable 197
6.4. Variance and S.D. 198
6.5. Illustrative Examples. 200
Exercises 213
Short Answer Questions 213
Answers 215
II Long Answer Questions 216
Answers 219
III Multiple Choice Questions 221
Answers 226

ml;;;;";;;;;;;;;;;;;;;;;;;;;;;;;;;S;;P;;E;;C;;IAL;;;;;;;;;;;TY;;;;;;;;;P;;E;;O;;F;;C;;O;;N;;T;;IN;;;;;;;;;U;;O;;U;;S ;;D;;I;;ST;;R;;I;;B;;UT;;I;;O_N

7.1. Introduction. 227


7.2. Normal Distribution. 227
7.3. Binomial Approximation to Normal Distribution. 232
7.4. Illustrative Examples. 234
7.5. Exponential Distribution. 243
7.6. Gamma Distribution. 247
7.7. IIIustrative Examples. 251
Exercise 253
Short Answer Questions 253
Answers 254
II Long Answer Questions 256
Answers 260
III Multiple Choice Questions 260
Answers 264
xii ENGINEERING MATHEMATICS - IlA

MODULE-ill
rn1 BIVARIATE DISTRIBUTIONS

m~;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;!;;;( C;;;;O;;;;N;;;;T;;;;I;;;;NU;;;;O;;;;U;;;;S;;;;V;;;;A;;;;R;;;;I;;;;AT;;;;E~S
)

8.1. Introduction. 265


8.2 Bivariate Distribution for continuous Variables. 265
8.3. Bivariate Probability Density function. 266
8.4. Properties of continuous Bivariate distribution. 266
8.5. Marginal Distribution 267
8.6. Conditional Density & Conditional Distribution 268
8.7. Independent Random Variables 270
8.8. Distribution of Sums and Quotients of random variables 271
8.9. Illustrative Examples 276
Exercise 8 289
Answers 292

MODULE-IV

~~;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;B;;;;A;;;;S;;;;IC;;;;;;;;;;;S;;;;T;;;;AT;;;;I;;;;ST;;;;I~CS

9.1. Statistics and its Related Terms295


9.2 Frequency Distribution. 296
9.3. Mean. 298
9.4. Median. 301
9.5. Mode. 305
9.6. Variance and Standard Deviation. 306
9.7. Significance of measure of central tendency and standard
deviation. 310
9.8. Moments. 310
9.9. Central moments and Raw moments 311
9.10. Relations between central moment and any moment 312
9.11. Skewness and Kurtosis 313
9.12. Significance of Skewness 315
9.13. Other Formula For finding Skewness 317
9.14. Significance of Kurtosis 317
9.15. Illustrative Examples. 318
Exercises 9 332
Answers 343
MULTIPLE CHOICE QUESTIONS 344
Answers 351
CONTENTS xiii

~101~
~
C_O_mffi ~_T_IO_N_,~R_E_G_RE
RANK CORRELATION
__ SS_IO~N

10.1 Introduction 352


10.2 Bivariate Data 352
10.3. Scatter Diagram 353
10.4. Correlation and its Different Types 356
10.5. Correlation Coefficient 356
10.6. Regression 370
10.7. Regression Line. 371
10.8 Properties of Regression Line and Coefficients. 374
10.9. JIlustrative Examples 376
10.10. Rank Correlation 383
10.11. Rank Correlation when there is Tie Rank in any series 384
10.12. Illustrative Examples 386
Exercise 10 391
Short Answers Questions 391
Answers 393
II Long Answers Questions 394
Answers 401
III Multiple Choice Questions 402
Answers 411

MODULE-V

lrn~
11.1 Introduction
C_UR_V_E_FI
__ TT_IN_G
__ BY_ME T_HO_D_O_F_L_E_AST S_Q_UA_RE;;;;;'

412
11.2. Least Square Regression Curve 413
11.3. Normal Equations 414
11.4. Finding best fitted straight lines 415
11.5. Finding best fitted straight line when (X, Y) assumes n pair of
datas exl'yl)'(xz'yz)······(xn'yn) 415
11.6. Finding best fitted Second Degree Parabola when (X, Y)
assumes n pair ofdatas (Xpyl)'(XZ'yz),-·····(xll,yll) 421
xiv ENGINEERING MATHEMATICS-IlA

11.7. Finding best fitted Exponential Curve when (X, Y) assumes


n pair of datas (xt'y.),(x1,yz)···· ··(x",y,,) 423
11.8. Finding best fitted Geometric Curve when (X, Y) assumes n pair
of data (x. ,y.)'(X1'yl)······· (x",y,,) 424
Exercise 11 432
Answers . 436

12.1. Population and Sample 440


12.2. Random Sampling 441
12.3. Sample Mean & Sample Variance 442
12.4. Sample Proportion and Population Proportion. 443
12.5. Sampling Distribution 443
12.6. Sampling Distribution of Sample Mean 446
12.7. Sampling Distribution of Sample Variance 449
12.8. Illustrative Examples. 451
Exercises 12 462
Answers 464
Multiple Choice Questions 465
Answers 472

13.1. Introduction 473


13.2. Statistical Hypothesis 474
13.3. Test Statistic 475
13.4. Critical Region, Region of Acceptance and Level of
Significance. 475
13.5. Type I Error and Type II Error 477
13.6. Best Critical Region 479
Exercises 13 482
Answers 482
Multiple Choice Questions 483
Answers 487
CO TENTS xv

14.1. Introduction 488


14.2. Test for Single Mean 488
14.3. Test of Single Proportion 495
14.4. Test of Single Standard Deviation 499
14.5. Difference of Mean ( or Test of equality of means) 50 I
14.6. Illustrative Examples 502
14.7. Test for Di fference of Proportions 503
14.8. Illustrative Examples 504
14.9 Test for difference of standard deviations 508
14.10 . Illustrative Examples 508
Exercise 14 509
Short Answers Questions 509
Answers 511
II Long Answers Questions 5 12
Answers 518

MODULE-VI
D SMALL SAMPLE TEST OF SIGNIFICANCE

15.1. Introduction 519


15.2. Test for single Means 519
15.3. Test for Difference of Mean (Test of equality of means) 525
15.4. Test for a specified Correlation Coefficient 530
15.5. Test For Difference of Correlation Coefficient 532
15.6. Test for Ratio of Variances (or, equality of two s.d) 534
15.7. chi-square test for Goodness of Fit 537
15.8. Illustrative Examples 538
15.9. Test for Independence of Attributes 546
15.10. Illustrative Examples 549
Exercise 15 551
Answers 562
STASTICAL TABLES 564
mD~ ~3
MODULE-l

li]]iiiiiiiiOijiiiiiiiiOijiiiiiiiiOijiiiiiiiiOijiiiiiiiiOijiiiiiiiiOij;;;B;;;A;;;S;;;ICiiiiiiiiOijP;;;R;;;O;;;B;;;AB;;;I;;;L;;;IT;;;YiiiiiiiiOijT;;;H;;;E;;;O;;;R;;;;;Y

1.1. Introduction: In Science and Technology we have to be concerned


with every phenomena whose future behaviour is not predictable in a
deterministic fashion. We have to depend on 'Chance' in every field. In
theory of probability we are very much concerned with 'Chance'. In fact
'Probability' is nothing but a numerical measurement of this 'Chance'.
Future can be guessed if this measurement is deduced. In this chapter
we are going to deal with this measurement 'Probability'. Conception on
Set-theory and Combinatorics Theory are the only prerequisistes for this
chapter.
1.2. Introductory Terminologies:

Random Experiment: An experiment or observation which may be


repeated a large number of times under very nearly identical conditions
and the possible outcome of any particular observation is unpredictable
but all possible outcomes can be described prior to its performance, is
known as Random Experiment.
For example, the experiment of tossing a coin is a random experiment,
as the possible outcomes are 'tails' or 'heads' but the ou.tcome of a
particular tossing cannot be predicted. . ;

Sample Points! Event Points. The outcomes of a randoin experiment


are called sample points or event points.
For example, the sample points in the experiment 'tossing a coin' are"
Head and Tail, in symbol Hand T.
Sample SpacelEvent Space. The set of all sample points i.e. the set of
all possible outcomes of a Random Experiment is called the sample space:
It is denoted by S.
For example if we throw two coins once, then'
S = {HH, HT, TH, IT}; if we roll a die once,
=
have S::: {I, 2,3,4,5, 6} as the event space.
vent. Any subset of the sample space S of a random experiment is called
an Event.
For example, in the experiment of 'throwing two coins',
A = {TH, HT} is an event because A c S .
EM-2A-l
2 ENGINEERING MATHEMATICS -IIA

Certain Event. Since every set is a subset of itself, so the sample


space is a subset of itself. So this is an event. This event is called a certain
event.
possible Event. An event that contains no sample points is called
impossible event. It is denoted by <1>. For example in the experiment
'throwing a die' the event 'Face 7' = <1>.
Complementary Event. For any event A, there is an event containing
all the sample points in the sample space which are not in A. This event
is called the complementary event of A and is denoted by A' or 1 or
C
A Obviously A'= 'NotA'.

For example, if A = {TH, HT} where S = {HH, IT, TH, ~T}, then
1= {HH, IT}.
ote. S = <I> ; ~ = S ; (1) = A
Simultaneous Occurence of two Events.
Let Al and A2 be two events. Then the set Al (\ A2 represents the
simultaneous occurence of the two events Al and A2• This event is also
denoted by AI A2 .
For example, in the experiment 'rolling a die' let Al = 'Even face'
A2 = 'Multiple of three'. Then Al (\ A2 = {6} is the event whose
occurence shows the simultaneous occurence of Al and A2.
At least one of Two Events.
Let Al and A2 be two events. Then the set Al v A2 represents 'at
least one of Al and A2 '. This event is also denoted by Al + A2 .
For example, in the experiment rolling a die let Al = Even face
={2,4,6},A2= Multiple of three ={3,6}. Then A\vA2 ={2,4,6,3}
is the event whose occurence shows the occurence of at least one of
'e en face' and 'Multiple of 3'.
isjoint or Mutually Exclusive (m.e) Events. If two events AI' A2
have no common sample points i.e. if Al (\ A2 = <1>, they are called
Mutually Exclusive Events.
For example, in a previous example if Al ={HH, IT}
A2 = {HT, m}, then Al n A2 = <I> . So, Al and A2 are mutually exclusive
events. Two me events cannot occur simultaneously.
BASIC PROBABILITY THEORY 3

airwise Disjoint Events. Let AI' A2, ... , All be n number of events.
Events Aj(i = 1,2,···, n) are said to be pairwise disjoint if no two of them
have any common event points i.e. if Aj (\ A j = <1>, i * j and i, j = 1,2,· .., n .
Exhaustive Events. Two or more events are said to be exhaustive if
at least one of them neccessarily occurs or in other words the events
AI> A21 are exhaustive if AI U A2 U A3 u··· = s.
For example, in the experiment of throwing two coins once, the events
AI = {HH}, A2 = {IT} and A3 = {HT, TH} are exhaustive.
ementary or Simple Event. An event containing exactly one sample
point is called Elementary Event. .

For example, AI = {2}, A2 = {5}, A3 = {3} etc are simple event of an


experiment of rolling a die. - ••
Composite Event. The event which can be decomposed into simple
events i.e. which can be expressed as union of two or more simple events
is called Composite Event.

For example, in the experiment of 'rolling a die' ~ ={2,3,4}, A2 = {I, 5}


etc are composite events.
qually Likely Sample Points. The sample points of a sample space
are said to be equally likely if one of them may not be expected rather
than the other.
. . Classical Definition of Probability.
Let us suppose that a random experiment E is such that its sample
space S contains a fmite number n( S) of sample points, all of which are
equally likely. Then the probability of an event A which contains n( A)
sample points, is defined by

n(A)
P(A) = n(S)" ... (1)

Illustration. A perfect die is rolled once and observed whether an odd


number appears. If A denotes this event, then A = {I, 3, 5} and the sample
space S = {I, 2, 3,4,5, 6}
4
ENGINEERING MATHEMATICS-IIA

n(A)=3 and n(S)=6


:. P(A) = n(A) =~=~
n(S) 6 2
This tells that chance of occuring 'odd face' is one per two throws.
Criticism of Classical Definition.
(i) To define the probability, firstly we presumes that the sample points
are equally likely which means equally probable i.e., probability of each
sample point is same. So this definition is circular.
(ii) This defmtion does not provide any criterion of deciding whether
the possible outcomes of an experiment are equally likely.
(iii) In many experiments, the no. of possible outcomes is infmite and
so this defmtion is not suitable in those cases.
(iv) This definition can be used only in very simple and unimportant
cases like games of chance.
(v) In some complicated problems, the calculation of possible
outcomes and favourable cases are difficult, for example the sex of a newly
born child and the throw of an untrue coin.
1.4. Theorems on Probability.
Some important properties of Probability are presented as the following
theorems. Proofs are given considering the Frequency definition.
heorem 1. 0 ~ p( A) s1 for any event A.
Proof: Let a random experiment E be repeated n times under identical
conditions and A be an event which occurs fl( A) times. Then we have
O~n(A)~n or, O~ n(A) s 1
n
:. 0 ~ p( A) s 1.
heorem 2. P(S) =1 and p(4)) = 0 where S is certain event, 4> IS
impossible event.
n(S) n
Proof: Now P(S)=-=-= 1
n n

p(t/J) = n(t/J) =~=O.


n n
BASIC PROBABILITY THEORY 5

heorem 3. If Al and A2 be two mutually exclusive events, then


P(AI u A2) = P(AI)+ P(A2) .
Proof: Let a random experiment E be repeated n times under identical
condition and AI' A2 be two events which occurs n( AI) and n( A2)
times. Since Al and A2 are mutually exclusive,
so n(AI uA2)=n(Ad+n(A2)
n(AI u A2) n(AI) n(A2)
.. -'---"':""= -- +--
n n n
.. P(AI U A2) = P(AI)+ P(A2)
heorem 4. If the events AI, A2' ..., All are mutually exclusive then
p( Al U A2 U ... U All) = p( AI) + p( A2) + ... + p( All)

i.e., P(QA} t,P(A')


Proof: Left as an exercise.
(Proceeding as Th. 3 and using induction theorem can be proved)
heorem 5. (Total Probability Theorem). For any two events Al
and A2 (may not be mutually exclusive),
P(AI uA2) = P(AI) + P(A2) - P(AI n A2) [W.B. U Tech 2007]
Proof: P( Al U A2) = p( Al U (AI r, A2)), when A is complement of A.
= p( AI) + p( Al n A2), since Al and Al n A2 are m.e
Again, A2 = (AI n A2)u(AI n A2)
:. P(A2)=p((AlnA2)u(AlnA2))
= P(AI nA2)+p(AI nA2) .: Al nA2 and Al nA2 are m.e
or, P(AI n A2) = P(A2)- P(AI r, A2)

Therefore, from above P( Al uA2) = p( AI) + p( A2) - P( AI n A2)


vTheorem 6. If AI' A2' A3 are any three events (not necessarily me.) then
P(AI u A2 U A3) = p(Ad+ P(A2)+ P(A3) - P(AI n A2)
-P(A2 n A3)- P(A3 n AI)+ P(AI nA2 n A3)
Proof: Left as an exercise.
6 ENGINEERING MATHEMATICS -IIA

Theorem 7. If AI, A2' A3' , All are any n events then

P(AI U A2 U A3 U···U An) = LP(A')- LP(AI)P(A2)

+L P(AI P(A2)P(AJ) - •••••• +(-If P(AI)P(A2)··· P(An)

Pro . mitted.
heorem 8. For any event A. p(l) = 1- P(A) where 1 is the
complementary event of A.
Proof: We have A and 1 are mutually exclusive events and
A u 1= S , the certain event.

:. P(AU 1) = p(S)
or, p( A) + p( 1)= 1, since A and 1 are mutually exclusive.

:. p( 1) = 1- p( A) .
Theoem 9. If AI' A2' ... ,All are mutually exclusive and exhaustive
n

events, then L p( A; ) = 1 .
;=1

Proof: Since the events are exhaustive, so U A; = S, the certain event.


;=1

Hence by Theorem (4) we have

n 11

i.e., LP(A;)= P(S) :. LP(A;)= 1, by Theorem 2.


;=1 ;=1

Theorem 10. For any two events Al and A2 where Al C A2


(i) P(AI)$P(A2) (ii) P(A2-AI)=P(A2)-P(AI)

Proof:
n(AI) n(A2)
(i) Al C A2 =>
n(AI) $n(A2) =>-- $--=> P(AI) $P(A2)
n n
BASIC PROBABILITY THEORY 7

.. 5. Axiomatic definition of probability. [W.E. U.Tech 2005 ]


t
Let E be a random experiment and S be its sample space ; L be the
class of all events (i.e., subsets of S). Let P be a function from L to the
set of aU real numbers satisfying the following axioms :
Axiom 1. p( A) ~ 0, for every event A in L
Axiom II. P(S) = I. ,
Axiom III. If AI, A2, ••• be a finite or infinite sequence of pairwise
mutually exclusive events, then
P(AI u A2 u... )= p(Ad + P(A2)+'"
Then for any event A the real number P(A) is called its probability.j-
The function P: L~ R is called Probability Function.

The collection {S,~, p} is called a Probability Space of the event E.

Illustration. Let a biased coin be tossed.

The event space S = {H, T} .


It has four events {H}, {T} 4> and S itself.

Here L={{H}, {T},¢,S}


We define P:L~R such that P(H)=1I3,P(T)=2/3, P(4))=O,
P({H, T}) =I

Then we see p( A) ~ 0 for all event A i.e., Axiom I is satisfied.


P(S) = p( {H,T}) = I i.e., Axiom II is satisfied. Consider the two m.e.
events {H} and {T}.

Then {H}u{T}={H,T}
1 2
:. P({H}u{T}) = P({H,T}) = 1= 3+ 3 = P({H}) + P({T}),
i.e., Axiom ill is satisfied.
Thus this function P represents probability. In particular we can say
probability of Head is j.
AU the Theorems which were deduced for the Frequency definition
of probability can be deduced for Axiomatic definition also. Some ofthems
are shown in the next page.
8 ENGINEERING MATHEMATICS-lIA

Some Deductions from the Axioms:


or any AcS, P(A)~l
Let 1= S - A . Then A, Ii 'are mutually exclusive and A u Ii = s.
:. P(AU Ii) = p(S)
or, P(A) + p(l) = 1, by Axiom IT and 1lI
or, P(A) = 1- P(1i) ... (1)

Again by axiom I, p( Ii) ~0


Hence, P(A) = 1- p(li) ~ 1
(ii) p(cj»=O

Since "$ = S . So from (1)

p( cj»= 1- p("$) = 1- p( S) = 1- 1 by Axiom (II)

=0
Hi) For any two events AI, Az c S and AI c Az .

We have P(AI) ~ P(A2)


and P(A2-Ad=P(A2)-P(AI)
Since AI <:= A2, A2 = AI U (A2 - AI) .
Also since AI and A2 - AI are mutually
exclusive, so by Axiom Ill, we have,
P(A2) = P(AI) + P(A2 - Ad
:. P(A2 - AI) = P(A2) - P(AI)
By Axiom I, P(A2 - AI) ~ 0 :. P(A2) - P(AI) ~ 0
i.e., P(A2) ~ p(Ad·
(iv) (Boole's inequality)
For n events A1,A2,···,An

11 n
i.e., P(UA;) ~ L:P(A;)
;=1 ;=1

Proof: For n = 1 , the theorem is obviously true. We have


P(AI U A2) = P(AI) + P(A2) - P(A1 (1A2)
BASIC PROBABILITY THEORY 9

:. P(At U A2) ~ P(At) + P(A2) [.,' P(At n A2) ~ 0] (1)

Thus the theorem is true for n = 2 .


Let the theorem holds for n = m

(2)
:. ~9tA;)~ ;~t(A;)

Now ~C9,~)UA_')~~g,A,)+P{Am.1) ~p((Q~)"Am"J


or, p( :9ttA;) s ~gt A; ) + p( Am+t), by Axiom I,

S tp{A,) + p{Am.'), since P ((QA,)" Am•,J ~ 0

;=1

:. ~:stA;)~:~tt P(A;).
Thus the theorem holds for n = m + 1 whenever it is hold for n = m .
But the theorem holds for n = 1,2 . Hence by induction, the theorem is
true for any positive integral values of n.
For any events AI> A2, ... , An,
n

p( AI (\ A2 n ...(\ An ) ~ 1- L p( AI)
;=1

Proof By Boole's inequality we have


p( AI + A2 + ... + An) s p( AI) + p( A2) + ... + p( An)

or, p( AI r, A2 (\ ... (\ An) ~ L" p( A; ) , by De Morgan's law.


;=1

n
or, 1- P(AI r, A2 (\ .. ·n All) ~ LP(A;)
;=1

[ .,' for any event A. p(A) = 1- P( A)]

11

.. p( AI r, A2 (\ ... r, A" ) ~ 1- L p( A; )
;=1

1
10 ENGINEERING MATHEMATICS -IIA

. Conditional Probability.
We consider two events A and B connected with a random experiment
E. Let us make the hypothesis that the event A has occured n( A) times
and B occurs simultaneously with A n( An B) times in the n repetations
n(AnB)
of experiment E .. Then ratio n( A) is called the conditional
probability of B on the hypothesis that A has already occured and is
denoted by p(~). .
p(B/) =n(AnB)
/A n(A)
n(AnB)/n P(AnB)
= n(A)/n = p( A) ,provided p( A);t 0
Similarly the conditional probability of A on the hypothesis that B has

already occured is p(~) = P(:(~)B) ,provided P(B);t o.


Illustration: Let one card be drawn from a full pack. A = 'spade'
B = 'king'
:. Probability of King supposing Spade occurs = p(~)
n(A n B) 1
n(A) 13
Multiplication Rule of Probability.
Thus we have
P(AnB) = P(A). p(~) = P(B)P(~) ... (1)
if p( A) ;t 0, P(B) ;t 0
Theorem. (Generalization ofthe Multiplication Rule)
For n events A\, A2,···, An

P(A\ n A2 n···n An) = P(A\)I{ A~Jp( j{A\ n A ))


2

... P ( An/:O:Ai)
provided p(n Ai);t 0, i = 1,2"", n-l
Proof: Left as an exercise.
BASIC PROBABILITY THEORY 11

1.7. Independent Events.


Tffor two events A and B,P(A/B)=P(A)[i.e., the chance of
occurrence of the event A is not affected by the occurrence of the event
B), the event A is said to be independent of the event B.
The following theorem is the most important characterisation of being
two events independent.
Theorem. Two events A and B are stochastically independent or
statistically independent or independent if and only if
P(AnB) = P(A). P(B)
P(AnB)
Proof: P(A/B)= P(A)~ P(B) = P(A)~ P(AnB)= P(A)P(B)
Hence the theorem.
Note: When P(A nB):t: P(A)· P{B), the events A and B are said to
be dependent.
Mutually Independent Events. The n events AI' A2, •.• , An are said
to be mutually independent if the following conditions are satisfied
P(AinAj)=P(Ai)P(Aj) for all i e j ; i,j=1,2,···,n
P(Ai n Aj r, Ak) = P(Aj)P(Aj )P(Ad for all i:t: i= k

P(AI r, A2 n···n An) = P(AI)P(A2)··· P{An)


Pairwise Independent Events: The n events AI, A2, ••• , An are said
to be pairwise independent if

P(Ai nAj)=P{Ai) P(Aj) for all i e j ; i,j=1,2,···,n


Illustrations. Let one card be drawn from a full pack. A = 'spade',
n(Bn A) 1
B = 'King'. Then P(B/ A) = = -.
n~A) 13

Again P(B) = n((B)) = ~ = ~ .


nS 52 13

Thus P(B/ A) = P(B). So in this experiment 'king' and 'spade' are


independent.
12 ENGINEERING MATHEMATICS -IIA

1.8. Baye's Theorem. ,If AI' A2,''', An


be a given set of n pairwise
mutually exclusive and exhaustive' events then for any event A where
peA) ~ 0 - , ' .

peA) = p(Adp(A/ AI)+ P(A2)P(A/ A2) + ... + P(An)P(A/ An)


= i:. p( A; )P(A/ A;)
;=1

l{A;)l{~~)
Ii) 1{A;/~) l{A) ,for i=1,2,"',n [W.E.UTech.2008]

Proof: Beyond the scope of the book


Illustration. There are three urns. First urn contains 3 red, 4 black balls
; second urn contains 6 black, 2 red balls; third urn contains 3 black
balls and 5 red balls. One urn is chosen and then a ball is drawn from the
urn. Let Al =' 1st urn is chosen' A2 = '2nd urn is chosen' A3 = 3rd urn
is chosen; A = The ball is red.
According to Baye's theorem

p( A) = P(AI)P(A/ AI) + P(A2)P(A/ A2) + P(A3)P(A/ A3)


1 3 1 2 1 5 1 1 5 73
=_._+-._+-._=-+-+-=-
3' 7 3 8 3 8 7 12 24 168'
On the other hand the. probability that the 3rd urn was chosen
supposing that the ball is red

= P(A3/ A) = P(A3)P(A/ A3)


peA)
1 5
3'8
=--=-x-=-
5 168 35
73 24 73 73 .
168
1.9. Illustrative Examples.
Example 1. What is the chance that a leap year selected at random will
contain 53 wednesdays ? [W.E. UTech 2002]
A leap year contains 366 days that is 52 full weeks and two days extra.
The extra two days will be either (i) Sunday, Monday or (ii) Monday,
Tuesday or (iii) Tuesday, Wednesday or (iv) Wednesday, Thursday or (v)
Thursday, Friday or (vi) Friday, Saturday or (vii) Saturday, Sunday.
BASIC PROBABILITY THEORY 13

So a leap year will contain 53 Wednesdays if one of the two extra


days is Wednesday. Therefore out of the above seven cases two are
favourable.
2
Hen the required probability is "7' ,
Ex pie 2. The integers x and y. .are chosen at random with
placement from nine natural numbers 1, 2, , 8, 9. Find
the probability that (x2 - y2) is divisible by 2.
x2 - y2 =(x- y)(x+ y) will be divisible by 2 iff x,y are either both
even or both odd. Now two even numbers can be chosen from {1,2, ...,9}
with replacement in 4 x 4 ways. Similarly both odd number can be selected
in 5 x 5 ways. So the total no. of favourable cases is 16+ 25 = 41 .
Again two integers x, y can be chosen at random with replacement
from {1,2, ... ,9} is 9x9=81·
,
41
Hence the required probability is 81 .
1 1- 1
xample 3. Given P{A)="2' P(B)="3' P{AB)="4' (AB means AnB)
(a) find the values of the following probabilities
p(l), P(A u B}, P(A/ B}, p(lB), p(l B), P(A u B)
n,

(b) State whether the events A and Bare


(i) mutually exclusive (ii) exhaustive (iii) equally likely (iv) independent.

(a) P (-)A = I-P(A}= 1 1


1--=-
2 2
111 7
P(AuB}= P(A)+ P{B}- P{AB} =-+---=-
1 2 3 4 12

P{A/B}= P{AP) = "4=~


P{B) .!. 4
3
P(AB) = p{(S - A)B} = P{SB - AB)
111
= P(B-AB) = P(B)- P(AB)
,
=--- =-
3 4 12
P (--)
AB =1-P(AuB)=I--=- 7 5
. 12 12
_ ) (_) (_) 1 1 1 3
P ( AuB =P A +P{B)-P AB ="2+"3-12 ="4'
14 ENGINEERJNG MATHEMATICS-UA

1 .
(b) (i) No; because p(AB) =4~ 0, i.e., AnB ~ c/>

7
(ii) No ; because P( A u B) = 12~ 1, i.e., Au B ~ S
1 1
(iii) No; because P( A) = 2' P(B) = '3 i.e., P(A) ~ P(B)
(iv) No; because P(AB)~ P(A)·P(B), here

1 1
P(AB) ='4 but P(A).P(B)=6'
Example 4. In an examination 30% of the students failed in Physics, 25%
in Mathematics and 12% in both Physics and Mathematics. A student is
selected at random' Find the probability that (i) the student has failed in
Physics, if it is known that he has failed in Mathematics.
(ii) the student has failed at least one of the two subjects
(iii) the student has passed at least one of the two subjects.
(iv) the student has passed in Mathematics if he failed in Physics.
Let A and B denote the events" a student failed in Physics" and "a
student failed in Mathematics" respectively. Then
P{A)=0·30, P(B)=0·25, P(ArlB)=0.12
Now (i) probability that a student has failed in Physics if it is known
that he has failed in Mathematics is

P(A/B)= P(A(rl)B)= 0·12 =0.48.


P B 0·25
(ii) the probability that a student has failed at least one of the subjects is
P(AuB)= P(A)+P(B)-P{ArlB) =0·30+0·25-0.12 =0.43
(iii) A = The student passed in Mathematics, Ii = he passed in Physics.
Then the probabilitythat the studenthas pissed at leastme of the subject is

p( A u Ii) = 1- p( A u B) = 1- p( A rl B)[ by D' Morgans law]


= 1- O·12 = 0·88
and (iv) the probability that the student has passed in Mathematics if he
failed in Physics is

- P(AB) 0·12
P(B/A)=I-P{B/A) =1--- =1--=0.60.
P(A) 0·30
BASIC PROBABILITY THEORY 15

Example 5. Two urns contain respectively 2 red, 5 black, 7 green and 1


red, 4 black, 9 green balls. One ball is drawn from each urn. Find the
probability that both the balls are of the same colour.
Let AI, A2' A3 be the event that both drawn balls are red, black and green
respectively.
Then the required event is Al U A2 U A3' where AI> A2' A3 are pairwise
exclusive.
:. P{AI U A2 U A3) = P(AI)+ P{A2) + P{A3)
2 1 2
Now, P{AI)=14x14= 196

P(A )-~x~- 20
2 - 14 14 - 196
7 9 63
P(A3)=14X14= 196
2 20 63 85
So the required probability is 196 + 196+ 196 = 196 .

Example 6. Two cards are drawn from a well-shuffled pack. Find the
probability that at least one of them is diamond.
Let A be the event that at least one of the drawn cards is diamond.
Then A be the event that the drawn cards is not diamond.
. -) _ 39C2 _.!...-9
A
P() 19 15
:. P(A)= 1- A = 1--=-
.. P ( - 52~ - 34 34 34
15
So, the required probability is 34.

ExamPle 7. A card is drawn at random from an ordinary deck of 52


(jplaying cards. Find the probability that it is (i) an ace (ii) a heart (iii) a
nine or a club (iv) neither a spade nor a ten.
Let H, D, C, and S be the event that the drawn balls are hearts,
diamonds, clubs and spades respectively. Also let us use the numbers I,
2, 3, ... 10 for ace, two, three, .. ten respectively.
I Then (i) P (an ace) = P{I) = 4CI =~ = J...
52q 52 13
..) P (ha eart )
(11 = P ()H =--
l3CI
=-13 =-1
52q 52 4
ENGINEERING MATHEMATICS - llA
16

(iii) P (a nine or a club)


= p( 9 U C) = P( 9) + P( C) - P( 9 r, C)
4q 13q 1 4 13 1 4
=--+----
52C1 52C 52
=-+---=-
52 52 52 13
1

(iv) P (neither a spade nor a ten)

=p(SIn1"Ol)=P{(SU10n =1-P(Su10)
= 1- {P(S) + P(IO)- p(S n IOn

=1- C~ +~ -
1
5 2)= 1- ~ = ~3 "
Example 8. Two dice are thrown n times in succession. What is the
probability of obtaining double six at least once.Hence find the minimum
number of throws so that the probability of obtaining double six' at least

once is less than ~.


Let A be the event that there is at least once double six in n throws of
two dice in succession. Then A' be the event that there is no double six
in n throws of two dice. So

p(A') =(~~J :. P(A) = 1- p(A') =1-G!J

So the required probability is 1-(~~ J


.
Again when
.
P(A)<-
1
then 1- -
(35)n <-
1
2 36 2

or, (35)"
- > -1 or, nlog (35)
- > -log2
36 2 36

log2
or, n> ::::24.6 :. n ~ 25
log36 -1og35
Hence the minimum number of throws is 25.
xample 9. Show that the probability of occurrence of only one of the
events A and B is p( A) + p( B) - 2P( AB) . [ WB. U. Tech 2006]
Let C be the event of occurence of only one of the events A and B.
BASIC PROBABILITY THEORY 17

Then
C=(AuB)-(AnB) s
=(A-B)u(B-A) A B·
:. p(C)=P(A-B)+P(B-A) ... (1)
[ .: A - Band B - A are disjoint]
Now A=(A-B)+AB
~
'-t-i!
:. P(A) = P(A -B)+ P(AB)
:. P(A-B)=P(A)-P(AB) ... (2) 'CADE~V n~ Tw:r.HNOL~
Again B = (B - A) + AB l_.f:'- .' ~ I.. ~ Q / ~ V,.0 ~ -I ~
.: P(B)=P(B-A)+P(AB)
•.. ~~:L~.b.Oatc._._ ..••••
.: P(B- A) = P(B)- P(AB) ... (3)

In virtue of (1), (2), and (3) we get


p(C) = P(A)- P(AB) + P(B)- P(AB) = P(A)+ P(B)- 2P(AB)
xample 10. If A and B are independent events, then show that the
following pairs are independent :
(i) 1 and S [WB.U.Tech 2002,2005,2007J
(ii) A and S (iii) 1 and B.
Since A and B are independent, so
P(An B)= P(A)P(B) ... (1)
(i) Now p(l n s) = p( A u B), by D' Morgan's law
= 1- p{ A u B) = 1- [P( A) + p( B) - P( A n B) 1
=1-P(A)-P{B)+P(AnB) =1-P(A)-P{B)+P{A).P{B) by (1)
=(l-P(A))(I-P(B)) = p(l)p(s) .
:. 1 and S are independent.
(ii) Again An B and An S are mutually exclusive and
(A n B) u (A n B) = A
:. P(A)=P(AnB)+p(AnB)
:.P(AnS)=P(A)-(AnB) =P{A)-P{A)·P{B) by (1)
= P(A)(I- P(B)) = P(A)P(S)
:. A and S are independent.
EM-2A-2

J
ENGINEERING MATHEMATlCS-IIA
18

(iii) Taking B = ( A 11 B) U (A 11 B) , we can prove the result as (ii)


Example 11. If A, B, C be mutually independent events, then prove that
(i) A and B + C are independent [W.B.U.Tech 2003]
(ii) A,B,C are mutually independent
(i) P{A(B+C)}=P(AB+AC) =p(AB)+P(AC)-P{ABC)
V = P(A)P(B)+ P(A)f\C)-f\A)f\BC)

[.: A, B, C are mutually independent]


= P(A)(P(B)+ p(C)- P(BC)] = P{A)P{B + C)
Hence A and B + C are independent.
(ii) Now A, B are independent as A, B are independent (by previous
example). Similarly B,C and C,A are independent. Also since A and B+ C
are independent so A and B + C i.e., A and BC are independent
[.: B+C=BC]
:. P{A(BC)}= P(A) P(BC) = P(A) p(s) p(C)
Hence A, B, C are mutually independent.
Example 12. The face cards are removed from a pack of 52 cards. Then
4 cards are drawn one by one from the remaining 40 cards. What is the
probability that 4 cards belong to different suits and different denomination.
4 cards can be drawn out of 40 cards one by one is 40 x 39 x 38 x 37
ways. So the total numbers of possible outcomes is 40 x 39 x 38 x 37 .
Now the number of ways in which 4 cards belong to different suits
and different denominators is IOC) X9C) X
8C) X 7C) = 10 x 9 x 8 x 7.

So the total number of favourable cases is lOx 9 x 8 x 7


· IOx9x8x7 21
Hence the required prob. = =--
40 x 39 x 38 x 37 9139
Example 15. A pack of 2n cards, n of which are red and another n are
black. It is divided into two equal parts and a card is drawn from each.
Find the probability that the cards drawn are of the same colour.
Let 2n cards be divided into such way that first part contains k red
cards and (n - k) black cards where k = 1,2,· .., n -1. Then the 2nd part
contains (n-k) red cards and k black cards.
BASIC PROBABILITY THEORY 19

n-k k
Then the probability for both the drawn cards are of black is --.-
n n
k n-k
and that of red is -. -- .
n n
So the probability for both the drawn cards are of the same colour IS

n - k . k +!!... n - k = 2 (n - k )k k = 1 2 ... n _ 1
n n n n n2" , ,
2(n-k)k
Hence the required probability = L
n-l
2
k=l n
=-;(n~k- ~k2) =-;{n. (n-1)n _ n(n-1)(2n-1)} = n2 -1 .
n k=l k=l n 2 6 3n
Example 14. An urn contains a white and b black balls, from which k
balls are drawn one by one and they laid aside without noticeing their
colours. Then one more ball is drawn. Find the probability that it is white.
Here the total number of cases of drawing (k + 1) balls from
(a + b) balls is (a + b)(a + b - 1) ... (a + b - k) .
Since the last drawn ball shall be white, we choose one white ball from a
white balls in a ways. Then k balls can be drawn from the rest (a + b - 1)
balls is (a+b-1Xa+b-2)···(a+b-k)ways.
So the total number of favourable cases in which the last drawn ball
is white in (k + 1) drawn is
(a + b - 1)( a + b - 2) ... (a + b - k)a .
Hence the required probability
(a + b - 1)(a + b - 2) ... (a + b - k) . a a
= (a + b)(a + b - 1) ... (a + b - k)
=
a +b .
Example 15. 15 new students are to be evenly distri~uted among 3 classes.
Suppose that there are 3 whiz-kids among the fifteen. What is the probability
that each class gets one whiz-kid and one class gets them all ?
15 students can be evenly distributed among 3 classes in
ENGINEERING MATHEMATICS -llA
20

ibuti . (15)!
di
So the tota 1 numb er 0 f stn utions IS ~.(5!)
(i) We can allot one whiz-kid to each of three classes in 3! ways. Then
the other 12 students can be evenly distributed among 3 classes in

(12)'
l2C x8C x4C ways ==s::i: ways.
4 4 4 (4!)3
So the probability that each class gets one whiz-kid is

3' (12)!
. (4!)3 25
==_.
(15)! 91
(5!)3
(ii) We can allot all the three whiz-kids to one class in 3 ways and the

rest 12 students in

3 x _(~12~)_!
(5!)2 2! 6
==-
So the prob. that one class gets all 3 whiz-kids is (15)! 91'
(5!)3

Example 16. An urn contains 4 white and 6 black balls. Two balls are
successively drawn from the urn without replacement of the first ball. If
the first ball is seen to be white, what is the probability that the 2nd ball
is also white?
Let Al be the event that the first drawn ball is white and A2 be the
event that the second drawn ball is white. Then Al Il A2 be the event
that the both drawn ball is white.
4 3 2
:. P(A1IlA2)==-x-==-
10 9 15
4 2
Also P(A1)==-==-'
10 5
BASIC PROBABILITY THEORY 21

Then by definition of conditional probability,


2
P(A / AI) =P(AI n A2) _ 15 =.!.
2
P(AI) 2 3
5
1
Hence the required probability = 3" .
xample 17. There are two identical urns containing respectively 4 white,
3 red balls and 1J white, 7 red balls. An urn is chosen at random and a ball
is drawn from it. Find the probability that the ball is white. If the ball drawn
is white, what is the probability that it is from the first urn ?
Let AI' A2 be the event that the ball is drawn from the first and second
urn .respectively. Clearly the events AI' A2 are mutually exclusive and
exhaustive events.
1
:. P(AI) = P(A2) =-
2
Also let A be the event that the drawn ball is white. Then we have
4 3
P(AIA1)=-, P(AIA2)=-
7 10
Now P(A) = P(AI)·P(A/AI)+ P(A2)·P(A/A2)
1 4 1 3 61
=-.-+-.-=-
2 7 2 10 140'
61
So the probability that the drawn ball is white is 140'
Now by Baye's theorem we have
1 4
P(A /A) =P(AI)P(A/AI) = 2'7 =40
1 P(A) 61 61 .
140
40
So the prob. that the white ball is drawn from the first urn is 61'
Example 18. Two urns contain respectively 5 white, 7 black balls and 4
white,2 black balls. One of the urns is selected by the toss of a fair. coin
and then 2 balls are drawn without replacement from the selected urn. If
both balls drawn are white, what is the probability that the first urn is
selected?
22 ENGINEERING MATHEMATICS -IIA

. Let AI' A2 be the event that the ball is drawn from the first and second

.
urn respectively. Then as the urn is chosen by coin-tossing, so we have
1
.

P(AI) =P(A2) =- .
2
Now let A be the event that the drawn two balls are white. Then
5·4
5C2 2.1 5
P (A / Al ) = I2C = 12 ·11 = 33
2 --
2·1

4·3
P(A/ A ) = 4C2 = 2 ·1 = ~
2 6C 6·5 5
2 -
2·1
:. By Baye's Theorem ~A)=~~)~A/~)+~~)~A/~)
1 5 1 2 91
=-.-+_._=-
2 33 2 5 330
Again by Baye's theorem, the required probability

1 5
P(A /A) = P(AdP(A/ Ad = "2' 33 = 25
I P(A) ~ 91'
330
Example 19. A speaks the truth 3 out of 4 times and B 7 times out of 10.
They agree in their statement that from a bag contaning 6 balls of different
colours a white ball has been drawn. Find the probability that the statement
is true.
Let Al andA2 be the events that the joint statement of A and B is
true and false respectively
1 5
Then P(AI) = 6' P(A2) = 6
Now let X be the event that both A and B agreed in their statement.
Then

3 7 21
P(X/A) =-X-=-'
I 4 10 40
BASIC PROBABILITY THEORY 23

(1-x-
P(XjA2)= 1) x(3-x- 1) =--3
4 5 10 5 1000

:. P(X) = P(Ad' P(Xj AI) + P(A2)· P(Xj A2)

1 21 5 3 7 1 9
=-x-+-x-- =-+-=-
6 40 6 1000 80 400 100

Hence the probability of the statement being true is

P(A jX) = P{AI)P{X/Ad (by Baye's theorem)


I P{A)
1 21
-x-
6 40 35
= 9 = 36'
100
Example 20. Asswning that each child is as likely to be a boy as it is to
be a girl, what is the conditional probability that in a family of two children ;'

both are boys, given that (i) the older child is a boy (ii) at least one of the
children is a boy?
Let Al and A2 be the event that the older child is a boy and the
younger child is a boy

1
Then P(AI) = P{A2) = 2
Also Al U A2 = at least one of the children is a boy
and Al nA2 = both children are boys. Since AI' A2 are independent,

• :. P{AI(12)=p{AI)P{A2)=~'~=~
1 1 1 3
P(AI uA2) =P(AI)+P{A2)-P(AI nA2) =2+2-4 =4"'
(i) Thus the probability that both children are boys given that the older
is a boy is
24 ENGINEERING MATHEMATICS -ItA

(ii) The probability that both the children are boys, given that at least
one of them is a boy, is

P{(Al nA2)/(Al u A2)}

P[(Al nA2) n (Al U A2)]

P(Al uA2)
P(AlnA2) 1/4 1
= P(Al uA2) =3/4 =3'

Example 21. In a bolt factory, machines A, Band C manufacture


respectively 25%, 35% and 40% of the total of their output. 5%, 4% and
2% are defective bolts. A bolt is drawn at random from the product and
is found to be defective. What are the probability that it was manufactured
by machines A, Band C? . [W.E. U. Tech 2003]

Let Xl' X 2 and X 3 be the events that a bolt is manufactured by A,


Band C respectively and X be the event that a bolt is defective. Then
P(X ) - 25 _.! P(X) _ 35 _ ~ P(X) _ 40 _ ~
1 - 100 - 4' 2 - 100 - 20 ' 3 - 100 - 5
5 1
.', P(X/Xl) = 100 = 20

P(X/X
2
)-
-
--±- - 2..
100 - 25 '
P(X/X)
3
- ~
-
- 2..
100 - 50

•. P(X) = P(Xl)P(X/X1)+ P(X2)P(X/X2) +P(X3)P(X/X3)


1 1 7 1 2 1 1 7 1 69
=-x-+-x-+-x- =-+-+-=--
4 20 20 25 5 50 80 500 125 2000
Then by Baye's theorem, we have
1 1
P(X IX) =P(X1)P(X/Xl)
=4'20 =25
1 P(X) ~ 69
2000
.. . '- 28 16
Similarly P(X2/X) = 69' P(X3/X) = 69 .

So, the required probability that defective bolt was manufactured by


25 28 16
machines ~ 8, Care 69' 69' 69 respectively.

L
BASIC PROBABILITY THEORY 2S

Example 22. An urn contains 10 white and 3 black balls, while another urn
3 white and 5 black balls. Two balls are drawn from the first urn and put'
into the second am and then a ball is drawn from the latter. What is the
probability that it is a white ball ?
Let A,B,C be the events that the drawn two balls from the first unr are
both white, both black and one white and one black respectively. Then
P(A) = lOC2 = 10x9 =~
l3C
2
13 x.12 26

P(B) _ 3C2 _ 3x 2 _ ~
- l3C - 13 x 12 - 26
2

lOC 3C 5
P(C) = 1X 1 =
l3C 13
2
When two balls are transferred into the 2nd am, it will contain either 5
white, 5 black balls. or, 3 white, 7 black balls or 4 white and 6 black balls
according to the events A, B, C respectively.

Let W denote the event of drawing a white ball from the second urn.
5 1 3
Then P(W / A) =10 =2'P(W / B) = 10

P(W / C) = -±- =~
1.0 5·
.So the required probability,
P(W) =P(A).P(W/ A)+P(B)P(W/ B) +P(C)·P(W / C)
15 1 1 3 5 2 75 + 3 + 40 59
=-.-+-.-+-.- = =-
26 2 26 10 13 5 260 130 .
Example 23. A student has to answer a multiple choice question with 5
alternatives. What is the probability that the student know the answer, given
that he answered it correctly.
Let B, and B2 be the events that the student knew right answer and
guesses the right answer respectively:Also letA be event that he getstheright
answer. Again let p be the probability that he knew the corrcet answer.
26 ENGINEERING MATHEMATlCS-11A

1
Also P(A / BI) = 1,P(A / B2) =-
5
.. By Baye's theorem,

P(B / A) = P(BI)P(A / BI)


I P(BI)P(A / BI) + P(B2)P(A / B2)
p-1
=----=----
p·1+(1- p)~

5p
=
4p+1·
EXERCISES

[I] SHORT ANSWER QUESTIONS

1. Twenty balls in an urn are numbered 1 through 20. A blind folded


contestant draws five balls from the win, with the order of the draw
recorded. What is the probability that the number 3 ball was selected?

[Hints: The required probability = ::CC


4 =.!...]
4
5

2. Prove that for any two events A,B with


Be A,P(A nB) = P(A)-P(B).

[Hints: As B c A , so B and An]j are disjoint and


Bu(AnB) = A
.. P(B)uP(AnB)=P(A).]
3. If PI' P2" .. Pn are the probabilities that n certain events happen,
then find the probability that at least one of these events must happen.
[Hints: The probabilities of their non-happening are
1- PI ,1- P2"· ·1- P« . So the probability of all of these failing is
(1- PI)(l- P2)···(1- Pn)·]
Hence the chance is which at least one of these events must happen is
1-(1- pd(l- P2)···(1- Pn)·
BASIC PROBABILITY THEORY 27

4. For two events AI,A2 let P(AI) =0.4,P(A2) =p and


P(AI uA2) = 0.6.
(i) Find p so that Al and A2 are independent events.
(ii) For what value of p, the events Al and A2 are mutually exclusive.
[Hints: P(AI n A2) = P(AI) + P(A2) - P(AI U A2)
= p-0.2.
(i) P(A1 n A2) = P(A1)·P(A2) => p x 0.4 =p- 0.2

=> p x 0.6 = 0.2


1
=>P=3·

(ii) P(AI n A2) = 0 => P = 0.2.]


5. Two urns contain 4 white, 6 blue and 4 white, 5 blue balls respectively.
One of the urns is selected at random and a ball is drawn from it. Find the
probability that the drawn ball is white.
[Hints: P(w) = P(A1)·P(wl AI)+ P(A2)·P(wl A2)
1 4 1 4 19
=-.-+_.-=-]
2 10 2 9 45

6. If P(A) = ~,P(A+B)=%, find P(AB).

[Hints: P(AB) = I-P(A + B) = 1-% = ~]

7. If P(A I B) = 1 , then prove that P(ABC) = P(BC)


[Hints: P(A I B) = 1 => P(AB) = P(B)
.. P(ABC) = P(AB)·P(C lAB) = P(B)·P(C I B)

= P(BC)]

8..If A and B are events with P(A) = ~,P(B) = ~ and


3 8 8
P(A u B) = 4"' find P{A I B),P(B I A). Are A and B independent?

9. For any two event A1,A2 prove that F(i\)+F(~)~I+F(i\~)


28 ENGINEERING MATHEMATICS -IIA

10. If A,B are two events with P(A) = O.4,P(B) = 0.3 and
p(AnB) = 0.2, find P(N nB). [WB.U.Tech 2006 M 302]
[Hints: P(A u B) = 0.4 + 0.3 - 0.2 = 0.5
.: P(Ac (I B) = P(A (I B) -P(A) = 0.5-0.4 = 0.1]
11. If two perfect coins are tossed simultaneously find the probability
of getting at least one head.
12. Out of 120 tickets numbered consecutively from 1 to 120, one
ticket is drawn at random. Find the probability of getting a multiple of 5.
13. If A, B, C are equally likely, mutually exclusive and exhaustive then
find P(A).
14. Find the probability of getting at least one 'Five' from 3 throws of
a perfect die.
15. Two events A and B are such that P(A) = 0.4, P(A uB) = 0.7 and
P(B) = x . For what values of x are A and B (i) mutually exclusive (ii)
independent ?
16.Find the probability that there would be two tail if four unbiased
coins are tossed.
17.Three balls are drawn at random from a box containing 6 red and
4 black balls. What is the probability that the two balls are red and one is
black?
IS.Box A contains 3 black balls and 3 red, box B contains 6 black
balls and 4 red. If a ball is randomly selected from each box, find the
probability that the balls will be of same colour ?
19.5 cards are drawn from a pack of 52 well-shuffled cards. Find the
probability that 4 are aces and 1 is a king.
20.Five persons X; Y,Z, W, S speak at a seminar-lecture. What is the
probability that X speaks immediately before Y ?
21.The nine digits 1, 2, 3 , 4, 5, 6, 7, 8, 9 are arranged in random
order to form a nine-digit number. Find the probability that 1, 2, and 3
appear as neighbours in the order mentioned.
22.Four dices are rolled. What is the chance of obtaining a sum of 18?
23. What is the probability of an odd sum when two dice are thrown ?

[Hints: The required Probability = 18 =.!.]


62 2
BASIC PROBABILITY THEORY 29

24. There are four persons in a room. Find the probability that (i) all
of them have different birthdays (ii) at least 2 of them have the same
birthday (iii) exactly 2 of them have the same birthday. (1 year = 365
days)
25.There are 20 people. Find the probability that among the 12 months
in the year there are 4 months containing exactly 2 birthdays and 4 containing
exactly 3 birthdays?
26. If F(Arill)=~,F(B)=~ and ~A)=..! find ~AuB),P(Ac nff),
352
P(Ac u BC) and P(A nBc).
27.What is the chance that a non-leap year should have 53 sundays?
[Hint. Same as Ex-I]
28. X and Y stand in a line at random with 10 other people. What is
the prob. that there are 3 people between X and Y.
29. Prove that the probability of obtaining six at least once in 4 throws
1
with a die is slightly greater than "2 .
9
[Hints: The required Probability = 1- ( %
2
1
0.52 > ..!. :::

30. What is the probability that a bridge han~ will contain at least one
ace? 48C
[Hints: 1- 52C13 ~ 0.696 ]
13
31.Prove that for any two events A, B
(i) P(A + B) = 1- P(B) + P(AB)

(ii) p(AB)= P(A)-P(AB)


(iii) P(A
n B) ~ P(A) + P(B)
(iv) P(A u B) 2: P(A)

(vi) P(A/B)~P(A)/P(B)
(vii) p( A / B) ~ 1 - P(A / B)
(viii) P(A)/ P(B) = P(A / B)j P(B / A).
30 ENGINEERING MATHEMATICS - UA

32.Ifthe two events A and B are independent and P(B)=~,P(A)=~'

fmd P(AuB), Ff4:1B) and P(A nB).


33. Prove that if A and B are mutually exclusive events and
P( A) ~ 0, P( B) ~ 0, then A and B are not independent.

2 ( -) 1
34. Given P(A u B) = 3' PAn B = 3"' fmd P(B) and P(A) if
1
P(AIB)="6. .

35. If P(A»P(B), show thatP(AIB»P(BIA).


36. For any three events A, B, C, prove that

(i) P((A + B)/C) = P(A/C) + P(B/C) - P(AB/C)

(ii) p(AB/C)+ P(AB/C) = P(A/C)

[Hints: (i) P(AC + BC) = P(AC) + P(BC) - P(ABC)


. P(AC + BC) P(AC) P(BC) P(ABC)
.. P(C) = P(C) + P(C) - P(C)

:. P[(A + B) I C] = P(A I C) + P(B I C) - P(AB/C)

_ p(AB C) P(ABC)
C) p(AB I C) + P(AB I C) = +--'--:---:-'-
11 P(C) P(C)

p(ABC)+P(ABC) P(AC)
= P(C) = P(C) =P(AIC)].
A SWERS

1. 4"
1 1
6."6 8.
22
5'3'
3
No 11.4" 12.5
1
1
13. 3" 14. 1-
(5)3
"6
15. (i) 0.3, (ii) 0.5
3
16."8 17.
1
"2 18. "21 19. 4/52 C 5

1 1 1
20. 5 21. 72 22. 5/81 23."2
BASIC PROBABILITY THEORY 31

24. (i) 0.984, (ii) 0.016, (iii) 6 x 364 x 363 25. 1.0604 x 10-3
(365)3

23 7 2 1 1
26. 30' 30' 3"' (3 27. -,:; 28. 4/33 30. 0·696
7
32. l3/15, 1/3, 1/5 34. 1/3, 18

[II] LONG ANSWER QUESTIONS

1. For a certain binary communication channel, probability that a


trasmitted '0' is received as a '0' is 0.95 and the probability that a transmitted
'1' is received as '1' is 0.90. If the probability that a '0' is transmitted is 0.4,
find the probability that
(i) a '1' is received.

(ii) a '1' was transmitted given that a '1' was received.

[Hints : Al = the event of transmitted '1'

Al = the event of transmitted '0'

A2 = the event of receiving' l'


A2 = the event of receiving '0'
.. (i) P(A2) = P(AI)· P(A2 / AI) + P(AI)P(A2 / AI)
= 0.6 x 0.9 + 0.4 x 0.5 = 0.56·
(ii) P(A / A ) = P(AI)P(A2 / AI) = 0.6 x 0.9 = 2278]
1 2 P(A2) 0.56
2. Two fair dice had two of their sides painted red, two painted
black,one painted yellow and the other painted white. When this pair of
dice are rolled, what is the probability that both land on the same colour ?
3. From an urn containing a white and b black balls, balls are
successively drawn without replacement until only those of the same

colour are left. Prove that the probability that the balls left are white is
a
a+b '
32 ENGINEERING MATHEMATICS-IIA

4. A bag contains 5 white and 4 black balls. If 3 balls are drawn at random,
what are the probabilities of the following:
(i) 2 of them are white
(ii) at most one of them is white
(iii) at lest two are white. [WE. UTech 2007]
[Hints: (i) n(A) =5 C2 X4 C1 (ii) n(A) =5 C1 x4 C2 +4 C3
(iii) 5C2 x4 C1 +5 C3]
5. The probability that a contractor will get a plumbing contract is 23,
and the probability that he will not get an electri contract is 59. If the probability
of getting at least one contract is 4/5, what is the probability that he will get
bothe the contracts?
.6. From the numbers 1, 2, 3, 4, 999, 1000 one number is drawn
-at random Find the probability that the number is multiple of (i) 12 and 18
(ii) 12 or 18
7. A box contains 5 defective and 10 non-defective lamps. 8 lamps
are drawn at random in succession without replacement. What is the
probability that the 8th lamp is the 5th defective?
[ WE. UTech 2005,2007,2008]

5C xlOC X 7'
[Hints: 4 3 •
15 x 14 x 13 x 12 x 11 x 10 x 9 x 8
8. Two sets of candidates are competing for the position of the Board
of Directors of a company. The probabilities that the first and second sets
will win are 0.6 and 0.4 respectively. If the first set wins, the probability
of introducing a new product is 0.8, and the corresponding probability if
the second set wins is 0.3. What is the probability that the new product
will be introduced?
9. An urn contains 2 white and 2 black balls. Balls are drawn
successively at random without replacement. What is the probability that
a black ball appears (i) for the first time in the 3rd drawing (ii) for the
2nd time in the 4th drawing?
10. A packet of 10 electronic components is known to include 3 defectives.
If 4 components are randomly chosen and tested, what is the probability of
finding among them not rrore than one defective?
BASIC PROBABILITY THEORY 33

11. A bag contains 8 red balls and 5 white balls. Two successive draws
of 3 balls are made without replacement. Find the probability that the first
drawing will give 3 white balls and the second 3 red balls.
12. In a bridge game, North and South have 9 spades between them.
Find the probability that either East or West has no spades. (There are only
13 spades in a pack of 52 cards and each player has 13 cards. The players
are designated by the positions they occupy, viz. North, South, East, West.)
13. In a hand of bridge, what is the probability that you have 5 spades
and your partner has the remaining 8.
14. A box contains 7 white and 5 black balls. If 3 balls are drawn
simultaneously at random, what is the probability that they are not all of
the same colour ? Calculate the probability of the same event for the case
where the balls are drawn in succession with replacement between
drawings.
15. An urn contains 3 white and 5 black balls. One ball is drawn and
its colour is unnoted, kept aside and then another ball is drawn. What is
the prob. that it (i) black (ii) white? '
[Hints: Let ff\, Bl be the event that 1st drawn ball is white and
black. Also let fJ1, B2 the event that 2nd drawn ball is white ana black.
3 5 2 3
P(Wd = 8' P(B 1) = 8' P(W2/Wd = 7 ,P(W2/B1) ='7
, 3
.. P(W2) = P(W1)·P(W2/W1)+ P(B1)·P(W2/W1) =-
8
5
:. P(B2) = 8']

16. Each of the two cabinets has, 3 drawers. Cabinet I contains a gold
coin in each drawer, and cabinet II contains a gold coin in one of its
drawers and a silver coin in the other. A cabinet is randomly selected, one
of its drawers is opened, and a gold coin is found. Find the probability
that there is a gold coin in the other drawer.
17. There are three identical boxes, each provides with two drawers.
In the first, each drawer contains a gold coin, in the third, each drawer contains
a silver coin and in the second, one drawer contains a gold and the other a
silver coin. A box is selected at random and one of the drawers is opened If
a gold coin is found, what is the prob. that the box chosen is the second one?
, EM-2A-3
34 ENGINEERING MATHEMATICS-IIA

[Hints: P(A1) = P(A2) = P(A3) =.!.


2 1 3
P(A/A1) = 2 = 1, P(A/A2) = 2" P(A/A3) =0

11111 1
:. P(A)=1· + · + ·O=2 :. P(A2IA)=3]
3 23 3
18. There are three identical urns containing white and black balls. The
first urn contains 3 white, 4 black balls, the 2nd urn contains 4 white 5 black
balls and the 3rd urn contains 2 white and 3 black balls. An urn is chosen at
random and a ball is drawn from it. If the drawn ball is white, what is the
prob. that the 2nd urn is chosen. [w'B. U.Tech 2007]
19. Urn A contains 2 white, 4 red balls; Urn B contains 1 white and 1
red bal. A ball is randomly chosen from urn A and put into urn B .and a
ball is then randomly selected from Urn B. Find (i) the probability that
the ball selected from Urn B is white (ii) the conditional probability that
the transferred ball was white, given that a white ball is selected from
Urn R
20. A box contains 3 types of disposable lights. The probability that
type I light gives over 100 hours of use is 0.7, with the corresponding
probabilities for type II and type ill lights being 0.4 and 0.3, respectively.
Let 20% of the lights in the box are type I, 30% are type II, and 50% are
type III.
(i) Find the probability that a randomly chosen light gives more than
100 hours of use ?
(ii)Given the light lasted over 100 hours, what is the conditional
probability that it was a type II light?
21. Two cards are drawn successively from the pack without replacing
the first. Find the prob. that the 2nd card is also spade, if the first card is
spade.
[Hints. Let A, B be the event that the 1st and 2nd drawn card is
spade .
. P(AB) - 13 x 12 P(A) =~ P(BI A)= P(AB) = 12 =~
.. - 52 x 51' 52 P(A) 51 17·]
22. III a certain class 15% of the students failed in Mathematics, 15%
failed in Chemistry, 10% failed in both Mathematics and Chemistry. A
student is selected at random

L
.J
BASIC PROBABILITY THEORY 3S

(i) If he failed in Mathematics, what is the prob. that he failed in


Chemistry?
(ii) If he failed in Chemistry, what is the prob. that he failed in
Mathematics?
(iii) What is the prob. that he failed both in Mathematics and Chemistry
? [W.B.U.Tech 2002]
(iv) What is the prob. that he failed in Mathematics or Chemistry.
23. Let 5% of men and 0.25% of women are colourblind. A colourblind
person is chosen at random. Find the probability that the person being
male. Assume that there are equal number of males-and females. What is
the population consisted of twice as many males as female ?
24.Section A, Band C have 50, 75 and 100 workers respectively, and
, 50, 60 and 70 percent of these are men. Resignations are equally likely
among all workers, irrespective of sex. One worker resigns, and this is a
man. Find the probability that the person works in section C.
25. Three machines X Y. Z produce respectively 60%, 30% and 10%
of the total number of items of a factory. Of this output 2% , 3% and
4% are defective. An item is selected at random and is found defective.
Find the prob. that the item was produced by machine Z.
26. A box contains 10 pairs of shoes. If 8 shoes are randomly selected,
what is the probability that there is (i) no complete pair (ii) exactly one
complete pair?

10c x 2S 10C 9C 26 -
[ Hint (i) S ,(ii) 1 x 6 x ;][ W.B. U. Tech 2005]
20CS 20CS'

27. Probability of recording temperature of Delhi 70° F is 0.3 and that


of Mumbai is 0.4. The probability of recording the maximum of
temperatures in Delhi and Mumbai as 70° F is 0.2. Find the probability
that the minimum of the two city temperatures is 70° F.
28. Urn I contains 7 black and 5 white balls. Urn II contains 12 black
and 3 white balls. We toss a fair coin. If head comes then a ball from
Urn I is drawn, whereas if tail comes a ball from Urn II is drawn. Suppose
that a white ball is drawn. What is the probability that the coin shows
tails?
36 ENGINEERING MATHEMATICS -IIA

29.Three machine men A, B, and C, produces a special kind of


electronic-toy, with respective probabilities 0.02, 0.03 and 0.05 it fails to
be established in market. In the factory where they work, A produces 50%
of all toys, B 30% and C 20 %. What proportion of "non-establishment"
is caused by A ?
30.An Engineering system consisting of 4 components is such that the
system functions if and only if at least 2 components work. Suppose
that all components work independently of each other. If the i-th
component works with probability Pi (i = 1, 2,3,4), compute' the
probability that the system functions.
[Hint : Let the event. :Ar = i th component works
:. P(AJ = Pi'
Now "The System does not function"
= (AI r, A2 n A3 n A4 ) U [AI r, A2 n A3 n A4 ]

1111nAz nAs n114]+[111 n~ nA3 n114]+[111 n~ nA:anA4]


:. Probability that the system does not function

+P(AI nA2 nA3 nA4)+P(AI nA2 nA3 nA4)}

= (1- Pl)(l- P2)(1- P3)(1- P4)+ {Pl(l- P2)(1- P3)(1- P4)


+(1- Pl)P2(1- P3)(1- P4)

+(1- pd(l- P2)P3(1- P4)+(l- pd(l- P2){1- P3)P4}


Next see, Probability that the system function
= 1- Probability that the system does not function.]
31.The probability that a construction job will be finished in time is
17/20; the probability that there will be no strike is 3/4 ; and the probability
that the construction job will be finished in time, assuming that there will
be no strike, is 14/15. Find the probability that (i) the construction job
will be finished in time and there will be no strike. (ii) there will be strike
or the job will not be finished in time.
BASIC PROBABILITY THEORY 37

ANSWERS

27 5 10 17 25
1. (i) 0.56 (ii) 28 2. 18 4. (i) 21 (ii) 42 (iii) 42

5 1 1
5. 14/45 6. 0.027, 0.111 7. 429 8.0.6 9. 6'2
10.2/3 11.7/429 12. 11/115 13.26084xlO~ 14.35/44,35/48

5 3 3 1 140 4 1
15. (i) '8' (ii) '8' 16. '4 17. '3 18. 401 19. (i) '9 (ii) 2
2 2 1
20. (i) 0.41 (ii) 12/41 21. 4/17 22. (i) '5' (ii) '3' (iii) 10'
3 20 40 1
(iv) 10' 23. 21; 4i 24. 2 25. 4/25 26.. 09145; .4268

27.0.5 28. 12/37 29.34.48

30. [1-4LP1P2 +2LP1P2P3 -3P1P2P3P4]


31. 7/10,3110

[III] MULTIPLE CHOICE QUESTIONS

1. The probability that the sum 8 appear in a single toss of pair of a fair
dice is
31 5
(a) 36 (b) 36

5 1
(c) -
6 (d) 6'
2. Of 6 girls in a.class, 3 have blue eyes. If 2 of the girls are chosen at
random, the probability that both have blue eyes is

1 4
(a) - (b) -
5 5

1
(c) 10 (d) none.
38 ENGINEERING MATHEMATICS-11A

3. If A and B are two independent events then

(a) A,B are exhaustive events


(b) A,S are independent events.
(c) A,B are mutually exclusive events ..
(d) None.
4. A die is tossed. If the number is odd, then the probability that it is prime, is
1 2
(a) - (b) -
2 3
1
(c) - (d) none.
3
1 1
5.
If A and B be events with P(A) = 3,P(B) = 4"' and
1
P(AuB)="2 then P(B/A)=

3 4
(a) - (b) -
4 3
1 1
(c) - (d) -
4 3

6. IfAandBbeeventswith P(A)=%,P(B)=~ and P(AVB)=%


then P(AnB) =
1 5
(a) - (b) -
6 6
1
(c) - (d) none.
3

7. If X, Yare two events with P(X) = ~,P(Y) = ~, and


6 3
P(X r, Y) = ~, then P(X n Y) =

5 7
(a) 12 (b) 12

1
(c) 6" (d) none.

[Hints: P(X (1 Y) = P(X) - P(X (1 Y) ]


BASIC PROBABILITY THEORY 39

8. Two events are said to be independent if

(a) P(AuB) = P(A).P(B) (b) P(A / B) = P(A)

(c) P(AnB) i; P(A).P(B) (d) P(A / B) = P(B).


9. If Al and A2 be two mutually exclusive i.c. disjoint events, then
P(A) uA2) is equal to [WB.U.Tech 2006, M 302]

(a) P(A))+P(A2)-P(A).nA2)

(b) P(A1)·P(A2)

(c) P(Ad+P(A2)
(d) P(Ad - P(A2).

10. Let A,B be events with, P(AuB) = ~,P(AnB) =~ and


8 4
P(A)=%. Then P(B) =
3 3
(a) - (b) -
8 4

1
(c) - (d) none.
2-
11. A coin is tossed successively three times. The probability of getting
exactly one head is
1 1
(a) - (b) -
8 2

3
(c) - (d) none.
8
12. If P(A) = 0.2,P(B) = O.4,P(A u B) = 0.6, then the events A,B
are

(a) independent (b) mutually exhaustive

(c) mutually exclusive (d) none.


40
ENGINEERING MATHEMATICS - IIA

13. A and B are events with P(A) = .!.,P(B) = ~ and P(AB) = ~.


3 5 15
Are A and B independent?
(a) Yes
(b) No.
14. If P(Ad+P(A2)+P(A3)=I,thentheevents AI,A2,A are
3
(a) Pairwise independent.
(b) mutually exclusive
(c) mutually exhaustive
(d) mutually exclusive and exhaustine.
15. If two events A and B are independent, then
(a) P(AuB)=P(A)P(B) (b) P(AUB)=P(A)+P(B)

(c) p( A n B) = p( A )p( B) (d) p( A U B) = p( A) + p( B) .


16. If P(A) = k,P(B) = ~, then P(AB) =?
2 1
(a) 15 (b) 5"
2 4
(c). 5" (d) 15'

17. When two events B A2 are not mutually exclusive, then


J,

(a) P(A+B)~P(A)+P(B) (b) P(AB)=P(A)+P(B)

(c) P(A+B)=P(A)+P(B) (d) P(A+B»P(A)+P(B).


18. If P(B) = P(C) = 1. then both the events B,C are
(a) mutually exhustive (b) independent
(c) certain (d) mutually exclusive.
19. For any two events AI,A2 where Al C A2, we have
(a) P(A2 - Ad = P(A2) - P(A1)
(b) P(A2 + Ad = P(A2) + P(A 1
)

(c) P(A2 -Al)~P(A2)-P(Al)


(d)P(A2 +A1) ~ P(A2)+P(A1). [W.B.U.Tech.2008]
BASIC PROBABILITY THEORY 41

20. If P(A +B) = ~, then P(AB) =?


1 2
(a) - (b) -
7 7
5
(c) - (d) none.
7
[Hints: P(AB)=l-P(A+B)]
21. The chance that a leap year selected at random will contain 53
wednesdays is [WB. U.Tech. 2006. 2007]
2
(a) - (b) 0
7
5
(c) 1 (d) "6'

22. If P(A) = ~,P(B) == ~,P(AB) = : ' then the value of P(A uB)
IS [WB.U.Tech 2007]
6 3
(a) - (b) -
7 7
7
(c) 1 (d) 12'

23. A coin is tossed. Event {H},{T} are [WB.U.Tech 2007]


(a) mutually exclusive (b) independent events
(c) dependent (d) both (a) and (c).
24. The condition for independence of two events A and B is

(a) P(A n B) = P(A)P(B) (b) P(A+B)=P(A)·P(B)

(c) P(A-B)=P(A).P(B) (d) P(AnB) = P(A)P(BI A)


[ WB. U.Tech. 2006]
25. The probability of obtaining an even number in the throw of a fair die is
1 1
(a) - (b) -
2 3
1
(c) - (d) none of these.
4
ENGINEERING MATHEMATICS-11A
42

26. 3 balls are drawn from a bag containing 6 red and 5 white balls.
The probability of getting the balls all red is
6 3
(b) -
(a) 11 7

1 (d) none of these.


(c) -
2
27. In a single cast with two dice the chance that one die turns up 3 and
the other 4 is
1 1
(b) -
(a) 12 18

1 1
(c) 36 (d) "3'
28. A bag contains 5 brown and 4 white balls. A man pulls out two balls
from the bag. The probability that they are of the same colour is
1 3
(a) - (b) -
5 7
4 (d) none of these.
(c) -
9
29. Three of six vertices of a regular hexagon are chosen at random.
The probability that the triangle with three vertices is equilateral is
1
1 (b) -
(a) - 3
6
1
1
(c) - (d) 10'
7
30. An unbiased coin is tossed repeatedly. If 'tail' appears on first four
tosses, then the probability of 'head' appearing on the fifth toss is
1
1 (b) -
(a) - 4
2
2
1
(c) - (d) g'
3
BASIC PROBABILITY THEORY 43

31. Seven white balls and three black balls are randomly placed in a
row. The probability that no two black balls are placed side by side is
3 7
(a) ~ (b) 15

2 1
(c) -
9 (d) "4.
32. Five boys and three girls are seated at random in a row. The probability
that no boy sits between two girls is
3 1
(a) 28 (b) 28

3 4
(c) 25 (d) 19.

33. Three different numbers are selected at random from the set
{1,2,3,4,5,6, 7,8,9,1O}. The probability that the product of two of the

numbers is equal to the third is


3 3
(a) - (b) 24
5
1 1
(c) 38 (d) 40.

34. The probability of a student getting first class, second class and
121
third class at an examination are 10' 5 and 5· respectively. The probability
that he fails is .
1 4
(a) 10 (b) -
9
3 7
(c) 10 (d) U.
35. The probability that Pradyut passes is 0.9 and Subrata passes is
0.8. The probability that at least one of them passes is

(a) 0.98 (b) 0.97 (c) 0.9 (d) none of these.

-
ENGINEERING MATHEMATICS -llA
44

36. A fair die is thrown. The probability that either an odd number or a
number greater then 4 will turn up is
3
2 (b) -
(a) - 7
5

2
2
(c) - (d) 3' .
7
37. Three bombs are dropped to destroy a bridge. The probabilities of
hitting the bridge by these bomb are 0.5, 0.2 and 0.1 respectively. The

probability that the bridge is hit is

(a) 0.36 (b) 0.64

(c) 0.80 (d) 0.90.

38. The probability of an event A occuring is 0.5, and of B occuring is


0.3. If A and B are mutually exclusive events, then the probability of neither
A nor B occuring is
(b) 0.1
(a) 0.2
(d) none of these.
(c) 0.8
then
39. If P(B)=~,P(A(\B(\C)=~,p(AnB(\C)=~

P(BnC) =
1
3
(a)- (b) 14
5
1 3
(d) ;;.
(c) 12
40. The probability that at least one of the events A and B occurs is 0.6.
If A and B occur simultaneously with probability 0.2, then p( A) p( B)
+ =

(a) 1.5 (b) 2.4

(c) 1.2 (d) 0.3.


BASIC PROBABILITY THEORY 45

41. Two numbers are chosen from the set {1,2,3,4,5,6} one after
another without replacement. The probability that the smaller value of the
two is less than 4 is
4 1
(a) - (b) -
5 5
3 7
{c) -
5 (d) 11.
ANSWERS

1.b 2.a 3.b 4.b 5.c 6.a 7.b 8.b 9.c

10. b ll.c 12.c 13.a 14.d 15.c 16.c 17. a


18.c 19.a 20.c 21.a 21. a 22.d 23.d 24.a
25.a 26.d 27.b 28.c 29.d 30.a 31.b 32.a
33.d 34.c 35.a 36.d 37.b 38. a 39.c
40.c 41.a
2.1. Joint Independent Experiments
Let E) and E2 be two random expriments which are performed
successively in such a manner that the change of occurence of the outcomes
of E2 are not affected by those of E) and vice versa. Then we say this is
joint independent experiments (EI' E2) . This concept can be extended to
define joint independent experiments (EI' E2, E3)' (Ep E2, E3, E4) and
so on.
Theorem. If AI' ~,A3 are any events connected with the
experiments E), E2, E3 ..... respectively then in ajoint independent experiment
(Ep E2,E3 ),
P(AI' A2, A3········.)= P(A)P(A2)P(A3)···.·······.
Proof. Beyond the scope of the book.
Illustration. Mr A can hit a target 4 times in 5 shots, Mr B 2 times in
4 shots and Mr C thrice in 7 shots. They fire a target successively. Find the
probability that Mr A and Mr B will hit but Mr C will fail to hit the target.
If Ep E2, E3 be the experiments that A, Band C fire the target then here
(EI' E2,E3) is a joint independent experiment.
Let A 0 = 'A hits the target'
B 0 = 'B hits' and Co = 'C hits'
4 2 3
Then P(Ao) = -,P(Bo) =- and P(Co) =-
5 4 7
Required probability is,
P(Ao,Bo,Co) =P(Ao)P(Bo)P(Co)
8
= ~ .~ -( 1- %) = 3 5
2.2 .Bernoultl Trial (Finite & Infinite)
Let E be an experiment having only two possible outcomes 'success
(s)' and 'failure (/)'. That is its event space S = {s,f} .
Then a sequence of n number of independent experiment
E" = (E,E, .....E) (n-tuple) is called a finite Bernoulli trial if the probability
of s remain the same throughout the trials.
THE BERNOULLI TRIAL 47

An infinite sequence of independent trials of E. i.e., the joint independent


experiment Eoo = (E,E.E upto(0) is called an infinite sequence of
Bernoulli trial if the probability of success remain the same throughout
the trials.
It is usual to denote probability of success by p and that of failure by q.
Clearly p + q = 1.
Example. Let E be the experiment of drawing a ball from an urn
contianing 5 white, 7 black, 3 blue and 8 red balls.
In each draw i.e., in a trial let
success (s) = 'the ball is white or blue' and
failure (j) = 'the ball is black or red'
Now let after drawing a ball from the urn, its colour is noted and it is
replaced. Next a ball is again drawn. Since the drawn ball is replaced so
E is being repeated independently. If we repeat this 9 times then we get a
finite Bernoulli trial with n = 9 . If this trial is repeated indefmitely then
we face infinite sequence of Bernoulli trial.
A possible outcome of the finite Bernoulli trial E9 is
(s.)f,s.)f.)f.)f,)f,)f,s)

and a possible outcome of the infinite sequence of Bernoulli trial Eoo


(s,s,)f,)f,s,s,)f,)f, upto(0)

Now in a single trial ofE, P(s) =~ and p(n =~


23 23
The corresponding probabilities are
p(s..)f,sJJJ,)fJ,s)
8 15 8 15 15 15 15 15 8 83 . 156
=-.-.-.-.-.-.-.-.- =
23 23 23 23 23 23 23 23 2J 239
and P(s,s,)f,)f,s,s,)f,)f, upto (0)
8 8 15 15 8 8 15 15
=_·_·_·_·_·_·_·_··········uptooo
23 23 23 ~3 23 23 23 23
(provided the value of this infmite product exists).
The following theorem on fmite number of trial has an important role
in the theory of probability and statistics.
48 . ENGINEERING MATHEMATICS -IIA

Binomial Law. In a finite Bernoulli trial E" if A; = 'i number of success


(consequently n - i number of failures) then its probability
P(Ai) = "C, pi qn-i where p, q (= 1- p) are probabilities of 'success'
and 'failure' respectively in a single trial of E .'
Proof. An event point in the event Ai is of the form
(s,s,s,.··s,f,j,··I) (i no. ofs and n=i no. of j) (1)
and so P(s,s,s, ..s,f,f, ...j) = iqn-i , .: pes) = p and P(f) = q .
Now in the n - tuple s may be placed in i number of rooms in "C,
ways. Once i number of rooms are occupied by s, remaining positions
will be occupied by j in one way. Therefore total number of n - tuples.
like (1) is n C, x I = n C,

:.P(A;)= nciiqn-i
Note. Probability of all's' in En is P(An) = p" and probability of all 'f
in En is P{Ao) = qn .
Remark. In a subsequent chapter we shall see this Biriomial Law has
an important role in the developlment of a special type of discrete
distribution called Binomial Distribution. In this book, in fact most of the
problems which could be discussed under this law will be presented in
that section of 'Binomial Distribution'.
Example 1. Let E be the experiment of drawing a ball from an urn
containing 5 white, 7 black, 3 blue and 8 red balls. In a draw let s = 'the
ball is white or blue'.
8 8 15
:. p =P(s) =- So q=I--=-
23 . 23 23
If this experiment E is repeated 9 times,
Probability of four success( among these 9 trials)

=P(A )=9C (~)4(~)9-4 x 4x 5


,;126 8 I5
4 4 23 23 239
Example 2. A and B toss a die alternately and the first to obtain a 'six'
wins the toss. If A starts the game, we shall find the probability of his
wmnmg.
Let E be the experiment of tossing the die.
THE BERNOULLI TRIAL 49

We suppose success (s) = 'six'

failure (f) ='non-six' :. p =~,q =~


6 6
and S = {s,J} which is certain event; peS) = 1
We consider the infinite sequence of Bernoulli trial of the experiment E.
A will win if at least one of the following events occur :
"I' AI = (s,S,S,S,S,S,.········upto (0)
A2 = (f,J,s,S,S,S,S,.······· -upto (0)
A3 = (f,J,J,J,s, s, S,. -upto (0)

~ = (f,J,J,J,J,J,s,S,S,.······· -upto (0)

........................... and so on .
Since pes) =p, P(f) =q and peS) =1
so, PC AI) = p x 1x 1x 1· =p

P(~) =q .q. pxlxl = pq2

P( A3) = q . q. q .q .p ·1.1 = P q4
P( A4) = q .q .q .q . q . q . p .1 = P q6
.. ...... and so on .
:. Probability that A wins

= P(AI U A2 U A3 U········..)
= P( AI) + PC A2) + PC A3) + .
2 4 6
=p+pq +pq +pq + .
= p(l+q2 +q" +q6 + )
."
1
= P . 1- q2 .: q2 common ratio of the G. P

• 1 36 6
=-.---=-.-=-
6 1-(%Y 6 11 11
EM-2A-4
50 ENGINEERING MATHEMATICS - rIA

2.3. Illustrative Examples.


Example 1. A problem of mathematics is given to three students A, B, C .
111 .
whose chances of solving it are 2' 3"' 4" respectively. Find the probability
that the problem (i) is not solved (ii) is solved.

Solution. Let El' E2' E3 be the experiments that the problem is given to
. .
A, Band C respectively.

Let A = A solves the problem


B = B solves and C = C solves.

1 1 . 1
Now, peA) = 2' PCB) = 3" and P(C) = 4"
- 1 1 . - 1 2 - 3 .,
:.P(A)=l--=-, P(B)=l--=-and P(C)=-
2 2 3 3 4
We consider the joint independent experiment (EI' E2' E3) .

(I) Here the event 'the problem is not solved' = (A, B, C)

:. Probability that the problem is not solved


--- - - - 1 2 3 1
=P(A,B,C) = peA) PCB) P(C) =_._.- =-
. 2 3 4 4
(it) The event 'the problem is solved'
=' at least one of the three students solves'

=(A,B,C)

:. Probability that the problem is solved


--- 1 3
=P ( A,B,C ) =l-P(A,B,C)=l-4"=4"

Example 2. If 20 dates are named at random, what is the probability that


3 of them will be Sundays ?
Solution. Let E be the experiment of naming a day against a date. Then
this is a Bernoulli trial E20 • Let in a trial's' = 'the named day is Sunday'.
Then 'f' = the day is not Sunday.

:. in a single trial p = pes) = ~ and q = P(f) = *


• THE BERNOULLI TRIAL SI

Now the event '3 of them will be Sunday' = A3 .


By Binomial law P(A3) = 20C3p3q20-3

1)3(6)17 617
or, the required probability = 2OC3
( -7 -7 = 1140 x -20
7
Example 3. In five throws with a coin find the probability of
(i) 3 heads (ii) at least 3 heads (iii) at most 3 heads
Solution.
Let E be the experiment of throwing a coin. 's' = Head 'I" = Tail
111
:,p="2,q=I-"2="2'
We consider the finite Bernoulli trial of 5, E5 .
(i) By Binomial law probability of3 heads
5
=P(A3)=5C3P3q5-3=5C3(kJ(kY =1 6

(ii) Probability of at least 3 heads

= p( A3 U A4 U As )
= 1- p( Ao U Al U A2) = 1- {5 CopOq5-0 + 5Clpll-1 + 5C2p2q5-2}

=1-C I
2 + :2 + ~~)=k
(iii) Probability of at most 3 heads
= P(AoU Al U A2 U A3) = P(Ao U Al U A2) + P(A3)
1 5
=- +- using (i) and (ii)
2 16
13
=
16
Example 4. A can hit a target 4 times in 5 shots; B 3 times in 4 shots; C
twice in 3 shots. They fire a target. What is the probability that at least
two shots hit ?
Solution : If EI' E2 and E3 be the experiments that A, Band C fire the
target respectively then we can treat this as a joint independent experiments
( EI , E2' E3) •
Let AI> A2' A3 be the event that A, B, C hit the target respectively.
52 ENGINEERING MATHEMATICS -IIA

Then
- 4 1
:.P(Al)=I--=-
5 5
- 3 1
:.P(A2)=I--=-
4 4
- 2 1
:.P(A3)=I--=-
3 3
Now the event 'at least two shots hit'
. ::::;(41, A2,A3) U (AI ,A2,A3) U (AI ,A2,A3) U (AI,A2,A3)
'.. :;

Now P(AI,A;',AJ) = P(AI)P(A2)P(A3) =.i .~.~ =~.


5 4 3 5
- - 4 3 1 1
and P(AI,A"A3) = P(AI)P(A2)P(A3) = _.-.- =-
- . 543 5
. - - 4122
and peA\> A2, A3) = P(AI )P(A2 )P(A3) = 5"'"4' 3 = 15
- - 1 3 2 1
and P(AI,A2,A3)=P(AI)P(A2)P(A3)=S'"4'3= 10

So the required Probability

=P(AI'~ ,A3) +P(AI ,A2,A3) + P(AI ,A2,A3) +P(Al,A2 ,A3)


2 1 2 1 5
=-+-+-+-=-
5 5 15 10 6
xample 5. A and B throw alternatively with a pair of dice. A wins if he
throws 8 before B throws 5 and B wins if he throws 5 before A throws
8. Find the probability that A wins. [WB UT 2006]
Solution. Let E be the experiment of throwing the pair of dice. .
In a single trial let%, Y be the event that A throws 8 and B throws 5
with a pair of dice. Then

x = {(3,5),(5,3),(2,6),(6,2),(4,4)} :. n(X) =5
Y = {(2,3),(3,2),(4,1),(1,4)} :. n(Y) =4
5 4 1
. P(X) -
.. - 36'
P(Y) - -
- 36-"9
.,' n(S) = 6 x 6 = 36
THE BERNOULLI TRIAL 53

:. P(X) =1-2. =~
36 36

:. P(Y) =1-i =%
We think this is an infinite sequence of Bernoulli trial Eoo' Here the
event 'A wins'
= {X ,S,S,.... } U {X ,Y ,X ,S,S,.... } U {X ,Y ,X ,Y ,X ,S,S,.... } U· .....
(S is certain event)
So, the probability of A wins in the game

= P(X) + P(X Y X) + P(X Y X Y X) +... ······00 [.: peS) = 1]

.
= P(X) + P(}()P( Y)peX) + [Since the successive trials are
Bernoulli independent trial]
l

5
=3 6 + ~~.%.:6 +G~J (%J. :6 + .
, 5

= 36 45
8 31 =76 [ .: Thi's is an infini
Illite GP senes
. wiith c.r"9·36<1
8 31
1--·-
9 36
?
Example 6. A missile hits target with probability 0.3. How many missile
should be fired so that there is at least an 80% probability of hitting a
target?
Solution. Let E be the experiment of firing a missile.
Let E be repeated independently n times
Let's' = 'a missile hits the target'
:.p=P(s)=O.3. So q=P(f)=1.1.0.3=0.7
In the Bernoullis n trials, probability of failing to hit the target in each
trial = qn .
. " .
Probability of hitting at least once
= 1- Probability of failing each trial = 1- qn = 1- (0.7)" .
We have to find n such that P(An) > 0.8

or, 1- (0.7)" > 0.8 !


or, (0.7) n < 1-0.8
• = 0.2
54 ENGINEERING MATHEMATICS -IIA

or, log (0.7r < log (0.2)

or, n log (0.7) < log (0.2)



log (0.2)
or, n » [.: log 0.7 is negative]
log (0.7)
or, n > 4.512
:. at least 5 missiles should be fired.
Example 7. The probability that a teacher will give a surprise test during
. 1 .
any class is '5' If a student is absent on two days, what is the probability
that he will miss at least one test?
Solution. Let E be the experiment of observing whether the teacher gives
surprise test on a day.
Let's' = 'the teacher gives test'

'f' = 'the teacher does not give test'


114
p = pes) ~ '5 and q = 1- '5 = '5 .
Here we consider this is a Bernoulli trial of 2. The event 'he will miss
at least one test' = 'the teacher gives test at least one day'
= (s,s) U (s,f) U (f,s)
:. Required probability = p{(s,s)U(s,f) U(/,s)}
= pes,s) + P(s,f) + P(f,s)
= P: + P: +
p + q q. p = p2 2pq

=(~J +2'~'~= ;5
Example 8. A factory produces blades among which 20% are defective.
If 5 blades are drawn at random from a day's production, find the
probability that there will be (i) exactly 2 defectives (ii) not less than 2
defectives.
Solution. Let E be the experiment of drawing a blade from the factory's
huge production.
THE BERNOULLI TRIAL 55.

's' =if the drawn blade is defective.


20 1
Then p=P(s)=-=-
100 5
We think this is a Bernoulli trial with n = 5
(i) Now' exactly 2 defectives'

= '2 success among 5 trials' = A2


:. by Binomial law, required probability

=P(A2)= 5C p 2q5-2 :x:l0· ( S1)2 ( l- 1)3


2
S
1 43 640 128
=10·_·-=-=-
52 53 3125 625

(ii) 'not less than 2 defectives' = Ao u Al


..
Now, P(Ao) = 5CoP 0q 5-0 = (4)5
S

and P(AI)=
5
Cip q
I 5-1
=5x x
S
·1 (4)4
S = (4)4
S
Required probability = 1- P(Ao U AI)

= l-{P(Ao) + P(A])}

~1-[ (H +(~n~1-G~~: +~~:)~821/3125


Example 9. Three groups of children contain respectively 3 girls and 1
boy, 2 grils and 2 boys, 1 girl and 3 boys. One child is selected at random
from each group. Show that the chance that the three selected children
13
consisting of 1 girl and 2 boys is 32 . [WBUT 2005, 2007]

Solution. Let EI,E2,E3 be the experiments of selecting a child from the


first, second and third group.
Let GI, G2 and G3 be the events of selecting girls and B], B2 and B3
be the events of selecting boys from the three groups respectively.
56 ENGINEERING MATHEMATICS-lIA

321
Then, peG,) = 4' P(G2) = 4' P(G3) = 4
• 123
PCB,) =-, P(B2) = -,P(B3) =-
444
We consider the joint independent experiment (E" E2, E3)
The event 'I girl 2 boys'
= {(G ,B2,B3
1 ),(B, ,G2,B3 ),(B, ,B2 ,GJ)}
:. the required probabili.ty
= p( G1 ,B2,BJ) + P(B1,G2,B3) + P{B"B2,G3)
= P(G,)P(B2)P(B3) + P(B,)P(G2)P(B3) + P(B,)P(B2)P(G3)
3 2 3' 1 2 3 1 2 1
=_._._+_._._+_.-.-=- 13
4 4 4 4 4 4 4;' 4 32 .
Example 10. In an infinite sequence of Bernoulli trial with probability of
1
success '3' find the probability that 7 failures will precede the first success.

Solution. If s = success, f = failure then the certain event


S = {.s,f} in a single trial.

1 1 2
P(s) = '3,P(f) =1-'3 ='3 and peS) "" 1.

'Now the required event = {f,f,f,f,f,f,f,s,S,S,S, upto oo}


, :. Required probability

=~.~.~.~.~.~.~.!.1.1.1 .
3 3 3 3 3 333
_(2)7 1_ 27
- '3 ''3-3"8
Example 11. Three persons A, Band C toss a coin in succession and the .
first to obtain a head wins the game. Find the probability of winning of
A, Band C respectively.
Solution. E be the experiment of tossing a coin. In a single trial
's' = 'Head'·
:. p = pes) =! in a trial. Then q = P(f) =!
2 2

L
THE BERNOULLI TRIAL 57

Let S = {s,f} which is certain event in a single trial.


Now the problem is concerned of the infinite sequence of Bernoulli
trial E"". In this sequence of trials by an outcome say
(f,f,s,s"l,s,f,S,S,. ) we mean A throws J, B throws J,C
throws s, A throws s, B throws J, .. .. ... so on.
Then the event 'A wins'
= (s,S,S,S ... :... (0)U(f,I,I,s,S,S,······ .(0)

U (f, I, r.I, I, I, s, S, S,· (0) U upto 00

:. Probability that A will win


= P(s,S,S, S,.···..) + P(f,f,f,s,S,S,.··· ..)
+Pt f ,I, 1,1,I, l,s,S, S,···· ..) + upto 00

=(p ·1·1··· ..)+(q. q .q. p·I·I·· ...) +(q. q- q.q. q. q- p·I·I··· ..)+ .....

= p + q3P + q6P + q9P + -upto 00

= pCI + q3 + l + q9 + (0)

=p._1_ .:l isc.r.oftheG.Pandq3 <1


I-l
1 1 11184
~2'1-G)' ~2 l-~ ~2'7~7

In this sequence of trial,


the event 'B wins'
= {/,s,S,S,. ..... oo} U{/,f,f,f,s,S,S,. ..... }
.. U{/,I,I,I,I,I,I,s,S,S,······} U······upto eo

:. Probability that B will wins


=(q.p.I.I )+(q.q.q.q.p.I.I )

+(q.q. q. q.q.q.q. p·I·I··· ... )+. ·····00


S8 ENGINEERING MATHEMATICS -IIA

= qp+q4p + q7P + upto 00

= qp(1+ l + q6 + 00)

1J
111 1 11182
=qp--=-.-. =4"·--1 =-.-=-
1- q3 2 2 1_ ( 1- "8 4 7 7

Similarly the event 'C wins'


= {/,/,s,S,S,.· ..·· oo}U ir.r.r.r.t.cs.s-> .oo}
U{/,/,/,/,/,/,/,/,s,S,S, =l U· upto 00

:. Probability that C wins


= q 2 P + q 5 P + q8 P + upto 00

2 36 21111
=q p(l+q +q +·········uptooo)=q p--=_._._-
I_q3 22 2 1-!
1 8 1 8
=-0-=-
8 7 7
Example 12. A player repeatedly throws a coin and scores one point for
a Head and two points for a Tail. He stops throwing whenever he scores
a total of 4 point. Find the probability of scoring 4.
Solution. In fact this is a finite Bernoulli trial with success,
H = Head and failure, T = Tail,
Now P(H) = 1/2, P(T) = 1/2 , in a single trial.

then the event 'Scoring 4' = {H,H,H,H}U{H,H,T} U{H,T,H}

U{T,H,H} U{T,T}

Now, P{H,H,H,H} =P(H)P(H)P(H)P(H) = (!)4


2
=~
16

P{~,H,T} = P(H)P(H)P(T) = (1J ·1 =~

P{H,T,H} =-!,P{T,H,H} =! and P{T,T} =!


884

. d pro b abili
:. th e require 1 1 1
ility =-+-+-+-+-=- 1 1 11
16 8 8 8 4 16
THE BERNOULLI TRIAL 59

Example 13. A, Band C playa game and chances of their winning in an


2 1 1
attempt are "3'"2 and 4" respectively. A has the first chance, followed by
B and then by C. This cycle is repeated till one of them wins the game.
Fintl the chances of winning the game by A.
Solution. In a trial let Ao = 'A wins', Bo = 'B wins' and Co = 'C wins'
2 1 1
Then P(Ao)="3' P(Bo)=:"2' P(Co) =4" .
- 21 - 1 - 13
:.P(Ao)=l-"3="3' P(Bo) ="2' P(Co)=I-4"=4"

S = certain event in each trial.


This is an infinite sequence of trials.
The event 'A wins the game'

= {Ao,S,S,·.·.····} U {Ao,Bo, Co,Ao,S,S, }

U{Ao,Bo,Co,Ao,Bo,Co,Ao,S,S, } U· upto 00

:. Probability that A wins the game

= {P(Ao)P(S)P(S) } + {P(Ao)P(Bo)P(Co)P(Ao)P(S) }

+ {P(Ao )P(Bo )P( CO )P(Ao )P(Bo )P( CO )P(Ao)P(S)··· }


•.
+······uptooo

= -211321231232
+ -. -. - .- + _. -. -. - .- .- .- + .....
.
upto 00
332433343343

=~+t·~+(tr·~+(tJ.~+..... upto eo

,.

L
60 ENGINEERING MATHEMATICS-IIA

Exercise 2
1. Three cards are successively drawn from a full pack, the card
drawn being replaced every time. Find the probability that first card is
spade, second card is heart or diamond and the third card is queen.
2. Four cards are drawn successively from a pack of 52 cards with
replacement. Find the probability that all the four cards are of the same
suit.

.
[Hmt. (13 13 13 13) x4]
_._._0.-
52 52 52 52
3. The probability that Ashok can solve a problem in Business Statistics
423
is 5"' that Arnal can solve it is "3' and that Abdul can be solve it is "7 . If
all of them try independently, find the probability that the problem is solved.
4. A candidate is selected for 3 posts. For the first post there are 3
candidates, for the second post there are 4 and for the third there are 2.
What is the chance of getting at least one post ?
[This is complement of 'getting none of three'.

P(getting none of the three = ( 1- i)(1- ~ ) ( 1- ~) = ~ )]


5. Find the probablity of obtaining multiple of three twice in a throw
with 6 dice.

[Hint. 6C2 (2)2


6 ( 1- 2)6-2 ]
6
1
6. The probability of hitting a target is 5". If 10 shots are fired, fmd
the probability of at least two hits. Find also the minimum number of shots
to be fired in order that the probability of hitting the target at least once

exceeds ~ . [Hint. For the first part, 1- (0.8)10 - 2 x (0.8)9 ]


7. One shot is fired from each of the three guns. Let A, B, C denote
events that the target is hit by the first, second and the third gun
respectively. Assuming that A, B, C are mutually independent events and
that peA) = 0.5, PCB) = 0.6, P(C) = 0.8, find the probability that at least
one hit is registered.
THEBERNOULLIT~ 61

8. There are three men aged 60, 65 and 70 years. The probability to
live 5 years more is 0.8 for a 60 year old, 0.6 for a 65 year old, and 0.3
for a 70 year old person. Find the probability that at least two of the three
persons will remain alive 5 years hence.
[Hint. (0.8)(0.6)(1- 0.3) + (0.8)(1- 0.6)(0.3)
+(1- 0.8)(0.6)(0.3) + (0.8)(0.6)(0.3) ]
9. A and B toss a coin alternately and the first to obtain a head wins
the toss. If A starts the game, find the probability of his winning
.
[Hint. the required probability = L:"2
(1 )2n ."21 "3]2
=

10. A player repeatedly throws a coin and scores one point for a Head
and two points for a Tail. He stops throwing whenever he scores a total
of 5 point. Find the probability of scoring 5.
U. In an infinite sequence of Bernoulli trial with probability of success
1
4" ' find the probability that 6 failures will precede the first success.
12. Two persons A and B throw a die alternatively and A starts
throwing till one of them gets a multiple of 3 and wins the game: Find
their respective probabilities of winning.

[Hint -1 + (2)2
- .-1 + (2)4
- .-1 1- ..... ·00 ]
·3 3 3 3 3
13. A man alternatively tosses a coin and throws a die begining with
the coin. Find the probability that he will get a head before he gets '5 or
6' on the die ?
J
1 1 2 1 1 .2 1 2 1
[Hint - + - x - x - + - x - x - x - x - + ...... 00 ]
·223223232
>
Answers
1 1 101 3 80
1. 104 2. 64 3. 105 4.4" 5. 243
2 21
6. ·624; 4 7. 0·96 8. ·612 9. "3 10. 32
36 2 3
11. -7
4
12. "5 13.4"
r

~3
~
I~ _I_S_C_RE
D __T_E_RA ND__O_M_V__MU
__A_B_L
ANn ITS EXPECTATION
__
E

3.1. Random Variable.


Definition. Let S be a given sample space. Then a real valued function
X defined on S is called a random or a stochastic variable (r. v) or
sometimes a variate. Thus for every point s of a sample space S, we have
a unique real value of X i.e X( s)

The range of the function X i.e. the set of all values assumed by X is
called the Spectrum of the random variable.
Discrete Random Variable. A random variable (r.v) X is said to be
discrete if the spectrum of X is fmite or countably infmite i.e an infinite
sequence of distinct values.

Continuous Random Variable. A random variable X is said to be


continuous if it can assume every value in an interval.
Event Described by Random. Variable. The set of all sample points s
for which X( s) E A , a given set of real numbers, is an event and is
denoted by (X E A). In particular, the event (X = a) is the set of all
sample points corresponding to which X takes the value a.
Let [ a, b ] be a given closed interval. Then the set of all sample points
s for which a~X(s)~b is an event and is denoted by (a~X~b).
Similarly the events (a<X ~b), (a~X <b) and (a < X < b) are defmed.
Also the event (-00 < X ~ x) is abbreviated as (X ~ x) where x is a real
number. Further the events (-00 < X < 00) and (-00 < X < -<Xl) denote
respectively the certain event S and the impossible event cj>.
Thus we see an event can be described by a random variable.
Illustration. (i) Let us consider the random experiment of tossing two
(unbiased) coins. Then the sample space S contains 4 sample points.
i.e., S = {HH, HT, TH, TT}.
Let the random variable X be such that X (an outcome) = "the number
of heads". Then X is a function over S defmed by
X(HH) = 2, X(HT}=X(TH) = 1, X(TT) = O.
DISCRETE RANDOM VARIABLE & ITS EXPECTATION 63

Thus the spectrum of X is {O, 1, 2} which is a finite set. Hence X is


a discrete random variable here. Here the event (X = 1) = {TH, HT} =
'one head', the event (-1 < X ~ 0) = {IT} = 'Two tails'
(ii) Let the random variable X denote the weights (in kg) of a group
of individuals. Then X can assume every value in an interval say (30, 100),
suppasing there is no individual having weight less than 30 and greater
than 100. Hence X is a continuous random variable. Here the event
(42 < X ~ 50) = the group of individuals whose weight lie between 42
and 50, including 50 ; the event (X = 70) = The group of individuals
whose weight is 70 kg.
3.2. Probability Mass Function and Discrete Distribution
Let X be a discrete random variable(r.v) which assumes the values
Xo , Xl s X2 , ... , X n' ... Let P(X = x.) = f(x.)
t l. = I,.+. . So , the value of J+.
J,

depends on xi i.e. i. This function I, is called Probability mass function


(p.m.f) of the random variable X A particular value of /; is called a
probability mass.

The set of ordered pairs (Xi,fi) is called the discrete probability


distribution of the random variable X
Discrete distribution is presented in the following way :
X
t, fo t, t;
Illustration. For the random experiment of tossing two coins as given
in Illustration (i) of art 1.2.1 we see X assumes the values 0, 1 and 2.
Moreover,
1 1 1
P(X = 0) = 4' P(X = 1) = 2' P(X -= 2) = 4'
So, the distribution of the number of heads is given by
X 0 1 2
1 1 1
4 2 4
64 ENGINEERING MATHEMATICS - IIA

Fundamental Properties of pmf.


If X : Xo Xl X2 X3
t. : fo ft f2 f3
is a discrete distribution of X; then the pmf has following two properties :
(i) t, ~ ° (ii) L:t. = 1,

Proof: (i) t. = P(X = xJ ~ °i


since a probability is always ~ °
(ii) lJi = L:P{X =xJ=P{S)=l as S={XO,~,X2"" .. ] = event
i
space.
3.3. Distribution Function or Cumulative Distribution Function.
(For Continuous & Discrete)
The distribution function (d.t) of a random variable X (discrete or
continuous) is given by
F( x) = P( -00 < X ~ x), -00 < X < 00
Thus, if Xi ~ X < Xi+l ' then i
F(x) = P(X = xo)+P(X = x )+ ...+ P(X
I = xJ = :Lfa .
a=O
Illustration. In the discrete distribution
X 0 1 2
1 1 1
fi -
4
-
'2
4' the d.fis

F(x)=O, x<O
1
= O~x<l
4'
1 1
=-+- 1~x<2
4 2'

111
=4+2+4' z s e.
Properties of Distribution Function.

(i) The distribution function F( x) is a monotonic non-decreasing


function.

(ii) F( -00) = ° and F(oo) = 1 and hence ° s F(x) s 1.


DISCRETE RANDOM VARIABLE & ITS EXPECTATION 6S

(iii) F(x) is continuous on the right at all points and has a jump
discontinuity on the left at x = a, the height of jump being equal to
P(X = a) i.e., lim F(x) = F(a) and
x-+a+
F(a) - lim F(x) = P(x = a)
x-+a-
(iv) Suppose a and b are any real numbers such that a < b
Then P(a < X ~ b) = F(b) - F(a) ,
P( a < X < b) = F( b) - F( a) - P(X = b)
and P(a ~ X < b) = F(b)-F(a)-P(X = b) + P(X = a)
Proof: Left as exercise.
Illustration. (i) Let X be a random variable denoting the number of points
appearing in a toss of a die. The distribution of X is

X 123 4 5 6
1 1 1 1 1 1
6 6 6 6 6 6
Now, if x c l . F(x)=P(X~x)=O

If 1 s x < 2, F( x) = P( X s x) = fl = ~
112 --
If 2 s x < 3 , F( x) == P(X s x) ==fi + f2 ==6' + 6 == 6'
and so on.

Thus the distribution function F( x) is given by :


F(x) = 0, -oo<x<l
1
==-, 1~x<2
6
2
==-, 2~x<3
6
3
=-,3~x<4
6
4
==-, 4~x<5
6
5
=-, 5~x<6
6
== 1, 6 ~ x < 00
EM-2A-S
66 ENGINEERL~G MATHEMATICS -IIA

The graph of the distribution F(x) is as follow :


F(x)
1

5/6
4/6
3/6
2/6

1/6

x
1 2 3 4 5 6 7
From the graph it is clear that F( x) is a step function, non decreasing
and is continuous on right at x = 1, 2, 3, , 6 and has a jun
1
discontinuity on left at 1,2,3, ,6, the height of jump is "6.

Also, F( -<Xl) = 0, F(oo) = 1 .


(ii) Let three balls be drawn at random from a bag containing 5 white
and 3 black balls ; X denotes the number of white balls drawn.

Then X can assume the values 0, 1,2, 3.

Here, fo = P(X = 0) = Probability of no white ball

= 3C3/8C3 =~
56
t, = P(X = 1) = Probability of one white ball

= 5C1 X3C2/8C3 = ~:.


DISCRETE RANDOM VARIABLE & ITS EXPECTATION 67

Then the distribution of X is

X o 1 2 3
1 15 15 5
56 56 28 28
From this we get the probability of the event like 'at most two white

ball' = P(X ~ 2) = P(X = 0) + P(X = 1) + P(X = 2)

= fo + t. + 12 = 1- f3 = 1- ~28 = 23
28'
3.4. Expectation or Mean of a Discrete Random Variable
Let X be a discrete random variable whose distribution is
X

t,
Then the mean or expectation or expected value of X denoted by
E(X) or m(X) or simply m is defmed as

E(X) = xofo + X/I + x2f2 + = Lxii' provided the series is


absolutely convergent if the above sum is an infinite series.
Similarly, the mean ofa function \jI(X) of the discrete random variable
X denoted by E{\jI(X)} is defined as

for a discrete distribution

Illustration. (i) Suppose a die is rolled. Let X be the number of points


on the die. Then its values are 1, 2, 3, 4, 5, 6.

.. P(X = i) = ~ for i = 1,2, 3, 4,5,6 .

So the distribution of X is
X 2 3 4 5 6

1 1 1 1 1 1
t. .- - - - - - -
6 6 6 6 6 6
68 ENGINEERING MATHEMATICS -IIA

Therefore its expectation,


1 1 1 1 7
E(X) = 1·-+2·-+3·-+···+6·- =- [W.B.U.T. 2013]
6 6 6 6 2
and

Properties of Expectation
(i) E( a) = a, where a is a constant.

(ii) E( aX +b) = aE(X) + b, a being a constant.

(iii) E(a'¥(X) + b) = aE('¥(X» + E(b)


Proof: Left as an exercise.
Remark. The mean has an important physical significance. In fact this
represents the centre of mass of the probability distribution.

Illustration. A number is chosen at random from the set {I, 2,· ..,l00}.
We are to find the expected value of the chosen number X and the
expection of 3X3 + 2.
Let X = first number and y = second number. So the distribution
. of Xis
X 1 2 3 100

1 1 1 1
100 . 100 100 100

So,
1 1 1 1
E(X) = -·1+-·2+···+-·100 = -(1+ 2+3+···+100)
100 100 100 100

1 100(100+1) 101
=_. =-
100 2 2
DISCRETE RANDOM VARIABLE & ITS EXPECTATION 69

The expectation of the function 3X3 + 2 is


E(3X3 + 2) = 3E(X3) + 2
3 3 1 3 1 3 -1
Now, E(X )=1 x-+2 x-+···+100 x-
100 100 100
= - 1 (13 + 23 + ... + 1003 )
100
= _1_{100(100 + 1)}2 = _1_(50 x 101)2
100 2 100

:.E(3X3 +2) =_3_(50xl0l)2 +2 =765077


100
3.5. Variance and Standard Deviation of a Random Variable
The variance of a r.v X , denoted by Var (X) is defined as
Var (X) = E((X - m)2), where m = E(X)
The positive square root of Var (X) is called the standard
deviation(s.d) of X and is denoted by cr(X) or c x or simply cr.

Thus o = +~var(X) .
Remarks: (i) The variance describes how widely the probability masses
are spread about the mean i.e it gives an inverse measure of concentration
of the probability masses about the mean which is called the measure of
dispersion.
(ii) As Var(X) = 0 only when X - m = 0 i.e, X = m, so in that
case the whole mass is concentrated at the mean.
Theorem.
2
(i) Var(X)
E(X2) _ m2 = E(X2) _ {E(X)}
=

(ii) Var(aX + b) = a2Var(X)


(iii) Var(k) = 0 where k is constant.
(iv) Var(X) = E{X(X -1)}- m(m -1) where In is mean of X

Proof: (i) Var(X) = E{(X - m)2} = E(X2 - 2mX + m2)


= E(X2)- E(2mX)+ E(m2)
= E(X2)-'2mE(X)+m2 = E(X2)-2m'm+m2 = E(X2)_m2
70 ENGINEERING MATHEMATICS -llA

(ii) and (iii) are left as exercise.


. 2
(iv) (X -m) =X(X -1)-2mX +X +m2
.. E{(X _m)2} = E{X(X -1)}-2mE(X)+ E(X)+ E(m2)

= E{X(X -1)}-2m.m+m+m2
=E{X(X-l)}-m(m-l.
Note: In fact the result (i) and (i~ of the
...-/ bove theorem are used to
evaluate variance and standard deviation.
3.6.l\'Ioments of a Random Variable
The r-th moment of a r.v X about A denoted by J.l; is defined as

J.l; = E {(X - A)"} , where A is a real number .

., The r-th moment about zero is

J.l; = E(Xr)
Ther-th moment about the mean of X is
u, = E (X - xy) which is known as r-th Central Mo~ent of

X, where X is mean of X.
Example. Let Xbe a discrete random variable whose distribution is

X: 3 6 7 11

1;: 0·5 0·1 0·2 0·2


Here X = 3 x 0 . 5 + 6 x 0 ·1 + 7 x 0 . 2 + 11x 0 . 2 = 5 . 7
:. here the 3rd central moment of X,

J.l3 =E(x-5.7i)
= (3 - 5 . 7)3 x 0 . 5 + (6 - 5 . 7)3 x 0 ·1 + (7 - 5 . 7)3 x 0 . 2
+ (11 - 5 . 7)3 x 0 . 2

= -9.8415 + ·0027 + 0·4394 + 29·7754


=20·376
DISCRETE RANDOM VARIABLE & ITS EXPECTATION 71

Theorems:
(1) o-th moment about any number is 1
(2) 1st moment about 0 is mean of the variable.
(3) 1st central moment is always zero.
(4) 2nd central moment is variance.
Proof. Let X be the random variable.

(1) P~ = E( (X - A)O) = E(1) = 1

(2) P: = E(X -0») = E(X) = X


(3) PI =E(X-X»)=E(X)-E(X)=X-X=O

(4) P2 = E( (X - X)2) = Var(X) by definition

Following theorems give the relation among central moments and


moments about any number.
Theorems. For any random variable X /

(1) P2 = p~ - p:2

(2) P3 = p~- 3p~p: + 2PI'3

(3) P4 = p~ -4p;p: + 6P2'P:2 _3P:4


Proof. (1) RHS = P; - p;2 = E( (X _A)2) -{E(X _ A)}2 P
=E(X2 -2AX +A2)~{E(X)-E(A)}2

= E(X2) - 2AE(X) + A2 - (X - A)2 ": A is constant

=E(X2)-2AX +A2 _X2 _A2 +2AX

=E(X2)_X2 =E(X2)-{E(X)}2

= Var(X) = P2 (using previous theorem)

(2) Beyond the scope of this book.


(3) Beyond the scope of this book.
ENGINEERING MATHEMATICS ·IIA
72

3.7. Illustrative Examples.


Example. 1. Find the probability distribution (or probability function or
p.m.t) of the number of heads when a fair coin is tossed repeatedly until
the first tail appears.
The sample space corresponding to the random experiment of tossing
a fair coin is S = {T, HT, HHT, HHHT," -} .
Let the random variable.X denote "the number of heads in the
experiment until the first tail appears".

Then the spectrum of X is {O,1, 2, 3,···}

Now, P(X = 0) = P(T) = ~


P(X = 1) = P(HT) = P(H)·P(T)
[ .: the trials are independent]
1 1 1
=2'2=2"2
P(X = 2) = P(HHT) = P(H)P(H)P(T) = ~2 and so on.

Hence the probability distribution of X is

x o 2
1 1
2 23
Example. 2. A random variable X has the following probability mass
function [ WB. U.Tech 2007]
X 01234567

P(X=k)=f(x) :
(i) Determine the constant k
(ii) Evaluate ~X <6), ~X2!6),~3<X ~6) and ~3<X/X~6)

(iii) Find the minimum value ofx so that P(X ~ x) > ~2


(iv) Obtain the distribution function F(x). [WB.U.Tech 2004]
DISCRETE RANDOM VARIABLE & ITS EXPECTATION 73

(i) Since j(x) is a p.m.f, Lf(x) =1


x
7
.. Lf(x) = 1~ O+k+ 2k+2k +3k+ k2 +2k2 + 7k2 +k = 1
x=o
~ 10k2 + 9k-1 = O~ (10k-1)(k+1) =0
1
~k=-l,-
10
1
:.k=lO r. f(x);:::O, V x=0,1,2,···7 and so k:;t:-1]

(ii) P(X < 6) = 1-P(X;::: 6) = l-{P(X = 6)+P(X = 7)}

=1_{2(~)2
10
+7.(~)2+~}
10 10
=~
100

81 19
.. P(X ~ 6) = 1 ~ P(X < 6) = 1- 100 = 100
33
P(3<X~6)=P(X=4)+P(X=5)+P(X=6) = 100

P{(3<X)n(X~6)} P(3<X~6) 3&'100 33


p(3<X/X~6)= P(X~6) = P(X~6) = ~100 = 83

(iii) Now P(X ~ 1) = ~ <.!.,


10 2
1 1 3 1
P(X ~2)=P(X =O)+P(X =l)+P(X =2) = 10 + 2· 10 = 10 <"2 .
P(X s 3) = P(X = l)+P(X = 2)+P(X = 3)
1 111
=-+2·-+2·-=-
10 10 10 2

P(X ~ 4)= P(X = 1)+P(X = 2)+P(X = 3)+P(X = 4)


8 4 1
=-=->-
10 5 2·

Thus the minimum value of x so that P(X ~ x) >1. is 4.


2
ENGINEERING MATHEMATICS -IIA
74
(iv) The distribution function F(x) is given below
F(x) = 0, -ex:><x<l

1 1~x<2
10

2..+2.2..=~
10 10 10

3 1 1
-+2·-=- 3~x<4
10 10 2

1 1 4
-+3·-=- 4~x<5
2 10 5

4 (1)2 81
5+ 10 =100
5~x<6

81 + 2.
100
(2..)2
10
= 83
100
6sx < 7

83 + 7(2..)2 + 2.. = 1 7 s x < ex:>


100 10 10

-oo<x<O
Example. 3. Let F(x) =0,
1
=- o s x <1
5'
3
= 1~x<3
5'
=1, 3~x<ex:>
Show that F(x) is a possible distribution function. Detennine the spectrum
and the probability mass of the distribution. [w.B.U.Tech 20031
Hence find the mean and standard deviation of X.
Clearly F(x) is monotonic non-decreasing function and is continuom

on the right at all points.


DISCRETE RANDOM VARIABLE & ITS EXPECTATION 7S

Also, F(-oo) = 0 and F(oo) = 1. Hence F(x) is a possible


distribution function.
Again F(x) is step function and step points are 0, 1,3. So the spectrum
is {O, 1, 3 }.

Now, P(X = 0) = F(O) - lim F(x)


x ....•c-
=.!5 - 0 =.!5
P(X = 1) = F(l)-lim
x ....•1-
F(x) = ~-.!=~
5 5 5

P(X = 3) = F(3)- lim F(x)


x ....•3-
= 1-~5 =~.5
So the required prob. mass of the distribution is
X 0 1 3
1 2 2
- -
555
-- 1 2 2 8
The mean X=E(X)=Ox-+lx-+3x-=-
5 5 5 5
E(X2)=02 x..!.+ 12x~+32 x~=4
5 5 5

:.Var(X)=E(X2)-{E(X)}2=4- (8)2
-
5
36
=-
25

:. the standard deviation, ax = (36 = ~


V25 5.

Example 4. The distribution function F(x) of a variate X is defined as


follows
F(x) = A, -co c x c=-I
-lS;x<O
0S;x<2
2S;x<oo
where A,B,C, D are constants. Determine the values of A, B, C, D,

given that P(X = 0) =.! and P(X > 1) = ~. [W.B. U.Tech 2004]
6 3
ENGINEERING MATHEMATICS - UA
76

We have F(-oo) =0
:. lira F(x) =0
i.e., lim (A) =0 .. A =0
x-+-oo
x-+-oo

Again F(oo) = 1 :. lim F(x)


x-+ao
=1

i.e.
,
lim (D)
X~OO
=1 :. D =1

NOW,..!:.. = P(X = 0) = F(O)- lini F(x)


6 x-+o-

[':P(X=a)=F(a)-lim x-+a-
F(x)]

=C-lim F(x) =C-B


x-+o-

:. C - B
. =..!:..6 .., (1)

Again P(-oo < X < 00) = P(-oo < X ~ 1)+ P(l < X < 00)
2
:. 1 = P( -00 < X ~ 1)+ P(X > 1) = P{ -00 < X ~ 1)+ "3
2
:. P{-00 < X ~ 1) = 1- -
3
i.e., "F(l)=..!:.. [.: F(x)=P(-oo<X~x)]
3
.. C=..!:..
3
1 1 1 1
.. From (1) B=C--=---=-
6 3 6 6
:. A = 0, B =..!:.., C =..!:.., D =1
6 3

Example 5. A special un-biased die with n + 1 faces is rolled. Its faces

are marked by the number O,..!:..,~ , ..., n -1 ,!!:. . If X denotes the number
n n n n
shown then fmd (i) the expectation of X (ij) standard deviation of X

(iii) E(X -~J


DISCRETE RANDOM VARIABLE & ITS EXPECTATION 77

1
Note that probability of each face is --1 0

n+
So, the distribution of X is
1 2 n-1 n
X 0
n n n n
1 1 1 1 1
t.
n+1 n+1 n+1 n+1 n+1
t
Obviously X assumes values Xi = -n 0

()O) 1iTX) 1 1 1 2 1 n-1 1 + n x _1_


~ =Oxn+1 +;;:xn+1+;;:xn+1+000+--;; x n+1 n n+1
nil 1 n 0 1 n( n + 1) 1
=~non+1=n(n+1)~t=n(n+1) 2 ="2

(ii) E(X2) = txifi = t(i)2 x _1_ = 2 1 ti2


i=O i=On n + 1 n (n + 1) i=O
1 n(n + 1)(2n + 1) 2n + 1
= =
n2(n+1) 6 6n

00 Var(X) = E(X2) _ {E(X)} 2


= 2n + 1 _.!. = n + 2
6n 4 12n

00 ax = ~~;:

Now,

1 (3 3 3) 1 {n(n+1)}2 n+1
= n3(n+1) 1 +2 +ooo+n = n3(n+1) 2 =-4-n-

Hence E( X - ~ r =0 0
78 ENGINEERING MATHEMATICS -11A

Example 6. If t is a positive real number and n is a discrete r.v


assuming the values 0:1, 2, .... co with p.m.f P( X = i) = e =t (1- e -i f":l .
Find the mean and E(3X + 2).

The p.m.f. t. = e-t( l-e-


.
co
r: co "I
:. the mean, E(X) = Iit, = Iie-t(l-e-tY-
i=O i=O

-t" '(1
00

=e L...J~ -e -t
'1
)t- -t"·
co

=e L...J~Zi-I
i=O i=O

=e-t(I+2z+3z2+ ... uptooo)=e-t(I-Zr2 [.: z<l]

= e-t{l- (1- e-t)} -2 = e'

Now E(3X +2) = 3E(X)+2 = 3et +2


Example 7. If a person gains or loses an amount equal to the number
appearing when a balanced die is rolled once according to whether the
number is even or odd, how much money can he expect from the game
in the long run ?
Let X = amount of gain.
Then X may assume the values -1,2, - 3,4, - 5 and 6.

Now, P(X = -1) = Prob. of 'face l' = .!


6
1
P(X = 2) = Prob. of 'face 2' = -
6
1
P(X = -3)= Prob. of 'face 3' =-
6
and so on.
Therefore the distribution of X is
X -1 2 -3 4 -5 6
1 1 1 1 1 1
Ii - - - - - -
6 6 6 6 6 6
DISCRETE RANDOM VARIABLE & ITS EXPECTATION 79

•• The amount of expected money E(X) =


1 1 1 1 1 1
= -1 x -+ 2 x -+(-3) x -+4 x-+(-5) x-+6 x-
6 6 6 6 6.6
1 1 1
= 6(-1+2-3+4 -5+6) = 6x 3= 2"'
Example 8. If a person gets Rs. (2x + 5) where x denotes the number
appearing when a balanced die is rolled once, then how much money can
he expect in the long run per game?
By problem x = number appearing on the die,
., the distribution of x is
x 1 2 3 4 5 .,.. 6.
1 1 1 1 1 1
6 6 6 6 6 6

1 1 1 1 1 .1 21 7
Now, E(x) = 1'6+2'6+3'6+4'6+5'6+6'6 =6= 2"
Expected amount of money = E( 2x + 5) = E( 2x) + E( 5)
7
= 2E( x) + 5 = 2 x 2" +5 = 12,

Example 9. Find the second, third and fourth central moments of X where
P(X = I) = a, P(X = 0) = I-a,
Solution. Obviously the distribution of X is
X : 0 I
/; : I-a a

The moments about 0 are


J.l; = E(X -0) = E(X) = Ox(l-a)+lxa =a
2
J.l; = E( (X - 0)2) = E(X2) = 0 , (1- a) + e 'a = a
J.l~ = E( (X - 0)3) = E(X3) = 03, (1- a) + 13, a = a
4 4
J.l~ =E(X _0)4)=E(X )=04 '(1-a)+1 'a=a
80 ENGINEERING MATHEMATICS-IIA

From the relation among moments and central moments


we get the required central moments as follow :

J1.2 = J1.2, - J1.1


a
=a- a
2

J.13 = J.13, - ' , + 2'3


3J1.2J1.1 J.11 =a- 3 . a . a + 2 a 3 = 2 a 3 - 3a 2 +a

J1.3 = J.14 = J1.4, - 4'f.lJJ.11 , + 6J1.2J1.1


' ,2 - 3J1.4,4

=a-4·a·a+6.a.a2 -3a4 =a-4a2 +6a3 -3a4


Example 10. Find the expectation of the number of heads preceding the
first tail in an infinite sequence of tosses of same coin.
Solution. Let X = number of heads preceeding the first ~il.
In a single throw the event space S = {H,T} .
Then X may take the values 0, 1, 2, 3,· upto 00 •

Now, (X = 0) = {T,S,S, =l
:. P(X = 0) = peT) peS) P(S)··········
=.!..1.1 =.!.
2 2
(X = 1) = {H,T,S,S, oo}
:. P(X = 1) = P(H)P(T) P(S) P(S) .

=~'~'I..I..·..·=(~r
Similarly P(X = = 2) P(H) P(H) P(T)P(S) =(~J
Thus P(X=3)=(~r 'P(X=4)=(~J ,. -and so on

:. the required expectation, E(X)

= 0'"21 + 1'"2( 1 )2 + 2'"2( 1 )3 + 3'"2( 1 )4 + .... -upto 00

= p 2 + 2 p 3 + 3p 4 + .
(putting p = -)1
2
DISCRETE RANDOM VARIABLE & ITS EXPECTATION 81

2
=p (l-p)
-2
=2"2.
1
( 1-2 1)-2 ="4·2
1 ()-2
1 .1
="4x2
2
=1

Exampiell. The first three moments of X about 3 are 2, 10 and 30


respectively. Obtain the first three moments about zero. Hence find the
variance of X.
Solution. By problem, E(X - 3) = 2
or, E(X) - E(3) = 2 or, E(X) - 3 = 2 or, E(X) = 5
.. 1st moment about zero is 5.
Again, by problem E( (X _3)2) = 10
or, E(X2 -6X +9)=10
or, E(X2) - 6E(X) + E(9) = 10
or, E(X2)-6x5+9=10 or, E(X2)=31.
., 2nd moment about zero is 31.
Again, by problem E(X -3)3)=30
3
or, E(X -3X2 ·3+3·X .32 -33)=30
or, E(X3 -9X2 +27X -27)=30
or, E(X3)-9E(X2)+27E(X)-27 =30
or, E(X3)-9x31+27x5=57
E(X3) = 57 + 279 -135 = 201
.. 3rd moment about zero is 201.

Now, Variance of X =E(X2)_{E(X)}2 =31-52 =6

Example.12 Find the third central moments of the following discrete


distribution
2 n-l
3 ........ --
- n
X 0
n n n n n
1 1 1 1 1
J; ------ -_ ....... _-
n+ln+ln+l n+l n+l n+l
Solution. The mean,

EM-2A-6

-----
ENGINEERING MATHEMATICS -llA
82
_ 11121 n 1
X=E(X)=O-+_·_+_·_····+_·-
n+1 n n+1 n n+1 n n+l
n i 1 1 n
=2,-'-=
n n+1i=O n(n + 1) i=O
2,i
1
= (1+2+3+ ... +n) =
1 1 n(n+ ) ==,~
n(n + 1) n(n + 1) 2 2

oW the third central moment,

2 3 n(n + 1)(2n+ 1)
1 n(n + 1)
} 2n2(n+l) 6
== n3(n+l)' { 2
3 n(n + 1) 1
+---~-~
4n(n+l) 2 8

=0 (detail evaluation is not shown).


EXERCISES
(11 SHORT ANSWER QUESTIONS
1. Find mean and variance of the following distribution:
Xi -1 0 1 2 3
t. 0.3 0.1 0.1 0.3 0.2
[Hints: mean == (-1) x 0.3 + 0 x 0.1 + 1 x 0.1 + 2 x 0.3 + 3 x 0.2 == 1.0
1
Var ~ (-1-1)' xO.3+(O-l)' x Q.l+(1-1)' x 01 +(2- )' x 0.3
+(3 _1)2 x 0.2 == 2.4]
DISCRETE RANDOM VARIABLE & ITS EXPECTATION 83

2. For what values of A will the function


f(x) = Ax.x = 1.2.3 .: =n.
be a probability mass function of a random variable.
[Hints: Use the result
n(n + 1)
L:fi =1 gives A(1+2+ ... +n)=1=:>A· 2 =1].

m r 1.E(
3. If the random variable X has mean 'm' and S.D o show that

E( X : = X : m) = 0.

4. Find the probability distribution of X, the number of 'sixes' in two


tosses of a die.
5. Two cards are drawn successively without replacement from a well-
shuffled deck of 52 cards. Find the probability distribution of the number
of aces.
6. Five balls are drawn from a box containing 4 black and 6 white
balls. Find the probability distribution of the number of black balls 'drawn
without replacement.
7. A random variable Xhas probability function
1
f(x)=~, for x=1.2.3,···upto 00

Find the mean,


8. An urn contains 7 white and 3 red balls. Two balls are drawn together
at random from the urn. Find the mathematical expectation of the number of
white balls drawn.
ANSWERS

2
1. 1.0,2.9
2'n(n+1)
4. X 1 2

t,
°25
--
10
-
1
36 36 36

5. X 1 2
°--
188 32 1
t. -- --
221 221 221
ENGINEERING MATHEMATlCS-llA
84

6. X 0 2 3 4

1 5 10 5 1
Prob. : - - - - -
42 21 21 21 42

7. 2 8. 1.4
[II1 LONG ANSWER QUESTIONS

1. (a) The random variable X has the probability density function


I(x)=k, ifx=O
= 2k if x =1
= 3k if x =2
= 0, elsewhere
determine (i) value of k (ii) P(X < 2)
(iii) The smallest value of k for whcih P(X $ k) >..!:.
2
[W.B. U. Tech 200t(

(b) A random variable X has the following prob. distribution


X = Xi 0 1 2 3 4 5 6 7 8
t, k 3k 5k 7k 9k 11k 13k 15k 17k
(i) Determine the value of k

(ii) Find P(X < 3), P(X ~ 3), P(2 s X < 5)


(iii) What is the smallest value ofx for which P(X $ x) > 0.5 ?
2. A fair coin is tossed 3 times independently. Let X be the r.v whose
value for any outcome is the number of heads obtained. Find the probability
distribution of X and its distribution function also.
3. From a lot containing 12 items, 4 of which are defective, 5 are
chosen at random. If X be the number of defectives found in the
sample,write down
(i) the probability distribution of X
(ii) P(X $1)
(iii) P(l < X < 3)
DISCRETE RANDOM VARIABLE & ITS EXPECTATION 85

4. A random variable X has the following probability function:


X =xi: -2 -1 o 2 3

t, 0.15 m 0.25 2m 0.35 m


(i) Find the value of m, (ii) Obtain the distribution function F( x) .
5. A random variable X has the following probability function

x :« -2 -I o 1 2 3
P(x) 0.1 k 0.2 2k 0.3 3k
(i) Find k
(ii) Evaluate P(X < 2), P(X s 2), P( -2 < X < 2) [W.B.U.Tech 2006]
(iii) Determine the distribution function F( x) of X
[W.B. U.Tech 2005]
6. The spectrum of the random variable X consists of the points
1,2" ", n and P(X = i) is proportional to .. 1 . Determine the distribution
.
function of X
~(~+1)

Compute P(3 < X ~ n) and P(X > 5)

[Hints: L f k+ 1) = lor, k L(1-;---.


n
i=1 ~ ~
1)- = 1
+1
n
i=1 ~ ~

kn n+1
or, -- = lor, k = --]
n+1 n

.. F(x) = tP(X=xr)=± k =k(l-~)


r:-<X) r=1 r(r + 1) ~+ 1
(n + l)i. .
= nt+1
(. )' ~~x<t+1

~
P(3<X~n)=L./i=k~ ~(1 -;---. 1)
- n+1(1---
=- 1) n-3
=-
4 4 z ~+1 n 4 n +1 4n

P(X>5)= ifi = n+1(.!. __ 1_)= n-5 ]


i=6 n 6 n+1 6n
86 . ENGINEERING MATHEMATICS -IIA

7. Find K for which the following gunction f(x) will be a pmf of a


discrete random variable X:f(x) =K(l+x):x =2,3;···n. Find K, P(X > 2)
and P(X ~ 2/ X :;;2). Also fmd the mean of the distribution. .
2
• 2 .(n-2)(n+5L .2(n +4n+6)
1
[(n-1)(n+4)'(n-1)(n+4)" 3(n+4) ]
8. Find the expectation, variance and standard deviation of each of the
following distribution:
(i) Xi -5 -4 1 2
1 1 1 1
4 8 2 8
8 12 16 20 24
1 1 3 1 1
f(xJ 8 6 8 4 12
9. A bag contains 5 white and 7 black balls. Find the expectation of a
man who is allowed to draw two balls from the bag and who is to receive
one rupee for each black ball and two rupees for each white ball drawn.
10. A and B play for a prize of Rs. 99. The prize is to be won by a
player who first throws a '3' with one die. A first throws, and if he fails
B throws, and if he fails A again throws, and so on. Find their respective

expectation.
11. Two players A and B agree to play a game under the condition
that A will get from B Rs. 3 if he wins and will pay B Rs. 3 if he losses.
The probability of Pt:s winning is p. Find the mean and variance of A's
gain. [Hints:.X= A's gain =3, -3, p(X=3)=p, p(X=-3)=1-p]
12. Evaluate the expectation of the number of sucess preceding the
first failure in an infmite sequence of trails with probability of failure is p,
where each trial gives two possible outcomes - success or failure.
[Hints: X = 0,1,2,'" ... up to co and P(X =i)=(l- py pl·
13. The first two moments of a discreet r.v about 5 are 2 and- 20
respectively. Find the mean and variance of the r.v.
14. The first two moments of a random variable about 4 are
-1.5 and 2.7 . Find the first two moments about zero. Also find the mean
and standard deviation of the r.v.
DISCRETE RANDOM VARIABLE & ITS EXPECTATION 87

15. The first four moments of a discrete random variable about 1 are
2·6, 10·2,43·4 and 192·6 respectively. Find the first four moments of
X about 4.
16. The first, second and third moments of a random variable about
2 are 1, 16 and -40 respectively. Find the mean, variance and the third
central moment.
Answers
1 1 1 8 7
1. (a)(i) k = ~
6
(ii) "2 (b)(i) k--
- 81' (ii) 9'9' 27' (iii) 6

2. X 0 1 2 3
1 3 3 1
fi - - - -
8 8 8 8
F(x)=O, x<O
1
=- O~x<l
8'
1
= 1~x<2
2'
7
= 2~x<3
8'
= 1, x~3
3. (i) X: 0 2 34

fi : t; t. f2 t, t, where t, = (~) (5~J/(l:)


(ii) P(X51)~{(:H:)(!)}/C:) (iii)(:)(:)/(1:)
.. 1
4. (1) m=-, (ii) F(x)=O, x<-2
16
= 3/20, -2 ~ x < -1
= 17/80, -1 sx < °
= 37/80, °s x <1
= 47/80, 1~ x <2
= 75/80, 2~x <3
= 1, x ~f3
88 ENGINEERING MATHEMATICS-IIA

5. (i) K =~ (ii) 0.5, 0.8, 0.4


15

(iii)F(x) = 0, x<-2
1
=- -2~x<-1
10'
1
=-, -l~x<O
6
11
=- O~x<l
30'
1
=-, 1~x<2
2
4
=-, 2~x<3
5
=1, 3 ~ x.
2
, 2 (n-2)(n+5) 2(n +4x+6)
7. (n-1)(n+9)' (n-1)(n+4);1; 3(n+4)
8. (i) -1,8.25,2.9, (ll) 16,20, 2.J5 9. Rs. 2.83 10. Rs. 54, 45
1- p.
11.3(2p-1),36p(1-p) 12. P 13. 7,16
14,2·5,6·7;2·5,0·671 15,-0·4,3·6,-5·2,22·8 16. 3,15, -86

rIII] MULTIPLE CHOICE QUESTIONS

1. The probability P(a ~ x ~ b) is defined by (where F(x) is the


distribution function of the random variable X)
(a) F(b)-F(a) (b) F(b) + F(a)
(c) F(a)-F(b) (d) F( a )F( b) . [W.B. U.Tech 2006]

2. The distribution function F( x) of a random variable X is given by


(where -00 < x < 00)
(a)P(-oo<X<x) - (b) P(-oo < X s x)
(c) P(-oo < X < 00) (d) none.
DISCRETE RANDOM VARIABLE & ITS EXPECTATION 89

3. A random variable X has the following p.m.f :


123
1 1
- - k
2 3
Then the value of k is
5 1
(a)l (b) 0 (c) - (d) 6"
6
4. The variance of a random variable X is
(a) E(X)2 (b) [E(X)t

(c) E(X2)_[E(X)]2 (d) E(X2) - E(X)


[W.E. U. Tech. 2006]
5. The expectation of the following distribution :
Xi 0 1 2 3
1 1 1 1
t, - - - -
4 8 2 8
is
1 5 3
(a) - (b) - (c) - (d) 1.
2 2 2
6. If the mean of a distribution is 5, then the value of

E(2X - 9) is
(a) 10 (b) 1 (c) -4 (d) -9.
2
7. When the variance of a random variable is "3' the •

Var (3X +5) =


(a) 8 (b) 2 (c) 6 (d) 11
8. The following distribution is a p.m.f of a random variable :

-1 o 2

0.3 0.4 0.6 0.1

The statement is

(a) True (b) False.


90 ENGINEERING MATHEMAT1CS-1IA

9. The distribution of a random variable X is given by

P(X=-I)=!,P(X=O)=~,P(X=I)=!. ThenS. DofXis


848
1 1
(a) - (b) -- (c) 0 (d) 1
2 2
10. If X assumes the values 3,5,6,9 and
P(X = 3) =.24,P(X = 5) = O.I,P(X = 6) = 0.12 and
P( X = 9) = 0.55 . Then this makes distribution.
(a) Yes (b) No.

1 1 1
11. If P(X = -1) = 4"'P(X = 0) = 2'P(X = 1) = 4 and

P(Y = -2) = ~,P(Y = 10) = ~,P(Y = 4) = ~


then E(X + Y) =

5
(a) ] 0 (b) 4 (c) 6 (d) "6
1 1 1
12. If P(X = -1) = 4"'P(X = 0) = 2'P(X = 1) = 4" and

111
P(Y = -2) = 3'P(Y = 10) = 3'P(Y = 4) = 3 and if X,Y are
independent then E(XY) =
(a) 10 (b) 4 (c) 0 (d) 1
13. Four coins are tossed. Expectation of number of heads is

(a)l (b) 2 (c) 3 (d) 4.


14. A discrete distribution of a random variable is
X 1 2 3 4 5
1 2 3 4 5
Ii - - - -
15·
15 15 15 15
Then P(x> 2/ x ~ 4) =
7 5 1 7
(a) - (b) - (c) - (d) 10
5 7 5
DISCRETE RANDOM VARIABLE & ITS EXPECTATION 91

15. Value of c for which the function


f(x) = ex, for x=0,l,2,3,4,5
= 0, otherwise
becomes a prnf is

7 1 2 4
(a) 15 (b) 15 (c) 15 (d) 15

16. For the discrete distribution


x 2 3 4 5
1. 2 3 4 5
-,
15 15 15 15 15
1
P(X < a) = 5"' Then a =

(a) 1 (b) 2 (c) 3 (d) 4

1 2 1
17. For the pmf f(x) defined by f(O) = "7,f(I) = "7,f(2) ="7
2 1 6 .
f(3) = "7,f(4) ="7 the value of P(X ~ K) ="7' Then K =

(a) 1 (b) 2 (c) 3 (d) 4


1 3 3
18. For th~ prnf f(x) defined by f(O) = 8,f(1) = 8,f(2) = 8'
f (3) = ~ , the value of the distribution function at 2 is
8 •
2 3 1 7
(a) - (b) - (c) - (d) -
5 7 7 8
19. The variance of the discrete distribution
x 0 1 2
1 1 1
p(x) - - -
4 2 4
IS

(a) 0.6 (b) 0.7 (c) 0.2 (d) 0.5


ENGINEERING MATHEMATICS-IIA
92

20. Throwing two unbiased coins simultaneously Diku bets Buku that
he will receive Rs 4 from Buku if he gets 2 heads and he will, give Rs 4 to
Buku, otherwise. If P(x) be the pmf of Diku's gain then P(Diku's Rs 4
loss)= 1
1 4 3
(a) -
4
(b) -
7
(c) -
4
(d) '5'
21. For a random variable X, E{(X _1)2} = 10 and

E{(X - 2)2} = 6. Then E(X) =


9 7
5
(a) - (b) 6 (c) -
2 (d) 2:'
2

22. For a random variable


X,E{(X - 2)2} = 6,E{(X _1)2} = 10.
Then CJx =
(a) J3 (b) .J2 (c) 0 (d) 2

23. Let X be a discrete random variable assuming values l,2,3,4, ... and
co

suppose that E(X) exists. Then E\X)= ~)~X~n)


=1

(a) True (b) False.

24. The probability distribution of the number of TV set sold in a day


by a salesman is
X 0 1 2 3 4 5
P(X) 0.1 0.2 0.3 0.2 0.1 0.1.

The average number of TV set sold in a day is

(c) 2.6 (d) 3.1.


(a) 2 (b) 2.3
25. If X is a random variable which assumes values 1,2,3,4,.·· with
respective probability given by P(X = K) = qK-1p,p+q = 1.
Then Mean of X is
~"

1 P
(d) -
(b) p+1 q'
DISCRETE RANDOM VARIABLE & ITS EXPECTATION 93

A SWERS
1.a 2.b 3.d 4.c S.c 6.b 7.c
8.b 9.a 10.b ll.b 12.c 13.b 14.d
. 15.b 16.c 17.a 18.d 19.d 20.c 21.d

22.b 23.a 24.b 25.c


o SPECIAL TYPE OF DISCRETE DISTRIBUTION

4.1. Introduction:
A discrete random variable is said to have a special type of distribution
if its pmf gets a special form.
Two such type of distributions are discussed in this chapter. Definition
of each of the distribution is first given and their properties and field of
fitness are illustrated in this chapter.
4.2. Binomial Distribution.
A discrete random variable X is said to have a binomial distribution
with parameters p( 0 < p < 1) and n (a positive integer) if its distribution
is given by
X o n

10

i = 0,1, 2,.··,n
For instant, one probability mass

+
11
(nlrl r:
= 1r1-P
nl
= np(l- p) - etc.

Note: (1) The pmf t. satisfy the two fundamental properties t. ~0


n
and Lt. = 1, which can be easily verified.
i=O

(2) When the random variable X has a binomial distribution with


parameters n, p we write X - b( n, p) and ~e say X is a binomial variate.
(3) The significance of the parameters nand p would be given in
subsequent theorem.
SPECIAL TYPE OF DISCRETE DISTRIBUTION 95

Cases where Binomial Distribution fits.


Let A be an event of a random experiment E. We call "the event A" =
'success' and "the event Not A" = 'failure'. Let p = probability of
'success' in a single trial of E. E be repeated, independently, n times. Let
X = number of succss in n trials. Then X may assume the values 0, 1, 2,
'" ... n. For example, the event (X == 3) means" 3 success in n trials".

It is shown in 'Binomial Law' that P(X == i) == ( ~ )pi (1- p r:


Thus the distribution of X becomes
X 0 2 3 n
t, t, t, f2 f3 t.

, ~he~e ~i = P(X = i) = (: )pi(l_ pr-i which is the pmf of Binomial


distribution.
Thus 'No. of Success' in n independent trials is a Binomial variate with
parameter p = probability of success in a single trial and n = number of
trial.
Illustration. The efficiency of a fighter-plane is such that the probability
."
of a bomb hitting a target is 2/ 5. The fighter is assigned to completely
r destroy a camp of enemy-side. The plane carries 6 bombs, i.e., 6 bombs
can be aimed at the camp. Here throwing a bomb is the experiment.
It car. be repeated 6 times; , A bomb hits the camp' = Success and
X == number of success in 6 trials. Then X has the Binomial distribution,
o
fo

where fi (6)(2)i ( 1- "52)6-i = (6)(2)i(3)6-i


= i "5 i "5 "5
If it is known that at least four direct hits are necessary to destroy the
camp then the probability of complete destruction of the camp
= P(X ~ 4) = P(X = 4) + ~(X = 5) + P(X = 6) = f4 + f5 + f6

= ~~!
(:)(~r(~r+(:)(~YG)+(:)(~r =
96 t:NGINEERING MATHEMATICS - IlA

Theorem. If X has Binomial Distribution with parameter n and p then


(i) its mean is np (ii) its variance is npq where q = 1- P .
[W.B.U.Tech 2005]
Proof Here X : 0 1 2· .. n
i
and its pmf is fi = P(X = i) = (: )pi(l_ pr-

(i) Mean = E(X) = tifi = ~{:)pi(l-pr-i •

=npt(~-I)pi-l(l_ pt-i
i=l L-l

=npr=0
-l n-
r
I} r
(l-p)
n-l-r
, replacing i-I by r
~
n-l
=np (p + 1- P) = np
(ii) Now
i
E{X(X -I)} = ti(i-l)(~)pi(l- pr-
i=O L

= ~i(i_l)n(n-l)(n-2) i(l_ )n-i


L..J
.
.( . 1)
L L-
. 2 P
L-
P
1=0

=n(n-l)p2~ n (ni-2
- 2)pl-2(1_
. p) n-i
=n{n-l)p
2 -2
r={)
n-
r
2}r (l-p) n-2; ,replacing i - 2 by r
~

=n(n _1)p2(p + 1- pr-2 = n(n _1)p2.

:. Var (X) = E{X(X -1)}-m(m-l)


=n(n-l)p2-np(np-l)=np(1-p) =npq where q=l-p

:. standard deviation (J = ~npq .


SPECIAL TYPE OF DISCRETE DISTRIBUTION 97

U1ustration. An unbiased die is tossed four times. Let 'multiple of three'


be success; otherwise it is failure. Here p = probability of sucess in a single

trial = ~ = ~. Let X = number of 'multiple of three' appeared among


63·
these four trial. Then, as we discussed before, X has Binomial distribution
1
with parameter n = 4 and P ="3. The expected number of 'multiple of

3' = mean of X = 4 x~ = ±. The standard deviation of


3 3

Moments of Binomial Variate


In the following theorems we find the moments of X having a binomial
distribution.
Theorem .t. If X has Binomial distribution with parameter nand p then
its moments about zero are
(i) P; = np
(ii) P; = n(n _1)p2 + np
(iii) P~ = n(n -1)(n - 2)p3 + 3n(n _1)p2 + np
(iv) P~ = n(n -I)(n - 2)(n - 3)p4 + 6n(n -I)(n:- 2)p3 + 7n(n _1)p2 + np
Proof. (i) PI' = E(X - 0) = E(X) = np
(ii) The second order moment about zero

P2' = E( (X - 0)2) = E(x2)

=:L" P "cipi(l- pt-i


i-O

n
= :L{i(i-I)+i} "cipi(l- p),,-i
i=O

" n
= :Li(i -1) "ci/ (1- »r: + :Li "c.p' (1- p),,-;
;=0 /" i=O

_~·(·_I)n(n-l)n-2
- ~I I . . Ci-2P
i(l_
P
),,-i ~."
+ ~l eiP
'n- P ),,-1
i=O 1(1 -I) i=O

EM-2A-7

l ~ ~
98 ENGINEERING MATHEMATICS -IIA

n n
= n(n -l)P2L n-2Ci-2p;-2(1- pt-; + Li nc;/(l- p)n-;
;=0 ;=1

n-2
=n(n-l)p22: ,,-2Cjpj(l- p),,-2-j +E(X) [putting }=i-2]
j=O
= n(n _1)p2 {(l- p) + p}"-2 + np using previous theorems result

=n(n-l)p~ +np
(i ii) The third order moment about zero,

J.L3'= E(X3) = Li3


" "c;/(l- p)"-;
;=0

" {iO
=L -I)(i - 2) + 3i(i -I) + i} "c.p' (1- p)"-;
;=0
n
= n(n -l)(n - 2)iL n-3c;_3/-3(1_ p),,-;
;=3
n
+3n(n _1)p22: n-2c;_2/-2(1- p)n-; + E(X)
;=2
,,-3
=n(n-I)(n-2)p3 ,,-3cjpj(l_p),,-3-i
L
j=O

,,-2
+3n(n _1)p2 L n-2Ckpk (1- p),,-2-k + np
k=O

[Putting ) =i- 3,k =i- 2]


= n(n-lXn-2)p 3{(1- p)+ P },,-3 +3n(n-l)p 2 {(1- p)+ P }n-2 +np

= n(n -lXn - 2)p3 + 3n(n _1)p2 + np


(Iv) Beyond the scope of the book.
Theorem 2. If X has binomial distribution with parameter nand p
then its central moments are
(i) J.L2= np(l - p)

(ii) J.L3= np(l- p)(l- 2p)

(iii) J.L4= np(l- p){ 1 + 3p(l- p)(n - 2)}


Proof. (i) The second central moments,
J.L2= E {(X - X)2} = Var(X) = ~(l- p)
SPECIAL TYPE OF DISCRETE DISTRIBUTION 99

(ii) Using the relation among central moments and moments about
any number we get
The third central moments,
, 3 ' , 2 '3
P3 = P3 - P2 PI + PI
= n(n -l)(n - 2)p3 + 3n(n _1)p2 + np
-3{n(n-l)p2 + np}np+2(np)3
= np(2p2 - 3p + 1) = np(l- p)(l- 2p)
(iii) Left to the readers as exercise.
r
Illustrative Examples
Example 1. The mean and s.d of a binomial distribution are respectively

, 4 and J% .Find the values ofn and p.Hence evaluate P(X = 0).
[ W.B. U. Tech 2006]
We know the mean and s.d of a binomial variate are respectively np
r and ~np(l- p).
. 8
.. np = 4 or, np(l- p) = '3
8 1
.. 4(1- p) = '3 ~ p = '3
.. n = 4x 3 = 12
O( 1-~ )12-0 = ( ~ )12 .
P(X= 0) = fO=12CO ( )
~

Example 2. Comment on the statement "a binomial variate has mean 4


and s.d 3".

Here, np = 4 and ~np(l- p) = 3 i.e. np(l- p) = 9


9 5
:. 4(1- p) = 9 .. 1- p = '4 :. p = -'4 which is not possible
~ since 0 < p < 1. So the statement is false .
...
Example 3. If the mean of a binomial distribution is 3 and the variance is
'"" 3
2" ' find the probability of obtaining at most 3 success. [WB. U. Tech 2007]
100 ENGINEERING MATHEMATICS -IIA

Let X be the r.v corresponding to the number of success. Then the


pmfof X is

t, = P(X = i)= Cipi(l-


It
pt-i, i = 0,1,2,· .., n
.. mean = np = 3, Variance = np(l- p) =~
2
or, np( 1 - p) = ~ x .!. .'. P =~ and so n = 6
np 2 3
Now probability of at most 3 success = P(X ~ 3)
= P(X = O)+P(X = 1) + P(X = 2)+P(X = 3)

_6 (1)0(1)6
- CO"2
6 (1)1(1)5 6 (1)2(1)4 6 (1)3(1)3 _ 21
"2 + C "2 "2 + C "2 "2 + C "2 "2 - 32.
1 2 3

Example 4. The probability that a pen manufactured by a company will

be defective is
1
10.
If 12 such pens are manufactured, find the prob. that

(i) exactly two will be defecive (ii) none will be defective (iii) at least two
will be defective

Let 'defective pen' = success

= 10 .
1
.'. Prob of success in a single trial p

The experiment is repeated 12 times.


Let the random variable X =
number of defective pens. Then X is a
1
binomial variate where the parameters n = 12and P = 10.
., Itspmfis li=P(X=t)=
. (12)( 1 )i( 1 )12-i
i 10 1-10

:. (i) The required probability =F\X=2)= (12X1)2(


2 10 1- 101)12-2
10
= 66 9 = 0.2301
1012 X

L
SPECIAL TYPE OF DISCRETE DISTRIBUTION 101

12X 1
(ii) required probality P(X = 0) = ( 0 10
)O( 1- 101 )12-{)= (910)12
= 0.2833·
11
(iii) P(X=I)= 12)( - 1 )1( 1-- 1 )12-1 =12x- 9 =3755.
( 1 10 10 1012 .

:. P(X ~ 2) = 1- P(X < 2)

= 1- [P(X = 0) + P(X = 1)] = 0.3412.

Example 5. A defective die is thrown ten times independently. The


probability that an even number will appear 5 times is twice the probability
that an even number will appear 4 times. What is the probability that odd
face appear in each of the ten throws. Find the third central moments of
the distribution.
Let "even face" = "success" and X,,;, number of even face among
" 10 trials.

•• :. X has Binomial distribution with parameter n = 10 and p. So the


prnf is given by P(X = i) = ti=lOCipi(l- p)10-i .

By problem, P(X = 5) = 2P(X = 4) or, t5 = 2 x t,

or, lOC5 x p5(1_ p)10-5 = 2xlOC4p4(1- p)1D-4

or, lOC5P = 2xlOC4(1- p)


5
or,252p=2x210(1-p) or,3p=5(1-p) or p=-
8
Required probability
= P(X = 0) = to=lOCopO(I- p)10-0

The third central moment,

f.i3=np(l-p)(1-2p)=-8-100X5( 1- 5)( 1- 10) =-128


75
8 8
]02 ENGINEERING MATHEMATICS -11A

Example 6. The overall percentage of failures in a certain examination is


40. What is the probability that out of a group of 6 candidates at least 4
passed the examination ?
ow the overall percentage of success in a certain exam is 60.
Consider the experiment that one student is drawn and seen whether

he is passed. Probability that he is a passed student is p = ~ = 0.6 .


100
The experiment be repeated.6 times. Let X = number of passed-student
in 6 such trials. So X is a Binomial variate with parameter n = 6, p = 0.6.
So the distribution of X is
X 0 23 456
P(X = i) = t.. to tl t2 t3 t, to t«.
where h=6Cipi(1- p)6-i =6Ci(Q6)i(1-0.6ti =6Ci(0.6)i(OA)6-i
Required probability = P(X ~ 4) = t, + ts + t6

=6C4(0.6)\OA)6-4 +6Cs(0.6t(0.4)6-5·+6C6(0.6t(0.4)6-6 = 0.54432 .


Example 7. A family has 6 children. Find the probability that (i) 3 boys
and 3 girls (ii) fewer boys than girls.
1
Probability of any particular child being a boy is "2'
Let 'boy' = success. The experiment of noticeing whether the child
is boy or girl is repeated 6 times. X = No. of boy i.e. No. of success
1
:. X - b(n, p) where n = 6, p =- .
2
So the pmf of X is

t, = P(X = t)= . 6 (l)i(l)6-i"2


Ci"2 = 6 Ci "2(1)6
(i) Now Probability of" 3 boys and 3 girls"

= P(X = 3)= 6 C3"2 (1)6 = 5/16


(ii) Probability of "fewer boys than girls'
= P( X s 2) = P(X = 0) + P( X = 1) + P( X = 2)

=6Co (ir c{ir (ir


+6 +6C2 = ~~.
r

SPECIAL TYPE OF DISCRETE DlSTRlliUTION 103

Example 8. A die is tossed thrice. A success is "getting I or 6" on a


toss. Find the mean and variance of the number of success.
Let X denote the number of successs. Clearly X can take the values
O. 1, 2 or 3 and X follows binomial distribution with n=3
• p = Probability of success = '6 = 3'
2 1

1 2
and q =1-3'='3
1
:. Mean = E(X) = np = 3 x - = 1
3
T
1 2 2
and variance = npq = 3 x - x- =-
3 3 3
Example 9. Find the probability distribution of the number of boys in a
family with 3 children, assuming equal probabilities for boys and girls.
Graph the distributions. Also fmd the distribution function F(x) for the
random variable X.
:; Let E be the experiment of picking a child in the family.
The event 'boy' = success.
We have p = probability of boy
1
="2 by hypothesis. E is repeated 3 times.
Let X denote the number of boys.
Then X can assume the values 0, I, 2, 3. X has the Binomial
distribution.
X 0 I 2 3
t. to t. f2 f3

where t, = G)(~}(~r-i(~)(~rG)~
= =

Therefore, the probability distribution of boys is


X 0 2 3
1 3 3 1
t, - - - -
8 8 8 8
]04 ENGINEERING MATHEMATICS - IIA

The graph of the distribution is given below :


j(X)
3/8
2/8

1/8
--~--~~--~------6-------~--~x
Using the above table we obtain the distribution function F( x) as

F(x) = 0, -00 < X <0


1
8' O::S;x<1
1 3
=-+-8' l::S;x<2
8
1 3
=-+-8' 2::s;x<3
2
7 1
=-+-8' 3::S;x<oo
8
i.e., F(x) = 0, -eo c x c O
1
= 8' O::s;x<l
1
=2' 1::S;x<2
7
=-8' 2::S;x<3

= 1, 3::s;x<oo·
Example 10. Suppose that half the population of a town are conswners
of rice. 100 investigators are appointed to find out its truth. Each
investigator interviews 10 individuals. How many investigator do you expect
to report that three or less of the people interviewed are consumers of
rice?

·r

SPECIAL TYPE OF DISCRETE DISTRIBUTION 105


t
Consider the experiment "One investigator is drawn and seen whether
he reports that three or less of the people interviewed are consumers of
rice" == whether it is success. Let probability of success = p
Now the investigator draws one individual and sees whether he is
1
consumer of rice. Let q be the probability of this event ="2. The
investigator repeats this experiment 10 times.
Let Y = Number of such individual among ten. So Y has binomial
distribution with parameter n = 10, q = .!:.
-. 2
Therefore p = P(Y ~ 3)
1 = P(Y = 0) + P(Y = 1) + P(Y = 2) + P(Y = 3)
_10
- C (1- )O( 1--1 )10-0 + 10C (1- )1( 1--1 )10-1
o 2 2 1 2 2

+ 10C (1- )2( 1--


1 }10-2+ 10 C (1- )3( 1--
3
1 )10-3
2 2 2 2 2

= (1)10
2 +10x21(1)9
2 +4\2
ll)2(1)82 j1)3(1) 2
+121..."2

1)10 (1 )10
= (2 {I + 10 + 45 + 120} = 176 x 2

Now, X has bonomial distribution with parameter

n=l00, p=176x"2(1)10. Therefore required expectation = E(X)

= 100 x 176 X (.!:.)10 = 17600 = 17600 === 17


2 210 1024
.. 17 investigators are expected to report so.

Example 11. If X be a binomially distributed with E(X) = 2 and


4
var(X) = 3"' find the distribution of X
4
We have E(X) = 2,var(X) = 3".
106 ENGINEERING MATHEMATICS - llA

.. np=2 (1)
4
np(l- p) ="3 (2)
Solving (I), (2), we get
1 1 2
p = 3" ,n = 6, q = 1- p = 1- 3" = 3" .

fo = P(X = 0) = Co
6 (1)0(2)6
3" 3" = 729
64

t. = P(X (1)1(2)5
= 1) = 6 C1 -
3
-
3
= - 64
249

f2 = P(X = 2) =6 C2(~r(~r- 28:3 .


Similarly f3 = P(X = 3) = 160 f4 = P(X = 4) = 20
729' 243

fo = P(X = 5) = 2:3.f6 = P(X = 6) = 7~9 .

._-
Thus the required distribution of X is
X : 0 I 2 3 4 5 6
64 64 80 160 20 4 1
. 729 243 243 729 243 729 729·
1
Example 12. The probability of a man hitting a target is "3
(a) Ifhe fires 5 times, what is the probability of his hitting the target at
least twice?
(b) How many times must he fire so that the probability of his hitting
the target at least once is more than 90%.

Here p = probability of hitting = .!..


3
2
:. q = probability of no hit ="3.
(a) Let X be the number of hits. Here n = 5
.. P(X ~ 2) = 1- P(X < 2)
SPECIAL TYPE OF DISCRETE DISTRIBUTION 107

= 1-P(X = O)-P(X = 1)

= 1_ CO(%r(~r Cl(%}(~r
5
_5

131
= 243 .
(b) Let n be the smallest number of fires so that the probability of hitting
the target of least once is more than 90%.
By the condition

P(X~1» 1~~ or, 1-P(X=0»0.9 or, 1-(~r >0.9


2 10gO.l
or, nlog- < logO.1 or, n>--2-
3 log-
3
.. n > 5·679
n =6.
Thus he must fire 6 times.

4.3. Poisson Distribution.


A discrete random variable X is said to have a poisson distribution with
• parameter 11(> 0) ifits distribution is given by
X 0 2 3

fo t,
e-~Ili
where the prnf, t. = P( X = i) = -. t.
I-

e -J.l1l2 e-~ 112


e.g one probability mass, f2 =~ = -2- etc. [W.B.V.T. 2013]

e -~Il i
Note : (1) Since the parameter 11 > 0, t. = -,-,t. - ~ O.
108 ENGINEERING MATHEMATICS -IIA

Thus the pmf Ii satisfies the two fundamental properties of prnf.


(2) When the random variable X has a poisson distribution with
parameter f.!(> 0), we write X - P(f.!) and we say Xis a poisson variate.
(3) Actually poisson distribution is a limiting case of Binomial
distribution when n is very large and p is very small so that f.! = np is of
finite magnitude.
(4) The significance of the parameter f.! is given in next theorem.
Cases where Poisson Distribution fits.
Let us consider a sequence of changes. If the random variable X(t)
denotes the number of changes during the interval (0, t), then X(t)
assumes the values 0, 1, 2, 3, .

It can be shown that dX( t) = L.) = u,


.I. \ =e
-At
-.(').4
,-, . 0, 1, 2, ... ...
L =
L
where 'A. = number of changes per unit time.
Thus the distribution of X (t) becomes
X(t) ° 1 2 3
t. fo t. f2 f3
('A.t)i
where t, = e -At -. ,- which is the prnf of poisson distribution.
£.

Thus 'changes in an interval' is a poisson variate with parameter


f.! ='A.t = average changes in the interval.
Note. (1) The interval (0, t) may not be a time-interval. See the following
iIlustration
(2) We use the notation X(t) in place of X because it depends on t.

Illustration. (i) Let a huge metal sheet be produced by a machine in a


factory. Defects are noticed in the sheet. The machine is such that the
average number of defects per unit area is 3. A piece of '10 unit area of
the sheet is purchased by a company.

Let X = Number of defects in this piece. Then X may assume values


0, I, 2, 3 ... up to 00. Here 'change' means 'defect' and the interval
(O,t) stands for '10 unit area of the sheet' or (0,10).
SPECIAL TYPE OF DISCRETE DlSTRmUTION 109

Then X (or, X(lO) has the poisson distribution


X 0 1 2 3
t. fo t. f2 f3
i
where t. = e-~ ~with
L.
the parameter Il = 3 x 10 = 30 = average number

of defects per 10 unit area of the sheet.


0
~
For example, /0 = P(X = 0) = e -30 -,-
30 = e -30
f
o.
i.e., probability no defects in the piece = e-30
In other words 100e-30% such pieces will be free from defects.
(ii) The number of deaths in a state in one year is a poisson variate
(Hi) The number of radio active atoms decaying in time t follows the
poisson distribution with parameter Il = t x average number of decayed
radioactive atoms per unit time.
Theorem. If X is a poisson variate with parameter Il then
• (i) Mean of X is Il
, (ii) Variance of X is Il [W.B.U.T. 2006,2007,2012,2013]
Proof The values assumed by X are 0, 1,2, , with probability
f . e-~Ili
P(X = L) = fi = -.,-L. .
ao ao -~ i
(i) Mean = E(X) = 2) t. = Li~
i=O i=O L.

=e-~t; ao
(i-1)!
Ili
=e-~ (Il
3
112 11 114
0!+T!+2T+3f+···uptooo
J
= II II £
,..e-~ (1 + ,.. + 2! + ~3! + ... up to ooJ = Ile -~ . e ~ = Il

(ii) Now, E{X(X -1)} =


ao

t;i(i -l)e-~ ~!
i ao

= e-~1l2t; (r-
i-2
2)!

= e-~1l2(1+.e.+£+~+ up to ooJ
1! 2! 3!
ENGINEERING MATHEMATICS -UA
110

So, by an earlier result, (See the Theorem of Art 2.6) the variance,
Var(X) == E{X(X -I)} - m(m -1) ,where m == mean
== 112 - 1l(1l - 1) == Il
Note. In light of the above theorem we considered the parameter Il == At
in our previous illustration.
Moments of Poisson Variate
In the following two theorems we find the moments about zero and
the central moments of a random variable X having poisson distribution.
Theorem 1. If X has poisson distribution with parameter J.l then its

moments about zero are

(i) J1.1' == J1.

(ii) J1.2'== J1.2+ J1.

(iii) J1.3' == J1.3+ 3f.l2 + J1.


(iv) J1.4' == J1.4+6f.l3 +7 J1.2+J1.
Proof. (i) The first moment about zero,
f.l/ == E( (X _0)1) == E(X) == J1.

(ii) The second moment about zero,


co -Jl;
J1.2'== E( (X _0)2) == E(x2) == Li2
;=0
~
I.

co -Jl i

== L{i(i-l)+i}~
;=0 II
<Xl e-Jl J1.; ~ e-Jl J1.; co e-Jl,,; co e-Jl,,;
==L(i-l) +L.Ji ==L~+L-r-
;=0 i(i-1)(i-2)! ;=0 i(i-1)! ;=0(i-2)! ;=o(i-1)!

eo ;-2 '" ;-1


== e- JlJ1.2L....l:!:-- + e- JlJ1.L _J1.
__
;=0 (i - 2)! ;=0(i -1)!
_ ~ J1. J1.23 J.l ) - ( J1. J1.23 J1. )
==e JlJ1.- 1+_+-+-+ ·00 +e JlJ.l 1+_+_+-+ 00
( l! 2! 3! l! 2! 3!

== e-Jl J1.2. e" + e-Jl . J1.' e'' == J1.2+ J1.


SPECIAL TYPE OF DISCRETE DISTRIBUTION 111

(i ii) The third moment about zero,

fJ3' = E( (X -oi) 3
= E(X ) = ~>3
00

;=0
-/1;

e .;
I.
oe . -JJ ;
= L{iU-l)(i-2)+3i(i-l)+i} e .;
;=0 l.

r 00

=e-JJfJ3L-fJ--+3e-JJfJ2L-fJ--+
;=o(i-3)!
;-3

i=0(i-2)!
00 ;-2 00

Li~
;=0
-JJ;

i!

t =e-'Il'(l+Il+~:
= e-JJfJ3eJJ+ 3e-JJfJ2eJJ+ fJ
+---00 )+3e-'1l2(1+1l+~:+---+E(X)

= fJ3 +3fJ2 + fJ
(iv) Kept beyond the scope of this book.
Theorem 2. If X has poisson distribution with parameter fJ then its
central moments are

(i) fJl =0 (ii) fJ2 = fJ


(iii) fJ3 = fJ (iv) fJ4 = 3fJ2 + fJ
Proof. (i) 1st central moment, fJl = E(X - X)

= E(X) - X = fJ - fJ =0
(ii) Second central moment,
2
2 2
IJ2=E«X-X »)=E(X )-2XE(X)+X
, , 2
= fJ2 - 2 fJfJl + fJ = fJ 2 + fJ - 2fJ . fJ
'
+ fJ - = fJ
(iii) The third central moment,

.,.I fJ3 = fJ3' - 3fJ2'fJ/ + 2fJ/3 (using relation among central and moment)

(tv) The fourth central moment,


, " "2 '4
fJ4 = fJ4 -4fJ3 fJl +6fJ2 fJl -3fJl •
(Using relation among central and any moments)
112
ENGINEERING MATHEMATICS-IIA

= /.l + 6J.l3 + 7J.l2 + J.l-4(J.l3 + 3J.l2 + J.l)J.l


+ 6(J.l2 + J.l)J.l2 - 3J.l4 = 3J.l2 + J.l
4.4. Poisson Approximation to Binomial Distribution.
The range of applications of Poisson Distribution becomes more wider
as it is used as an approximation for·a Binomial distribution. In case of a
Binomial distribution when n becomes large, p is small enough so that np
is a moderate fixed value,the Binomial variate becomes approximately equal
to a Poisson variate. This is given by the following theorem

Theorem. Let the random variable X follows Binomial distribution with

pmj, fi = (7)pi(l-p r- i
, i=0,1,2, ...,n

where nand pare parameters.If n ~ 00 and p ~ 0 such that np = J..l ,a


fixed quantity then
i
,~~n;,(7 )pi(l_ pr-
i
= e-;r for a fixed i, i.e. t, of Binomial

fi of Poisson distribution.
distribution ==

Proof Beyond the scope of the book.


Illustration. Let a box contains 200 fuses. Experience tells that 2% of
such fuses are defective.Let us consider the experiment of drawing a fuse
and testing whether this is defective or not.
Let X = number of defective fuse.

:. X has Binomial distribution with parameter n = 200, p = 2 =.02.


100
The prnf of X is

I, = (: )P'(I- pr' = (2~0) (.02)'(1-.02)200-'.


Here we see n is so large and p is small so we can write
200) -2,)Ox.02
( . (.02)i(l_.02)200-i z: ~'_I _
L
-4

L.
=;- L.

e-4 e-4 e-4 e-4


Now P(X s 3) = fo + fl + f2 + f3 ==_+_+_+_.
O! I! 2! 3!
SPECIAL TYPE OF DISCRETE DISTRIBUTION 113

4.5. Illustrative Examples.


Example 1. For a poisson variate if P(X = 2) = P(X = 1), find
P(X = 1 or 0). Find also mean of X. Find the 3rd moment of X about O.
Let m be the parameter of the poisson variate.

:. P(X = i) = fi
e-IJ . J.l.2 e-IJ. J.l.l
Now, f2 = t.. or, ----'-- = ----'-- :. J.I. =2
2! l!
:. P(X=l orO)=.t;+Jo=e-IJ(1+J.I.)=e-2(1+2)=3e-2

Mean of X = J.I. = 2
Now, 3rd moment about 0 = J.l.3'

= J.l.3 + 3J.1.2 + J.I. = 8 + 12 + 2 = 22


Example 2. A car-hire firm has two cars which it hires out day by day.
The number of demands for a car on each day is distributed as a poisson
distribution with average number of demand per day 1.5, Calculate the
proportion of days on which neither car is used and the proportion of
days on which some demand is refused (e -1.5 = 0.2231) .
[W.E. U. Tech,2003, 2006,2007]
Let X be the random variable denoting the number of demands for a
car on any day. Then X is poisson distributed with parameter J.! = 15. So
-J.1 i
its pmf P(X = i) = t, = ~
to
where J.! = 1.5 .
:. Proportion of days on which neither car is used
= Prob. of there being no demand for the car
o -J.1
= P(X = 0) = ~
O!
= e-1.5 = 0.2231
.
Proportion of days on which some demand is refused
= Prob. for the number of demands to be more than two

= P(X > 2) = 1- P(X s 2)


EM-2A-S
\14
ENGINEERING MATHEMATICS -IIA

= 1- {P(X = 0) + P(X = 1) + P(X = 2)}

_
-1- {e
-Il Ile-Il
+l!+~
112e -Il}
=1-e
-1- (
.0
()2
1+1.5+- -
1.5 J =0.19126·
2

Example 3. A radio active source emits on the average 2.5 particles per
second. Calculate the prob. that 2 or more particles will be emitted in an
interval of 4 seconds.
Here J.... = number of changes (which is particle emitted) per unit time
on an average = 2.5 .
Let X be the random variable denoting the number of particles emitted
in the given interval. Then X is poisson distributed with parameter 11 =
average number of particle in 4 seconds = 2.5 x 4 = 10 .
So the p.m.f, fi = P(X = i) = Ilie-Il / i! = 10ie-10 / i!

So the required Prob. = P(X ~ 2) = 1- P(X < 2)

= 1- {P(X = 0) + P(X = 1)}

Example 4. In a certain factory turning razor blades, there is a small


chance, 1/500 for any blade to be defective. The blades are in packets of
10. Use poisson distribution to calculate the approximate number of packets
containing (i) no defective (ii) one defective (iii) two defective blades
respectively in one consignment of 10,000 packets. (Given e-.02 =.9802).
[WB. U.Tech 2004]

On an average there are 1 defective blade per 500 blades. So the average
number of defective blades in a packet of 10 is 10 x _1_ = ~ = 0.02.
500 50
Let X = number of defective blades in a packet. X follows poisson
distribution with parameter 11 = 0.02 .
e-Illli
So the prnf is P(X = i) = fi = -.,-.to
SPECIAL TYPE OF DISCRETE DlSTRmUTION I1S

(i) Now probability that one packet contains no defective blade


e-!'Il 0
=P(X = 0) = fo= ()! = e-!' = e-0.02 = 0.9802'

:. Number of packets in the consignment containing no defective


blades = 0.9802 x 10,000 = 9802
(ii) Probability that one packet contains one defective blade
-!'
= P(X = 1) = t. = ~ = e-O·02 x 0.02
I!
= 0.9802 x 0.02 = 0.019604
:. Number of blades in the consignment
= 0.019604 x 10,000 = 196.04:::: 196
-!' 2 (02),2
(iii)
111 P(X = 2) =--=e
e Il -.02' x--
2! 2
= 0.9802x.0002 = 0.00019604
:. Required number = 0.00019604 x 10,000 = 1.9604:::: 2.
Example 5. If a random variable has a poisson distribution such that
p(t)P(2) , find (i) mean of the distribution (ii) standard derivation (iii)
=
P(X = 4) [WE.U.Tech 2007]
Let X be a poisson va~iate. Then the p.m.f of X is
i
f.I = P(X = i) = e-!' ~i! ' i = 0, 1, 2, ...
2
As P{l) = P(2) , so e-!'Il = e-!' ~! :. Il =2

(i) So the mean of the distribution is 2


(ii) Now Var(X) = Il = 2 :. standard derivation = .J2 .
(oo,) ( ) -2 24
Ul P X = 4 = e -
2
= -e
2
.
4! 3
Example 6. Find the probability that at most 5 defective fuses will be
found in a box of 200 fuses if experience shows that 2 percent of such
fuses are defective.
Let X denote the number of defective fuses in the box.
Then clearly X has a binomial distribution with parameters
2 1
n=200p=-=-
, 100 50
\
I
116 ENGINEERING MATHEMATICS -llA

1
.. Il = np = 200 x 50 = 4

Using an approximation, by the poisson distribution, we have

5 e-44i -4( 1+,+-,


4 42 4
5)
P(X::;5)= L-.,-=e +"'+-, =0.785
i=O L 1. 2. 5.

Example 7. Six coins are tossed 6400 times. Using the poisson distribution
find the approximate probability of getting six heads 8 times.
Let X denote the number of six heads in the toss of six coin. Then X

is a binomial variate with parameter n = 6400, p ="2 (1)6 1


= 64 . Here n

is so large and p is small, but Il = np = 6400 x ~ = 100


64
So using the poisson approximation, we have,
(100)8
P(X = 8)=nC8p8(1- pr-8 ::::e-1OO-'----'--
8!
Example 8. 2% of the items made by a machine are defective. Find the
probability that 3 or more items are defective in a sample of 100 items.
(Given e-1 = 0.368, e-2 = 0.135, e-3 = 0.498)
Consider the experiment - one item is drawn and found whether it is
defective (success). Let this experiment be repeated 100 times.
X = number of defective items. Then X follows Binomial distribution
2
with parameter n = 100, P = 2% = - =.02 .
100
i
:. the pmf, fi=nCipi(l- pr- .

Now since n is large and p is small and np = 100x.02 = 2 so Binomial


prnf is approximately equal to Poisson prnf with parameter np .
e-2·2i
:. fi:::: .,
L.

Required probability = P(X ~ 3) = 1- P(X < 3)

= l-{fo + t. + f2} = 1-e-2(1+2+ ;~) = 1- 0.135 x 5 = 0.325.


SPECIAL TYPE OF DISCRETE DISTRIBUTION 117

EXERCISES
[T] SHORT ANSWER QUESTIONS

1. If for a poisson variate X, E( X2) = 20, fmd E(X) and var(X).


[Hints: E(X2) = ~2 +~
:. 1l2+1l=20
:. Il =< 4
.. E(X) = Il = 4
and var(X) =~= 4]
2. For a Binomial distribution, the mean is 3 and the variance is 2. Find
the values of n andp. Hence find the probability that X (the variable value)
is 5.
3. A discrete random variable X has the mean 6 and variance 2.
1
Assuming the distribution is binomial, find the probability that 5:::; X :::;7.
[W.B. U. Tech, 2002 ]
4. If 20% of the articles produced by a machine are defective,
determine the probability that out of the 4 articles chosen at random less
than 2 articles will be defective. [W.B. U. Tech 2002 ]
1
5. Show that the Binomial distribution is symmetric when P = "2 .
.•.
1
6. If the probability of a defective bolt is 10' find (i) mean (ii)
variance for the distribution of defective bolts in a total of 400.
7. If the sum of mean and variance of a Binomial distribution is 4.8
for five trials, find the distribution.
8. In a shooting competition, the probability of a man hitting a target
1
is "5 . If he fires 5 times, what is the probability of hitting the target at
least twice.
9. Let the probability of a patient recovering from a certain disease is
0.75. Find the distribution of the number of recoveries among 4 patients.
Hence find mean and s.d.
ENGINEERING MATHEMATICS -IIA
118

10. Prove that the probability of obtaining double six at least once in
1
24 throws with two dice is slightly less than 2" .
l1.Find the minimum number of times a die has to be thrown such
1
that the prob. of 'no. six' is less than 2".
[Hints: .The prob. of 'no. six' in n throws of a die =( !J
:. (5)
- < -1 :. n> -10g2 = 3.81]
6 2 10g5-10g6
2
12. A and B playa game in which A's chance of winning is "9. In a
series of 8 games, what is the chance that A will win at least 6 games ?
13. Find the probability that at most 5 defective pen will be found in
a box of 200 pen if experience shows that 2% of such pen are defective.
14. Five balls are drawn with replacement from a box containing 5
white and 4 black balls. Find the probability of getting 3 white balls.
[Hint. X = No. of white balls, X - b(n, p) with

n = 5, p = ~; P(X = 3) =?]
9
15. Two dice are thrown n times in succession. What is the probability
of obtaining double six at least once?
16. If probability of success is 0.1, how many trials are neccessary
1
in order that probability of at least one success is > 2" .
17. The mean of the poisson distribution is u , Then what is its
standard deviation? [ W.B.U.Tech 2004 ]
18. If a r.v. follows poisson distribution such that P{l) = P(2) , find
(i) mean of the distribution (ii) P(4) [W.B.u. Tech 2003,2004]

19." If X is a poisson variate such that P(X = 1) = 0.2 and


P(X = 2) = 0.2, find P(X = 0) [W.B.U.Tech,2002]

20. Show that the poisson distribution is a probability distribution.


21. Show that for a poisson variate standard deviation =.J mean .
SPECIAL TYPE OF DISCRETE DISTRIBUTION 119

ANSWERS

1. 4, 4 2.9,.!..,224/2187 4.0.8192 6. (i) 40, (ii) 36


3

7.B(5,:) 821
8. 3125

9. X: 0 1 2 3 4
1 3 27 27 81
.. t, : 256 64 128 64 256
.J3
Mean s j , s.d=T

11. 4 12. 1024/2187 13. 0.785 14.10/21 15.1- (35)n


36

16.69 17. JJ; 18. (i) 2 19. e-2

rII] LONG ANSWER QUESTIONS

.. 1. Assume that 50% of all engineering students are good in mathematics .


Determine the probabilities that among 18 engineering students (a) exactly
10 (b) at least 10 (c) at most 8 (d) at least 2 and at most 9 are good in
maths.
2. On an average 20 red blood cells are found in a fixed volume of blood
for a normal person. Determine the probability that the blood sample of a
normal person will contain less than 15 red cells.
14 -2020i
[Hints: Here J..l = 20:. P(X < 15) = Lc ., = 0.105 ]
-
j
i=O L.

3. The probability that screw manufactured by a machine to be defective


1
is 50 . A lot of 6 screws are taken at random. Find the probability that (i)

there are exactly 2 defective screws in the lot (ii) no defective screw and
(iii) at most 2 defective screws.
1
4. The probability of a missile hitting a target is "4 .
(i) If 7 such missiles are sent, what is the probability of hitting
the target at least twice
120 ENGINEERING MATHEMATICS -IIA

(ii) How many missiles must be sent so that the probability of


2
hitting the target at least once is greater that "3?

5. In a bombing attack there is a 50% chance that a bomb strikes the


target. Two direct hits are required to destroy a bridge completely. How
many bombs must be dropped to give a 99% chance or better to
completely destroy the target.
2
6. (a) Amal has probability "3 of winning a game. If he plays 4 such
games, find the probability that Amal wins (i) exactly 2 games (ii) at least
1 game (iii) more than half of the game.
(b) A and B play a game in which their chances of winining are in
the ratio 3 : 2. Find A's chances of winning at least three games out of
the five games played. [W.B. U. Tech 2004]

7. The probability is 0.02 that a bicycle produced by a factory is


defective. An consignment of 10000 bicycles is sent to a country. Find
the expected number of defective bicycles and standard deviation.
8. It is seen that a cricket player becomes out within 10 runs in 3 out
of 10 innings. If he plays 4 innings, what is the probability that he will
becomes (i) out twice (ii) out at least once within 10 runs.
[ W.B. U. Tech 2007]
3
[Hints : P = 10 (the probability of success in a trial)

Let X = No. of times of becoming out within 10 runs when


he plays 4 innings.
= No. Of success in 4 trials.
3
X follows Binomial distribution with P = 10 ,n = 4.

.. Ii, = P(X = i) ( 3
=4 C\lO
)i ( 1- 3 )4-i =4
( 10
c,
3 )i ( 7 )4-i
10
10
4 3i x 74-i
= C·---
l 104
SPECIAL TYPE OF DISCRETE DlSTRmUTION 121

32 x 74-2
:. (i) P(X = 2) =4 C2· 104 = ·2646
4 x 74-0
4 Co' 3 10
(ii) P(X ~ 1) = 1- P(X = 0) = 1- to = 1- 4
74 2401
= 1- 104 = 1- 10000 = ·7599]
9. Show that the neccessary and sufficient condition for two given
numbers a and b to be respectively the mean and variance of some binomial
a2
distribution are that a> b > 0 and a _ b is an integer. Show further that
when these conditions are satisfied the binomial distribution is uniquely
determined.
10. Determine the expected number of boys in a family with 8 children,
assuming the sex distribution to be equally probable. What is the probability
that the expected number of boys does occur ?

11. In sampling a large number of parts manufactured by a machine,


the mean number of defectives in a sample of 20 is 2. Out of 1000 such
samples, how man would be expected to contain at least 3 defective parts.

12. Let X be a binomially distributed random variable with parameters


nand p. For what values of p is var(X) maximwn, if you assume that
n is fixed.
13. X is poisson variate with parameter 3. Find the probability that X
assumes the values (i) 3, 2, 1, 0 (ii) less than 3 (iii) at least 2.

[Given e-3 = 0.0498]


14. A bank receives on an average 2.5 customers per hour. Find the
probability that in a certain hour the bank receives (i) no customer (ii)
exactly 4 customers. Assume that the number of customers received in
an hour is poissonly distributed. ( e -2.5 =.0821)
15. For a poisson variate with mean 2, find the probabilities

P(X = 1), P(X s 1), P(X < 1), P(X > 1), P(1 ~ X ~ 3)
[Given e-2 =.1353]
122 ENGINEERING MATHEMATICS -IIA

16. (i) The average number of defective spot per sheet of a metal sheet
factory is 2. Asswning poisson distribution, what is the probability that a
particular sheet is free from spot? If the factory supplies 1000 such sheets,
how many sheet contain more than 2 spots ?

(ii) Suppose that the number of telephone calls an operetor receives


from 11.00 am to 11.05 a.m. follows a poisson distribution with m = 3
(i) Find the probability that the operator will receive no calls in that time
interval tomorrow (ii) Find the probability that in the next 3 days the
operator will receive a total of 1 call in that time interval. (e = 2.718)
17. If 5% of the electric bulbs manufactured by a company are
defective, use poisson distribution to find the probability that in a sample
of 100 bulbs (i) none is defective (ii) 5 bulbs will be defective (given
e-5 =.007)

18. Find the probability that at most 5 defective bolts will be found in
a box of 200 bolts, if it is known that 2% of such bolts are expected to
be defective. (You may take the distribution to be poisson). Given
4
e- = 0.0183 .

19. The manufacturer of a telephone instrument knows that 3% of


his product is defective. He sold the instruments in cartoon of 100 and
gurantees that not more then 3 in any cartoon will be defective. Find the
probability that a cartoon fails to meet the gurantee ? (Given e3 = 20.1)
20. In a certain factory blades are manufactured in packets of 10.
There is a 0.2% probability for any blade to be defective. Using poisson
distribution calculate approximately the number of packets containing two
defective blades in a consignment of 20,000 packets. (given that
e-O.02 = 0.9802 )

21. Suppose 220 misprints are distributed at random throughout a book


of 200 pages. Find the probability that a given page contains (i) no
misprints (ii) one misprint (iii) 2 or more misprints.
SPECIAL TYPE OF DISCRETE DISTRmUTION 123

22. A source of liquid contains bacteria with the average number of


bacteria per c.c equal to 3. Ten 1 c.c. test tubes are filled with the liquid.
Assuming the poisson distribution is applicable calculate the probability
(i) that all 10 test tubes show growth, that is, contains at least 1 bacterium
each (ii) that exactly 7 test tubes show growth [Given e -3 = 0·04975 ]
23. A large number of observation on a given solution, which contained
bacteria, were made taking samples of 1 c.c each noting down the number
of bacteria present in each sample. Assuming the poisson distribution and
given that 10% samples contained no bacteria, find the average number
of bacteria per c.c. [loge 10 = 2.3026 ].
24. 100 litres of water are supposed to be polluted with 106 bacteria.
Find the probability that a sample of 1 c.c of the same water is free from
bacteria.

j . 25. An office switch board receives phone calls at the rate of 2 in every
5 minutes on the average. What is the probability of getting exactly 4 calls
in 15 minutes ?
26. In turning out certain toys in a manufacturing process in a factory,
the average number of defectives is 10%. What is the probability of getting
exactly 3 defectives in a sample of 10 toys chosen at random, by using
Poisson approximation to the Binomial distribution (Take e = 2.72 )
(Hint: the method is adopted in an lllustrative Ex. page 89;
27. (i) An aeroplane runs with the help of its 1000 components. Failure
of a component is independent of other. Chance of failure of a component
'is 0.1 %. The plane takes off from one airport and reached safely its
destination if every component of its does not fail. Find the probability of
reaching safe. (Hint. Use method of approximation of Binomial variate by
poisson)
(ii) If 5% of the books bound at a certain bindery have defective
bindings, find the probability that 2 of 100 books bound by this bindery
will have defective bindings using (a) the binomial distribution (b) the
possion approximation [e-5 = 0.007].
124
ENGINEERING MATHEMATlCS-UA

(iii) Suppose 1% of the items made by a machine are defective. Find


by Poisson aprox to Binomial the probability that 3 or more items are
defective in a sample of 100 items (Given e -1 =.632)
(iv) In a lottery with 10,000 tickets there are 100 prizes. A man buys
100 tickets. Apply poisson approximation to binomial distribution to find

the approximate probability of his winning at least one ticket (~ = 0.368) .

ANSWERS
1. (a) 0.167 (b) 0.4073 (c) 0.4073 (d) 0.5920 2.0.105

6j(
3. (i) ( 2 50A
1 )2(49)4
50
49)6
(ii) ( 50 (iii)
(49)8
50
(6)(
+"1 1 )(49)5
50 50

4547
4. (i) 8192 (ii) 4 5. 11

6. (a) (i) 8/27 (ii) 80/81 (iii) 16/27 (b) 0.68256

7. 200, 14 8. (i) 0.2646 (ii) 0.7599


1
10. 4, 0.27 11. 323 12. "2
13. (i) .2241, .2241, .1494, .0498 (ii) .4233 (iii) .8008

14. (i) 0.0821 (ii) 0.1336

15. 0.2706, 0.4059, 0.1353, 0.5941, 0.7216

16. (i) e-2; 1000(1- 5e-2) (ii) e-3, ge-9

17. (i) 0.007, (ii) 0.182 18.0.78 19. 0.35(=1-13e-3) 20.4

21. (i) 0.333 (ii) 0.366 (iii) 0.301

22. (i) 0.60003 (ii) 0.01036 23. loge 10 = 2.3026

24. e-10 25. 0.133926. 0.061

27. (i) e-1 (ii) (a) 0.081 (b) 0.084 (iii) 0.080 (iv) 0.632
SPECIAL TYPE OF DISCRETE DISTRmUTION 12S

rIll) MULTIPLE CHOICE QUESTIONS

1. The mean of Binomial distribution B(n,p) (where n,p are the no.
of trials and probability of success) is
n
(a) p (b) 0 (c) np (d) I

[WB.U.T~ch 2007]
2. If X has Binomial distribution with parameter nand p, then its
variance is
(a) np (b) np 2 (c) np( 1 - p) (d) np( 1 + p)
3. The mean and variance of a Binomial distribution with parameter
2
6'3 are
4 4 2 4
(a) 6'3 (b) 4'3 (c) 0'3 (d) 9,-.
3
4. The mean of the Binomial distribution B( 10, i) is

(a) 4 (b) 6 (c) 5 (d) O.


[WB.U. Tech. 2007]
5. The statement "a Binomial variate has mean 7 and S.D 5" is
(a) False (b) True.
1
6. If X has binomial distribution with parameter 4 and 3' then

P(X = 1) =?
4 16 32 16
(a) -
3 (b) 27 (c) 81 (d) 9.
7. The mean and standard deviation of a Binomial distribution are

respectively 4 and JI.


parameters of the distribution)
The values ofn and p are (where nand p are the
[WB. U.Tech 2006]

(a) 11 ~ (b) 12,~


'4 7
(c) 12.!. (d) 11,±
'3 3
126
ENGINEERING MATHEMATICS -ilA

8. If X has a poisson distribution with parameter u , then mean of X is


(a) ~ (b) ~2
(c) ~(~ -1) (d) 1.
9. The distribution for which mean and variance are equal is
(a) Poisson (b) Normal
(c) Binomial (d) Exponential.
[W.B. U.Tech 2006]
10. If a random variable X has a poisson distribution with mean 0.4,
then P(X s 1) is

(a) e-O.4 (b) 1.4e -0.4

(c) 1A8e -0.4


(d) none.

11. A random variable has a poisson distribution such that P(l) = P(2) .
Then the S.D of X IS

(a) 0 (b) 2

(c) .J2 (d) -2. [W.B.U.Tech 2007]


12. The Poisson distribution is a limiting case of Binomial distribution
when
(a) n is very large and p is vary small
(b) n is very small and p is very large
(c) n,p both are very small
(d) n, p both are very large.
13. The mean. and variance of a binomial distribution are 4 and 3
respectively. Then the parameters of the distribution are
(a) 16..!. 1
'4 (b) 8'2
1
(c) 12,- (d) none.
3
14. If X is a poisson variate such that P(X = 1) = 0.2 and
P(X = 2) = 0.2 find P(X = 0). [W.B.U.Tech 2002]
SPECIAL TYPE OF DISCRETE DISTRIBUTION 127

ANSWERS

l.c 2.c 3.b 4.a 5.a 6.c 7.c


8.a 9.a 10.b ll.c 12.a 13.a 14.d
Plsl
m.. D_IS_C_RE;;;;;;;;T_E_J_O_I_N_T_D
CHEBYSHEV'S __IS_T_RI_B_U.;;TI
__O_N_&;;;;
INEQUALITY

5.1. Introduction.
Let a discrete random variable X assumes 4 possible values
~, X ,
2
A1 and x4 • Corresponding to each of these vaules let another random
variable Y assumes 3 possible values YI' Y2 and Y3 .
Then the pair of variables (X, Y) assumes the 4 x 3 = 12 pair of values
shown below.

By (X = xi,Y = Yi) we mean that the events (X = Xi) and (Y = Y)


occur simultaneously and the event like (X = Xi) = {(Xi' YI)' (Xi' Y2)' (X;, Y3)}
In this chapter we are interested in probability statements of these event
points (Xi' Yj) . The table showing the probabilities of assuming these pairs
(Xi' Yj) by the pair (X, Y) is known as joint distribution of the Bivariate
(X, Y).
5.2 Joint Distribution of two random Variables:
Let X be a random variable assuming the values

Y be a random variable assuming the following values corresponding


to each Xi

Y: YI Y2 Y3······ Yll
DISCRETE JOINT DISTRIBUTION & CHEBYSHEV'S INEQUALITY 129

Then the pair of variables or the bivariate (X, Y) assumes the values
shown in the following table
y
X Y2 Yn

There are mn number of values (xi,y) in the above table. These


values are known as Bivarite Data.
Let, Pij = Probabili~ of assuming the pair (xi'Yj) by (X, 1')
= p{(Xi'Yj)} = P(X = xi'Y = Yj) "
All the possible values of Pij are shown by the followin table:

X Y Yl Y2 Y3 Yn Total
.• Xl Pll P12 P\3 Pin PXI

X2 P2l P22 P23 p/'


.••.. n PX2

X3 P3l P32 P33 P3n PX3.,·
.•:..•.
" '

Pml Pm2 Pm3 Pmn PXm

Total PYI Pri Pn PYr. 1


where the row wise and column wise total are
n
PXi = Pi! + Pi2 + ..... + Pin = LPij
j=l

EM-2A-9
ENGINEERING MATHEMATlCS-llA
130
m

and PYj = Plj + h} + ..... + Pmj = LPij


i=1

m n n m
The grand total = LPXi =LPl'i = LLPij ==1
i=1 }=I }=I i=1
The above two tables represent the joint distribution of the bi-variate

(X, Y)
The row-wise totals PXi and the column-wise totals PYi are called
Marginal Probability Mass of X and Y respectively.
Pij = P(X = i, Y = Yj) are known as Joint Probability Mass of the

bivariate (X, Y)
Obviously for the individual variables X and Y,
P(X ==x) = PXi = Pi! + Pi2 + + Pin' j = 1,2,.·· =m
and P(Y ==Yi) = PYj = PI} + P2} + + Pm}' j = 1,2,.····n
:. X will have the probability distribution
X : XI X X3 •.••••• »; Total
2

PXi : PXI PX2 PX3 .. , 'PXm 1


This distribution is called Marginal Distribution of X in the above
joint distribution.
Similarly the Marginal Distribution of Y in the joint distribution of

(X, Y) is
Y : YI Y2 Y3 ... , Y n Total

PYj : PYI Prz Prs ' ... PYn 1


5.3. Independent Random Variables
Let (X, Y) be a pair of random variables having joint distribution as

discussed above.
If Pij =PXiPl'i i.e.,P(X=xpY=Yj)==P(X==xJP(Y==Yj) hold for
all values of j (1 s j s m) and j (l sj ~ n) then X and Yare called
independent random variables.
In the above Example the random variables X and Yare not independent
because we see P(X ==-1, Y = 8) ==0.5
but P(X =-l)P(Y ==8) = ·40x ·20 = ·08
DISCRETE JOINT DISTRIBUTION & CHEBYSHEV'S INEQUALITY 131

Following is the example of a joint distribution of (X, Y) where X and Y


are independent.
Example. Let (X, Y)be a bi-variate having the following joint distribution

-3 5 7 Total

9 ·09 ·15 ·06 ·30 •.

·2 ·21 ·35 ·14 ·70

Total ·30 ·50 ·20 1


Here we see every data in the main body of the table is equal to the
product of the corresponding data in last colum and last row, e.g
·35 = ·70x ·50, ·1)6 = .30x.20 etc.
thatis P(X=·2,Y=5)=P(X=·2) ,P(Y=5),

P(X = 9, Y = 7) = P(X = 9)P(Y = 7) etc.


So, X and Yare independent r.v.
Theorem. If X and Yare independent random variables and A, Bare
two events then P{X E A, Y E B} = P(X E A)P(Y E B) and vice versa

Proof. P {X E A, Y E B} = Pij = PXiPYj


i j i j
where x,eA where YjeB where x.e A whereYJEB

= L i
P Xi L
j
PYj': X, Yare independen
': . wherex.e A whereYjeB

= P(X E A) P( Y E B)
The proof of the converse part is left to the reader.
Example. Let X and Y be two discrete r~dom variables so that their pair
(X, Y) assumes the following bi-variate datas :
Y
3 5 8

-I (-1,3) (-1,5) (-1,8)

4 (4,3) (4,5) (4,8)


ENGINEERING MATHEMATlCS-11A
132

The corresponding probabilities be

X 3 5 8 Total

·40
-1 ·1 ·25 ·05

4 ·3 ·15 ·15 ·60

Total -40 -40 ·20 1

Then according to above distribution,


P(X = -1, Y = 8) = p{ (-1,8)} = ·OS etc.
and P(X=-I)=·I+·2S+·0S=-40 etc.
The marginal distribution of X'and Yare respectively
X: -1 4
PXi: ·40 ·60
and
Y: 3 S 8
PYj: . 40 . 40 . 20
These two random variables are not independent as we see ·1;F. -40 x -40
etc.
Example. An urn contains three red, two white and five blue balls. Three
ball s ere drawn from the urn. X and Y denote the number of red and white
balls in a draw. Find the joint distribution of (X, Y). Hence find
P(X ~ 2, Y ~ 1) .
Find the marginal distribution of Y and hence find the probability of
drawing more than 1 white balls. Are X and Y independent r.v ?
Solution. X may take the values 0, 1, 2, 3 and Y may take the values 0,
1,2
The values of the bivariate (X, Y) are
Y
X
o 2

o (0,0) (0,1) (0,2)

(1,0) (1,1) (1,2)


1
(2,0) (2,1) (2,2)
2

3 (3,0) (3,1) (3,2)

~- ------
DISCRETE JOINT DlSTRmUTlON & CHEBYSHEV'S INEQUALITY 133

The corresponding probabilities are


p( (0,0») = P(X = 0, Y = 0) = Probability of "no red, no white"
= Probability of "3 blues"
5C3 10 1
= 10C3 = = 120 12
p( (0,1») = P(X = O,Y = 1) = Probability of "no red, I white
= Probability of'l white, 2 blues'
2C) x 5C 20 1
= 2 =_=_
toc3 120 6

p( (0,2») = P(X = 0, Y = 2) = Probability of "2 white 1 blue"

_
2C 2 x5C I
5
_=_ I
- 10C 120 24
3

p( (1,0») = P(X = 1,Y = 0) = Probability of" 1 red, no white"


= Probability of" 1 red, 2 blues"
3C)x5C2 30 I
= -=-
IOC
3
120 4
3Cx2Cx5C 1
P(I, 1) = Probability of" 1 red, 1 white, I blue" = I 1 1 =
120 4
3C x 2C 1
P(1,2) = Probability of" 1 red, 2 white, no blue" = I 2 =
120 40

P(2,0) = Probability of "2 red, no white, 1 blue" = 3 C2 X 5 C1 =!


120 8

P(2,I) = Probability of" 2 red, 1 white, no nlue" = C2 X 2C1 =_1


3
120 20
p( (2,2») = Probability of" 2 red, 2 white" = P( <1» =0
(": only 3 balls are drawn)

3C 1
P(3,0) = Probability of'3 red' = 12~ = 120

P(3,1) = Probability of 3 red, 1 white' = P( <1» =0


P(3,2) = Probability of '3 red, 2 white' = P( <1» = 0
134 ENGINEERING MATHEMATICS -IIA

:. the joint distribution of (X, Y) is given by

x -
0
1
1
1
2
1
Total
7
0 - - -
12 6 24 24
1 1 1 21
1 - - - -
4 4 40 40
1 1 7
2 - - 0 -
8 20 40
1 1
3 - 0 0 -
120 120
7 7 1
Total - - -
15 15 15 1
In fact above two tables persent the joint distribution.
Now, P(X s 2, Y ~ 1)
= P(2,1) +P(2,2) +P(l,l)+P(1,2) +P(O,l) +P(0,2)
1 1 1 1 1 8
=-+0+-+-+-+-=-
20 4 40 6 24 15
The marginal distribution of Y is given by
Y : 0 1 2
7 7 1
PYj :.
15 15 15
Probability of "more than 1 white balls"
7 2 9 3
P(Y ~1)=-+-=-=-
15 15 15 5
1 7 7
From the above table we see 12 * 24 x 15
:. X, Yare not independent.
5.4. The Multinomial Distribution
We can define joint probability distribution for n number of random
variables in exactly the same manner as we did for 2 random variables.
One such joint distribution is multinomial distribution.
DISCRETE JOINT DISTRIBUTION & CHEBYSHEV'S INEQUALITY 135

This is being defmed below :


LetEbe an experiment having r number of possible outcomes 81 'S2'··· s,
with respective probabilities PI' P2" ... P r , PI + P2 + ... + P r ::::1. That is
its event space S:::: {SI,S2" ,sr}. A sequence of n number of independent
trials of E . i.e., the joint independent experiment En::::(E, E, .. ·E) is
considered where the probability of SI' S2'.... remain same throughout
the trials.
Let XI> X2, .... X, be the variables defined by
XI:::: Number of SI occured in E"
X2 = Number of S2 occured in En

XI" = Number of s, occured in En'

Then XI ,X2' .... X, assume the non-negative integral values


nl ,n2,' ••• nr, where nl + n2 + .... +nr = n .
It can be shown that the joint probability

The joint distribution of (XI' X2,· ••• X r) given by this joint probability
is called Multinomial Distribution.
Note that when r = 2 the multinomial distribution is reduced to binomial
distribution where SI = success, S2:::: failure.
Example. Suppose that an unbiased die is rolled 9 times. Here E is the
experiment of rolling the die. So we are considering 9 independent trials
of E. E has 6 possible outcomes '1', '2', .... '6' with respective

pro b a biliti
1 ities P ::::-1 P = -1 .... P = -1 .
1 6' 2 6' 6 6
Let XI :::: Number of 'face l' occured in 9 trials.
X2 = Number of 'face 2' occured in 9 trials.

X6 ::::Number of 'face 6' occured in 9 trials.


Then the possible values of each Xi are 0, 1, ..... 9.
136 ENGINEERING MATHEMATICS - llA

The multinomial distribution of the 6 variate (XI'X2'X3'··· .. X6)


is given by

P(X1 =nl'X2 =n2,' ·X6 =n6) = 9! . (1- )nl ( -1)n2 ...( -1


n! n2 !...n6! 6 6 6
I
9! 1
=--------
n!n2!··· n6! 6nl+n2+"+n6

9'
= . .- 1 [v here L:n. =9]
n !n2 !... n6! 69 I

where every nj is non negative integer and '1 + n2 + ... + n6 = 9 .


More precisely, the probability that 'face l' appears three times, face
2 and 3 twice each, face 4 and 5 once each, and face 6 not at all

=P(X1 =3,X2 =2,X3 =2,X4 =1,X5 =1,X6 =0)

9! 1 9! 1
3!2!2!1!1!0! 69 = 3!2!2! 69 •

5.5. Joint Distribution Function


Let (X, Y) be a bi-varite having joint probability masses Pij' 1 s i s m;
l~j~n
Then the two variable function F (x, y) defined by
F(x,y) = P( -00 < X ~ x, -00 < y ~ y) is called the joint distribution
function or the joint cumulative probability distribution of (X, Y)
The function Fx (x) = P( -00 < X ~ x, -00 < Y < (0) is called marginal
distribution function of X.
Similarly the marginal distribution ofY is defined as
Fy (y) = p( -00 < x < 00, - 00 < y ~ y)
Example. In the previous example we find
F(l,2) = P(-oo < X ~ 1, -00 < y s 2)
=P(l,O) + P(l,l) + P(1,2) + P(O,I) + P(0,2)

= {P(l,O) + P(1, 1) + P(I,2)} + {P(O,O) + P(O,I) + P(0,2)}

_ 21 7 49
-PXl +Pxo =-+-=-
40 24 60
DISCRETE JOINT DlSTRffiUTION & CHEBYSHEV'S INEQUALITY 137

Theorem. For any two independent random variables X and Y the joint
distribution function F (x, y) is F (x, y) = Fx (x) Fy (y).
Proof. The proof follows from the previous theorem.
5.6. Sum of independent Random Variables.
In this section we discuss the behaviour of the sum X + Y where X
and Yare independent random variables having well known distribution.
Theorem 1. If X and Yare two independent random variables having
binomial distribution with parameter (m, p) and (n, p) respectively then
their sum X + Y has binomial distribution with parameters (m + n, p)
Proof. By hypothesis the distribution of X is
X. 0 1 2········m
/; fo J; 12 r;
where /; = P(X = i) = mCipi qni-i and the distribution of Y is
Y: 0 1 2 ········n

Let Z = X + Y . Therefore Z assumes the value 0, 1, 2,····m + n .

The event (Z = i) = (X + Y = i)
=(X = 0, Y"=i)U(X =1, Y =i -1)U ····U(X = i, Y =0)
:. P(Z = i) =P(X = 0, Y = i) + P(X = 1, Y = i-I) +> ··+P(X = i,Y = 0)
= P(X = e}P( Y = i) + P(X = I)P( Y = i-I) + ... + P(X = i)P(Y = 0)
.. X and Yare independent

i
= 'Lfrgi-r = 'L mCrpr
i
r=c;»: q"-(i-r)
r=O r=O
t
= '"
L... piqm+lI-i In C
r
II c.i -r
r=O
i
= piqm+lI-i '" mC II c.t-rr
L..J r (1)
r=O
138 ENGINEERING MATHEMATICS -IIA

Using binomial theorem we have


(1 +xY' == tnco + mcJx+ mC2x2 + ... + mCixi + ... + mcmxm
and (l+x)n == nco + n'1x+ nC2x2 + ... + nCixi + ... + ncnx"
Multiplying these two we have

(1+x)ln+n ==(mCo + InCJx+ InC2X2 + ... + mCiXi + ... + tncmxtn)X

(n Co + n CJX-t n
C2X
2
+ ... + n CiX inn)
+ ... + Cnx (2)
On RHS of (2) Coefficient of Xi is

i
"" "c r
== L..J "C.t+r
r=O

on LHS of (2) the (i + 1) on term is m+1Ic.x': Equating the coefficient


of Xi on both side of (2) we get
i
"" m C "c. ==m+nc.
L.. r I-r I

r=O
From (1) we get P(Z == i) == m+nCi/qm+lI-i
:. Z i.e., .k + Y has binomial distribution with parameter m + nand p.
Theorem 2. If X and Yare two independent random variables having
poisson distribution with parameters PJ and P2 respectively then their
sum, X + Y has poisson distribution with parameters PJ + P2 .
Proof. The values of X and Yare respectively
X:01234
Y:01234
Let U=X+Y.
:.U : 0 1 2 3 .
Now, for any integer k > 0
P(U=k)=>LP(X+Y=k)~ L P(X=i,Y=})
i+j=k

:: L P(X = i) P(Y =}) r: X and Yare indipendent


i+j=k

"
-PI
~.
i
e
-P2
fJ.2
j J =e-(PI+P2) ""
I
PIfJ.2
j

L.. [ 'f'f L..,'f"


i+j=k 1. ). i+j=k 1.).
DISCRETE JOINT DISTRIBUTION & CHEBYSHEV'S INEQUALITY 139

= e -Pi • e -J12 = e---{J4 +J12)

e -(J11 + J12)
= O! (f.l2 + f.l1)O.
Thus For any integer k ~ 0
e -(J11 + J12) (f.l + f.l )k
P(U=k)= 1 2
k!
e-(J1I+J12) (f.l + f.l )k
or, P(X + Y = k) = . I 2
k!
.'. X + Y has Posisson distribution with parameter f.ll + f.l2
5.7. Bivariate Expectation
Let (X, Y)be a bivariate having joint probability mass
P ij = P( X = Xi Y = Yj ) , marginal probability masses P Xi and P tj
j .

Then the expectations of the


(i) two variable function ~(X,Y) is
E(~(X,Y» = LLPij~(Xi'Yj)
I i j

(ii) one variable function ~(X) is


E(~(x» = L~(Xi)PXi
i

(iii) one variable function ~(Y) is

E(~(Y») = L~(Yj}PYj
j
140 ENGINEERING MATHEMATICS - IIA

If the concerned series are infinite series then the expectations exists
provided the infinite series is absolutely convergent.
Example. For the joint distribution of (X, V),

X -1 2 3 Total

2 ·09 ·15 ·06 ·30

5 ·21 ·35 ·14 ·70

Total ·30 ·50 ·20 1

find E(X2y),E(X3) and E(Y +2)


Solution. For convenience we re-write the given table with notation.

X YI = -1 Y2 = 2 Y3 = 3 Total

4 =2 PII =·09 PI2 = ·15 PI3 = ·06 Px =·3


I

x2 =5 P21 = ·21 P22 = ·35 P23 = ·14 PX2 =·7

Total Pro =·30


I PY2 =·50 Pv.3 =·20 1
2 3
E(x2y) = LLX;2YjPij
;=lj=1
2
= L( x; YIPn) + x; Y2Pi2 + x; Y3Pn ,
;=1

= (x~ YIPll + x~ Y2PI2 + x~ Y3P\3) + (X~YIP21 -+- X~Y2P22 + X~Y3P23)

= x~ (YIPII + Y2PI2 + Y3P\3) + x~ (YIP21 + Y2P22 + Y3P23)


= 22 (-1 x ·09 + 2 x ·15 + 3 x .06) + 52 (-1 x ·21 + 2 x ·35 + 3 x .14)

= 1·56 + 22·75 = 24·31


2 .
E(X3)
2
= LX;Px; =X~PXI +X~PX2 =22 x·30+5 x·70=18·7
. ;=1

3
E(Y + 2) = L (yj + 2)pYj = (YI + 2)pYl + (Y2 + 2)prz + (Y3 + 2)Pn
j=1

= (-1 +2) x·30+ (2 +2)x ·50+ (3 + 2)x ·20 = 3·3


DISCRETE JOINT DISTRIBUTION & CHEBYSHEV'S INEQUALITY 141

5.8. Theorems on Expectation


Let (X, Y) be a bivariate with joint probability p (X = X;, Y = Y j ) = Pij .
We get the following results.
Theorem 1. For a joint distribution of (X, Y) E(X ± Y) = E(X) ± E(Y)
m n
Proof. E(X±Y)=LL(X;±Y)Pij
;=1j=1
m n
= LL(X;Pij ±YjPij)
;=lj=1
In n m n

= LLX;Pij ± LLYjPij
;=1i=) ;=1 j=1
In n n n

=LX;LPij±LLYjPij
;=1 j=1 j=1;=1
m n m
= LX;Px; ± LYjLPij by definition of Px;
i=1 j=) i=1
m n
= LXiPXi ± LYjPYj by definition of PYj
;=1 j=1
=E(X)±E(Y)
Note. This theorem can be extended to more than two variables e.g
E(X + Y + Z + W) = E(X)+E(Y)+ E(Z) + E(W)
Theorem 2. For a joint distribution of (X, Y)
E(XY) = E(X)E(Y) provided X and Yare independent random
variables.
m n
Proof. E(XY) = LLx;YjPij
i=lj=1
m n
= LLxiYjPXiPYj" .: X and Yare independent
;=lj=1
m n
= LXiPXi LYjPYj
i=1 j=1
=E(X)E(Y)
Note. Converse of this theorem is not true . See a subsequent Example.

l-- --J
142
ENGINEERING MATHEMATICS -IIA

Theorem 3. E(k¢(X,Y»=kE(¢(X,Y»
Proof. Left to the readers.

Theorem 4. For a joint distribution of (X, y), E(k) =k where k is a


constant.
Proof. Let ¢(X, Y) = k

Then E(¢(X,Y»= LL¢(x;'Y)Pij


i j

= k x 1 by property of joint probability mass Pij


= k .
5.9. Covariance of two Variables
If X and Y be two random variables then covariance of X and Y is
defmed as

Cov (X,Y) = E {ex -X)(Y - Y)}


where X and yare mean of X and Y respectively In practice we
use the following result on covariance.
Theorem 1. Cov (X, Y) = E(XY) - E(X)E(Y)

Proof. Cov (X,Y) = E {(X - X)(Y _ y>}


=E(XY -XY -XY +XY)
- -
= E(XY) - E(XY) - E(XY) + E(X Y)
- - --
= E(XY) - YE(X) - XE(Y) + X Y

.: X and Yare constants


=E(xY)-Y X -XY +XY
- -
[.: X = E(X), Y = E(Y)]
=E(XY)-XY
= E(XY) - E(X)E(Y)
DISCRETE JOINT DlSTRmUTION & CHEBYSHEV'S INEQUALITY 143

Theorem 2. If X and Yare independent r. v then their covariance is O.


Proof. Using the previous theorem,
Cov (X,Y) ==E(XY) -E(X)E(Y)
==E(X)E(Y)-E(X)E(Y) .: X,Y are independent
==0
5.10. Correlation Coefficient between two Variables
The correlation coefficient between two random variables X and Y is
defined as
Cov (X,Y)
Pxy==
C1'xC1'y
where C1'.t and C1'y are standard deviation of X and Y respectively.
The degree of association or the strength of relationship between X
and Y is measured by the correlation coefficient. -'
If the probability of assuming the datas of X are not affected by those
of Y then we shall get Pxy ==0 .
Before we go for further study on correlation we discuss the following
examples.
Example. Find the covariance and correlation coefficients between the
variables X and Y when the joint distribution is given by

x 0 1
.
2

0 0·1 0·4 0·1

1 0·2 0·2 0

Solution. For convenience we re-write the table using full notation

X Yl == 0 Y2 == 1 Y3 == 2 Total

Xl == 0 Pll == 0·1 P12 == 0·4 Pl3 == 0·1 PXl == 0·6

x2 == 1 P2l ==0·2 P22 == 0·2 P23 == 0 PX2==0·4

Total PYI == 0·3 PY2 == 0·6 Pvs == 0·1 1

J
144 ENGINEERING MATHEMATICS-llA

2 3
Now, E(XY) == L,L,xiYjPij
i=1 j=1
2
== L, (XiYIPiI + XiY2Pi2 + XiY3Pi3)
i=1
2 2 2
== YIL,XiPiI + Y2L,XiPi2 + Y3L,XiPi3
i=1 i=1 i=1

==Ox L,XiPn + 1x(XIP\2


2
+~P22)+ 2(xIP\3 + ~P23)
i=1
==(0 X 0 . 4 + 1x 0 . 2) + 2(0 x 0 ·1 + 1x 0) ==0 . 2

2
E(X)== L,XiPXi ==XIPXI +X2PX2 ==OxO·6+1xO·4==0·4
i=1
. 3
E(Y) ==L,YjPYj ==YIPYI + Y2Py'2 + Y3Pn
j=1
==OxO.3+1xO.6+2xO.l ==0·6+0·2==0·8

.. Cov(X, Y) == E(XY) - E(X)E(Y)


==0.2 -0· 4x O·8 ==0 ·12
2
Now, E(X2) ==L,x; PXi ==x~PXI +X~PX2
i=1
==02 X 0 . 6 + f X 0 . 4 ==0 . 4
2 3 2 2 2 2
E(Y ) ==L,Yj PYj ==YI PYI + Y1.Pn + Y3 Pn
j=1
==02 X 0 . 3 + e x 0 . 6 + 22 x 0 ·1 ==1
2
., the standard deviation of X, CJx == jE(X ) - {E(X))2
==.[0.4-(0.4)2 ==0·490

and CJ
y
==JE(y )-{E(y))2
2

==[1 - (0 . 8)2 ==0 . 6

•• the correlation coefficient


Pxy Cov(X,Y) == 0·12 ==£==0·408
CJPy 0·490xO·6 ·294
DISCRETE JOINT DISTRIBUTION & CHEBYSHEV'S INEQUALITY 145

5.11. Properties of Correlation Coefficients


Following interesting and useful properties are obtained for correlation
coefficients between X and Y. The bivariate distribution of eX, Y) is given
by the joint probability mass Pi} = P(X = Xi' Y = Yj) .
Property 1. Pxy = Pyx
proof. Obvious.
Property 2.
Pxy has no unit, it is a pure number.
Proof. Since Cov (X,y) and ax,ay have same unit

. :. the ratio Cov(X,Y) should be pure number.


axay
Property 3. If (U, V) be another bivariate such that U = aX + b and
ae
V = ey + d then Puv = lallel Pxy where a, b, e, d are constants.

Proof. Let (X, Y) assumes the value (x;,y). So if (U, V) assumes


the values (u;, vj) then
U; = ax; +b and Vj =cYj «d ,

Moreover P(U = U;, V = vj) = P(X = x;,Y = Y j)


:. the joint probability masses of (U, V) would also be Pi} .
Va r (U ) = E {( U - U ) 2 } = E {(aX + b) - aX +b t}
= E(aX + b - aX - b)2 using property of mean

= E(aeX - X) f= a2 E{eX _X)2} = a2 Var(X)

Similarly a; =Ici ay
Again Cov(U,V) = E{(U -U)(V -V)}·
=E{(aX +b -aX +b)(eY + d -eY + d)}

= E {(aX + b - aX - b)( cY + d - cY - d)}

=E{(aceX -X)eY -Y) } = acE {(X -X)(Y -Y)}


= ac Cov(X, Y)
EM-2A-to
146 ENGINEERING MATHEMATICS -IIA

Cov(V,V) ae Cov(X,Y) ae
Then P = = =--P
uv 0" uO"v lalO"x lei 0" y lallel xy
Property 4. If V = aX + b, V = ey + d then
(i) Puv = P xy if a and e are of same sign
(ii) Pliv = -Pxy if a and e are of opposite sign.
ae ae
Proof. By previous property Puv = lalicl Pxy = lael Pxy
(i) Let a and e are of same sign. Then ae > 0
ae
:.Iael = ae. Then PI/V = =P» = Pxy.
LIe
(ii) Let a and e are of opposite sign. Then ae < 0
ae
:.Iael = -ae. Then Puv = -Pxy
-ae
= -Pxy
Property 5. -1 ~ Pxy s1
Proof. We construct two variables V and V by the rule

V = X - X and V = Y - Y
o"x O"y _

Then V=_1 X-~ and V=_1 Y-~


«. O"x· O"x.... O"y
:. COy (V,V) = E( (V -V)(V - V))

=E{X;,X-( x;.xJW;,Y t~YJ}


= E{X;.XJ;,XW;,Y J~Y}
= E{( X ;,xf ;,Y]}
=_1_E{(X -X)(Y -n}
O"xO"y

Cov(X,Y)
= = Pxy
O"xO"y
DISCRETE JOINT DISTRIBUTION & CHEBYSHEV'S INEQUALITY 147

:. Cov(U, V) = Pxy (1)

Again E(U2)=E{(
lX ;xx) Y} =~E{(X
o"x
-xi} =~O";
o"x
=1

Similarly E(V2)=1

Now, (U + vf ~0
or, U2 + 2UV + V2' ~ 0
2
:. E (U + 2UV + V2) ~ 0

or, E(U2) + 2E(UV) + E(V2) ~ 0


or, I + 2E(UV) + 1~ 0
or, 2E(UV) ~ -2
or, E(UV) ~ -1 (2)

Now, U=
- (x -)( J. A-:
--
O"x
=--=0
0"x

Similarly, V=0.
:. Cov(U,V) = E( (U - U)CV - V)) = E(U V) (3)

:. from (2),Cov(U,V) ~-1


or, pXy~-l using (1).
\
2 2
Again (U - V)2 ~ 0 or, U - 2UV +V ~ 0
\
or, E(U2 - 2UV + V2) ~ 0
or, E(U2) - 2E(UV) + E(V2) ~ 0
\ or, 1 - 2E(UV) +1~ 0

or, E(UV)::;~=l
2
or, COy (U ,V) ::;1 using (3)

or, Pxy ::; 1 using (1)


Thus -1::; P xy ::; 1
148 ENGINEERING MATHEMATICS - llA

Property 6. If the r.v X and Yare independent then Pxy = 0,


i.e., X and Yare uncorrelated .
Proof. Since X and Yare independent, E(XY) = E(X) E(Y)
:. Cov(X,Y) = E(XY) - E(X)E(Y) =E(X)E(Y) -E(X)E(Y) =0

..»; = Cov(X,Y) =_0_=0.


«», «»,
Converse of this property is not true:
If P.ry = 0 i.e., if){ and Yare uncorrelated then X and Y may not be
independent.
This is shown by the following example : Let the bivariate (X, i) have
the following joint distribution:

X 4
1
1 0 1 4 ,
PXi

- 0 0 0 0 1/5
-2 5
1
- 0 0 1/5
-1 0 0
5
1
0 0 0 - 0 0 1/5
5
1
1 0 0 0 - 0 1/5
5
1
2 0 0 0 0 - 1/5
" 5
1 1 1 1 1
PYj - - - - - 1
5 5 5 5 5
5 5
Here E(XY) = LLxiYjPij
i=1j=1
5
= :~:>i(YIPii + Y2Pi2 + ..... + YSPiS)
i=1

.............................................
1
+XS(YI xO+ Y2 xO+···+ Ys x-)
5
DISCRETE JOINT DISTRIBUTION & CHEBYSHEV'S INEQUALITY 149

1
= -(XIYI + X2Y2 + ... + xsYs)
5
1
= -(-2 x 4+ (-I)x 1+ Ox0+ 1xl + 2 x4)
5
1
=-(-8-1+ 1+8) = 0
5 .
s 1 1 1 1 1
E(X) = LX;Px; = -2 x-+ (-I)x - +Ox -+ 1x -+ 2x- = 0
;=1 5 5 5 5 5
s 1 1 1 1 1
E(Y) = L.YjPYj :;::::
4x-+ I x-+ Ox-+ 1x-+4x- =2
j=1- 5 5 5 5 5
Thus Cov(X, Y) = E(XY) - E(x)e(y) = 0 - 0x 2 = 0

:. Pxy = Cov(X,Y) = _0_ =0


a.x<Jy v,«,
1 1 1
. But we see PXI x Pyl =SxS = 25 1:-PII

i.e., X and Yare not independent


Note. In the above example we see E(X Y) = E(X) E(Y) but X and Y
are not independent.
5.12. Variance of Sum oftwo Variables.
In this section we shall see the variance of the sum X + Y or of the -
difference X _ Y not only depends on the individual variance of X and Y
but also depends on their covariance.
For a bivariate (X,y) having joint probability mass Pij we get the
following theorems.
Theorems 1. Var(X + Y) = Var(X) + Var(Y) + 2Cov(X,Y)
Proof. Let Z =X +Y
Then E(Z) = E(X + Y) = E(X) + E(Y)
E(Z2) = E( (X + Y)2) = E(X2 + 2XY + y2)

= E(X2) + 2E(XY) + E(y2)


:. Var(Z)=E(Z2)_{E(Z)}2

= E(X2) + 2E(XY) + E(y2) - {E(X) +E(y)}2


150 ENGINEERING MATHEMATICS-UA

=[ E(X:)-{E(X)}2]+[ E(y2)-{E(y)}2] +2E(XY)-2E(X)E(Y)

=Var(X) + Var(Y) + 2 {E(XY) - E(X)E(Y)}

or, Var(X + Y) = Var(X) + Var(Y) + 2Cov(X,Y)


Theorem 2. Var(X - Y) = Var(X) + Var(Y) - 2Cov(X,Y)

Proof. Let Z =X - Y
:. E(Z) = E(X - Y) = E(X)
- E(Y)
E(Z2) = E( (X - Y)2) = E(X2 - 2XY + y2)

= E(X2) - 2E(XY) + E(!2)


:. Var(Z) = E(Z2) _{E(Z)}2
= E(X2) _ 2E(XY) + E(y2) - {E(X) - E(y)}2
= E(X2) _ 2E(XY) + E(y2) _{(E(X»)2 - (E(Y»)2 - 2E(X)E(Y)}

=E(X2)_{E(X)}2 +E(y2)_{E(Y)}2 -2{E(XY)-E(X)E(Y)}

or, Var(X - Y) = Var(X) + Var(Y) -2Cov(X,Y)


Theorem 3. For any two constants a and b
Var(aX + bY) = a2Var(X)+ b2Var(Y) + 2abCov(X,Y)
Proof. This is mere generalization of the previous two theorems. We
lett the proof to the reader as exercise.
Corollary.. If X and Yare independent then

Var(aX + bY) = a2Var(X) + b2Var(y)


Theorem 4. If X\,X2," -X; are n random variables then
n :
Var(XI +X2 + .. ···+Xn)= L,Var(X;)+2L,LCov(X;Xj) -
;=1 i« j

More generally,
Var(alX1 +a2X2 + .... :+anXn)=
"
~>iVar(X;)
;=1
+2L,L,a;ajCov(X;,Xj)
; <i
DISCRETE JOINT DlSTRffiUTlON & CHEBYSHEV'S INEQUALITY 151

Corollary. If they are pairwise independent then


Cov(X;,Xj)=0 'V i= ] , Then

Var(Xl +X2 + .... +X,,) = L:Var(XJ

Theorem 5. If the random variables XPX2'" ·X" are independent,


_ a2 -
each has same variance a2 then Var(X) = - where X is mean of XjS
n
- 1
Proof. X =-(Xj +X2 + .... +X,,)
n

:.var(x)=var{~(Xl +X2 + .... +Xn)}

1
=--zVar(XI +X2 + .... +Xn)
n

2
1 2 2 2 1 2 a
=-(a +a +.... ·+a )=-·na =-
n2 n2 n

5.13. Illustrative Examples.


Example 1. In a joint distribution of (X, Y) the marginal distribution of X
and Yare respectively
X 5 7
1 1
PXj
2 2
and Y 3 6
1 2
3 3
1
If Cov(X'Y)=-"2'

(i) Construct the joint distribution

(ii) Are X and Yindependent?

(iit) Find P(X = 5, Y = 6) and P(X > Y)


152 ENGINEERING MATHEMATICS - IIA

Solution. (i) Let the joint distribution be

3. 6 PXi

1
-
5 PII PI2 2
1
-
7 P21 P22 2
1 2
PYj
- -
3 3
1
Let PII =x. :. PI2 =2"-x
P21 =.!..- x and P22 =~- Pl2 = ~-(.!.. -x) =.!..+ x
3 3 3 2 6
1 1
Now, E(X) =5x-+ 7x-= 6
2 2
1 2
E(Y) = 3x-+6 x- = 5
3 3
E(XY) = 5 x 3x Pll + 5x 6x Pl2 + 7x 3x P21 + 7x 6x P22

= 15x+ 30(~ -x) + 21(~-X )+42( ~+x) = 6x+ 29

Now, Cov(X,y) = E(XY) -E(X)E(Y)


1
or, -- = 6x + 29 - 6 x 5 or, x ==- 1
2 12
1 115 111 111
:. Pll ==12' Pl2 ==2"- 12 ==12' P21 ="3 - 12 =="4' P22 =="6 + 12 =="4
:. the joint distribution of (X, Y) is

3 6 Total
(Pxi)

1 5 1
- - -
5 12 12 2
I 1 1
- - -
7 4 4 2
1 2
Total - - 1
3 3
(PYj)
DISCRETE JOINT DISTRIBUTION & CHEBYSHEV'S INEQUALITY 153

1 1 1 1
(ii) We see PI I =- but P Xl X PYI =- x- =-
12 2 3 6
:. Pu "# PXI . PYI

:. X and Yare not independent


5
(iii) P(X = 5,Y = 6)= p{ (5,6)} = 1 2

P(X > Y)= p{ (5,3),(7,3),(7,6)}


.
1 I 1 7
=-+-+-=-
·12 4· 4 12
Example 2. Following is the joint distribution of the pair of variables
(X, Y):

o 2

1 0·3 0·2 0·1

2 0·1 0·0 0·3

(i) Find the marginal distribution of X and Y. Hence find P(X = 2)

(ii) Are the two random variables independent?


(iii) Find the correlation coefficient between X and Y.
Solution. We re-write the table

>( 0 1 2 PXi

0·3 0·2 0·1 0·6

2 0·1 0·0 0·3 0·4

PYj 0·4 0·2 0·4

(i) The marginal distribution of X is

X 2

PXi: 0·6 0·4


So P(X=2)=O·4
154 ENGINEERING MATHEMATICS - IIA

The marginal distribution of Y is

Y o 1 2
PYj 0 .4 O· 2 O· 4

(ii) We see Pll =0·3 but PXl xPYl =0·6xO·4=0·24

.'. PI 1 P Xl . PYI "*


So X and Yare not independent
(iii) Now, from the marginal distribution we get

E(X) = l x 0 ·6+ 2x 0·4 = 1·4

E(X2) = e xO·6+22 xO·4= 2·2


E(Y) = Ox 0·4+ 1xO·2+2x 0·4 = 1

E(y2) = 02 xO.4+12 X O·2 +22 X 0·4 = 1·8

•• Var(X)=E(X2)-{E(X)}2 =2·2-(1·4i =·24


Var(Y)=E(y2)_{E(y)}2 =1.8-e =.8

To find E(XY) = LLPijXiYj we construct the following table of

(Pij ·Xi • Yj)

0 1 2 Total
~x·
1 P"x,y, =
·3xO ·2xl ·lx2 0·4

2 -l x 0 ·Ox2 ·3x4 1· 2

.', E(XY) = 1· 6

.', COY (XY) = E(XY) - E(X)E(Y)


= 1· 6-(1·4)x (1) =0· 2
•. the correlation coefficient.
--,.-
p = Cov(X,Y) = 0·2 =0.456
xy <Jx<J
y
54 x.[:8
DISCRETE JOINT DISTRIBUTION & CHEBYSHEV'S INEQUALITY 155

Example 3. The bivariate (X, Y) is such that it take the values (i,j)
where i = 0,1,2,3;j = 1,2,3,4. The joint probabilities P(X = i,Y = j)
= k(3i + 4j). Find
(i) the value of k
(ii) the marginal distribution of X and Y.
(iii) P(X ~ 2, y s 3)
(iv) P(Y = 2/ X,,; 3)
(v) Are X and Y independent?

Solution: (i) Let Pij = p(X = i,Y = j) :. Pij = k(3i + 4j)

3 4 3 4
Now, 2:2:Pr =1 or, 2:2:k(3i+4j) =1
IJ • 0 . 1
i=O)=1 1= J=

3 4 3 4 4
or, k2:2:(3i+4j)=1 or, k2:(2:3i+ 2:4j)=1
i=O)=1 i=O )=1 )=1

3
or, k2:(3ix4+4(1+2+3+4»=1
i=O
3
or, k2:(12i+40)=1
i=O

or, k ( 12 t t i+ 40) =1
or, k(12(1+2+3)+4x40)=1
1
or, 232k =1 :. k =-
232
(ii) Now, X 0 1 2 3
444
Now, ix(X =0)= 2: PO) = 2:k(3xO+4j) =4k2:j
)=1 )=1 )=1

. 40 5
= 4k(1+ 2+ 3+4)= 40k = -=-
232 29
4 4 .
ix(X =1) = 2: PI) = 2:k(3xl+4j)=k(4x3+4x(1+2+3+4»
)=1 )=1

=52k=~=Q
232 58
156 ENGINEERING MATHEMATICS -IIA

8 19
Similarly we shall find Ix (X = 2) = 29' Ix (X = 3) = 58
.'. the marginal distribution of X is
X 0 1 2 3
5 13 8 19
Ix - - -
29 58 29 58
Similarly we can find the marginal distribution of Y is (detail evaluation
is not shown)
Y: 1 2 3 4
17 25 33 41
Iy :
116 116 116 116
(iii) P(X~2,Y::;3)=:L:LPij
i:?2jS3

3 3
= :L:L k(3i + 4 j)
i=2j=1

= k~
3(3 f;3i +4f;/
3 J 3
= k~(9i +24)

( 3)
:::.k 9:Li + 24x 2 = k(9(2+ 3)+ 48)= 93k =-
~2
93
232

(iv) P(Y = 2/X = 3) = P(Y = 2,X = 3)


P(X =3)

P(X=3,Y=2) P32 k(3x3+4x2)


= = =
P(X = 3) P(X = 3) 13
58
17
17k232 17
=13=13= 76
- -
58 58
- _1_(3 + 8) - ~
(v) We see PI2 - 232 - 232

13 25 325
58x116= 6728
x
but PXl px2=
.'. X and Yare not independent
DISCRETE JOINT DlSTRlBUTION & CHEBYSHEV'S INEQUALITY 157

Example 4. The joint probability distribution of the random variables X


and Yis

x2
0

-05 -10
2

-25 0-4

4 -15 -05 -15 0-35

6 -10 -10 -05 0-25


Total 0-3 0-25 0-45

Find P(X + Y ~ 4) _Are X and Y independent?

Solution. If (X, Y) assumes value (x;,y) then X + Y assumes Xi + Yj


.'. X + Y assumes value 2 to 8
ow, (X + Y = 4) = {(2,2),(4,O)} .'. P(X + Y = 4) = -25 +-15 = 0-4
(X + Y = 5) = {(4,1)} .'. P(X +Y = 5) =0-05
eX + Y = 6) = {(4,2),(6,0)} .', P(X + Y = 6) = -15+-10 = -25
(X + Y = 7) = {(6,1)} :_P(X + Y = 7) =-10

ex + Y =8) = {(6,2)} .. P(X + Y =8) = -05


Thus P(X + Y ~ 4) =0- 4 +0-05 +0- 25 +0-10+ 0 -05 = 0 -85
From the table we see -05 i:- 0 -4 x 0 -3 _ .', X, Yare not independent
Example 5. Two scanners are needed for an experiment of the five
available, two have electronic defects, one has defect in memory and two
are in good working condition. Two units are selected at random Let XJ =
Number of units with electronic defects, X2 = Number with defect in
memory.
(i) Find the joint distribution of XPX2
(ii) Find the probability of no or one defects among the selected two

(iii) Find the marginal distribution of XI'

Solution. o 2
X2 0
For i = 0,1,2 and j = 0,1 ,

J'----'---- __ -.JL
158 ENGINEERING MATHEMATICS-11A

Pi} =P(Xj =i,X2 = j)=Probability of i mumber of electronically defected


unit and j number of memory defect units among
2 drawn from 5 ..
2Ci 'Cj 2C2_(i+j) 2Ci 2C2_i_j
= 5C = 5C2
2
(i) These form the following joint distribution

o 1

POO Po,

(ii) Probability of no or one defects

2C02C2 2CO 2C, 2C/C,


= Poo + POI + PIO = 5 + 5 + -5 -
C2 C2 C2
124 7
=-+-+-=-
10 10 10 10
1 2 3"
(iii) f XIO P
= 00 + P +
0' = 10 10=10

2
4 2C C 4 2 3
lXI' = PIO + Pll =10+ 5~2 0 = 10 + 10 ="5

1
=
10
:. the marginal distribution of X, is

o 2

3 3 1
10 5 10
DISCRETE JOINT DISTRIBUTION & CHEBYSHEV'S INEQUALITY 159

Example 6. Two discrete random variables X and Yare jointly distributed


1 1
with P(X = 2) = P(Y = 3) = P(Y = 4) = - ; P(X =l,Y = 2) =4;
1 3
P(X = 2, Y = 3) = -12
Find (i) correlation coefficient between X and Y
(ii) the probability distribution of X +Y
Solution. (i) The joint distribution of (X, Y) is

X 2

1
3

1
4

1
PXi

2
1 - - - -
4 4 6 3

1 1 1 1
2 - - - -
12 12 6 3

1 1 1
PYj
- - -
1
3 3 3

1 1 1
P(X=l Y=3)=---=-
, 3 12 4
1 2
Px(X=l)=l-3"=3"

:. P(X = 1,Y = 4) =%-(~+±)= ~


1 1 1
P(X=2 Y=4)=---=-
, 3 6 6
P(X = 2 Y =2) =..!.-(~+..!.) =~
, 3 12 6 12
111
Py(Y = 2) =-+-=-
4 12 3
2 1 4
Now, E(X)=lx-+2x-=-
3 3 3
1 1 1
E(Y) = 2x - + 3x- +4x- = 3
3 3 3
2 1
E(X2) = 12 X - + 22 X - = 2
3 3
160 ENGINEERING MATHEMATICS - UA

E(y2) = 22 x.!.+32 x.!.+42 x.!. = 29


3 3 3 3

:. Var (X) = E(X2) _{E(X)}2 = 2_(i)2393 =~, Similarly Var(Y) = ~


To find E(XY) = LLPijx;Yj we construct the following table of
(Pij 'X;' Yj) :

2 3 4 Total
~x·
1 23
1 PllX1Yl= PI2XIY2=
lx4x- -
6 12
1 1 1 3 2
-xlx2=- -xlx3=- -
4 2 4 4 3
1 1 1 1 1 4 13
2x2x-=- 2x3x-=- 2x4x-=- -
2 12 3 12 2 6 3 6
49
E(XY)= -
12
49 4 1
Cov(X,Y) = E(XY) - E(X)E(Y) = ---x 3 =-
12 3 12
:. the correlation coefficient,
1 1
Cov(X,Y) 12 12·.fj
P,,= J.J, =J%~= 31 =s
(ii) When X = 1, Y = 2, X + Y = 3; when X = 1, Y = 3, X + Y = 4.
Thus
X +Y: 3 4 5 6
1
P(X + Y = 3) = P{(1,2)} = 4

1 1 1
P(X + Y = 4) = P{(1,3),(2,2)} =4+12 ="3
1 1 1
P(X + Y = 5) = P{(1,4),(2,3)} ="6+ 12 = 4
1
P(X +Y = 6) = P{(2,4)} ="6
DISCRETE JOINT DISTRlBUTION & CHEBYSHEV'S INEQUALITY 161

So the probability distribution of X + Y is

X+Y 345 6
1 1 1 1
4 3 4 6
Example 7. If X and Y have same standard deviation then prove that
X + Y and X - yare uncorrelated.
Solution. Let U = X + Y and V =X - Y

:. U = X + Y and V = X - Y

Now, Cov(U,V) =E {CU -U)(V - V)}·


=E{(X +Y)-(X +Y)}{(X -Y)-(X -n}
= E {(X - X) + (Y - Y)}{(X -X)-(Y - Y)}
=E { (X-X) - 2 -(Y-Y) - 2}
= E( (X _X)2)_ E( (Y _ y)2)
= Var(X) - Var(Y)
=0 .: s.d of X = s.d of Y
.
.. Puv -
_ Cov(U,V)·_
-
°
auav
•• X + Y and X - Yare uncorrelated.
Example 8. An unbiased die is thrown 7 times. Find the probability that
the face 1 turns up twice, face 2 once, face 4 twice and face 6 twice.
Solution. E be the experiment of throwing the die. Its possible outcomes
. 1 1 I
areface I,face 2, .... face 6 with probability 6'6'···6.
It is conducted 7 times, i.e. the compound experiment E7 is performed.
Let Xl = No. of times face I turns up.
X2 = No. of times face 2 turns up

X 6 = No. of times face 6 turns up.


Required probability
=P(X1 =2,X2 =1,X3 =O,X4 =2,Xs =O,X6 =2)
EM-2A-1l
162
ENGINEERING MATHEMATICS - IIA

7!
=2!1!0!2!0!2!"6
( 1 )2 ( "61)1( "61)0( "61)2( "61)0 ( "61)2
according to Multinomial Distribution.
7! 1 630 35
="8' 6' = 279936 = 15552
Example 9. An urn contains 5 white, 2 blue and 3 red balls. Six balls are
drawn with repalcement. Find, using multinomial distribution, the
probability that there are four white' and two red balls.
Solution. Let E be the experiment of drawing a ball from the urn. This
has three possible outcomes: white ball (w), blue ball (b) and red ball
(r) with respective probability
5 1 2 1 3
PI = 10 =Z,P2 = 10 =S,P3 = 10
Since six balls are drawn with replacement we think E is repeated 6
times.
Let XI = number of W, X2 = number of band X3 = number of r in
6 trials.
By Multinomial distribution, required probability

=p ( XI =4'X2 =0,X3
6!
=2 ) = 4!0!2!PI
4 °
P2 P3
2

nO(I)4(1)0( 3 )2, 1 9 135 27


= 48 '2 S 10 = 15· 16 . 100 = 1600 = 320
Example 10. The probabilities that the light bulb of a certain kind of
projector will last fewer than 40 hours of continuous use, anywhere from
40 to 80 hours of continuous use, or more than 80 hours of continuous
use are 0·30,0·50 and 0·20 respectively. Eight such bulbs are drawn
one by one. Find the probability that among these eight bulbs 2 will last
fewer than 40 hours, 5 will last anywhere from 40 to 80 hours, and 1
will last more than 80 hours.
Solution. XI = No. of bulbs lasting fewer than 40 hours
X2 = No. of bulbs lasting from 40 to 80 hours
X3 = No. of bulbs lasting more than 80 hours.
In multinomial distribution, PI = 0 . 3, P2 = 0 . 50 and P3 = 0·20 .
DISCRETE JOINT DISTRIBUTION & CHEBYSHEV'S INEQUALITY 163

:. Required probability = P(XI = 2, X2 = 5,X3 = 1)

=_8!_(0.30)2(0. 50i(0, 20)1


2! 5!1!
=0·0945
Example 11. A pair of unbiased coins are tossed six times. Find the
probability of getting 2 tails twice, 1 head and 1 tail 3 times, and 2 heads
once.
Solution. E be the experiment of tossing the pair of coins. This has three
possible outcomes :
SI = 2 tails; S2 =1 head, 1tail; S3 = 2 heads
111 11111
with respective probabilities PI ="2x"2=4,P2 ="2x"2+"2x"2="2'
1 1 1
p) ="2 x "2 = "4. See PI + P2 + P3 = 1. '
XI = No. of occurence of sl among 11= 6 trials.
X2 = No. of occurence of S2 among n = 6 trials. "

X3 = No.
of occurence of S3 among n = 6 trials.
From the principle of multinomial distribution,
Required probability = P(XI = 2,X2 = 3,X3 = 1)
./

6! 2 3 I 720 ( 1 )2 ,( 1)3 (, 1 )1
= 2! 3! 1!PI P2 P3 = 12"4 "2 "4
60 15
=-=-
512 128 "
Example 12. XI'X2,X3 are three discrete random-variables which are
pairwise uncorrelated. Each has same standard deviation. Find the
correlation coefficient between XI + X2 and X2 + X3 .

Solution. U = XI + X 2' V= X2 ,+X3


Let crXl =Cf-'2'=CfX] =Cf
Cov (U, V) = E(UV) - E(U)E(V)
=E{(XI +X2)(X2 +X3)}-E(Xt +X2)E(X2 +X3)

= E(X1X2 +XIX3 +X; +X2X3)-{E(Xt)+E(X2)}

{E(X2)+E(X3)}
164 ENGINEERING MATHEMATICS -11A

= E(X1X2) + E(X\X3) + E(X;) + E(X2X3)


- E(X1)E(X2) - E(X1)E(X3){ E(X2)}2 - E(X2)E(X3)
= {E(X1X2) -E(X\)E(X2)} + {E(X2X3) -E(X2)E(!3)}

+ {E(X1X3) -E(X)E(X3)} +E(Xi)-{E(X2)}2

= COV(X\,X2) + COV(X2X3) + COV(X\,X3) + Var(X2)


= (J2 '," X., X2'X2,X3 and XI'X3 are pairwise uncorrelated.
'," X"X2,X2,X3 andXI'X3 are pairwise uncorrelated,
Var (U) = Var (X) + X2) = Var (X\) + Var(X2) - 2Cov(X),X2)
= 0'2 + 0'2 -'2 x 0 = 20'2 :. au =.fi0'
Similarly, Var (V) = 20'2 :'O'v =.fi0'

. Cov(U, V) 0'2 1
• '.
p=
uv <«. = .J20" JiO' =-2
Example 13. Find the expectation of the total number of points in a bridge
hand of 13 cards where the points are assigned as 2 for spades, 3 for
club, 4 for heart and 6 for diamond.
.Solution, Let X\ = points scored in the first card received
Then X\ 2 3 4 6
1 1 1
Prob:
4 4 4 4
I 1 1 1 15
:. E(X\) =2x-+3x-+4x-+6x- =-
4 4- 4 4 4
If X2,X3 .. · .. X\3 be the points scored in the second, third,
thirteenth card respectively then as above we shall get

E(X )=~
E(X )=~ .. ·..E(X )=~
4 '
2 3 4' \3 4
:. -the expectation of the total number of points
= E(X\ +X2 +X3 + .... +X13) =E(X\)+E(X2)+ .. ·· .. ·+E(X13)

= 13x~ = 195
4 4
Example 14. If X and Yare two independent random variables having
standard deviation 0'\ and 0'2 prove that the correlation coefficient of X
a)
and X + Y is I 2 2
'Va) + 0'2
DISCRETE JOINT DISTRIBUTION & CHEBYSHEV'S INEQUALITY 165

Solution. Let U =X +Y
Now, U = E(U) = E(X + Y) = E(X) + E(Y) =X +Y
Cov(X,U') = E {(X - X)(U - U)}
=E{(X -X)(X +Y -x -f)}
=£(X -X){(X -X)+(Y -y)}) .
=E{(X _X)2 + (X -X)(Y - Y)}
=E(X -X)2)+E(X -X)(Y -Y»)
= Var(X) +COV(X,Y)

= Var(X) .: Cov(X,Y) is 0 as X, Yare independent.

Now, Var (U) = Var (X + Y)


=Var (X) + Var (Y) - Cov(X, Y)
=0"2
1
+ 0"22 - 0 .: X and yare independent

=0"2
1
+0"22

2
. _ Cov(X,U) _ 0"1 _ 0"1
.. Px •u - 0"0"
- ~ 2
or, Px ,x+y - ~ .
X U 0"1 0"1 + 0"22 2
0"1 + 0"22
Example 15. Two discrete random variables X and Yare connected by
the relation 2X + 3Y + 4 = 0 . Prove that the correlation coefficient between
X andYis -1.
Solution. Given 2X + 3Y + 4 = 0 (1)

or, £(2X + 3Y + 4) = £(0)


or, 2£(X) + 3£(Y) + E( 4) = 0 .
or, 2E(X) + 3£(Y) + 4 = 0

or, 2X + 3Y + 4 = 0 (2)
166 . ENGINEERING MATHEMATlCS-IIA

Now, Var(X)=E {(X-X) - 2} =E {3Y+4 3Y+4}2


--2-+-2-
.
usmg(1),(2)

= 2.E(Y _ y)2 = 2. Var(Y)


4 4
3 3
:. ax 'i "2ay or, ay = "2 ax

Now, Cov(X,Y) = E {(X - X)(Y - y)}

= E {(X - X)( - 2X + 4 + 2X + 4
3 3
J} using (1), (2)

=-%E{(X -X)(X -X) =-~E(X -xt}


2 2
=--a
3 x

2 2
--a
3 x
=-1
2
ax ·-a
3 x

Example 16. 'Player A tosses 5 coins and player B tosses 8 coins. If the
coins are unbiased, find the probability of obtaining a total of 6 heads by
the two players.
Solution. Let X = number of heads among 5 trials by A
Y = number of heads among 8 trials by B
:. X + Y = total number of heads obtained by A and B.

Now X has binomial distribution with parameter m = 5, p =!


2
and Yhas binomial distribution with parameter n = 8, p =!
2
:. X + Y has binomial distribution with parameter

m +n = 5 + 8 = 13 and p = -1
2
Required probability = P(X + Y = 6)

=1II+n C 6 Jn+n-6 = 13C (_21)6(_21)13-6= 13C6 1 1716 429


6p q 6 'i3= 8192 = 2048
DISCRETE JOINT DISTRIBUTION & CHEBYSHEV'S INEQUALITY 167

Example 17. An urn contains four white, four red and four black balls.
The balls of each colour are marked 0, 1, 2, 3 respectively. A ball is drawn
from the urn at random. A random variable X assumes the value 0, 1, 2
according as the ball is white, red and black respectively. Y denote the
number marked on the ball.Find E(X), E(l'). Hence find the covariance
of the variates.
Solution. X o 2
Y o 2 3
Now, Poo = P(X = 0, Y = 0) = Probability that the ball is white with

marking 0 =~
12
1 1 \-I ••
Similar9" POI = P02 = P03 = 12 . In general Pi} = 12 v t.]

Thus the joint distribution is

x 0
1
1

I
2
1
3
1
Total

1
0 - - - - -
12 12 12 3
1 1 1 1 1
- - - - -
1 12 12 12 12 3
1 1 1 1 1
2 - - - - -
12 12 12 12 3
1 1 1 1
Total - - - -
4 4 4 4
Marginal distribution of X is
X 0 1 2
111
Ix; - - -
3 3 3
1 1 1 1
:. R(X) =Ox-+ 1x-+ 2x- = 3x- =1
3 3 3 3
That of Yis
Y o 2 3
1 1 1 1
4 4 4. 4
168
ENGINEERING MATHEMATICS -IIA

1 1 1 1 6 3
:. E(Y) = Ox-+ lx-+2x-+3x-=_ =_
4 4 4 4 4 2
From the above table we see Pij = Ix; x IYj V i,j
:. X and' Yare independent.
:. Cov(X,Y) = O.
Example 18. Two balls are drawn one by one without replacement from
a box containing three balls, numbered 1, 2, 3. Let
X = number on the first ball drawn
Y = number which is larger of the two.
(i) Find the joint distribution of (X, Y).
(ii) Cov (X, Y)

(iii) Correlation coefficient of X and Y.

(iv) Variance of X + Y, 2X -5Y.


Solution. X : 1 2 3
Y 2 3
~r.' So (X, Y) assumes (i, j) = (1 st ball is of No. i, 2nd ball is of Noj')
where i = 1,2,3 and j = 2,3.

Ie x le 1
PI2 =P(X=I,Y=2) =P(I,2) = 3 I 2 I =_
CjxCI6

Ie xl e 1
PI3 = P(X = 1,Y = 3) = P(l, 3) = 3 I 2 ~ =6
CI X 'JI

P22 = P(X =-2,Y = 2) = P(2,I)


'cI xl e1=_ 1
3CI x2 CI 6

P23 =P(X=2,Y=3)=P(2,3)=
'c
3 I
xl
2
e1=6 1
Clx CI
P32 = P(X = 3,Y = 2)

=P(¢) .: if 1st ball is of No. 3, larger of two can not be 2


=0

P = P(X = 3 Y = 3) = P(3 2) + P(3 3) = !6 + ! =!


33 , , , 6 3'

-
DISCRETE JOINT DISTRIBUTION & CHEBYSHEV'S INEQUALITY 169

(i) :. the joint distribution of (X, Y) is

X 2
1
3
1
Total
1
(lxJ

1 - - --
6 6 3
1 1 1
2 - - -
6 6 3
1 1
3 0 - -
3 3
1 2
Total (fYj - - 1
3 3
(ii) To find E(XY) = LLPij i- j we construct the following table of
(Pi) . i- j) .

<; 2 3 Total

1 Pll 1·2 P13 1·3


1 1 1 1 5
=-·1·2=- =-·1·3=- -
6 3 6 2 6

2 P22·2·2 P23 ·2·3


1 2 1 5
=-·2·2=- =-·2·3=1 -
6 3 6 3

3 P32 ·3·2 P33 ·3·3


1
=0·3·2=0 =-·3·3=3 3
3
11
E(XY)= -
2

E(X) = lx-+1112x-+3x- =2
. 3 3 J

E(Y) = 2 x.!. + 3 x 3. = ~
3 3 3
11 8 1
:. COY (XY) = E(XY) - E(X)E(Y) = 2" - 2 x 3' = '6
170 ENGINEERING MATHEMATICS-IIA

(iii) E(X2)=12 X.!.+22 x.!.+32 x.!. = 14


3 3 3 3
E(y2) = 22 x.!. +32 x3. = 22
333
:. Var (X) = E(X2) _ {E(X)}2 = 14 _ 22 = 3.
.33

Var(Y)=E(y2)-{E(y)}2 = 22 _(~)2 =3.


. 3 3 9
1 1
.
.. p
_ Cov(X,Y)
-
_
~
6 _ 6 _ 1 3J3 _ J3
-----x----
xy CYxCYy ~ ~ _2_ 6 2 4
V3V9 3J3
(iv) Var (X + Y) = Var(X) + Var(Y) + 2Cov(X,Y)
2 2 1 11
=-+-+2x-=-
3 9 6 9
Var(2X - 5Y) = 22Var(X) + (-5)2Var(Y) +2·2 . (-5)Cov(X,Y)
2 2 1 44
=4x-+25x--20x-=- .
3 9 6 9
5.14. Chebyshev's Inequality
In this section we derive the following important theorem which enable
us to derive bounds on probabilities when the variance of a random
variable is known. From this result we see that a small variance shows
that there is a little chance of large deviations from the mean. The result
knavvn as Chebyshev's inequality is an exceedingly useful tool. In this
theorem we suppose the existence of the variance of the random variable.
Theorem. Let X be a random variable having mean X. Then for
arbitrary E> 0,
p(IX - Xl ~E).~ ~
E
Var (X)

Proof. Let the probability distribution of X be


X XI x
2
x3 ......•••..

13 .
DISCREET JOINT DISTRIBUTION & CHEBYSHEV'S INEQUALITY 171

Let Y =X _ X . Then distribution of Y will be


Y : YI Y2 Y3 .
1; : 1; h 13•••.••••••••
where y. I
= x. -
I
X for all i

Now, p(IX - Xl ~E) = p(IYI ~E) = L p(Y = »)


IYil~E
= L 1; = --;-. E2 L 1; = --;- L E2 1;
lytl~E E IYil~E E lyd~E

= ~2E{(X -xt}= :2 Var(X)

Thus p(IX - xl ~E) s --;-Var


E
(X).

Illustration.
Let X be the number of Chips produced in a factory during a week. Its
mean and variance be 50 and 25 respectively. Find the probability that this
week's production will be between 40 and 60.
Now, 40<X <60

<=:> 40 - 50 < X-50 < 60 - 50

<=:> -10 < X-50 <10

or,40<X<60<=:>IX-501<10 (1)

By Chebyshev's inequality,
1 . 1 1
P {IX -5ol~10 } ~-2 Var(X) =-x25=-
10 100 4
172 ENGINEERING MATHEMATlCS-1IA

or,l-p{IX-501<1O}:::;!..
4
or, P {IX - 501< 10} ~ 1-!.. = l
4 4
or, P(40 < X < 60)-~ l using (1)
4
:. the required probability is at least l
4
From the above theorem another result known as One-sided
Chebyshev's inequality can be deduced. This is stated in the follwing
theorem.
Theorem. If X is a random variable with mean 0 then for arbitrary E> 0

P(X ~E):::; Var(X)


Var(X)+ E2

Proof. The proof is kept beyond and the scope of the book.
Corollary. If the mean of X is 0 then for arbitrary E> 0 .

P(X:::;-E):::; Var(X)
Var(X)+ E2

Proof. Put Z = -X then Z = -X = 0 and

Var(Z) = (-1)2Var(X) = Var(X).


:. P(X:::; - E) = P(-Z::; -~)

= P(Z ~E):::; Var(Z) = Var(X)


Var(Z)+ E2 Var(X)+ E2
Illustration •
. Number of chips produced in a factory during a month is a random
variable X with mean 100 and standard deviation 20. Find the maximum
chance that this month's production will be at least 120.

We see X = 100 . So put Z = X - X = X -100 . Then Z = X -100 = 0 .


and Var (Z) = Var (X -100) = e Var (X) [by a theorem given in the
chapter of variance]
= Var (X) = 202 = 400
DISCREET JOINT DISTRIBUTION & CHEBYSHEV'S INEQUALITY 17-3

Again the event (X ~ 120) == (Z ~ 120 -100) == (Z ~ 20)


Since mean of Z is 0, by One sided Chebyshev's inequality,

P(X ~ 120) = P(Z ~ 20) ~ Var(Z) = 400 = 1


Var(Z) + 202 400 + 400 2
Thus the Probability that this month's production will be at least 120 is
1
at most "2 .
:. there is maximum 50% chance that this month's production will be
at least 120.

Illustrative Examples.
Example 1. The distrubution of a random variable X is given by
1 3 1
P(X =-1) =8'P(X = 0) =4"'P(X = 1)=8'

Verify Chebycheff's inequality for the distribution.


1 3· 1
Solution. Here m=E(X)=(-I)·-+O·-+I--=O
8 4 8
2 2123211
E(X )=(-1) --+0 --+(1) .-=-
8 4 8 4
0'2 = Var(X) =E(X2)_{E(X)}2 =~-O=~.

Thus for any E> 0, Chebycheff's inequality becomes

1
pOX -0\ ~ e )~4
e
Now we consider the following cases:
Case (i) Let 0 < e ~ 1

Then p(\X\ ~ s) = p(X ~ -&)+ p(X ~ &) =.peX = -1)+ P(X = 1)


(.: -1 in the only value of X which is ~ -s and
1 is the only value of X which is ~ e )
1 1 I 1
= 8 + 8 = 4" s 4&2 , as 0 < e s1
174 ENGINEERING MATHEMATICS - IIA

Case (ii) Let e > 1

Then p(lxl '?&)=p(X ~-&)+p(X '?&)=0+0 =O<~;


4&
as IXI '? e is an impossible event.
Thus in both cases Chebycheff's inequality is verified.
Example 2. Suppose that it is known that the number of cars manufctured
in a car-factory during a month is a random variable with average 50.
If the standard deviation of a month's production is 5 then find the
probability that this month's production will be between 40 and 60.
Solution. Let X = number of cars produced in a month. Now the required
probability
= PC40 ~ X ~ 60) = PC40 - 50 ~ X-50 ~ 60 - 50)

= PC-1 0 ~ X-50 s 10) = pclX - 501s 10)


. a2 52 1 3
= 1-pclX -SOl,?10)'?1--= 2 1--= 1--=-
10 • 100 4 4
(by Chebycheff's inequality)
_-_ The probability that this month's -production will be between 40 and
60 is a t least 0 -75.
Example 3. For a random variable if Var(X) =0 then prove that
P(X = J1)= 1 where J1 is mean of X.
Solution. Using Chebyshev's inequality, we have, for any integer n '? 1 .

p{IX-J1I,?~}$ ~ Var(X)
n2

or, p{IX -J1I,?~}$n2 xO

or, P {IX - J11 '? ~} s 0

or, P {IX - J11 '? ~} = 0 .: Probability 1. 0

or, lim
n-+oo
p{IX - J11 '?~}=
n
lirn 0 = 0
n-+oo
DISCREET JOINT DISTRIBUTION & CHEBYSHEV'S INEQUALITY 175

or, P {lim {IX - ,ul ~ .!..}} = 0 .: Probability is a continuous function


II--too n

or, P(X *- ,u) = 0 lim {IX -


[ ..• existence of ,,--too ,ul ~~}n implies

X-,u*-O]

or, I-P(X=,u)=O :.P(X=,u)=1


Example 4. A group of 200 persons consisting of 100 men and 100 women
is randomly divided into 100 pairs of 2 each. Find the maximum chance
that at most 30 of these pairs will consists of a man and a woman.
Solution. Let ml' m2' m3 ., ·mlOO be the men and
WI' W2' W3 •.••.•... 'WIOO be the woman.

Let XI be the variable such that


XI = 1 if ml is paired wdith some Wj

= 0 if mL is paired with mJ
Similarly consider X 2 such that
X = 1 if m2 is paired with some Wj
2
=0 if m2 is paired with some m],

In the way construct the random variables


X 3' X 4" XIOO.. Obviously they are not independent
and if XL + X2 + X3 + +XIOO = k then we understand k number
of man - woman pair is done.
Thus if X = XL +X2 +X3 + ·+XIOO
then possible values of X = 0 I 2 3 ·100
Now the man mL can be paired with any person (man or woman) is
199 ways. The man mL can be paired with the 100 woman in 100 ways.
100
:. Probability that the man ml can be paired with a woman is 199'

100 100 99
Thus P(XL = 1)=- :. P(X( =0)= 1--=-"
199 199 199
:. the distribution of XI is

XI o 1
99 100
Prob:
199 199
176
ENGINEERING MATHEMATlCS_11A

Similarly distribution is for all of X2, X; X,oo i.e. in general the


distribution of Xi is
Xi 0 1
Prob: 99 100
190 199
The product XiX j may take two values 0 and 1
the joint probability of the product XiX j
peX;Xj = 1) = rcc, = 1, Xj = 1)
=p{eX; =l)neXj = I)}
. p{(X; =l)n(Xj =1)}
= Pt X, = 1)·~- __ --.:._-!..
rex, =1)
=Pt X, =l)P(Xj =1/X; =1)

Now, Pi X, = 1) = _100
I 199
Now if m; is paired with some wj,mj can be paired with remaining

197 persons of which 99 are women. So, p(Xj = 1/ X; = 1) = 99


197
.',£eXx.) = 100 x 99
IJ 199 197

99
Now, £eX)=Ox-+lx_=_ 100 100
199
I 199 199
., £(X) = £(X1) + £eX2) + ... + £(XIOO)

= 100 + 100 + ..... + 100 =100x 100 =50.25


199 199 199 199
Now, Var(X) = Var(X1 + X2 + ..... + X )
IOO
100 1

= LVar(XJ+2LLCoV(X;,Xj)
;=1 i< j

£(X2) = 02 x 99 + 12x 100 = 100


Now, I 199 199 199'

.'. Var(X;) = £(X/)-{£(X;)}2 = :~~ -G~~)2


DISCREET JOINT DISTRIBUTION & CHEBYSHEV'S INEQUALITY 177

= 100(1_100)= 100 x 99
199 199 199 199
If i < j,Cov(Xj>X;> = E(Xi,Xj) - E(Xi)E(Xj)
100 99 100 100
=-x---x-
199 197 199 199
100 (99 100)
= 199 197 - 199
100 1
=-x--
199 39203
100 100 99 100 1
Thus Var(X) = I-x-+ 2x
II-x--
j=1 199 199 199 39203
i<j

=100x
100 x 99 -!-2x
100C x 100 x_l_
199 199 2 199 39203
= 25 ·126 (Approx).

Put Z = X-50· 25 :. Z = 0, Var (Z) = Var(X)

X = 30 ~ Z = 30 - 50·25 = -20·25
Now, X s 30 ~ Z s -20·25.
:. by One sided Chebyshev's inequality (in fact by the corollary)
P(X s 30) = P(Z s -20·25)
< Var(Z) = Var(X)
- Var(Z) + (20· 25)2 Var(X) + (20.25)2
25 ·126
= = ·058 (approx).
25 ·126 + (20·25)2
:. there is a maximum chance of ·058 i.e., 5.8% of the required event
Example 5. The average number of bikes sold weekly at. a certain show
1
room is 16 with s.d 8" . Find the maximum probability that next week's sale
will exceed 18.
Solution. Let X = number of bikes sold weekly.
By problem X = 16 . Let Z = X -16 . Then Z=0 .
1)2 1
Var(Z) = Var(X) = ( 8" = 64 .
EM-2A-12
178 ENGINEERING MATHEMATICS - IIA

Now, X = 19=> Z =19-16=3


Using one sided Chebyshev's inequality we have
P(X > 18) = P(X ~ 19) = P(Z ~ 13)
1
~ Var(Z) = 64 = .00173
2
Var(Z)+3 _1 +9
64
.'. maximum probability is ·00173
.'. there is very little chance that next sale will exceed 18.
Example 6. If X=75 ,
Y=75 ,
ax =M .'
ay =.Jl2 and Cov(X,y)=-3
find the maximum probability of the event (IX - YI > 15)
Solution. Let Z =X - Y .', E(Z) = E(X) - E(Y) = 75 - 75 = 0

Var(X - Y) = Var(X.) + Var(Y) - 2Cov(X,Y)

or, Var(Z) = 10 + 12 - 2 x (-3) = 28


By Chebyshev's inequality
p(lz -"21~ 15) s _1_
152
Var(Z) = 28
225
8
or, p(IX - Y - 01 ~ 15) s ;2 5

or, P(lX _ YI ~ 15) ~ 28


225

or, p(IX - r] > 15) + pclX - YI = 15) ~ 22;5

8
or, p(IX - yl > 15) ~ ;2 5 - p(lX - YI = 15)
28
~ 225 (X may be equal to Y)

., the maximum probability of the event (IX - r] > 15) is ;;5 .

Example 7. Show by Chebycheff's inequality that in 2000 throws with a


coin the probability that the number of heads lies between 900 and 1100 is
19
at least 20' [W.B. U. Tech, 2002,2(03,2007]
DISCREET JOINT DISTRIBUTION & CHEBYSHEV'S INEQUALITY 179

Solution. Let X denote the number of heads. Then clearly X is a binomial

b(n,p) variate with n = 2000,p =.!..


2
:.11 = E(X) = np = 1000

7 1 1
and c: = Var(X) = np(l- p) = 2000 x-x - = 500.
2 2
Now P(900 < X < 1100) = P(-IOO < X -1000 < 100)

= pOx -1001 < 100) = I-POX -10001 ~ 100) (1)


Now, by Chebycjeff's inequality,

p(IX -1001 ~ 100) ~ ~ =~ =_1


(100)2 (100)2 20

I 19
:. from (1) P(900<X<1l00)~I--=-
20 20
So the probability that the number of heads lies between 900 and 1100
19
is at least 20'

Example 8. The p.m.fofa random variable Xis I(X = 1) = Ti,i = 1,2, .. ·..

Show that Chebycheff's inequality gives pOX - 21~ 2) ~ i.Also fmd the

actual probability.

Solution. Here m = E(XYI,i2-i = I . .!. + 2· ~ + 3· -\- + ...


i=l 2 2 2
E(X2) = i/T i = 12. .!.+22 .~+32 .-\-+ ...
i=l 2 2 2

Now (1-xr1 =1+x+x2 +X3 + ...


:. Differentiating both sides of (1) .w.r. t.x we get

(1- xr2 = 1+ 2x + 3x2 + . (1)

... x (1 -x )-2 =x+ 2 x 2 +:3 x 3 + ....


Again differentiating both sides of (1) w.r.t. x we get

(l-xr2 + 2x(l-xr3 = 1 + 22 -x +32 ·x2 + ....

l+x 2 2 2
:. = 1+ 2 + 3 x + ... (2)
(1- x)3

-
~----~------~~------------~
180 ENGINEERING MATHEMATICS-ItA

Putting x = Yz in (1) we get

1 )-2 =1+2'2+13'"22+'"1
( 1-2

1 1 1 1
4·-=-+2·-+3·-+···
2 2 22 23
1 1 1
:.-+2·-+3·-+···= 3 2 :. m =E(X)= 2
2 22 2

Again putting x = Yz in (2), we get

12=1+22 ..!.+32 ._1 ....


. 2 22

:.'!'+22._1 +32.1..-+ .... =6 :.E(X2)=6


2 22 23

0-
2 = Var (X) =E(X2)-{E(X)}2 =6-4=2

So for e = 2, Chebycheff's inequality


0-2
p(lx -ml~ s) ~-2
e
- 2 1
becomes pOX -21 ~ 2) ~ 2" =-
2 2
Again pOX -21 ~ 2) = I-POX - 21< 2)

= I-P(O< X <4) = l-(P(X =I)+P(X =2)+P(X = 3))


= 1_ (.!. + _I + _1 ) = .!.
2 22 23 8.

Example 9. If X = 75, Y = 75 and Var (X) = 10,Var(Y) = 12,

Cov (X,Y)=-3, fmd the maximwn probability of the event (Y > X + 15)

Solution. Let Z = X - Y :. Z =X - Y =0
Var(Z) = Var(X - Y) = 10+ 12-2x-3 = 28.

Now, y > X + 15 ~ X - y < -15 ~ Z < -15 .


DISCREET JOINT DISTRIBUTION & CHEBYSHEV'S INEQUALITY 181

Applying One sided Chebyshev's inequality (the corollary) we have

P(Z$-15)$ Var(Z) = 28 28
Var(Z) + 152 28 + 225 253
. 28
or, P(X-Y$15)::;-
253
. 28
or, P(Y~X +15)::;-
253
28
or, P(Y > X +15)+P(Y =X +15) ~-
253
28 28
or, P(Y >X -15)$--P(Y =X +15) ~-
253 253
28
:. the required maximum probability is 253 .

Example 10. From the past experience, a professor knows that the test
score of a student taking her final examination is a random variable with
mean 75 and variance 25. What can be said about the probability that a
student will score between 65 and 85 ?
Solution. X = score of the student
:. X = 75, Var(X) = 25.
Let Z = X - 75 . So X = 65 ~ Z = -10 and X = 85 ~ Z = 10
:. P(65 < X < 85) = P(-1O < Z < 10) = p(IZI < 10) (1)
Now, Z=X -75 = 75 - 75 = 0 and Var(Z) = Var(X) = 25 .
By Chebychev's inequality we have

P(lZ - 01~ 10) s ~Var(Z)


10
or, P (IZI ~ 10) ~ _1_ x 25 = .!..
100 4
or, 1 - P (IZ 1 < 10) ~ .!.. or, P (IZ I < 10) ~ 1 - .!..
4 4

V.
or, p( 65 < X < 85) ~ ~ using (1)
4
:. The probability of the event is at least

') .
182 ENGINEERING MATHEMATICS - DA

Example 11. If Xl' X2,.····· X20 be independent poisson variables with

mean 1 estimate P(~Xi > 15]'

20
Solution. Let X = LX; . Since X; are mutualy independent soXis poisson
;=1

variate with mean I + 1+ ..... + 1 = 20 . So, Var(X) = 20 also.

Put Z = X -20 :.Z =0, Var(Z) = Var(X) = 20.

Now, P(X > 15) = P(Z + 20 > 15) = P(Z > -5) (1)
By Chebyshev;s inequality

P(Z-5, -5) 5, Var(Z) = 20 4


Var(Z) + 52 20 + 25 9
4 4 5
or, 1- P(Z > -5) 5, - or, P(Z > -5) ~ 1- - = - .
9 9 9
5
or, P(X>15)~-
9
20 5
:,P(LX, >15)~-.
i=l 9
Example 12. If 10 unbiased dice are tossed. Find the approximate probability
that the sum obtained is between 30 and 40, inclusive.
Solution. Let X; = value shown on the ith die for i = I, 2, 3" ..... ·10 .
Let X ~ Xl + X2 + X3 + ... + XIO .

Distribution of X; is

1 2 3 4 5 6
1 1 I 1 1 1
6 6 6 6 6 6'
1 21 7
:. E(X) = -(1 + 2+ 3+4+5 +6) = - =-
. I 6 6 2
Var(X;)=E(X;2)_{E(X;)}2·

=.!..(12 +22 +32 +42 +52 +62)_(2)2 = 35


6 2 12 -
7
:.e(X)=E(Xl +X2 +···+XIO) =10x-=35
2
DISCREET JOINT DISTRIBUTION & CHEBYSHEV'S INEQUALITY 183

Var(X) = Var(X1 + X2 + ... + XIO) = Var(X1) + Var(X2) + .... +Var(XIO)


=10x 35 = 175 since X;,Xj are independent
12 6
Now, (30:$ X:$ 40) = (29 < X < 41)
~ 29 - 35 < X - 35 < 41- 35
~ -6<X -35<6
<=> IX - 351 < 6
Now by Chebyshev's inequality

p{IX -351;::: 6}:$ ;2 1


Var(X) = 3 6x 1~5 = ~ = O· 81

or, I-p{IX -351 < 6}:$ 0·81

or, P {I X - 3 51 < 5} ;::: 1 - ~ = . 1 9


6
.', the probability that the sum is between 30 and 40 is ~ ·19

Exercise 5
1. Following is' the joint distribution of X and Y.

X 1 2 3

1 3 1
0 - - -
10 10 10
1 1 1
2
- - -
5 10 5

(i) Is it a valid distribution


(ii) Find P(X = 2),P(Y = 3),P(X = 2/Y = 3)

(iii) X and Y
(iv) O'x,O'y and Pxy

(v) Are X, Y independent

2. An unbiased coin is tossed three times. Let X = number of heads


and Y = Inumber of heads - number of tails]. Find the joint distribution of
(X, Y). Find the marginal distribution of X and Y.Are X and Y independent?
184
ENGINEERING MATHEMATICS - llA

3. A pair of unbiased dice is thrown. Let X = number of sixes and y =


number of fives turn up. Find the joint distribution of the bivariate (X, Y)
and the marginal distribution of X and Y. Hence obtain P(X + Y ~ 2) .

4. Two balls are .drawn one by one with replacement from an urn
containing 2 white, 2 balck and 4 red balls. Let Xi be the random variable
such that Xi = 1, if the ball on ith draw is white
= °, if the ball on ith draw is non-white
Find the joint distribution of (Xl' X2) • Deduce the marginal distribution
of Xl and X2•

5. The probability of a female birth in a family of three children is 1 "2 .


X = number of female child in the first two birth
y = number of female child in the last two birth
Find the joint distribution of the two dimensional random variable
(X, Y). Evaluate P(Y=2/X=I). Are X and Y independent.
6. An urn contains four black balls numbered 0, 1,2,3; three red balls
numbered 0, 1,2 and two black balls numbered 0, 1. One ball is drawn at
random from the urn. X denotes the values 0, 1 and 2 respectively for
white, red and black balls. Y denotes the number marked on the ball. Find
.the joint distribution of (X, Y). Hence obtain the marginal distribution of X
and Y. Find whether X and Yare independent.

7. An urn contains 2 white, 3 black and 2 red balls. Three balls are
drawn one by one without replacement. Let.x: denote the number of white
balls, Y denotes number of black balls. Then obtain the joint distribution of
the two variables X and Y.

8. If the two independent random variables X and Y have poisson


distribution with means PI and zz, respectively find the joint distribution of
(X, Y).

9. An urn contains three balls marked with numbers 0, 1,2 and having
white, red and black clours respectively. One ball is drawn at random. Let
X takess value 1, 2 and 3 according as the colour of the ball is white, red
and black. Prove that the correlation coefficient between X and the number
marked on ball is 1.

10. If u = 2x + 3, v = -5x + 1 prove that Puv = -P.ry


I
DISCREET JOINT DISTRIBUTION & CHEBYSHEV'S INEQUALITY 185

11. If u = 2X + 3Y, v = 4X + 9Y then prove that

Puv =0 if 80-; +30pX),o'xCTy + 27CT; =0


12. If X =Y =0 then prove that

Var(Y - X) = Var(X) + Var(Y) - 2pxy8/iy


PxyCTy - CTx
and Px,y-x = ~var(X) + var(Y) - 2PxyCTxCTy

13. For a bi-variate (X, Y) prove that


Cov(2X - 5, 3Y + 7) = 6 Cov(X, Y) .
14. If X and Yare two independent random variables and Z = xy , find
te expectation of Z. X can take values 10 and 20 with the probability

P(X = 10) =.!.. Y can take the values 5, 6, 7 with the probability
3
P(Y = 5) = .!.,P(Y = 6) =.!.. Find expectation of Z.
4 2
15. Th~ joint distribution of (X, Y) is given by P(X = O,Y = 0) =~ .•
1 I
P(X = I,Y = 0) ="4'P(X = -l,Y = 0) = "4'P(X = O,Y = 1) ="4I show that
X and Yare uncorrelated but X and Yare not independent.
16. Let X assumes the values ±l,± 2 each with probability .!..
4
Let Y = X2 . Construct the joint distribution of (X, Y). Show Pxy =0.
Are they independent?
I -1 2 -2
Y
1 1/4 1/4 o
4 o o 1/4
17. A box contains 5 white, 2 blue and 3 red balls. If 6 balls are drawn
one by one with replacement find, using multinomial distribution, the
probability that there are 0) 2 white, 3 blue and 1 red balls (ii) 2 balls of
each colour.
18. Find the probability that in 8 throws of a die, the number 1, 3, 5 turn
up 2, 3,.3 times respectively
ENGINEERING MATHEMATICS -UA
186

19. A die is thrown 10 times in succession. Find the probability of the


occurence of six 4 times, five twice and all other faces one each.

[Hint:
10!
4! 2!(1 !)4
( 1
6"
)10 by multinomial distribution]

20. A die is rolled 8 times. What is the probability that the faces 1,3,5
turn up 2, 3, 3 times respectively.
21. A box contains 10 balls of which 5 are white, 3 red and 2 black. A
ball is drawn and replaced 3 times. Find the probability that the balls are of
different colours.
22. A pack of 8 cards contains 3 aces, 2 kings, 2 queens and 1 jack. The
pack is shuffled 5 times and the top card is exposed each time. Find the
probability that the exposed cards are ace twice,. king once, queen once
and jack once.
.
[Hmt:
5! (3
2! 1!1!1!"8
)2 (2)1
"8
(2)1
"8
( "8]1)1
23. A box A contains six balls numbered 1, 2, 3, 4, 5, 6 and another box
B contains six balls numbered 1, 2, 3, 0, 0, 0. Two balls are drawn at
random, one from each box. Let X = No. of the ball drawn from box A
and Y = No. of the ball drawn from box B. Find E(XY).
[Hint: Obviously X and Yare independent. B(Y) = "I. .!. + 2 . .!. + 3 . .!.]
6 6 6
24. Three balls are placed into three cells at random. Let N denote the
number of occupied cells. Xi denote the number of balls in r-th cell (for
i = 1,2,3). Find the joint distribution of (X\,N) and of (X\,X2) . Hence
find Cov(N,XI) and Cov(X\,X2)·
25. Two unbiased coins are tossed.
X = 1 if first coin shows head
= ° if first coin shows tail.
Y =1 if second coin shows head
= 0 if second coin shows tail
(i) Find the joint distribution of (X, Y).
(ii) Marginal distribution of X and Y.
(iti) Are X and Y independent. (iv) Find E(2X + 3Y)
DISCREET JOINT DlSTRmUTION & CHEBYSHEV'S INEQUALITY 187

26. Find the expectation of the sum of points on 10 unbiased dice.


[Hint: X; = No. of pts shown by i-th die.
1 1 1 III
E(X;) = 1 x"6+ 2x"6+ 3x"6+ 4x"6+ 5 x"6+ 6 x"6

Required expectation =E(Lx;)]

27. The bivariate (X, Y)takes the value (i,}) where i = 1,2; 3;} = 1,2,3.
P(X = i, Y = }) = kij where k is constant. Find
(i) the value of k (ii) P(l ~ X s 3,Y ~ 2)

(iii) P(X ~ 2) (iv) P(Y s 2) (v) P(X = 2)


(vi) Are X and Y independent.
28. Two discrete random variables X and Y take the values 1,2, and 3.
P(X = i,Y =D = XO,(i,})::t (2,2)

= Ys~(i,})= (2,2)
Find expectation 0'[ (i) X (ii) Y (iii) X +Y
and (iv) Cov(X,Y) (v) Var(X +Y)

29. In a joint distribution of the two variables X, Y if ax' ay, al._y are
standard deviation of X, Yand X - Y respectively then prove that the
2 2 2
ax +ay +ax_y
correlation coefficient Pxy = 2
axO"y
30. The joint distribution of (X, Y) is given by

o
o a e

b d

where a, b, e, d are non-negative reals. Prove that X and Yare


uncorrelated if and only if ad = be .
31. X and Yare two independent random variables and U =X +Y ,
2 2
V. = X - Y . Prove th at Puv = PXII - Pyu
188 ENGINEERING MATHEMATICS-llA

32. If U = 2X + 3Y, V = 3X - 2Y where X =Y=0 and Puv =0


Then prove that
2 2
(I) alla 69 2
v = 1 a xa
2 (1
y
2 )
- Pxy

(iil) a,~- a; = 13(a; + a~)


33. Two dice are thrown. X denotes the number on the first die and Y
be the greater of the two numbers on the dice. Find the correlation coefficient
of X and Y.
[Hint: See a similar Ex. in Illustrative Examples.]

34. If X = f.1 and Var(X) = a2 then for arbitrary E> 0 prove that
a2
P(X > f.1+ E) :5; 2 2'
a +E
35. The average number of automobiles sold in a week in a showroom
is 16 with variance 9. Find the least probability that next week's sales are
between 10 and 22 inclusively.

36. The average number of automobiles sold in a week in a shQWJQQm


is 16 with variance 9. Find the maximum probability that next week's sale
sill exceed 18. [Hint: P(X ~ 19):5;.!..]
2
37. If X = Y = 75 and ax = J1O,ay =.J12, Cov(X,Y) = -3 what is the
maximum probability of the event (X> Y + 15).

38. Suppose that the-number of cars produced daily at factory ofMaruti


is a random variable with mean 20 and standard deviation 3. The facory of
Honda has the corresponding data 18 and 6. The production of these two
factories does not depend on another. Find the maximum probability of the
event that Honda will produce more than Maruti in a particular day.
[Hint: P(Y > X) = P(Y - X > 0) = P(Y - X ~ 1)
= PC Z ~ 1) where Z = Y- X ]
39. The average daily probuct of electric bulbs is 20 with standard
deviation .fiO. Using Chebyshev's inequality prove that in a particular day
19
the probability of production of bulbs is less than 40 is at least 20'
DISCREET JOINT DISTRIBUTION & CHEBYSHEV'S INEQUALITY 189

40. From the past experience a professor knows that the test score of a
student taking her final examination is a random variable with mean 75 and
variance 25. Find the maximum chance that a particular student will score
more than 85.
41. Show that the probability that the number of heads in 2000 throws
with a fair coin lies between 900 and 1100 is 0.999. Compare this value
with the value given by Chebycheff's inequality.
42. A symmetric die is thrown 360 times. Determine the lower bound
for the probability of getting 50 to 70 ones using Chebycheff's inequality.
43. The average number of cars sold weekly at a certain dealership is
16 and the s.d is 3. Find a lower bound to the probability that next week;s
sales are between 10 and 22 inclusuvely.
44. Does there exists a variate X for which
P(p - 2~ 5, X 5,P + 2~) = 0 .6 ?
[Hint: the given probability contradicts Chebycheff's inequality]
45. If X is the number of points on a dice prove that Chebycheff's
inequality gives p(IX - Xl> 2.5) < 0·47 when the actual probability is
nearly O.
46. If X be a r. v such that E(X) = 3 and E(X2) = 13 then using

Chebycheff's inequality show that P( -2 < X < 8) ~ ~ .


25
47. Use Chebycheff's inequality to show that for n ~ 36 , the probability
J;;
that in a throws of a fair die the number of sixes lies between'!' n - and
I 31 6
- n+ J;; is a least - . [WB. U T. 2008]
6 36
48. What is the probability that rolling of 80 dice gives a sum exceeding
300. [Hint: Mean of each Xi = 3· 5, a = ~35/12 ]
49. A professor knows that the test score of a student taking her final
examination is a random variable with mean 75 and variance 25.
(i) Find the [probability that a student will score between 65 and 85.
(ii) Find the number of students taking the examinations to ensure, with
probability at least 0.9, that the class average would be within 5 of 75.
19(} ENGINEERING MATHEMATICS - IIA

50. Use Chebycheffs inequal;ity to find how many times a true coin
tnust be tossed in order that the probability will be at least 0.90 that is X,
the proportion of the number of heads will be between 0.4 and 0.6
[Hint : If n is number of toss, P (!X - n pi :5: n x O. 1) ~ 1 - 1 2
4n(·1)
1
By problem 1- 2 = 0.9 => n = 250.]
4n(-I)
51. For the discrete districution X = i
;; = P(X = i) = Iii i = 1,2,3,··· prove that Chebycheff's inequality
1
gives P (Ix - 21 > 2) < ~ while the actual probability 16.
52. A random variable X takes the values -1, 1,3, 5 with associated

probabilities ~ ' ~ ' ~ and T . Find by direct computation p(IX - 31 ~ 1) .


Find an upper bound for the probability by applying Chebycheffs inequality.
53. Two good dice whose faces are numbered 1 to 6 are thrown. If X
is the sum of the numbers shown up, using Chebycheff's inequalityto
5
prove that P {IX - 71 ~ 3} :5: ~. Also show that the actual probability is -6 .
54
Answers
1 3 2
1. (i) Vaild (ii) 2' 10'3 (iii) 1,2 (v) not independent

2.
x a 0
3

1/8
t, :0 1
1 3
- - - -
2 3
3 1
not indep.
8 8 8 8
1 3/8 0
2 3/8 0 t.: 1 3

3 0 1/8 3/4 1/4

3. X 0 2 X: 0 2
1
25
- 10 1
-_. -
o 4/9 2/9 1/36 fx
36 36 36 ' 9
2/9 1/18 0 Y 0 1 2

2 1/36 0 0 fy: 25/36 5/18 1/36

J
DISCREET JOINT DlSTRlBUTION & CHEBYSHEV'S INEQUALITY 191

X2
° XI
°
3 1
0

3 1
o 9/16 3/16 IXI -
4
-
4
IX2 : -
4
-
4

3/16 1/16

5. 0 2
Y
1/8 1/8
°
1 1/8 1/4
°
1/8 -
1
not indep.
4
2 0 1/8 1/8

6. (X,Y) assumes the value (i,i) where i = 0,1,2 and i = 0,1,2,3.


1
The joint probabilities, Pi} ="9 for (i,i) except (1, 3), (2, 2), (2, 3).

PI3 = P22 = P23 =


Marginal distribution
°. is given by

2 1 1 2 1
- and IYj
9 3 3 9 9
X and Yare not independent.
7. (X,Y) assumes the value (i,i) where i=0,1,2;i=0,1,2,3. The
joint probabilities are

Pi} =P(X=i,Y=i)=e Ci.3 CJ.2 C3-i-J)FC3• when l~i+ j s:


= ° otherwise
8. (X,Y) assumes the value (i,i), i.] = 0,1,2,.······00
i j
p .. = P(X = i Y = i) = e-(Jll+)J2)!!J...!!:1-
lj' i! i!
. 9 81
14. ItOO 16. not independent 17. (i) 250 (ii) 1000

8! 1 10! 8! 9 135 7
18'72'(1 19. 48x61O 20. 72x68 21.5022'204823'2
192 ENGINEERING MATHEMATICS -11A

24. v. 0 1 2 3 X2
~l 0 1 2 3

1 2/27 0 0 1/27 0 1/27 3/27 3/27 1/27


.
2 6/27 6/27 6/27 0 1 3/27 6/27 3/27 0

3 0 6/27 0 0 2 3/27 3/27 0 0

3 1/ 27 0 0 0
Cov(N,Xl) = 0; Cov(XpX2) = -1/3
25. (i) Y 0 1 (ii) X :0 Y :0
1 1 1 1
o 1/4 1/4 PXi: 2"2 PYi: 2" 2

1 1/4 1/4

(iii) not indep. (iv) 1i


26.35.

27. (i) f3'6 (ii) % (Hi) li (vi) X, Yare independent.

(27
28. (i) 2 (ii) 2 (ill) 4 (iv) 0 (v) 1.2.33. Vn
3 28 45
35. "4 37. 253 38. 54
15 3
40'17 43. "4 44. no /

3
48. ·0951 49. (i) "4 (ii) 10 50.250

Multiple Choice Questions

1. If X and Yare independent then

(a) Pxy = 1 (b) Cov(X, Y) :;::0 (c) Y = aX + b (d) none of these


2. If P-r;' :;:: 0 then X and Yare independent

(a) True (b) False


DISCREET JOINT DISTRIBUTION & CHEBYSHEV'S INEQUALITY 193

3. If X and Yare Two independent random variables then

(a) E(XY) = E(X)/ E(Y) (b) E(X+Y)=E(X)

(c) E(XY) = E(X)E(Y) (d) none of these

4. If Var(X + Y) = Var(X) + Var(Y) then

(c) Pxy =0 (d) Pxy = 1

5. If Pxy = 0 then Var(X + Y) =


(a) Var(X) Var(Y) (b) Var(X) - Var(Y)
I

(c) 2Var(X) Var(Y) (d) Var(X) + Var(Y)

6. If X and Yare uncorrelated then X + Y and X _ yare uncorrelated

(a) True (b) False

7. If X is (7,1) binomial variate and Y (4,1) binomial variate then

X + Y is ( n,±) binomial variate where n= _


-(a) 11 (b) 10 (c) 7 (d) 9

8. If XI' X 2 are independent with standard deviation 0'1,0'2respectively,

(c) 0

9. If XPX2 are independent wJ.th Vilr(XI) = a~, Var(X2) = a; then

Var(A~) +X2)=

(a) al~ -ai (b) a~ +ai (c) 0 (d) 2(0'12+0';)


,
10. If)''( has variance 9, Y has variance 5 and if X, Yare independent
then Var(2.•.
~ + Y - 5) =

(a) 30 (b) 40 (c) 41 (d) 31


EM-2A-13
194 ENGINEERING MATHEMATICS - UA

11. If X has mean 4, Yhas mean -2 then mean of 2X + Y - 5 is


(a) 1 (b) 0 (c) -1 (d) -2
12. According to Chebychefl's inequality for a random variable with
mean 1 and s.d. ·03, p(IX -11 ~.2) $;

(a) ·22~ (b) ·0225 (c) ·002 (d) none of these

13. If X is a (2000,~) binomial variate p(IX -10001 ~ 100) $;

1 19 9
(a) 20 (b) 20 (c) 20 (d) none of these

14. For a random variable with mean m and s.d. a p(IX - ml ~ 2a) $;
1 1 1 1
(a) - (b) - (c) - (d)-
4 2 3 5

15. For a random variable with mean 0, p( -40- < X < 40-) ~

1 1 15 13
(a) 16 (b) -
4 (c) 16 (d) 16
16. For a random variable with mean m and variance 2,
P(lX -ml$;2)~
(a) .9 (b) .3 (c) .2 (d) ·5
17. If XI' X 2'·· .,X; be n independent variates having same distribution
with common mean 4 and variance ·01 then the destribution of
- 1 n
X =- L Xi has distribution with approximate variance
n i=1

(a) ·0025 (b) ·025 (c) ·25 (d) none of these


Answers
l.b 2. b 3. c 4. c 5. d 6. b
7.a 8. b 9. b 10.c 11. a 12. b
13. a 14. a 15. c 16. d 17. a
MODULE-2
~ CONTINUOUS PROBABILITY DISTRIBUTION

6.1. Introduction.
In a previous chapter we defined continuous random variable and its
distribution function. Useful properties of distribution function were also
discussed in that chapter. In fact probability distribution of a continuous
random variable is given by its Probability Density Function which is
being discussed in this chapter. Concept of Expectation, Mean, Variance of
continuous random variable are nothing but the consequences of the
appearance of this density function.
6.2. Prebabillty Density Function (or density function)/
.. -
.and Continuous Distribution.
Let X be a continuous random variable with distribution function
F(x) = P(-oo < X $ x) Then a function f(x) is said to be probability
d~nsity function (pdj) of X if f( x) is integrable on the interval [a, x] for
all a and if
x

F(x):: Jf(t)dt
This holds for every real x.

Fundamental Properties of pdf.


If f(x) is apdfofa random variable X then
00

(i) f(x)~O (ii) Jf(x)dX::l

Proof: (i) Fromdefmition F'(x)=f(x). Since F(x) is increasing


so F'(x) ~ 0
:. f(x)~O

Jf(t)dt
00

(ii) Since F(ao) = 1 so, = F(ao) = 1.


196 ENGINEERING MATHEMATICS - IIA

Properties of pdf.
(i) As F( x) is a continuous function, so we must have
P(X=a)=F(a)-limF(x)=O.
x-+a-
(ii) For a continuous distribution,
P( a ~ X ~ b) = P( a < X < b) = P( a < X ~ b)
b
= P(a ~ X < b) = f(t)dt f = F(b) - F(a),
a

where f is the probability density function.


(iii) f(x) is continuous, then from (definition of pdt), we must
If
have f(x) = F'(x)
(iv) In differential notation we have,
P(x <X ~ x+dx) =F(x+dx)-F(x) =dF(x) =F'(x)dx =f(x)dx
which is known as the prob. differential of X.
Density Curve. The curve given by y = f(x) (f(x) is pdf) is called the
probability density curve which gives the graphical representation of the
corresponding continuous distribution.

IlIust';ation. Consider a function f( x) (f( x) is pdf) which is defined


2
as f(x) = -3 ' 1~ x < 00
x .
=0, elsewhere.
00

As f(x) ~ 0 everywhere and f f(x)dx

00 2 P2 ()
=f-3dx= Lt f-3dx= Lt 1---;- =1-0=1
x P-+oo X P-+oo P ,
1 1

so this f( x) is a probability density function of some random variable.


x x
Now F(x) = f f(x)dx = f ts-dx = 0 when-oo < x <1
. --<0 --<0
X x '2
and F(x) = ff(x)dx';;' J3dx when l~x<oo
--<0 lX
1
=1--·
X2

L
CONTINUOUS PROBABILITY DISTRIBUTION 197

So the distribution function of the above pdf is


F(x) = 0, -oo<x<1
1
=1--2, 1~x<oo
X
6.3. Expectation or Mean of a Continuous Random Variable
For a continuous random variable X with probability density function
f (x) , the mean or expectation of X is defined as
co

E(X) = fxf(x)dx,
--«l
provided the infinite integral converges absolutely.
Similarly, the mean of a function ",(X) of the random variable X
denoted by E{ ",(X)} is defined as
co

E {f/I ( X)} = f f/I ( x) f (x ) dx , for a continuous distribution.

Illustration. Let the p.d.f of a continuous random variable X is


1
f(x)=- in -1<x<1
2
= 0, elsewhere.
Then the mean or expectation of X is
co 1 1
E(X) = f x f(x) dx = f x"2 dx = 0
--«l -1

and also, E(2X') ~ 12X'!(x)dx 1


~ 2X' ·i dx ~ [X '
4
L~ 0

Following properties of expectation of a continuous random variable


hold as those for discrete random variables
Properties of Expectation
(i) E( a) = a, where a is a constant.
(ii) E(aX) = aE(X) , a being a constant.
(iii) E{X ± Y) = E(X) ± E(Y) , X; Yare two r.v
(iv) E(XY) = E(X)E(Y) if the two r.v X and Yare independent.
Proof: Left as an exercise.
198 ENGINEERING MATHEMATICS -ItA

Illustration.
Suppose X and Y be two continuous random variables. Their distribution
is given by the density function.
f(x) = 2x, 0<x ~ I
= 4x - 2x, I<x ~ 2
=0 , elsewhere
I
and </l(y) = -, - 2~y ~ 2
4
=0 ,elsewhere

Then find E(2X - Y + 3)


OCI

Now, E(x) = f xf(x)dx

o 1 2 OCI

= f x·Odx+ fX.2xdx+ fx(4-2x)dx+ fx.Odx


--<:lJ 0 1 2

1 2 2
= 2 f X2dx + 4 f xdx - 2 f X2dx =2
o 1 1

OCI 2 ~ I OCI

and E(Y) = f y</l(y)dy = f yOdy + f y. -=-dy+ f y. Ody


-00 -2 4
-<Xl 2

=![y2]2 =.!.(4-4)=O
4 2 -2 8
Then E(2X - Y + 3) = E(2X) - E(Y) + E(3)

= 2E(X)-O+3 =2x 2 = 4
We define variance and standard deviation for a continuous random
variable as that for discrete random variable given in an earlier chapter.

6.4. Variance and S.D.


The variance of a r.v X, denoted by Var (X) is defined as
Var (X) = E((X - m)2), where m = E(X) .
-co TINUOUS PROBABILJTY DISTRIBUTION 199

The positive square root of Var (X) is called the standard deviation
of X and is denoted by cr(X) or c x or simply cr. Thus c = +~var(X) .
Remarks: (i) The variance describes how widely the probability masses
are spread about the mean i.e it gives an inverse measure of concentration
of the probability masses about the mean which is called the measure of
dispersion.
(ii) As Var(X) = 0 only when X - m = 0 i.e, X = m , so in that
case the whole mass is concentrated at the mean.
Theorem.
2
(i) Var(X) = E(X2) - m2 = E(X2) _ {E(X)}
(ii) Var(aX + b) = a2Var(X)
(iii) Var(k) = 0 where k is constant.
(iv) Var(X)= E{X(X-l)}-m(m-l) where m is mean of X
Proof: (i) Var(X) = E{(X - m)2} = E(X2 - 2mX + m2)
= E(X2)- E(2mX) + E(m2)
=E(X2)-2mE(X)+m2 =E(X2)-2m.m+m2 =E(X2)_m2
(ii) and (iii) are left as exercise.

(iv) (X _m)2 =X(X -1)-2mX +X +m2


.. E{(X _m)2} = E{X(X -l)}- 2mE(X)+ E(X)+ E(m2)
= E{X(X -1)}-2m.m+m+m2
=E{X(X -l)}-m(m-l).
ote: In fact the result (i) and (iv) of the above theorem are used to
evaluate variance and standard deviation.
TIlustration. Consider the following distribution of a random variable X :
1
f(x)="2x, O~x~2
= 0, elsewhere.

:(:;:
f~:;:
:fj:- 0
l~ x dx ~ [X:]2
0
4
3'
200 ENGINEERING MATHEMATICS-llA

Now, J
-00
J
E(X2) == x2f(x)dx == X2 '~X dx ==[X;
0
]2 ==2
0
1
:. Var(X)==E(X2)_{E(X)}2 ==2_ : ==!.

So, s.d== Fy{,.


6.5. Illustrative Examples.
Example 1. The probability density function of a random variable X is
f(x)==k (x-1)(2-x) for 1~x~2. Determine (i) the value of the

constant k ; (ii) the distribution function F(x) (iii) p(: s X s ~).


[ WB. U. Tech 2007]

(i) Since f(x) is a p.d.f,


00

we have, j f(x)dx ==1

2 2

i.e., kj(X-1)(2-x)dx==1 i.e., k J(3X _X2 - 2)dx ==1


1 1

2 3
. [3X x
i.e., k T-s-2x
]2
1 =1

k
i.e
.. , -=1
6 ... k-6
- .

:. The complete p.d.f of X is therefore

f (x) = 6(x-I) (2 - x) , for 1~ x ~2


elsewhere.
(ii) The distribution function F(x) is defmed by
x
F(x)== jf(x)dx :. F(x)=O, for x<l
CONTINUOUS PROBABILITY DISTRIBUTION 201

x x
J
Also F(x) = f(x) dx + f(x)dx,
-00
J
1
for 1 s x ~ 2

x x
= J O·dx+ J 6(x-l)(2-x)dx
-00 1

1 2 x

Again F(x)";'
-00
J O·dx+ J 1
6 (x-l)(2-x)dx+ J O·dx,
2
for x > 2

3X2 X3 ]2
= 0+ 6 [- 2 --
3
- 2x + 0 = 1
.
1

Therefore the distribution function is given by :

F( x) = 0 , for x <1

= 5 -12x + 9x2 - 2x3 , for 1 ~ x ~ 2


= 1, for x > 2
(iii) Using the properties of distribution function

p( : ~ X s %) = F( %) - F( :).-
={5-12%+9H)' -2 (%)'}-{5-12 : +9(:), -2(H)
1 5 11
=---=
2 32 32
Alternative by (Using p.d.f) :

3 3

p(: ~X~%)= ff(X)dX = f6(x-l)(2-X)dx = ~~.


~ ~
4 4
202 ENGINEERING MATHEMATICS-llA

Example 2. A random variable X has the density function

a
f(X)=-2-' -oo<x<oo·
. x +1
1
Find (i) a (ii) the probability that X2 lies between 3" and 1 (iii) the
distribution function of X.
00

(i) As j(x) is a p.d.f, so we must have f f(x) dx = 1


. f_a-
i.e.,
00

+X2 dx 1
=1 -<Xl

-00

i.e., a [tan- x]: 1


=1 i.e., a(; + ;) =1 .. a= ~
(ii)A~ ~~X2~l~(-l~X~- Js)u(Js ~X~l)
p(~~ 1) = p(-1~
X2 ~ X s - ~) + p( ~ ~ X s 1)
1

1-~ dx 11 dx
=; f 1+x2+;f
-1
1+x2
1
~

1
= -[tan
1t
-1 x]-~
~
-1
1
+ -[tan
1t
x]- -1
1
1

= ~(-~+:)+~(:-~) = ~.
(iii) The distribution function of X is given by

F(x)=- 1
1t
fX --=-
dx
1+ X2
1[tan
1t
_l]X x
-<Xl
1(tan-
=-
1t
1
x+- 1t)
2
.
-<Xl

Example 3. A random variable X is exponentially distributed with p.df

1
f() x =-e - :0 ' x > 0
40
CONTINUOUS PROBABILITY DISTRIBUTION 203

Find (i) P(X ~ 20) (ii) P(32 ~ X s 48) (iii) P(X ~ 25)

(i)
wI
P(X s 20) = [ 40
_~
e 40dx
[_~]W = ~
= -e 40 0 1- e 2

X ]48 4 6
(ii) P(32 ~ X s 48) = 1]...e -:Odx = [
-e - 40 = e-5 - e-5
~40 ~

25 1 _~
(iii) P(X ~ 25) = I-P(X < 25) = 1-
o
J
40 e 40dx

Example 4. If the random variable X has p.d.f f(x)=~,-2~x~2,


find (i) P(X < 1) (ii) p(IX -11 ~ ~) 4
The p.d.f of X is given \y
1 in -2 ~ x s2
f(x) =-
4
=0 elsewhere

1. -2 1

(i) P(X < 1) = J f(x)dx = J f(x)dx + J f(x)dx


--<lO --<lO -2

.-2 1 1 1 3
= JOdx+J-dx=-(1+2)=-
--<lO -2
444

(ii) Now IX -11> ~ ~ (X -1 ~~)v (X -1 ~ -~)

~(%~X <oo)v( -OO<X~~)


ENGINEERING MATHEMATICS - IIA
204

1
2
J J
00

= f(x)dx + f(x)dx
3 -00

21 21
= J-dx+O+O+ J-dx
4
34 _2

=~(2-%)+~(~+2)=!'
Example 5. Consider the p.d.f f(x) = ae-b\x\ where x is a random variable
whose allowable range of value are from x = -00 to 00. Find (i)
cumulative distribution function (ii) the relation between a and b (iii)
P~sxs~. .
(i) The cumulative distribution function is given by
x x
F(x) = J f(x)dx J
= aebxdx, for -00 <X <0
-00 -00

o x

F(x) = J f(x)dx + J f(x)dx for 0 S x < 00


-00 0

o
= J aebxdx + J ae-bxdx
x
a [bX)O
=- e
b -00
a [ _bX]X
+--e
b 0
-00 0
CONTINUOUS PROBABILITY DISTRIBUTION 205

a
(ii) Since F(oo) =1, so, -,;(2-0)=1

. 2a = 1 . _
i.e., b .. b - 2a
(iii) Now P(l s X s 2) = F(2) - F(I)
-b
= ~ (2 - e -2b - 2+e -b) = e2 (1- e -b) .
Example 6. Show that a function which is Ixl In (-1,1) and zero
elsewhere is a possible p.d.f and find the corresponding distribution
function. [W.B.U.T. 2013,2006]
Let us denote the given function by f(x). Then
f(x)=lxl in -1<x<1
= 0, elsewhere
i.e., f(x)=-x, -1<x~O
=x, O<x<1
= 0, elsewhere
Clearly we see that j(x) ~ 0 everywhere.
00 -1 0 1 00

Also, f f(x)dx = f f(x)dx + f f(x)dx +f f(x)dx +f f(x)dx


~ ~ -1 0 1
o 1

=0+ f (-x )dx +f x dx + 0 =.!. +.!. = 1 .


-1 0 2 2
Hence j(x) is a possible prob. density function.
Now the distribution function F(x) is given by
x x

F(x) = f f(x)dx = f O·dx = 0, for -00 < x ~-1

-1 x
F(x) = f f(x)dx + f f(x)dx, for -1 < x s0
~ -1

-1' x

= fo.dx+ f(-x)dx=i(I-X2)
-'>' -1
206 ENGINEERING MATHEMATICS - rIA

-lOx
F(x) = I f(x)dx + I f(x)dx + I f(x)dx,
-<X> -1 0
for °< x :5 1

. 0 x 1
=0+ I(-x)dx+Ixdx="2(1+x2),for 0<x:51.
-1 0
-1 0 1 x

F(x) = ff(x)dx + I f(x) dx + I f(x)dx +I f(x), for 1 < x < 00


-<X> -1 0 1

o 1

=0+ I (-x )dx + I x dx +


-1 2 0
° =~ +~ = 1.
2
So the distribution function is given by

F(x) = 0, -oo<x:5-1

~ ~(I-X2), -1<x:50
1 .
=2"(I+X2), 0<x:51
=1 l<x<oo
Example 7. X is a continuous random variable having p.d.f

f(x) = 4x , 0< x:5 1


5
= !(3-X), l<x:52
= 0, elsewhere.
Find the mean value of X
co

Mean of X = E(X) = I x f(x)dx


01 2 co
4 2
= IxxOtlx+ Ix, ;dx+ IX"5(3-x)dx+ fX.Odx
-<X> 0 1 2

41 22 17
= -fx2dx +-fx(3-x)dx =-
5 5 15
o 1
CONTINUOUS PROBABILITY DISTRIBUTION 207

Example 8. Find the mean, variance of a continuous random variable having


p.d.f.
f(x)=1-11-xl, 0<x<2
= 0, elsewhere
Rnd aso E(X3)
The mean
00 0 2 00

=E4:x)= fXf(x)dx = f x·<Xix+ fX{1-11-~}dx+ fx.Odx


-<Xl -<Xl 0 2
2 1 2

= f x{1-11- xl}dx =f x{1- (1- x)}dx + f x{1 + (1-x)}dx


o 0 1
1 2
= f x2dx + f(2x _X2)dX = 1
o 1
00 2

E(X2) = f x2f(x)dx = f x2{1-11-xIJdx


-<X) 0
1 2

= fX2{1-(1-x)}dx+ fx2{1+(1-x)}dx
o 1

00 0 2
E(X3) = f x3f(x)dx 3
= f x ·0 dx + f x3{1-ll- xl}dx
-<X) -<X) 0
00 2

+f x3 -txix = f x3{1-11-xl}dx
2 0
208 ENGINEERING MATHEMATICS-IIA

Example 9. Avariable X has the density function


x
f(x)=2'
1
= 2' 1<x~2
1
= "2(3-x), 2<x~3
Find the mean and variance of X [W.E. U.Tech 2002]

. 3 1.2 3
1
Mean = E(X) = f x f(x)dx = f x·~dx+ f x·-dx +fx.! (3 - x)dx
. 2 2 2
. 0 0 1 2

1 1
=-(1-0)+-(4-1)+-
6 4 2 2
(27
1 --9-6+- 8)
3
=-
3
2

3 123

E(X2) = fx2f(x)dX= Jx2.;dx+ fX2.~dX+ fX2~(3-x)dx


o 0 1 2

8
=-
3

.. Var(~) = E(X2)_{E(X)}2 =: -(%r = 5


12

Example 10. The demand for a new product of a company is assumed


to be a random variable with p.d.f
%2
X -- 2
f(x) = ')..,2 e 2A. , X ~ 0

= 0, x <0
Find the mean, variance of this random variable and also find the probability
that it will exceed A.

J
00 X2

Mean = E(X) = X· ;Z e - 2A.2 dx


o
X2
J
00 ~-1
= A..J2 u 2 e -u du , by putting 2A.2 = u
o
CONTINUOUS PROBABILITY D1STRmUTION 209

=,,~ r(%)= ,,~~,r(~) = ,,~,~J; = "J%


(using the property r(n + 1) = nr(n))

X2

J X2,
00

Now, E(X2) = ~ e-2A2dx


o

= 2~2J
·002

I'v U
2-1
e-ud u, by puttmg
,x
2/..2 = u
o
= 2,,2r(2) = 2,,2 [ ',' r(2) = I! ]

.. Var(X) = E(X2) _ {E(X)}2 = 2,,,2 _ ,,2 2: = (4 -n) ,,2


2 2.

Now, P(X > ,,) = 1- P(X s ,,)


1

X2
= 1- Je
2 -u

du, by putting 2,,2 =U


o

_1:.J -1:. 1
=1-1-e2 ( =e2=J;'

Example 11. A random variable X has probability density


j(x)=e-x, x~O
= 0, otherwise

Find (a) E(X) (b) E(X') (c) E{ (X -I)'} (d) E( .': J


J J
00 00

(a) E(X) = x f(x)dx = xe-x dx = r (2) =1

I
--<X) 0

[.: the Gamma Function r (n) ~ e-'x"-'dx 1


EM·2A·14
210 ENGINEERING MATHEMATICS -IIA

J
00

(b) E(X2) = x2e-xdx = 1 (3) = 2! = 2


o

J (x _1)2 e-xdx
00

(c) E{(X _1)2} =


o
00 00 00

2 x
= Jx e- dx-2Jxe-xdx+Je-X =1(3)-2 1(2)+r(1)
o 0 0

= 2!- 2 . 1+ 1= 1

U
= Lt
U--+oo
Je -3d:X;
x
= Lt (_3e-U/3 +3) = 0 +3 = 3.
U--+oo
o
Example 12. The distribution function of a random variable X is
F(X) = CX3, O~x<3
= 1, x~3
= 0, x<O
If P(X = 3) = 0, find (i) the constant c (ii) the density function
(iii) P(X > 1) (iv) P(l < X s 2) (v) P(3X + 2 < 8)
(i) The density function/(x) is given by f(x) = F'(x)
.. f(x) = 3CX2, 0~x < 3
= 0, elsewhere

J f(x)dx
00

Also we have = 1
-00

i.e., 27c =1 1
... c=-·
27

(ii) The required density function is given by


X2
f(x)=9' O~x<3

= °, elsewhere
CONTINUOUS PROBABILITY DISTRIBUTION 211

00 . 1 3 26
2
(iii) P(X > 1) = ff(x) dx = -fx dx =-
9 27
1 1

2 1 2 2
2
(iv) P(l < X ~ 2) = ff(x) dx = -fx dx =-
91 27
1

(V) P(3X + 2 < 8) = P(3X < 6) = P(X < 2)


2 0 2 2 8
= f f(x)dx = O·dx
--<lO
f
--<lO
+J Xg dx =
0
27'

Example 13. Verify that the following is a distribution function


F( x) = 0, x < -a
_~(x +1)
-2 a ,-a~x~a
=1 , x>a
In order that F(x) may be a distribution function we should have
(i) F( -(0) = 0 (ii) F( (0) = 1

(iii) d :~x) = f(x) 2:0, -a ~ x ~ a (iv) j f(x) dx


--<lO
= 1.

As F(x) = 0 for x < -a and F(x) = 1 for x> a, so the conditions


(i) and (ii) are satisfied.
d F(x) 1
Now =-2:0 for -a<x<a
dx 2a - -

Again t(x) dx ~ Io,dx+ 12~ dx + f O·dx ~ [;J.~ 1

Hence F( x) is a distribution function.


Example 14. A continuous cumulative distribution function F( x) is
defined as follow:
F(x) = 0, x ~1
= ~(x _1)4
16 ' 1<x~3
= 1, x>3
Find the probability density function f( x) . Also find the mean of X.
212 ENGINEERING MATHEMATlCS-1IA

The p.d.f f(x) is given by F'(x) = f(x)


1 3
i.e., f(x) = "4(x-1) , 1$x$3
=0, elsewhere
3 3

:. mean of X= fx.~(x-1)3dx= ~f(x4-3x3+3x2-x)dx


1 1

4
1[X5 3x 3 X2J3
="4 5-7+x -2 =2.65.
1

Example 15. Find the value of the constant k such that


f(x) = kx(l- x ),0 < x $1
=0 , else where.

is a possible density function and compute p( X > ~). Also find E(X)
" 00

Since f(x) is a p.d.f, so f f(x)dx = 1

ie., kf x(l-x)dx =1
o

X2
i.e., k [ 2-"3
x3l1 =1
0

i.e., k(i-~) =1

:. k=6.
.. f(x) = 6x(1-x),0 < x $1
=0, else where.

" p( X> i) = 1-P( Xs i) = 1~ J f(x)dx = 1-6J x(l-x)dx


o 0
1

=1-{x: - x:l: =l-i=!.


CONTINUOUS PROBABILITY DISTRIBUTION 213

Now, E(x)=6fX2(I-x)dx
o

<J:
~{~3
=6(j- ~)=i·
Exercise
[I] Short Answer Questions
1. Find mean and S.D of the following distribution:
1 -~ .
f(x) = -e 4,X> 0
4
=0 , else where.
.. .. 1~i ~

[Hints: mean = 4" f xe +dx = 4


o

x O<x~1
2. Let f(x) = 2- x 1sx ~2
{
. 0 elsewhere

Is the following function is a p.d.f of random variable X ?


00

[Hints: Yes, as f(x) ~ 0 and f f(x)dx = 1]


-00

3. Is the funtion f (x) defined as

0 ,x < 2
f(x) = ~(3+2x),2 ~x ~4
{ 18
o ,x> 4
a density function ?
214 ENGINEERING MATHEMATICS-ItA

4. A random variable Xhas the p.d.f

f(x) = Axe-A2x2 ,x >0


Find A

[Hints: Use the result J f(x)dx


00

= 1].
~ 1 -=-
5. Verify that the function f(x) = -e
')...
A , o<x<oo
= 0,
elsewhere
is a possible probability density function. Find P(2 < X < 6) .
6. (a) Ifa random variable X has the density function
1
f(x) = "6' -3 < x < 3
=0, elsewhere,
obtain (i) P(X < 1) , (ii) p( IXI > 1),. (iii) P(2X + 3;> 5)
(b) A random varaible X has the following pdf

f(x)=k, -2<x<2
= 0, otherwise.
(i) Determine the constant k (ii) what is the value of P(lxl > 1) ?
[W.B.U.Tech 2004]
(c) Obtain E(X), Var(X) of X whose pdf is given by
f(x)=2x,O~x~1.
= 0, otherwise.
Hence find Var( 3 - 5X) .
7. Is the following a p.d.f?

f(x)=2x, O<x~1
= 4-2x, 1<x~2
elsewhere.
8. If the random variable X has the p.d.f
1 .
f(x)=4·-:-2~x~2

= 0, elsewhere. Obtain P(2x + 3> 5)


CONTINUOUS PROBABILITY DISTRffiUTION 215

9. Show that the function f( x) given by


f(x)=k(x-9)(1O-x), 9~x~1O
=0, elsewhere,
.is a pdf for a suitable value of the constant k. Find mean of the
distribution.
10. Find the mean and s.d of the continuous r.v having p.d.f
1
f(x)=--kx,0<x<4
= °,
2
elsewhere
n
11. Show that the function f(x) = ~e-x, x > 0, n being a positive
n.
integer, is a pdf. Find also its mean and variance.

12. The probability density of a continuous distribution is given by

f(x) = ~x(2 - x),


4
° < x < 2. Compute mean and variance.
[WB. U.Tech 2004,2007,2008 ]
13. Let f(x)=ke-ax(l-e-ax),x~O. Find k such that f(x) is a '
density function. Find also the corresponding distribution function.
[WB.U.Tech 2004]
Answers

1. 4.80 2. yes .3. Yes

2 2 1 2 1 25
6. (a) (i) 3" (ii) 3" (iii)"3 (c) 3' 18'
18 7. not p.d.f.

1
9. k = 6, mean = 9.5
10 .! J8
8. 4' . 3' 3 '

11. n + 1, n + 112. 1'5! ' 13. 2u, 1- 2e-ax + e


2'
ax
ENGINEERING MATHEMATICS - IIA
216

[II] Long Answer Questions



1. If f(x) = hxe'? ,0 ~ x < 00 be a continuous distribution, find k. Also

evaluate P(2 s x ~ 3) and P(X s 1) .


2. A continuous random variable Xhas probability density function f (x)
as-follows :
f(x)=kx,0~x<5
= k(10-x),5 ~ x < 10
= 0, otherwise.
Determine k,P(X < 6) and E(X) .
~
3. The lifetime (in hours) of a certain machine is a continuous random
variable X having range 0 < x < 00 and p.d. function is
f(x)=xe-kx,O<x<oo
=0, otherwise
(a) Determine the constant k (b) Compute~20<x<ro).
(c) Determine the distribution function.
4. A continuous r.v has the p.d.f.

1
f(x)=2' 4~x~6
= 0, elsewhere.
Find (i) P( 4 ~ X s 5) (ii) P(X < 3.2) (iii) P(X < 4.2)
(iv) P(3 s X s 4.5) (v) P(X > 5.5) (vi) c.d.f (cumulative distribution
function i.e., distribution function) (vii) k such that P(X < k) = 0.7
5. The p.d.f of a random variable X is
f (x) = ex 2, 0sx s1
=0, otherwise. ...;.-

Find (i) e (ii) ~O~X~~) (iii)P(4X > 3) (iv) P(l < 4X < 3)

(v) the c.d.f


CONTINUOUS PROBABILITY DlSTRmUTION 217

6. f(x) =~ -
kx, 0 $ X $ 4 is the p.d.f of a r.v X Determine (i)
2
k (ii) the probability that X assumes the value between 2 and 3 (ill) the c.d.f (iv)
P{X$l) (v) l{X~25) (vi) *-~<Q5)
7. The length of the life of a tyre manufactured by a company
follow a continuous distribution with density function
k .
f(x) = -3,1000 $ x $150.0 .
x
Find k, the probability that a tyre selected at random would function
for at least 1200 hours.
8. The p.d.f of a r.v is f(x) = k(x -1)(2 - x), 1 $ x $ 2. Find (i) k
5
(ii) c.d.f (iii) the probability that X is less than '4 (iv) the probability that

2X is greater than 3 (v) P(5 < 4X < 6).


9. Show that f(x) =x, 0 $x <1
=k-x, 1$x$2
=0 elsewhere
is a pdf of a random variable X for a suitable value of k which you
are to determine. Then find the distribution function of the random variable
1
X Calculate the probability that the random Variable lie between '2 and
3 .
2. [ W.E.UTech 2005, 2006]
10. The pdf of a continuous r.v. is given by

f{x) = -ex + 3e, 2 $ x $ 3


=e, 1<x<2
=ex, osx si
:; 0, elsewhere
Find c and the distribution function.
11. The radius of a circle has distribution given by the pdf
f(x)=l, 1<x<2
= 0, elsewhere.
Find the mean and variance of the area of the circle.

L
218 ENGINEERING MATHEMATlCS- llA

12. A Continuous distribution of a variate X is defined by

O<x~l

= 2' 1~x~2

1
= "2(3 - X), 2~x ~ 3
(i) Verify that the area under the curve is unity
(ii) Find mean and variance of X
13. If the pdf of a r.v X is given by
-(x2+2x+3)
f (x) = ce ,- 00 < x < 00
find the value of the constant c, the expectation and variance of the
distribution.
14. (a) The demand for a new product of company is assumed to be
a random variable with p.d.f.

= 0, x < 0
Find the mean and variance of this random variable and find the
probability that it will exceed 1.
(b) The length of the life of tyre manufactured by a company follows
a continuous distribution given by the density function
k
j(x) = -3' 1000 ~ x s 1500
x
=0 , elsewhere
Find k and find the probability that a randomly selected tyre would
function for at leasfl200 hours. [W.B. U. Tech 2005]
15. A continuous distribution is given by
f(x)=_I_e-(logx)2/2 x>0
x..{i; ,
= 0, x <0
Find the mean and standard deviation of the distribution.
CONTINUOUS PROBABILITY DISTRmUTION 219

oo 2 J;.
[Assume
o
f e-X dx = 2]
16.The pdf of a continuous r.v is given by
f(x) = Ke-b(x~a) ,a ~ x < 00
= 0, elsewhere
. 1
wha"e K, a, b are constants. Prove that K = b = - and a
. (J
=•.in - (J

where m is mean, o is s.d. [W.B. U. Tech 2003]


17. The pdf of a continuous r.v. X is given by
f(x) = 2(/3+ x)
/3(a+/3) , -/3 ~ x < 0

= 2a( a - x), O:s; x :s; a


(a +/3)
Find the expectation and variance of X Given a > /3 > O.
[WB.U.Tech 2003]
18. The pdf of a continuous random variable is given by
1
f(x) = kx +-, -1< x < 1
,. 2
= 0, elsewhere
where k is constant. Find k for which variance of X is maximum.

.
.
[Hint: Var(X) = - - - k2 ]4(~)
9 4

~swers
3e-4 e-2 1 17 5
1.1'--3-'-- 2. 25 '25'
e e

3. (a) k = 1, (b) 21 _ 31 , (c) l-(l+x)e-X


e20 e30

4. (i) 0.5 (ii) 0 (iii) 0.1 (iv) 0.25


ENGINEERING MATHEMATlCS-llA
220

x-4
(v) 0.25 (vi) -2- (vii) 5.4

1 3
S. (i) 3, (ii) 8' (iii) 37/64, (iv) 13/32, (v) x

.. 3 ... x(S - x) . 7 9
1
1 (vi) -
6. (i)8 (ii) 16 (iii) 16 (iv) 16 (v) 64 4

7. 36x105 ; ~.
20
5 1 (v) .!.!..
8. (i) 6 (ii) 5 + 9x2 -12x - 2x3 (iii) --
32 (iv)-2
32

9. k = 2, F(x) = 0, - 00 <x <0


X2
=-, O~x<l
2
X2
= 2x - - - 1, 1 ~ x ~ 2
2
= 1, x > 2.
1 X2 2x-l
10 . C = -,2 F(x) = 0, x < O·'4'
- 0<
- x <
- 1·, -- 4' 1 < x -,
<2 .

2 .
6x - x - 5 , 2 < x <
_ 3 .' 1, x > 3
4

71t 341C2
11.3'45

12. (ii) 1.5, 0.42 13. e2 j.r;, -1, ~

14. (a)
r; 4-1t
V"2' -2-'..{e
1
(b) 0.45

IS . ..{e, ~e2 -e

17. ~(a-J3), 11s(a2 +aJ3+(32)

18.0
CONTINUOUS PROBABILITY DISTRIBUTION 221

(III] Multiple Choice Questions


1. If f(x) is a p.d.fofa random variable X, then
~ ~
(a) J f(x)dx
o
=1 (b) J f(x)dx
~
=1

(c) Jf(x)dx = 1 (d) none.

2. The probability P(a ~ x ~ b) is defined by (where F(x) is the


distribution function of the random variable X)

(a) F(b)-F(a) (b) F(b) + F(a)


(c) F(a) - F(b) (d) F( a )F( b ) . [W.E. U. Tech 2006]

3. The distribution function F( x) of a random variable X is given by


(where -00 < x < 00 )

(a) P(-oo < X < x) (b) P(-oo < X ~ x)

(c) P(-oo<X<oo) (d) none.


4. The p.d.f. of a random variable X is
f(x) = k(2x -1),0 ~ x s 2.
The value of the constant k is
1
(a) 1 (b) -
2
1 2
(c) -
3 (d) "3'
5. A random variable X has the following p.d.f.

f(x) = e-x , x ~ 0
= 0, otherwise.
Then the value of E( X) is
(a) 0 (b) 1
(c) 2 (d) -1 [W.E. U. Tech.2007]
222
ENGINEERING MATHEMATICS -IIA

6. The distribution function of a random variable X is given by :


F(x) = 0, for x < 2

= 3- 2
X , for 2:$ x :$ 5
=1, for z » 5.

Then P( 3 < X :$ 4) =?
(a) -7 (b) 7
(c) -13 (d) 6
7. A random variable X has the following p.d.f:

f(x)=k,-2<x<2

= 0, otherwise.
Then the value of the constant k is
1
(a) - 1
8 (b) -
2
1
(c) - 1
4 (d) 12 [W.B.U. Tech. 2007]

8. The distribution F( x) of a variate X is defind as follows:


F(x)=0,-00<x<1
2
=-,1:$x:$5
3

= A,5 :$ x < 00 .

Then the constant A is equal to


(a) 1
(b) -1

(c)-_
2 1
3 (d) 3"'
9. Let X be a continuous random variable with distribution

f()x = .!.8'
{
° :$ x s8
0, elsewhere
CONTINUOUS PROBABILITY DISTRIBUTION 223

Then P(X s 6) =
1
.l 3
(a) - (b) -
4 4

7 1
(c)-
8 (d) "2'
10. The following distribution is a p.df of a random variable X :

f(x) = lxi, -1~x ~ 1


= 0 , else where.
The statement is
(a) True (b) False.

11. If f(X)=1-!..x,0~x~2 is a p.d.fofa random variable X,


2

then P(lX -11 < 2) =


(a) -1 (b)1

(c) ° (d)
1
3"'
12. The distribution of a random variable X is

F(x) = 0, x<O

= X2 + 3, 0 ~ x < 1
=1, x~l·
Then the corresponding p.d.f of x is

x3
(a) - + 3x, 0 ~ x < 1
3
0, else where.

(b) 2x,0 sx <1

0, else where.
224 ENGINEERING MATHEMATICS - IIA

(c) 2x, 0 < X s1


0, else where.

(d) 2x, - < x < 00 •

13. Which one of the followings is not true for a distribution function

~(x) (a) ~(oo) = 1

(b) lim
x-t-ao
~(x) = 0

(c) ~(x) is strictly increasing function.


(d) 0 s ~(x) s 1.
14. If X is a continuous random variable and F(x) is its distribution
function then find which one of the followings is not true
(a) P(X = 2) = 0
(b) P(2 s X ~ 3) = F(3) - F(2)
(c) P(2 < X ~ 3) = F(3) - F(2)

(d) :x {itspdf} = F(x).


15. If X is discrete random variable then

(a) E(IXI) ~ IE(X)I (b) E(lXI) ~ IE(X)I

(c) E(IXI) = IE(X)I (d) none of these.

16. For two random variables X and Y, E(XY) = E(X)E(Y) hold if X


and Yare
(a) independent (b) uncorrelated
, (b) continuous variate (d) none of these.
17. If X is a random variable then standard deviation of 3X + 10 is

(a) 3crx

(c) 3crx + 10
CONTINUOUS PROBABILITY DISTRIBUTION 225

18. If X is a random variable and its s.d c x =0 then X is

(a) discrete (b) continuous


(c) constant (d) o.
19. If the mean of X is 3 and Var (X)=E(X2)-E(X)-K, then K =

(a) 6 (b) 5
(c) 36 (d)-6

20. If f(x) = .!.x, 0 ~ x ~ 2 and f(x) = 0 elsewhere, then the expectation


2 . .
of X2 is
(a) 4 (b) 2
(c) 6 (d) 1 /
21. The pdf of a random variable is
3
f(x) = 4x(2-x),O~x~2
=0 , other wise.

Then F(1) (where F(x) is the distribution function) is


1 1
(a) - (b) -
3 4
1
(c) - (d) 1
2
22. If X is a random variable which assumes negative value only and if

E(lxl) = 2, then E(X) =


(a) 2 (b) 0
(c) -2 (d) none of these.

23. If X and Yare both negative random variables, mutually independent,


E(XY) = 6 and E(IXI) = 2. Then E(Y) =
(a) -1 (b) -2
(c) -3 (d) -4

EM-2A-IS
226 ENGINEERING MATHEMATICS-lIA

24. If P(X = i) = p and P(Y = j) = q,(i,j = 1,2) and X and Yare two
mutually independent random variables, E(XY) =

1 7
(a) - (b) -
4 4
9
(c) - (d) none of these.
4
25. A random variable X has the pdf
1
f(x)=-,-2<x<2
2
= 0, otherwise.

Then pOX\ > 1) =


1 1
(a) - (b) -
2 4
(c) 3 (d) none of these.

26. The pdf of a continuous random variable X is


f(x) = 6x(1- x),O ~ x ~ 4
= 0, otherwise.
Then the mean of X is

1 1
(a) - (b) -
3 2

1 1
(c) -
7
(d) '9'
Answers
l.b 2.a 3.b 4.b 5.b 6.a 7.c 8.a 9.b

10.a ll.b 12.b 13.c 14.d lS.b 16.a 17.a 18.c

19.a 20.b 21.c 22.c 23.c 24.c 25.a 26.b


f917 I
~ DISTRIBUTION
~;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;S;;;;;;;P;;;;;;;E;;;;;;;C;;;;;;;IAL;;;;;;;;;;;;T;;;;;;;;;;;;YP;;;;;;;E;;;;;;;;;;;;O;;;;;;;F;;;;;;;C;;;;;O;;;;;;N;;;;;;;T;;;;;;;IN;;;;;;;U;;;;;;;O;;;;;;;UiOiiiiiiS

7.1. Introduction.
In an earlier chapter we introduced some special type of discrete
distribution. In this chapter similarly we are giong to introduce some special
type of continuous distribution. Three such type of distribution will be
discussed in this chapter.
7.2. Normal Distribution.
A continuous random variable X is said to have a normal distribution
if its probability density function is given by
(X_I!)2 .
1--2-
.. f(x)=--e
cr..J2; 20' ,-oo<x<oo·

where Il and c > 0 are the two parameter of the distribution.


Note. (1) In this case we say X is a normal variate with parameters I..l
and c and its pdf f(x) is called Normal Probability Density.
(2) Clearly f(x) ~ 0 for all x.
1 JX-I!)2
Moreover, J f( x)dx = crvrn-27t J e
00

-<Xl
00

-<Xl
2(72
dx

1
= r Je
00

-z
2 •
dz , by putting z = x-IlIn
v7t -<Xl
crv2

J
00

1 e -u u 2d u , by putting z 2 = u
= 2I e- z2 dz = r J
00 1

v7to v7to

= ]; r(~) = ];. ~ = 1.
So the two necessary conditions for the probability density function
are satisfied.
(3) The significance of the parameters Il and c are given in the next
theorem.
228 ENGINEERING MATHEM~TICS-IIA

(4) The distribution function F( x) is given by


1 JU-I')2 du
F(x) = J f(u)du
x

--«J
=-
crJ2;.
J X

--«J
e 20
2

Normal Density Curve


~-hegraph of the pdf of a normal variate is called normal density curve.
The curve is shown in the following figure for different values of c .
f(x)

--~--~--~--~~----~----~~--~---------+x
The normal curve is bell shaped and symmetric about the ordinate
x = Il . For small values of o , the curve has a small peak and as o
increases, the normal curve tends to be flatter.
Theorem 1. If X has normal distribution with parameter Il and o then
(i) the mean of X is' Il (ii) the s.d of X is cr.
Proof Here p.d.f of X is given by
(x_I')2
1 --- 2
f(x) = r;:;-e 20 , -oo<x<oo
crv21t
00 (X_I')2
(i) Mean = E(X)=_l_Jxe-2T dx
crJ2;. --«J

(x_I')2 (X_I')2
= _1_
crJ2;.
J (x - -2T dx + crJ2;. J -2T dx
00

Il)e _Il_
00

e
--«J --«J

= r J ue - d u + r J - d u
00 00

u2 u2
1 Il x - Il
u = -- e. by putting
1tv --«J 1t v --«J' cr~
SPECIAL TYPE OF CONTINUOUS DISTRIBUTION 229

J
00
2
= 0 + ~ 2 . e _u du [ '.' the integrand of the 1st integral is odd
'lire 0
and that of the 2nd integral is even]

=~ J Z2- e-zdz,
f;.o
JC 1
1
by putting u2 =z

= j;r(~)= j;f;. = ~
(ii) Now, Var(X) = E{(X _ ~)2}

= f;. J z
2 c2 00 x-
2
e
-z2d
z by puttmg
.
crF2
~
=z
--«>

[.: r(n+l) = nr(n)]

2
cr
= f;.' f;. = cr 2

:. the s.d of X is cr.


A Case where Normal Distribution Fits.
Let a thermal plant supplies electric power in a certain city. Let X =
. amount of electric power (in watt) supplied by the plant in a day. Clearly
X varies from day to day. It can be assumed that X is a continuous variate.
It can be shown that Xhas normal distribution with parameter ~ = average
power supply per day and c = standard deviation of all the values
assumed by X
ENGINEERING MATHEMATICS-llA
230

Standard Normal Distribution.


The normal distribution with mean 0 and standard deviation 1 is called
standard nonnal distribution. The random variable having standard normal
distribution is called standard nonnal variate or X - N(O, 1) .
Thus the p.dJ of the standard normal distribution is given by
X2
1 --
~(x) == r;:;- e 2 , -00 < x < 00 •
...;27t
and the corresponding distribution function is given by
x u2
~(x) == .5; J e-2" du
-00

Standard Normal Curve.


The graph of the p.d.f of a standard normal distribution is called
standard normal curve. This is shown in the following figure.
$(x)

~~----------~o~-----------=~-
x
It is symmetric about y axis, bell shaped. It has maximum value at

x:=;:O. Since J ~(x )dx


00

== 1 so area under this curve is 1. X axis is its

asymptote.
Standard Normal Distribution Curve.
The graph of the distribution function ~(x) is shown in the following

figure.
<t>(x)

____------~----------------------~x
o
SPECIAL TYPE OF CONTINUOUS DISTRIBUTION 231

Theorem 2. (An Important Result)


If the continuous random variable X has normal distribution with

parameter I-l and c then Z X - I-l has standard normal distribution.


=
c
Proof That Z has normal distribution is shown in the next chapter.
Yet it is observed that the mean of Z,

Z=E(Z)=E(X~I-l)= ~{E(X)-E(I-l)}

=l.-{I-l-I-l}=O
o

Var(Z) = var(X -I-l) = ~ Var(X -Il)' from the property of


. (j (j
vanance
1
= -2
(j
{Var(X) - Var(Il)} ,
from the property of variance

=~{(j2_0}=1.
c
Tabulation ofthe Standard Normal Distribution.
Let the random variable X has standard normal distribution. Then its
distribution function
x
<t>(x) = P(-oo < X s x) =
-<Xl
J <j>(z)dz
=-
1
x
.s:
2

Je 2dz:. F{X~a)=-
IQZ-
,

Je'2dz and I{a~X~b)=


1J-e 2dz
In
b i'-

5c -<X) ~ -eo ,,27t a

These integrals cannot be evaluated by using the fundamental theorem


of integral calculus. However, method of numerical integration can be used
to evaluate the integral.

The values of the probabilities P (a ~ x < a) for different values of a


have been tabulated in the Table - 1 at the end of the book. From this we
can find <1>( x) for different values of x.
Since P(a ~ x ~ b) = <1>(b)- <1>(a), we have if X has distribution
N(O,l) then from the tabulated value of <1> we can evaluate the
probabilities associated with X
232 ENGINEERING MATHEMATICS -IIA

An Illustrative Examples
Let a thermal plant supplies electric power in a certain city. Let
X = amount of electric power (in watt) supplied by the plant in a day.
Clearly X varies from day to day.
It can be assumed that this X has normal distribution with parameter
~= average power supply per day and o = standard deviation of all the
values assumed by X Now ifit is known that these m = 300 and c = 10
. X -300
then by the preVIOUStheorem Z = is a standard normal variate.
10
Suppose we are asked to find the number of days on which power
supplied will lie 'between 280 to 310 MW.
Then we find P(280 < X < 310)

= p(280-300 < X - 300 < 310-300) = P(-2 < Z < 1)


10 10 10

= f
I

<1>( z )dz [where. <I> is the pdf of standard normal distribution]


-2
= area enclosed by
standard normal curve, X axis,
the ordinates z = -2 and·
z = 1 (shown by shade in the
figure)
=0.4772+0.3413 -2 1
(obtained from the tabulated
value)
= 0.8185.
So the probability of having power supply between 280 and
310 MW in a day is 0.8185. Thus 81.85% day will receive a power supply
between 280 and 310 MW.
7.3. Binomial Approximation to Normal Distribution.
Theorem. Let the random variable X follows Binomial distribution with prof

t. = (n).
i pt (1- p) n - i , i = 0,1,2,· ..,n

where nand p are parameters.


SPECIAL TYPE OF CONTINUOUS DISTRmUTION 233

If n ~ 00 and p is not very small then the distribution of the r. v

Z = ~ X - np approches to the standard normal distribution.


np(l- p)
Proof Beyond the scope.
Note. (1) In light of above theorem we understand X is approximately a
normal variate with mean np and s.d. ~np,q

(2) The variable Z = ~ X - np is called standardized Binomial


. ~(l-p)
vanate.
Illustration. Let an unbiased coin is tossed 12 times. Let X = number
of heads appeared. ThenXhas binomial distribution with parameter n = 12
1
and P = "2 = 0.5 . So the probability masses are

fO=12CO(0.5)o(0.5)12-0 = 0.000244

t.=12C1 (0.5)1 (0.5)12-1 = 0.00292


f2=12C2(0.5)\0.5)12-2 = 0.01611
and so on.
In the following diagram the heights of the rectangles are these .

.20
.15
.10

.05

o I 2 3 4 5 6 7 8 9 10 II 12
We think n as large. So X is approximately normal variate with
parameter J..l = np = 12x.5= 6 and

c = ~np(1- p) = ~12x.5x.5 = 173


ENGINEERING MATHEMATICS -IIA
234

l.e., Z =
X - 6 is approximately a standard normal variate. In the
1.73
above figure the curve is (6, 1.73) normal curve. This approximately fits

the height of the rectangle.


(i) Now, suppose we seek P( 4 ~ X ~ 7). But if we find this from
the normal distribution then we must find P(3.5 ~ X ~ 7.5) as indicated
3.5-6
in the above diagram. Now when X = 3.5, Z = - = -1.46, when
1.73

X = 7.5 Z = 7.5 - 6 = 0.87


, 1.73
Then P(4 ~ X ~ 7):::P(-1.46 sX ~ 0.87)
0.87
= J <j>(t)dt
-1.46
o 0.87

= J <j>(t)dt + J <j>(t)dt
-1.46 0

= 0.4279 + 0.3078
(obtained from the statistical table-I)
= 0.7357·
(ii) Now, suppose we seek P(X = 8).
We calculate this with the help of approximate normal distribution then
we find P(7.5 ~ X ~ 8.5) = P(0.87 ~ Z s 1.44) which would be

obtained from statistical table.


7.4 ..Illustrative Examples.
Example 1. If X is normally distributed with zero mean and unit variance
find the expectation of X2. [w.B.U.Tech, 2002, 2007)
2
By problem E(X) = 0, Var(X) = lor, E(X2) - {E(X)} =1
or, E(X2) - 02 =1 :. E(X2) = 1
235
SPECIAL TYPE OF CONTINUOUS D1STRmUTION

Example 2. If X is normally distributed with mean 3 and s.d. 2, find c


such that P(X > c) = 2 P(X ~ c) .
.43 1
Given J <I>(t)dt = 3"' [WE.U.Tech 2007]'

Let Z be the standard normal variate. Then Z = X 2- 3

.. P(X>c)=P
X -3
( -2->-2-
C-3) =P Z>-2-
( C-3)
c-3

~1-P( s C;3) ~ j ~(t)dt


Z 1-
-00

c-3
2
X -3 c-3 c-3
Also P(X 5c) = P( -2- 5-2- ) = P( Z 5-2- ) = J <I>(t)dt
-00

As P(X > c) = 2P(X ~ c), we have


c-3 c-3 c-3
1
1-
2 2
J <I>(t)dt = 2 J <I>(t)dt
x or, J <I>(t)dt = 3"
2

-00

c-3
or, -2- =.43 (from the given data)

:. c = 3.86.
Example 3. The length of bolts produced by a machine is normally
distributed with mean 4 and and s.d. 0.5. A bolt is defective if its length
does not lie in the interval (3.8, 4.3). Find the percentage of defective bolts
produced by the machine.

1
Ie'
2
\ 1
lJ2;; y'
0.6 e eft
~O.7257, J2;;
1 0.4 t eft
~o.6554j
[WE. U.Tech 2004]

Let X = length of bolt. By problem X has normal distribution with


mean Il = 4 and s.d o = 0.5. First we shall find the probability
P(3.8 < X < 4.3).
236 ENGINEERING MATHEMATICS-IIA

X-4 has standard normal distribution.


Now, Z = __ Now when
0.5
3.8-4 4.3-4
X=3.8,
Z=-- = -0.4; when X=4.3,Z=--=0.6
. 0.5 0.5
:. P(3.8 < X < 4.3) = P( -0.4 < Z < 0.6)
= Area under standard normal curve enclosed between the two ordinate
Z = -.04 and Z = 0.6 (shaded part in figure)
0.6

= f <I>(t)dt
-0.4

-0.4 0.6
0.6 -0.4
f f <I>(t)dt = f <I>(t)dt [.:
co .

= <I>(t)dt - 0.7257 - <I> curve is symmetric]


--<Xl --<Xl 0.4

= 0.7257 -
(
1- L
0.4 J
<I>(t)dt = 0.7257 - (1- 0.6554) = 0.3811

:. Probability that the length of the bolt lies between 3.8 and 4.3
is 0.3811.
:. Probability that the length of the bolt does not lie between 3.8 and
4.3 = 1- 0.3811 = 0.6189 .
:. Probability that the bolt is defective = 0.6189

.. Percentage of defective bolts produced


= 0.6189 x 100 61.89 ~ 62· =
Example 4. If the weekly wage of 10,000 workers in a factory follows
normal distribution with mean and s.d Rs. 70 and Rs. 5 respectively, find
the expected number of workers whose weekly wages are (i) between
Rs. 66 and Rs. 72 (ii) less than Rs. 66 and (iii) more than Rs. 72
[ W.B.U.Tech 2006]

f
z
[Given that ~ e-t2/2dt = 0.1554 and 0.2881 according as z = 0.4
-V21t 0
and z = 0.8]
SPECIAL TYPE OF CONTINUOUS DISTRIBUTION 237

Let X = wage of a worker. X has normal distribution with


Jl = 70, c = 5 .
:. Z = X -70 has standard normal distribution.
5
. 66-70 72-7D
(1) When X=66,Z=-5-=-Q8; when X=72,Z=-5- = 0.4

:. P(66 < X < 72) = P(-0.8 < Z < 0.4)

-O~ OA
= area under st. normal curve enclosed between the two ordinates
Z = -0.8 and Z = 0.4 (shown by shade in the figure)
0.4 0 0.4

= f <I>(t)dt = f <I>(t)dt + f..p(t)dt


-0.8 -0.8 0

0.8 0.4

= f f
<j>(t)dt + <j>(t)dt, since <I> curve is symmetric about Yaxis.
o 0

= 0.2881 + 0.1554 = 0.4435


:. Probability that the wage of a worker lies between Rs 66 and Rs. 72 is
0.4435·
:. the number of worker whose wage lie between Rs. 66 and Rs. 72
= 0.4435 x 10,000 = 4435 .
(ii) When X = 66, Z = -0.8

So, P(X < 66) = P(Z < -0.8)


= area under the st. normal curve enclosed on left side of the ordinate
Z = -0.8
-0.8 00

= f <j>(t)dt = f <I>(t)dt (by symmetry)


-«> 0.8
238 ENGINEERING MATHEMATICS -IIA

0.8
= 0.5 - J <p(t)dt = 0.5 - 0.2881 = 0.2119
o
:. the expected number of workers = 0.2119 x 10,000 = 2119
(iii) When X = 72, Z = 0.4
00

.. P(X > 72) = P(Z > 0.4) = <p{t)dt J


0.4
0.4
= 0.5 - J<p(t)dt = 0.5 - 0.1554 = 0.3446
o
:. the required number = 0.34-46x 10,000 = 3446·
Example S. The mean of a normal distribution is 50 and 5% of the values
are greater than 60. Find the standard deviation of the distribution (Given
that the area under standard normal curve between z = 0 and z = 1.64
is 0.45)
X-50
Let X be the normal variate. Let its s.d. be CJ:. Z= is
CJ
standard normal variate. By problem

P(X > 60) = ~ = 0.05 .


100
60-50 10
When X=60, Z= =-
CJ CJ

.. from above, p( Z > 1:) = 0.05 (1)

From the supplied data we have P(0 < Z < 1.64) =.45
or, P( Z > 1.64) = 0.5 - 0.45 = 0.05 (2)
10
Comparing (1) and (2) we get -
CJ
= 1.64
10 :. the s.d is 6.1.
or, CJ = -- = 6.097 ~ 6.1
1.64
Example 6. In a normal distribution, 31% of the items are under 45 and
8% are above 64. Find the mean and standard deviation. [Given
P(O < Z < 1.405) = 0.42, P( -0.496 < Z < 0) = 0.19
[ W.B. U.Tech 2003 ]
SPECIAL TYPE OF CONTINUOUS DISTRmUTlON 239

Let X be the normal variate with mean ~ and s.d a.

So, Z = X - ~ is standard normal variate.


a
31 8
Now, by problem, P(X <45)=- and P(X>64)=-.
100 100
45-~ . 64 - ~
When X =45,Z =--, when X =64, Z = .
a a
So from above JlZ < -a-
.L
45-~) = 0.31 and P (64-~)
Z > a = 0.08. _

S·IDee 0.31 < 0.5 so --45-J..L..IS on negatrve SIide.


a

From above

p( 45; ~ < z < 0) = 0.5 - 0.31


45-~ 64-J..L
= 0.19 a a

Comparing this with the second given data we have 45 - ~ = -1.496


a
= -1.496 a
or, 45 - J..L - 0.496 a
or, J..L = 45 ... (1)
64- ~ IS
S·mce 0.08 < 0.5 so -a- . lires on +ve SI'de.

:. p( 0 < z < 64; ~) = 0.5 - 0.08 = 0.42


64- ~
Comparing this with the first given data we get ---'- = 1.405
a
= 1.405 a
or, 64 - J..L
+ 1.405a = 64
or, J..L ... (2)
Solving (1) and (2) a = 9.995 and J..L
= 49.958.
:. Mean = 49.958 and s.d. = 9.995 .
Example 7. If X is normally distributed with mean 12 and s.d. 4, find
the probability of (i) X ~ 20 (ii) 0:5 X :512 and (iii) also find a such
that P(X> a) = 0.24 . [Use table]
Let Z be the standard normal variate.
240 ENGINEERING MATHEMATlCS-IIA

Then
Z = x - Il = X -12 :. X = 4Z + 12
cr 4
:. (i) P(X? 20) = P(4Z+12? 20) = P(Z? 2) = I-P(Z < 2) = 1-<D(2)
= 1- 0.9772 = 0.0228·
(ii) P(O ~ X s 12) = P(O s 4Z + 12 ~ 12) = P( -3 s Z s 0)
= P(O s Z s 3) (due to symmetry)

= <1>(3)- <D(O)= 0.9986 - 0.5 = 0.4986 (from table)

(iii) P(X> a) = 0.24 or, P( 4Z + 12 > a) = 0.24


a-12)
or, P ( Z > -4- = 0.24 or, 1-P ( Z~-4-a-12) =0.24

( a ~ 12)
or, "1- <1> = 0.24 , where <1>is c.d.f of st. normal variate

or, <1>(a ~ 12) = 0.76

or, a -12 = 0.71, from table-I


4
:. a = 14.84
Example 8. If a random variable X follows a normal distribution such
that P(9.6 s X ~ 13.8) = 0.7008, and P(X? 9.6) = 0.8159 where
1 ~ 1 u
.J2;; f e-
-00
N2
dt = 0.8159, .J2;; f e-
-00
t2 2
/ dt = 0.8849, find mean and

variance of X •

Let E(X) = Il, Var(X) = cr2 • Now, if <1>is c.d.f of st normal variate

<1>(x)= p( X; Il ~ x) = kJ U2
e- 2du
-00
/

:. <1>(0.9)=.8159, <1>(1.2)= 0.8849

1
SPECIAL TYPE OF CONTINUOUS DISTRmUTION 241

:. <1>(-0.9)= 1- <1>(0.9)= 0.1841

Now, p( -0.9 s X~ 11 s 1.2) = <1>(1.2)


- <1>(
-0.9)

= 0.8849 - 0.1841 = 0.7008

= P(9.6 ~ X ~ 13.8)

=p(9.6- J-l S X - J-l S 13.8- J-l)


(J' (J' (J'

:. 13.8 -11 = 1.2, 9.6-11 = -0.9.


c o
Solving we get 11= 114, c = 2
.. mean = 11.4, var(X) = 4.
Example 9. A fair coin is tassed 400 times. Using normal approximation
to binomial distribution find the probability of obtaining (i) exactly 200
heads (ii) between 190 and 210 heads, both inclusive. Given that the area
under standard normal curve between Z=O and Z=0.05 is 0.0199 and
baweet Z=O and Z=1.05 is 0.3531. [ WB. U.Tech 2007]
Let the random variable X denotes the number of heads in 400 tosses.
Then clearly X has a binomial distribution with parameter

n = 400 , p = ~. Since n is large we suppose X is an approximately


2
normal variate with parameter 11= np = 400 x~ = 200
2

and cr= Inp(l-p) = 400xl:.xl:. =10


V 2 2

. Z = X - 200 IS
i.e. . approximate
. stan dar d norma I'variate.
10

(i) Now, X=Iffi.5~XIffi·5-IDl --().(J) .and X=ID)·5~Z=0·a5.


10
Then using the normal approximation, we have the required probability,
P(X =2(0) :::::P(I99.5~X~200.5) =p(-O.05~Z s 0·05)
= 2P(0 ~ Z ~ 0·05) = 2 x 0·0199 = 0·0398
EM-2A-t6
ENGINEERING MATHEMATICS -IIA
242

(ii) Now, X = 189.5 ::::}Z = -1.05 and X = 210·5 ::::}Z = 1.05 .


Then using the normal approximation, we have the required probability,
P(190 ~ X ~ 210)
=:P(189.5 ~ X ~ 210.5) (in terms of normal approximation)
= P(-1.05 ~ Z ~ 1.05) = 2P(0 ~ Z s 1.05) = 2 x 0.3531 = 0.7062·
Example 10. Among 10,000 random digits, fmd the probability that the
digit 3 appears at most 950 times.
Let X denotes the number of times the digit 3 appears. Then X is a
. 1
b (n, p) variate where n =10,000, p =-
10
1 9
:. np = 1000, np(1- p) = 10000 x - x - = 900
10 10

.: It is large we find P(X ~ 950) by approximating with normal

distribution. X is approximately (np, ~nA.1-P))=(1cro,2IJ) normal variate

.
i.e., Z = X -1000 .
IS .
approximate Iy stan d ar d norma 1·vanate. S·mce
30
discrete variate is approximated by continuous variate so we find
P(X ~ 950.5).
Now X = 950.5::::}Z = -1.65

1.65

:. the required probability ~ P(X ~ 950.5) = P(Z ~ -1.65)

= 0.5 - P(-1.65 sZ < 0) = 0.5- P(O ~ Z s 1.65)


SPECIAL TYPE OF CONTINUOUS DISTRIBUTION 243

1.65

= 0.5 - J $(t)dt = 0.5 - 0.4505 {From statistical table)


o
= 0.0495·
7.5. Exponential Distribution.
Two parameters exponential distribution. A random variable X is said to
have a two-parameter-exponential distribution if its probability density function
is given by

1 --x-a
f(x)=be b ,x~a

= 0, elsewhere

where a, b (b > 0 andb > 0) are two parameters of the distribution. f(x) is
called Exponential Probability Density of X.
ote. (1) Clearly f( x) ~ 0 for all x and

oo Joo1 -~
. f(x)dx be dx 1.
J
-00
=
a
b =

So the two fundamental properties of pdf are satisfied


(2) If X has an exponential distribution with parameters a and b we
write X - E[a, b].
(3) The density curve is shown in the following figure

f(x)
1
b

o a
244 ENGINEERING MATHEMATlCS- UA

(4) The distribution function F( x) is given by

F(x) = 0, -co c x c c

x-a
=l-e b ,

(5) The graph of the distribution function F(x) is given in the


following figure

(0,1) --------------------------------_.

__+- -+ -+x
o a

One Parameter Exponential Distribution


We say that X has one parameter exponential distribution if its

probability density is

= 0, elsewhere
where A > 0 is the only parameter.
ote. (1) By the termX has exponential distribution' we mean X
has one-parameter exponential distribution.

(2) Clearly the pdf f(x) satisfies the two fundamental properties of pdf.
(3) If X has exponential distribution with parameter ).." we write
X - E(O,)..,,).
(4) This distribution is obtained from the previous by putting
a=O , b=! ).." .
Theorem. If a continuous random variable X has exponential distribution
with parameter ).."then
1 1
[w.B.U.r. 2013]
(i) the mean is i (ii) the variance is )..,,2 •
SPECIAL TYPE OF CONTINUOUS DISTRlliUTION 245

Proof The p.d.f. of this distribution be


f(x) = )..e-u, x ~0
= 0, elsewhere

(i) :. mean, m=E\x) f


"" I""
= xk-Axdx =")J uerdu , by putting Ax = U
o 0

1 ""
=ir(2) [':r(n)= fe-Xxr1dx from definition ofGanuna function]
o
1
= '" [.: r(n + 1) = n! when n is a +ve integer]

(ii) Now, f
""
E(X2) = x2)..e-udx =
o
~Ju2e- du
A
""

0
u byputting Ax =u

=~r(3) =~ =.!.
",2 ",2 ",2
:. Var(X) = E(X2)_m2 [where m is mean]

Illustrative Example.
Example 1. Suppose that during rainy season, on a tropical island, the
length of shower has an exponential distribution with average length of
1
shower '2 mins. What is the probability that a shower will last more than
three minutes ? If a shower has already lasted for 2 minutes, what is the
probability that it will last for at least one more minute?
Let X = length of shower in minute.

By problem X has exponential distribution with parameter '"


1 1
wherei = '2' i.e., A= 2

:. its p.d.f, f(x) = 2e-2x, z >0


246
ENGINEERING MATHEMATICS -IJA

Now, Prob~bility that a shower lasts more than three minutes


00 x
==P(X > 3) == J2e-2Xdx == 2lim Je-2xdx
X-+oo
3 3
2x
x
== 2 lim [ -
e- == - lim {
1 1
} == _
1
x-+oo -2 ] X-+oo e2X e6 e6
3

Probability that a shower lasts more than two minutes


00

== P(X > 2) == J 2e- 2x dx == ~.


2 e
Now the required probability == P(X ~ 31X ~ 2)

1
P((X ~ 3)n(X ~ 2)) P(X ~ 3) ~ 1
== P(X ~ 2) ==P(X ~ 2) ==T ==~ .

e4
Example 2. Let the length of a phone call (in minutes) is exponentially
distributed with mean 10. If some one arrives immediately ahead of you at
a telephone booth, find the probability that you will have to wait for
(i) more than 10 minutes

(ii) between 10 and 20 minutes.


1 1
Solution. Since A ==10 :. A ==10 is the parameter

Let X ==length of the call made by the person in the booth.


(i) The required probability ==P(X > 10)
00 I _~
==If(x)dx wheref(x)=_e 10
10 10

I 00 I
-~ x -~ I 10 I
==- Ie l°dx=_lim Ie lodx=_._==_
10 10 10 x-+0010 10 e e
Ui) The required probability ==P(IO < X < 20)

20 I .s. I I
== I-e 10 = _
1010 e e2

I
SPEClAL TYPE OF CO TINUOUS DISTRIBUTION 247

Example 3. Let the kilometers that a car can run before its battery wears
out is exponentially distributed with a mean 10,000 km. If a person desires
to take a 5000 km trip, what is the probability that he will be able to complete
the trip without having to replace the car battery ?
Solution. Let X = the remaining life time (in-thousand kilometer) of the
battery. Therefore X has exponential distribution with parameter A = ~ .
The required probability = P(X > 5) 10
x; 1 .s.
= ff(x)dx wheref(x)=-e 10
5 10
1 co .s. 1 x .s. -.!.
=- Ie l°dx=-lim Ie l0dx=e 2
10s 10X-+oos
1
Example 4. If X has exponential distribution with mean A prove that

ECXI/) n! h . . ...
=- were n IS a positive integer.
An
co
Solution. E(Xn) = I xl/f(x)dx where f(x) = Ae-h

co 00

= Ixn Ae-hdx = A Ixlle-Axdx


o 0

co
J
t"
= An e-/ dt putting, Ax = t i.e., dx = ± dt
CO

= _1_ It(n+I)-le-, dt = -l-rCn + 1) = ~


An 0 An An .
7.6. Gamma Distribution.
A random variable X is said to have a Gamma distribution with parameters
I,A, / > 0, A > 0 if its probability density function is given by
Ae-AX (AX/-I
I(x) = , xzO
reI)
=0 , elsewhere
00

where I'(I) = Ie-I t l-I


dt is Gamma function. (introduced in vol I)
o
f(x) is called Gamma Density of X; X is called Gamma Variate with
parameter (/, A) .
248 ENGINEERING MATHEMATICS -IIA

ote. (l) Clearly f(x) ~ 0 for all x and

AI co tl-! 1
= -- fe-I -(-I . -dt, putting AX = t.Xdx = dt
reI) ° A- A

---- 1 _ 1 r(/)-1_
e -I t Hd t---
°f
co

r(1) reI)
So the two fundamental properties of pdf are satisfied.

(2) The gamma distribution with A = ..!. and I =!2 n is a positive integer
is given by the probability density 2 2

-- 1
e x2 ( -x )~-I7
-
2
f(x) = () x~o
zr !2
2

=0 , elsewhere

This f(x) is known as Chi-square Ci) density function. The


distribution given by this pdfis known as Zl - distribution with n degrees
of freedom.

(3) The gamma distribution with I = 1 is given by the probability density


f(x) = Ae-A.X(Ax)o x~0
r(I) ,
=0 , elsewhere

that is, f(x) = Ae-A.X, .x ~0

=0 , elsewhere
which is nothing but the density function of one Parameter Exponential
distribution.

Thus Gamma distribution is a general form of several distribution like


exponential distribution and X 2 (Chi-square) distribution,
249
SPECIAL TYPE OF CONTINUOUS DISTRIBUTION

Theorem. If X has Gamma distribution with parameter 1 and A then


1
(i) the mean of X is A

(ii) the standard deviation of X is ~

Proof. Here the p.d.f of X is


" _A~-J.x(AXi-l >
f(x)- . , xzO
r(1)
= 0 , elsewhere
eo

(i) :. Mean = E(X) = J x f(x)dx

__ ""J x Ae-A.X (AX)H dx _-- 1 ""J e -A.X('AX )1 dx


o I'(I) I'(I) 0

=- 1 ""J -z"1 1 d
I'(I) 0 e z -A z'" [Puttmg AX= z .,. Adx = dz ]

1 ""J e-z z Idz =--·r


=__ 1 (I +1)
Ar(/) 0 Ar(l)

1 1
=-Zr(1) =-
Ar(1) A
eo

(ii) Now, E(X2) = J X2f(x)dx

1 ""J -z 1+1 dz
= Ar(1) 0 e z ;: Putting AX = z :. Adx = dz

=--1 ""J e -z z I+ld'z


A?rel) 0

1
=-r(/+2)
A2r(1)
250
ENGINEERING MATHEMATlCS-UA

1
= A2r(/) (l + 1)1r(l) [by a property of garrnna function]

1(/ + 1)
=)}

Now, Var(X) = E(X2) - {E(X)}2

= l(l + 1) _
,1.2
(i)2 =
A
l(l + 1) _ ~
,1.2 ,1.2
,
/1.2

.. Standard deviation of X =~ .
Gamma denisty Curve
The graph of the pdf of a gamma variate is shown in the following
figure for different values of' and A
Y f(x)

1 ,= 1, ,1.=1

'.
".
--O~--------~4~=-~6~~~·~~~~·--------~X
Cases where Gamma distribution fits
Let there be n number of events occuring randomly. X = the time at
which the nth event occurs. Then it can be shown that the variable X has
gamma distribution with parameters ., = n and A ·where A is average
number of event occuring in unit time interval.
That is the pdf of X will be

Ae-..lX(Axr-1
f(x) = (n-I)! ,x>O
SPECIAL TYPE OF CONTINUOUS D1STRmUTION 251

7.7. Illustrative Examples.


Example 1. If a random variable has the gamma distribution with parameter

I =2 and A =.!..,
find the mean and the standard deviation of this
3
distribution. Hence find the probability that the random variable will take
a value less than 5.
o I 2
Solution. The mean, E(X) = - = -1 = 6
A .
The standard deviation = Ji = T2 = 3J2
A 1/3
The pdf of the distribution is

1 -~
-e 3 -
(X)2-1
3 3 x;::: 0
f(x) = 1(2) ,

=0 , elsewhere

ie.. , f(x)_l -"9 xe-~ ' xz 0 ·:r(2)=i!=1 .

=0 , elsewhere
5
So, the required probability = P(X < 5) = J f(x)dx

=
5I~xe
1 --x
3dx~"9
1 {[ --x]5
-x·3e 3 0 -
5 --x}
1-3e +dx

=i{-lse-l-3+ ;I}
="91 { -15e -~3 -ge -~3 +9 } =-ge
24-~
3 +1=0·496
252 ENGINEERING MATHEMATlCS-IIA

Example 2. In a town, the daily consumption of electric power (in millions


kw) is a random variable having gamma distribution with mean 6 and s.d
2J3 . If the power plant of the town has a daily capacity of 12 million
KW, what is the probability that this power supply will be inadequate on
any given day ? For how many days the town would be reeled under load
shedding in a year (= 365 days)
Solution. Let X = consumption of power.
By problem it has gamma distribution with parameter I and A. (say).

S·mce M ean = -A.I :. -A./ = 6 or, / = 6A..

and s.d =.Ji


A.
: . .Ji
A.
= 2J3 or, I = 12A.2

I 1
:.12A.2 =6A. :.A.="2. So,I=6x"2=3

. 1 _.!.x (x J-1
:. thep<!fofX is f(x)~ 2' ~(3~ , xz O

=0 elsewhere

.
i.e., 1
f() x =-x 2-~
e , x~o
16
=0 elsewhere

f f(x)dx
<0

Required Probability = (X> 12) =


12
<0 1 -~dx 1 <0 -~dx
= f-x2e 2 =_ fx2e 2
1216 1612

=~ lim
16 x--+«> 12
'f x2e-~dx =~ lim
16x-+<o
{-2[x2e-~lX + 41xe-~dx}
. 12 12

1 400
= 16 x7 (detail evaluation is not shown)

= 0·062
:. No. of days the town will suffer load shedding
= 0 . 062 x 3654 = 22 . 63 = 23 days.
SPECIAL TYPE OF CONTINUOUS DISTRIBUTION 253

Example 3. Show that when I> 1, the graph of gamma density has a

local maximum at x = I -1 .
A
Ae-A.X(Axi-1
Solution. The pdf is f(x) = T(I) x>0

or , f() X
AI
=-x H-A.x
e
I'(I)

= ~xl-2e-A.X(-Ax+l-l)
reI)
I-I
:. f'(X) = O=> -Ax+/-l =0 => x =-
A
/-1
.: x >O,e> 0 so when x <-,f'ex) >0
A
/-1
and when x> - ,f'(X) < 0
A
. . . /-1
:. f( x ) IS maxunum at x = - .
A

Exercise

[I) Short Answer Questions

1. Given that for a standard normal variable z.


P(O < z < 0.8) =.2881. Find out p(\z\ ~ 0.8).
[Hints: p(\z\ ~ 0.8) = 2P(z ~ 0.8), due to symmetry

= 2[0.5-P{0 < z < 0.8)]


= 2[0.5- 0.2881]

= 0.4238]
254 ENGINEERING MATHEMATICS - UA

2. Xis distributed as N(50,4) find P(41 ~ X ~ 56).


. 51-50
[Hmts : For x = 41,Zl = :: -2.25
4
56-60
x :: 56,z? :: = 1.5
~ 4
.. P(41 ~ X ~ 56) = P(-2.25 ~ z ~ 1.5)

= P(-2.25 ~ z ~ O)+P(O < z ~ 1.5)

:: P( 0 s z ~ 2.25) + P( 0 < z s 1.5)


:: 0.4878 + 0.4332 = 0.921]
3. Let X be a chance variate having a normal distribution with mean
30 and s.d 5. Find the probability that (i) 26 ~ X ~ 40 (ii) IX - 30 5 1>
4. If X - E(O, ",) with P(X s 1):: P(X > 1) find Var(X).
5. If the exponential distribution is given by f(x):: e-x, 0 ~ x < 00 then
find the mean of the distribution. [W.B. U. Tech 2004]
o x
f f l~f
00

[Hints: Mean = Odx + e-xdx:: e-xdx


- 0 0

:: xlim[e-X]X
-1
->00
:: -lim(e-X -1) = -lim(~-I)
X
x->oo eX ->00
o

;:;-(0-1):: 1]

ANSWERS

1
1.0.4238 2.0.921 3.(i) .7654 (ii) .3174 4. ",2 5. 1

[II] Long Answer Questions

1. The distribution of height of men is normally distributed with mean


64.5" and s.d 4.5". Among 10,000 men find the number of men whose
height is (a) less than 69" but greater than 55.5" (b) less than 55.5" and
(c) more than 73.5"
SPECIAL TYPE OF CONTINUOUS DISTRIBUTION 255

1 2
[Given J <j>(t)dt =.0398, J <j>(t)dt = 0.4772]
o 0
2. A sample of 100 dry battery cells tested to find the length of life
produced the following results : x
= 12 hours, (J = 3 hours. Assuming
that the data are normally distributed, what percentage of battery cells to
have life (i) more than 15 hours (ii) less than 6 hours (iii) between 10 and
z
14 hours? Given J <j>(t)dt = 0.4938,0.4772,0.3413 and 0.2487
o
according as z = 2.5,2,1 and 0.67.

3. The weight of students in a college is normally distributed with


m = 40kg, (J = 5kg. Find the percentage of the students that have
weight
(i) greater than 40 kg.
(ii) greater than 50 kg.
(iii) between 38 kg and 52 kg

1 2 _~ 1 0.4 _~
Given ~
v2n
Je
~
2 dt = 0.9772, ~
v2n
Je
-00
2 dt = 0.6554
'
2.4 t2

v2n ~ J e -"2dt = 0.9918 [ W.B


... U teen,
T. h 2003]
-00

4. The life of tyre manufactured by ABC company is normally


distributed with mean 34,000 miles and standard deviation 4000 miles. Find
(i) the probability that such a tyre lasts over 40,000 miles (ii) the probability
that it lasts between 30,000 and 35, 000 miles? (iii) given that it has
survived 30,000 miles what is the probabiltiy that it survives another 10,000
miles?
You may use the appropriate statistical table.
5. A random variable X follows normal distribution with mean 100 and

kJ
1 x2

variance 25. Find P(lX -10°1::;; 5) given that e-2dx = 0.8413


-00
256 ENGINEERING MATHEMATICS-IIA

6. The mean and s.d of the I.Q of a group of 500 children is 90 and
20 respectively. Assuming that the I.Q is normally distributed, find the

J2;. f--
00 X2

1
number of children with I.Q ~ 100. Given that e 2 dx = 0.308
1t 0.5

7. If Xhas distribution Nim, cr), find m and cr .

If P(X < S9) = 0.90, P(X <94)=0.95.


1·28 1.645

[Given f tfJ(x)dx = 0.4 and f tfJ(x)dx = 0.45


o 0
8. Suppose that 10% of the population for a normal distribution
!'l( m, cr) is below 60 and 5% is above 90. What are the values of m and
cr? Use statistical table.
9. Assuming that the height distribution of a group of men is normally,
find the mean and standard deviation, if 84% of the men have heights
less than 65.2 inches and 68% have height lying between 65.2 and 62.8
inches.
(Take the help of table for supporting data)
10. (a) As a result of test on electric light bulbs, it was found that the
lifetime of a particular make was distributed normally with an average life
of 1000 burning hours and standard deviation of 200 hours. Out of 10,000
bulbs produced by the company how many bulbs are expected to fail (i)
in the first 800 burning hours (ii) between 800 and 1200 burning hours?
[Given <1>(1)= 0.84134 ]
1

[Hint: (i) P(X < 800) =? <1>(1) means f <I>{t)dt ]

(b) The marks obtained by 1000 students in a final examinations


are found to be approximately normal distributed with mean 70 and standard
deviation 5. Estimate the number of students whose marks will be between
60 and 75 both inclusive. Given the area under the normal curve

<1>( z) =
1
.J2; o[ Z2 J
exp - 2 between z = 0 and z = 2 is 0.4772 and between

z=O and z = 1 is 0.3413.


SPECIAL TYPE OF CONTINUOUS DISTRIBUTION 257

11. The mean of the inner diameters (in inch) of a sample of 200 tubes
produced by a machine is 0.502 and the s.d is 0.005. The purpose for
which these tubes are intended allows a maximum tolerance in the diameter
of 0.496 to 0.508 (i.e. otherwise the tubes are considered defective). What
percentage of the tubes produced by the machine is defective, if the
diameters are found to be normally distributed? (Area under the standard
normal curve between z = 0 and z = 1.2 is 0.3849)
12. If skulls are classified as A, B, C according as the length-breadth
index is under 75, between 75 and 80, over 80, find approximately
(assuming that the distribution is normal) the mean and standard deviation

of a series in which A are 58%, Bare 38% and Care 4%, being given

that if
It J exp (1-"2
f(t) = .[2; 1 I":
,1
X2

o
then f(0.20) = 0.08 and f(1.75) = 0.46.

13. Screws are manufactured to be 3 c.m. in lengths. Yet they are


acceptable if their length lie between 2.99 em and 3.01 ern. It is noted/
that 5% screws are rejected over-size and 5% are rejected under-size.· The
lengths of the screws arc nomally distributed. Find the standard deviation
of the distribution. Hence find the proportion of 'would be rejects' if the
permissible limits were widened to 2.983 em and 3.015 cm. Use table for
necessary data.

[Hint: 3 - 2.99 = 1.645 ]


o
14. The average height of soldiers is 68.22 inches with s.d .J1O.8
inch. How many soldiers in a reginment of 1000 would you expect to be
over 6 feet tall, given that the area under the S.N.C between the ordinates
z = 0 and z = 0.35 is 0.1368 and between z = 0 and z = 1.15 is 0.3749)
15. In firing a target assume that the horizontal distance of a shot from
the centre is normally distributed with standard deviation 2 feet. In 200
shots, how many woulJ be expected to miss the target, if it is 10 feet
wide and sufficiently high? Use the appropriate stat. table for relevant
data.
EM-2A-17
-258 ENGINEERING MATHEMATICS -llA

16. The marks obtained by candidates in Mathematics in B.Tech


examination are normally distributed. If 12.5 percent of the candidates
score 60% or more marks, 39% obtain less than 30 marks; find the mean
number of marks obtained by the candidates. You may use the table given
below: -

z 0.27 0.28 0.29 1-.14 1.15 1.16

area 0.1064 0.1103 0.1141 0.3729 0.3749 0.3770

(under S.N.C between 0 and z)

17. (a) A fair coin is tossed 10 times. Find the probability of obtaining
between 4 and 7 heads inclusive by using (i) the binomial distribution (ii)
the normal approximation. (See stat. table for relevant data)

(b) 100 unbiased coins are tossed. Using normal approximation


to Binomial distribution calculate the probability of getting

(i) exactly 40 heads, (ii) 55 heads or more. Given <l>{2.l)


= 0.9821,
<l{L9)=0.9713,<l{Q9)=Q8159. [ W.B.U. Tech 2005]

18. Among 625 random digits, find the probability that the digit 7
appears (i) between 50 and 60 times inclusive (ii) between 60 and 70 times
inclusive. (See stat. table for relevant data)

19. A pair of dice is rolled 900 times and X denote the number of
times a total of 9 occurs. Find P(SO ~ X ~ 120)

20. The ideal size of a class of a college is 150 students. The college,
experienced from past, knows that only 30% of the admitted students will
actually attend. The college uses a policy of approving the applications of
450 students. Find the probability that more than 150 students attend the
class. Area under the standard normal curve enclosed between the
ordinates z = 0 and z = L59 is 0.441.

21. One thousand independent rolls of a fair die is made. Find the
approximate probability that the number 6 appears between 150 and 200
times inclusively. It number 6 appears exactly 200 times, find the probability
that number 5 appears less thau 150 times. Use appropriate statistical table.
SPECIAL TYPE OF CONTINUOUS DISTRlBUTION 259

22. The length of phone call in minutes has an exponential distribution

with parameter A = ~. If somebody arrives immediately ahead of you


10
at a public telephone booth, what is the probability that you will have to
wait
(i) more than 10 minutes

(ii) between 10 and 20 minutes.


[Given ~ =.368]
e
23. The number of kilometers that a car runs before its fuel finished
is exponentially distributed with an average value of 10, 000 krns. If a
person desires to take a 5000 kms. trip, what is the probability that he
would be able to complete the' trip without having to refuel? (given
1

e 2 = 0.604)
1

(Hint. required Prob. = 1- f(5) = e-5A' = e-'2)

24. The amount of lifetime, in hours, that a computer functions before


breaking down is a exponentially distributed r.v with.average lifetime 100
hour. What is the probability that (i) a computer will function between 50
and 150 hours before breaking down. (ii) it will function less than loa
hours?
Find the % of computer breaking down within above time
intervals.
25. The time (in hours) required to repair a machine in an industry is
exponentialiy distributed with mean 2. What is (i) the probability that a
paricular machine takes repairing time greater' than 2 hours (ii) the
probability that a repair takes at least 10.. hours assuming that its duration
,;

exceeds 9 hours ?
26. The number of years a computer works smoothly is exponentially
distributed with vairance 64. If Prof. Das buys a used computer, what is
the probability that it will be working after an additional 8 years ?
27. Show that the graph of the gamma function with parameter I and
A has no maximum when 0</ < 1, has maxirmiin at x = 0 when 1=1.
260
ENGINEERING MATHEMATICS -IIA

Answers
1. (a) 8,200 (b) 200 (c) 200

2. (i) 15.87%, (ii) 2.28%,


(iii) 49.74%
3. (i) 50%, (ii) 2.28%, (iii) 64.72%
4. (i) 0.0668 (ii) 0.44
(iii) 0.079
5. 0.6826
7. 71.465, 13.6 8. 73.1, 10.2 9. 64.0, 1.2

10. (a) (i) 1587 (ii) 6827 (b) 818.5 11.23

12. mean = 74.4 s.d. == 3.2 13. s.d = 0.806,1.24% .

14. 125 5
15. 2~3 16. 36

17. (a) 0.7734, 0.7718 (b) (i) .0108, (ii) .1841

18. 0.3518, 0.5131 19. 0.96 20. .0559 21. 0.9258, 0.1762

22. (i) 0.368 (ii) 0.233 23. 0.604

24. (i) 0.384, (ii) 0.633, 38.4%; 63.3%


1 1
25. ;' ..re
[III] Multiple Choice Questions
1. The p.d.f of a random variable X is
1
f(x)=-, -3<x<3
6
= 0, elsewhere.
The mean of the distribution is
(a) 0
(b) 3

(c) -3 1
(d) 3"'

L
SPECIAL TYPE OF CONTINUOUS DISTRIBUTION 261

2. The exponential distribution of a random variable X is


f(x) = Ae-Ax, x ~ °
= 0, elsewhere.
where A is the parameter then the mean of the distribution is

1 1
(a) A (b) - (d) A2
A
3. If X is normally distributed with zero mean and unit variance, then
the expectation of X2 is
(a)1 (b) 2 (c) 8 (d)20.
[W.B.U.Tech. 2007]

4. If X is a normal variate with parameter /.l and a i.e. X - N(/.l,a)


then variance of X is
1
(a) a (c) ~ (d) -.
a
5. The mean and S.D of a standard normal dsitribution are
(a) 1,0 (b) 0, 1 (c) 1, 1 (d) 0, 0.

6. If F( x) = 0, - 00< x <4
x-4
=--,4~x<6
2
=1>6~x<00
be the distribution function of a uniform random variable then value of
its pdf on [4,6] lS

111 1
(a) -
3
(b) -
6
(c)-
4 (d) 2'
7. If f(x) is the pdf of a two-parameter-exponential variate with
parameter 2 and 6 then, for x ~2,f(x) =
x-2 x-2
1 --x-2 1 -- (d) ~e--2
(a) -e 6 (c) -e 6
2 2 4
262
ENGINEERING MATHEMATlCS_IJA

8. If f (x) is pdf of an exponential variate X with two parameters 3


and 5 then

1
(a) O~f(x)~- (b) O~f(x)~.!.
5
3
(c) O~f(X)~3
(d) O~f(x)~5.

9. If F( x) is the distribution function of an exponential variate X


with two parameters 6 and 8 then F( x) ==

x-6
x-6
(a) e 8 ,6~x<00 (b) 1- e 8, 6 ~ x < 00
x-6 x-8
(c) l-e-8,8~x<00 (d) l-e-6 ,6~x<00.

10. If X has exponential distribution with parameter 5"1 then its standard
deviation is

1 1
(a) 25 (b) -
5 (c) 25 (d) 5.

II. If X has exponential distribution with parameter A.== 2 then its


pdf f(x) ==
(a) e-2x .z » 0
(b) 2e -2x ,- 00 < x < 00

(c) 2e-2x ,x> 0 (c) 2e-x ,x> 0


12. If X has exponential distribution with parameter A. == 1 then
P(X> 3) ==
1
(a) ~e (b) e3 (c) e 1
(d)3'
1 _(x_2)2 e
13. Iff(x)== r;:;-e 18 ,-00<.1:<00 is thepdfofa normal variate
3" 27t
then its mean and variance arc

(a) 2, 3 (b) 2, 9 (c) 0, 9 (d) 1, 2.


SPECIAL TYPE OF CONTINUOUS DISTRIBUTION 263

14. Normal curve of a variate represents the


(a) distribution function
(b) skewness
(c) probability density function
(d) none of these.
15. If X is a normal variate with mean -2 and variance 25 then which
one of the following is standard normal variate

X-2 X+2 X -25 X+2


(a) --
5 (b) 25 (c)
2
(d)
5
16. If 3z - x =5 where z is a standard normal variate then
(a) x is a normal variate with mean 5, s.d. 3
(b) x is a normal variate with mean 3, s.d 5
(c) x is a normal variate with mean -5, s.d. 3
(d) none of these.
17. If X has Binomial distribution with parameter n = 500 and p =.001
then which of the following variate is approximately standard normal variate?

X-0.5 X-.001 X-.OOl X -0.5


(a) .4f 95 (b) (c) ..)500 (d) ..).4995 .
500

18. If X is a normal variate with mean 4 and s.d 0.5 then


P(3.8 < X < ,1.3) =

(a) P(-0.4<z<0.6) (b) P(0.4<z<0.6)


(c) P(O < z < 0.6) (d) none of these.
19. If X is a normal variate with mean 70 and variance 25 and if z a
standard nornal variate than X = 66 implies z =
(a) 0.8 (b) 0.6 (c) -0.8 (d) O.

20. If JO'{ is a normal variate with mean 4 and s.d. 0.5 and z be standard
normal var iate then z = 0.6 implies X =
(b) 5 (c) 1.6 (d) 4.3
264
ENGINEERING MATHEMATICS -IIA

21. X has normal distribution with s.d. 2 and z is standard normal


variate. If X == 9.6 implies z == -0.9 then the mean of X is
(a) 11 . (b) 11.4 (c) 13 (d). none of these.
22. The pdf of a X2 variate is
(a) periodic function (b) odd function
(c) even function
(d) none of these:
23. If f(x) is the pdfofax2 variate x then f(-lO) ==

(a) f(lO) (b) -f(lO) (c) undefined (d) O.

24. If X2 is a X2 variate with 6 degrees of freedom then for X2 > 0,

_x2(1-x 2)2
e
(b) 2
4

(e) e-f(~x2
4
r (d) none of these.

Ans'wcrs
La 2.b 3.a 4.b 5.b 6.d 7.c 8.a
9.b 10. d ll.c 12.d 13.b 14.c 15.d 16.c
17.d 18.a 19.c 20.d 21.lJ 22.d 23.d 24.c
MODULE-3
BIVARIATE DISTRIBUTIONS

m~======(;;;C;;;;;O;;;;;N;;;;;T;;;;;INU=O;;;;;U;;;;;S;;;;;V;;;;;A;;;;;R;;;;;I;;;;;A;;;;;TE;;;;;S;;;;;;)
8.1. Introduction.
In an earlier chapter we introduced joint distribution or bivariate
distribution for discrete random variables. In this chapter we present the
concept of bivariate distribution or joint distribution for a pair of continuous
random variable.
Just as discrete case if the continuous random variable X and Yassumes
values x and y in the interval a ~ x ~ b and c ~ y ~ d respectively then
Bivariate (X, Y) assumes all the values (x, y) lying within the rectangle
having vertices (a, c), (b, c), (b, d) and (a, d)

a b

Here also a bivariate describes an event. For example let X = height of


husband and Y = height of wife -,Then the region R = (4. 5 ~ X ~ 5· 2 ,
4 s Y < 6) represents the event of getting a pair w~ere the hurband's height
lies between 4·5 and 5·2 and his wife's height lies between 4 and 6.
8.2 Bivariate Distribution for continuous Variables.
In an earlier chapter we developed the theory of bivariate distribution
for discrete random variables by going through the concept of 'Joint
Probability mass', 'Marginal Probability mass' etc. In the continuous case
we are being introduced with the analogus terminologies which are defined
below:
Joint Distribution Function
For any two continuous random variables X and Y, the joint distribution
function or the joint cumulative distribution function or the bi-variate
distribution functiuon of X and Y is defined by
F(x,y) = ri-« < X s x,-oo < Y ~ y).
266 ENGINEERING MATHEMATICS-IIA

As usual (-00< X ~X,-oo < Y ~y)={(a, b) :-00< a ~ x and -00 < b ~ y}


The Marginal Distribution Function of X is
Fx (x) = F(x, (0) = Pi-co < X ~ x,--OO< Y < (0).
Obviously Fx (x) = P( -00 < X ~ x) which is distribution fuction
of the single variable X. Similarly the marginal distribution function of Y
is Fy(Y) = Fy(oo,y) = Pi-so < X < 00,--00< Y ~ y) = P(-oo < Y ~ y) which
is nothing but the distribution function of Y.
8.3. Bivariate Probability Density function.
For a bivatiate (X, Y) for two continuous random variable, a function
f(x,y) is called a bivatiate probability density function or a two
dimensional probability density function (pdf) if the joint distribution
function F (x, y) is given by
y x
F(x,y)= f f f(u,v)dudv for all (x,y)
\I=-OOU=-OO

In general p{(X,Y)ER} = fff(x,y)dxdy


R
Fundamental requirement for a function f(x,y) to be probability
density function.
If f(x,y) is a two dimensional pdfthen
00 00

(i) f(x,y) ~ 0 (ii) f f f(x,y)dxdy = 1.

Proof. Beyond the scope of the book.


8.4. Properties of continuous Bivariate distribution.
If (X, Y) is a bivariate having distribution function F (x, y) and pdf
f(x,y) then
(1) P(X=a,Y=b)=O
db
(2) P(a<X~b,c<Y~d)= fff(x,y)dxdy
ca
2
a
F
(3) axay = f(x,y)Vx,y

(4) P(x<X ~x+dt, y < Y ~ y+dy) =dF = f(x,y)dxdy.


BIVARIATE DISTRIBUTIONS (CONTINUOUS VARIATE) 267
\
Proofs : The proofs of the above properties are kept beyond the scope
of the book.
lllustration. Consider the function f(x,y) which is defined as
f(x,y)= 6(1-x- y) for x > O,y > o,X+ y < 1
=°, elsewhere

°
00 00

As f(x,y) ~ everywhere (': x + y < 1) and f f f(x,y)dxdy


--<Xl--<Xl

= f ff(x,y)dxdy (region R is shown in the adjacent figure)


R

I I-x

= f f 6(1- x - y) dy dx
x=o y=o

I ( 2 )I-X
=6 f
y-xy-~
x=o . y=o
It
=6 (1-X)-X(1-x)-
(1_X)2t~
2 r
Jl
= -6 If(2 - 2x - 2x + 2x 2 - 1- x 2 + 2x ) dx
20

=3[(X -2X+I) 2
=3[X; ->'+x1 =3(t-1+1)=1
So the function obeys the two fundamental requirement of pdf
Therefore this function is a pdf of some bivariate (X, Y).
8.5. Marginal Distribution

.. Let (X, Y) be a continuous bivariate having distribution function


F (x, y) and probability density j{x, y). Then the Marginal distrlbution
function of X and Yare given by
00 x
r Fx(x)=F(x,oo)= f J f(u,v)dudv
V=--<'oU=--<X>

y 00

and Fy(y)= F(oo,y)= f f f(u,v)dudv respectively.


V=-O::>U=--<X>
268 ENGINEERING MATHEMATICS-IIA

The Marginal Density Function of X and Yare


00

fx(x) = f f(x,y)dy
00

and fy (y) = f f(x,y)dx respectively

Example. For the joint pdf(as seen in the previous example),

f(x,y)=6(1-x- y),x>O,y>O,x+ y<l


= °elsewhere ,
the marginal pdf of X and Yare respectively
00 I-x

fx(x)= f f(x,y)dy= f 6(1-x-y)dy


y=-oo y=o

o
[1
= 6~xf (1- x - y) dy = 6 - - x + -X2) = 3(1- 2x + X2)
2 2

for ° < x < 1 because in the region x has this constraint.


00 I-v

and fy(y) = f f(x,y)dx= f 6(1-x- y)dx


x=-oo x=o

= 3(1- 2y + y2) for °< y < 1.


Note that these fx(x) and fy(Y) do satisfy all the fundamental
requirements of pdf.

8.6. Conditional Density & Conditional Distribution


If the continuous bivariate (X, Y) has joint pdf f(x,y) then (i) the
conditional probability density ~ction of X given Y = Yo is given by

f (/ ) f(x,yo)
XlY x Yo = fy(yo)

(ii) the conditional probability density function of Y given X = Xo


given by

.• (Ix) - f(xo,Y)
NIX y 0 -. fx(xo)

where fy and fx are marginal pdf respectively.

I
BIVARIATE DISTRIBUTIONS (CONTINUOUS VARIATE) 269

Example. For the bivariate (X, Y) the function


12
f(x,y)=-x(2-x-y), O<x<l, O<y<l
S
=0 ,otherwise.
Then the conditional pdf of X given y = 0 .S will be

j, (x/O.S)= f(x,O·S)
);/y fy (0· S)

f(x,O·S)
= using definition of marginal pdf
00

J f(x,O·S)dx
-00

12
-x(2-x-0·S) x(2-x-0·S)
= S = 2 O·S
112 ---
J-(2-x-0.S)dx 3 2
o S

= 6x(2-x-0·S) = 6x(1·S-x)
4-3xO·S 2·S
The conditional density of X given Y =y will be
f(x,y) 6x(2 - x - y)
fx/y(x/y)= fy(y) = 4-3y (evaluation is as above).

Theorem. For a continuous bivatiate (X, Y) having joint pdf f(x,y),


the conditional probability of the event X E A on the hypothesis

Y=yo,P(XEA/Y=yo)= Jfx/y(x/yo)dx
A

where P (X E AI Y = Yo) is defined as

P (X AIY = Yo) = lim P (X


E E AI Yo < Y < Yo + Lly)
t.y--)o
Proof. Omitted
Corollary. The conditional probability,
b
P(a<X<b)IY=yo)= Jfx/y(xlyo)dx
a
ENGINEERING MATHEMATlCS -llA
270

Note. In fact for a continuous variate the event (Y = Yo) is impossible


as we know P(Y = Yo) = O. Because of this in the above theorem
P(X E A/Y = Yo) is defined as above.
Illustration. In the above example
0·7
P(O·S<X <0·7)/Y=0·2)= I fx/y(x/0.2)dx
0·5

7
= 0'I 6x(2-x-0· 2)
dx
0·5
4-3xO·2

300.7 30
=- I x(I·8-x)dx =-xO·143333
17 0.5 17
8.7. Independent Random Variables
In a previous chapter we defined the independence of two discrete
random variables. Going to the same line we define •
"Definition : Two continuous random variables X and Yare said to be
independent if f(x,y) = fx(x)fy(Y) V X,Y
where f(x,y) is the joint pdf and fx,fy are marginal pdf of X, Y
respectively. "
Theorem 1. If X, Yare independent random variables and A, Bare
two events then b d
Pea < X ~ b, C < Y ~ d) = Ifx(x)dx Ify(y)dy
a C

This is obvious since Pea < X s b, C < Y ~ d)


db db ."
= I If(x, y) dxdy = J If~~X))y(Y)dxdy .
ca ca

Theorem 2. If X and Y are continuous random variables then the


conditional density of X given Y = Yo is same as the marginal density of
X; the conditional density of Y given X = Xo are also same as the marginal
density of Y.
Proof. The conditional density,
f(x,yo) I b d £:_ ..
fstv (Ix Yo) = fy(yo) x Yo Y enttltlon
BIVARIATE DISTRIBUTIONS (CONTINUOUS VARIATE) 271

- fx (x)fy (yO) .: X,Y are independent


fy(yo)
= fx(x) hold for all x. This completes the proof
Second part is similar.
Theorem 3. Let (X, Y) be a continuous bivariate having joint pdf
fx y(x,y). (0. V) be another bivariate which assumes the value (u, v)
where u=u(x,y)andv=v(x,y). If u,(x,y) and v(x,y) are continuously
differentiable and the Jacobian 8(u, v) has same" sign for all (x, y) then
.. df f (U V) 1S
t he joint po,
. 8(x,y)

fu v(u, v) = fx Y(X,y)18(X;y)1
' . 8(u,v)
Proof. Beyond the scope of the book.
This theorem leads us to obtain the distribution of sums and quotients
of two random variables as discussed below:
8.8. Distribution of Sums and Quotients of random variables
Theorem 1. If X and Yare independent continuous variates then the
probability density function of U = X + Y is
00

fu(u) = f fx(v)fy(u-v)dv
Proof. We consider another random variable V to obtain the new
continuous bivatiate (U, V) where U = X + Y and V = X .
Then u = x + y, v = x (1)

. theJacobian
..
a(u.v)
8(x,y)
=:av :av =1:~I=-l
8x By
8(u, v) .
:. -- has same sign for all (x, y)
8(x,y)

:. by the above theorem,

fu.v(u, v) = fx.Y(X'Y)I~~::~~1

or, fu.v(u, v) = !r(x)fy(Y) xl-II = fx(x)fy(y)


ENGINEERING MATHEMATICS -IIA
272

00

Now, the marginalpdf, fu(U)= J fu,v(u,v)dv


00

or, fu(u) = J fx(x)fy(y)dv


00

= J f x (v) fy (u - v) dv using ( 1)

Theorem 2. If X and Yare two independent normal variates with


parameters (mx'()"x) and (my,()"y) respectively then U = X + Y is a normal
, 2 2
variate with parameter (m, o) where In = mx + my and a = Va x +o y
(x_m.)2
. 1-~ -oo<x<oo
Proof. By hypothesis, fx(x) = ()"x
cr: e
v21r
(y_my)2
1 -"2,;2
and fy (y) = c;": e Y, - 00 < y < 00 .
()"yv21r
By previous theorem the marginal pdf of U,
(u-v-myl
(v_m,)2
00 1 -~ 1 2(1;
fu (u) = J ax",-;::--
-«> 21r
e •
aY V 21r
cr: e
dv

(1)

(v-mx)2 + (u-v-my)2
Now,
()"2 ()"2
x y

= (v-mx)2 +{u-(mx+lny)-(v-mx)}2
()"2 ()"2
x y

J
BIVARIATE DISTRIBUTIONS (CONTINUOUS VARIATE) 273

_ ax2 2 axay
[Putting f1.=mx+-2 (u-m) and A=--]
a a
(u-mf , (v-JJ)2
1 --,-
= __ l_e- 20-2 xl -:f(v)=--e 20-- IS tspdf
&0' A&
(u-m)2
I - 20-2
=--e
&0'
U is a normal variate with parameter (m,a)
Theorem 3. If X and Yare independent Gamma variates with
parameters (/, 1) and (m,l) respectively then
(i) (X + Y) is a Gamma variate with parameter (l + m,l)
X J-l
(ii) The pdf of - is fv(v) = I ,v> 0 where v=X
Y (l+v) +mB(l,m) Y
EM-2A-IS
ENGINEERiNG MATIlEMATlCS-lIA
274

X
Proof. Put U =X + Y and V = -Y
. x uv u
i.e. U = x +y and v = -. From these we get x
Y
= --and
l+v
y = --.
l+v

C(U,V)jW al
ll
1 1 x 1 x+y
Now, C(x,y) : ~ = =- I ~y=-7
ax: 0'
which is always negative since x,y > 0 for Gamma dist.

:. when u,v>O fuv(U,V)=fxY(X,y)\8(X,Y)\


' . 8(u,v)

= fx (x)fy (y) --
l
x+y

=e
-x I-I
x .e y
-y m-I
l ": X, Yare Gamma variate
r(1) rem) x+y

e-(x+Y)xHym-1 .L
r(l)r(m) x +Y

«: (_uv Y- I
(_u \,"-1 2
1+ v) u 1 + v)
=--~--~~~~-- --~---~
f(l)r(m) (1 + (~+V)2
l -r v
»:\
1+v)

e -u "ul-l+m-I+2 • vi-I 1+ v
f(l)r(m)(1 + v)H+m-I+2 U (1 + v)
-u I+m-I /-1
e u v
= f(l)r(m)(1 + vi+m ; otherwise fu,v (u, v) =0
The pdf of U is
00

fu(u) = J fu,v(u, v)dv


BIVARIATE DISTRIBUTIONS (CONTINUOUS VARIATE) 275

o eo -u I+m-I I-I
e u v
=
--«>
J Odv+ J r(l)f'(m)(1 + v)'+m dv
0

-u I+m-I co I-I +u I+m-I


=e u f v dv=e u Blm
r(l)f'(m) 0 (1 + v)l+m f'(l)r(m) (, )

-u I+m-I f'(l)f'( )
=e u for u z-t) .:B(/,m)= m
r(/+m) r(/+m)
which is the pdf of Gamma variate with parameter (l + m, 1) .

:. X + Y is a Gamma variate with parameter (l + m, 1) .

The pdf of V is
co

fv(v) = f fu,v(u, v)du

o eo -u I+m-I I-I
e u v
=
--«>
f Odu+
0
J f'(/)f'(m)(1 + v)'+m
du

V
I-I
f
<Xl

-u I+m-Id
= f'(/)f'(m)(1 + v)l+m 0 e u u

. vH
= -Tt] + m)
f'(l)f'(m)(1 + v)l+m

lI
V- ru + m)
(1 + v)l+m r(l)r(m)
Vl-I 1
(1+v)'+m B(l,m)

:. the pdf of V =X is given by


y
vH
fv(v) = (1+v)l+m .B(l,m)' v xO

=0 otherwise
ENGINEERING MATHEMATICS - II
276

8.9. Illustrative Examples


Example 1. The joint pdf of a bivariate (X, Y) is
f(x,y)=C(x+y), x>O,y>O,x+y<2
= ° ,elsewhere

Find C and P(X <l, Y >~).


2
Solution. The region R {x > 0, y > 0, x + y < 2} is shown in the adjae
figure.
- co co

Now, J J f(x,y)dxdy =I
--«J--«J
2 2-x
or, J J C(x + y)dydx =1
x=Oy=O

12 I
? 1 R y=;
or, C·- J(4- x-)dx =1
20 2
(detail calculation is not shown)

or, C =_.3
8
Now, P(X <l,Y >~)= p(o<X < l,~< Y < 2-X)
12-x 31 2-x
=J JC(x+y)dydx=-J Jex+y)dydx
o 1 80 1
"2 y="2

35
= 64 (detail calculation is not shown)

Example 2. The joint pdf of the bivariate (X, Y) is

f(x,y) = e-(x+y) ,x ~ O,y ~ °


= ° ,elsewhere
Find (i) marginal pdf of X and Y (ii) P(X+Y~4)

(iii) P(X ~ 1) (iv) P(X ~ Y) (v) P(2 < X + Y < 4).

Show that X and Y are independent.


BIVARIATE DISTRIBUTIONS (CONTINUOUS VARIATE) 277

Solution. (i) The marginal pdf of X is


co co 00

fy(x)= J f(x,y)dy= Je-(x+Y)dy=e-x Je-xdx=e-x


--«> 0 0

:.fx(x)=e-x, x;:::O

Similarly, fy(Y) = «', y;::: O.

(ii) P(X + Y ~ 4) = Hf(x,y)dxdy


R
where R is the triangular region shown in the first figure
4 4-x
:. P(X + Y ~ 4) = J J e-(HY)dy dx
x=Oy=O

= fe-x {4Te-YdY}dx= fe- (l-e 4)dx


X X
-

o 0 0 R

= 1- 5e-4 (detail evaluation not shown)


00 00

(iii) P(X;::: 1) = J fx(x)dx = Je-x = e-1


--«> 1
(iv) P(X s Y) = P(X - Y s 0) = J J f(x,y)dxdy
(where the open region R' is sho~~ in the adjacent second figure.)
0000 00 y
= J J e-(X+Y)dydx = J -e-x[e-yr=xdx
x=Oy=x x=o y
co
Je-2xdx
00

J
= e -x e -xdx =
o 0 --:-¥-------x
4
= _![e-2XJoo = J.. ',: lim e-2x =0
2 0 2 x--+oo

(v) P(2~X +Y~4) = f ff(x,y)dxdy


R'
(where the region RW is
shown in the thirrd adjacent figure)
2 4-x 4 4-x --~-~-~---x
= J J e-(HYldydx + J J e-(x+Y)dydx
x=oy=2-x x=2y=O
3 5
ENGINEERING MATHEMATICS-Ill
278
(detail evaluation of the double integral is not shown)
Now, Jx (x)Jy (y) =- e-x • «r, x,y ~ 0
=- e-lx+Y) =- J(x,y)
:. X and Yare inOependent.
Example 3. The joint pdf of (X, Y) is given by
J(x,y) =-2, O<x<y<l
=- 0,
elsewhere
Find (i) marginal density of X and Y
(ii) conditional density of X given Y =- Y .
(iii) conditional density of Y given X =- x .

(iv) p(nV x~~)andP(x;,VY~%)


Solution. (i) The marginal density of X and Yare
«> 1 .. __... ---- . y.~
Jx(X)=- J J(x,y)dy= J 2dy, O<x<l
R (1,1)
-<0

:.Jx(x)=2(l-x),O<x<l
y=X
,.q-q
Y
and Jy(Y) = J 2dx=2y, O<y<1 0
x=o
(ii) Conditional density of X given Y = y, 0<y <1

fx/Y(x/y) f(x,y) =2.=.!., O<x<y


fy(y) 2y y ..
(iii) Conditional density of Y given X =- x , 0 < x < 1

JYIx (y / x) = J(x,y)
Jx(x)
= 2
2(l-X)
= -,
I-x
1
x <y <1
~

(iv) +~V x~~)~P(~5Y51/ X~~)


=- IJJYIX\yj}j)dy= IJ~y=-~lY]~ =~
1/2 1/21 _ - 2 2 4
_----3
and p(x~~/y~%)~pa5X5%/Y~~)
BIVARIATE DISTRIBUTIONS (CONTINUOUS VARIATE) 279

= ffx/y(X/'!:)dX=l/f~dx=~
1/2 13 1/2 1_ _ 4
3
Example 4. Determine the value of k which makes

f(x,y)=kxy, O<x<l, O<y<x


a joint probability density function. Find the marginal density and
conditional density for X and Y respectively. Show that the variables X
and Yare not independent.
00 00

Solution. By fundamental property, f f f(x,y)dxdy=l


-00-00

or, J
x=o y=o
1 k x y dx dy=Y or, k J X[y2]X
x=o 2 0
dx=l

k 1 3 k 1
or, - f
x dx = 1 or _. - = 1 :. k = 8
, 2 4
2.<=0
The marginal pdf of X,
00

fx(x) = J f(x,y)dy
-00

x
= f kxydy, Gc xc l
y=o
X2
=8x·-=4x3
2
:. fx (x) = 4x 3
, 0<x <1
=0 ,elsewhere.
00

The marginalpdfof Y, fy(y) =


-00
J f(x,y)dx
Now, the region R (0 < x <1,0 < y < x) is the triangle shown in the
figure.In this region, 0 < y < 1 and for an y, y < x < 1 .

:. fy(y) = 1Jkxydx =8y 1fxdx = 8y [X2]1


- =8y __ (1 L2) =4y(l-l)
y y 2 y 2 2

:. the marginal pdfis, fy(Y) = 4y(1-l), 0 < y < I.


ENGINEERING MATHEMATICS - IlA
280

The conditional density of X given Y = 'Yo, 0 < Yo < x is


\ j(x,yo) kxyo O<x<1
fxtv (Ix yo~= i I,,)
JyVO
= 4Yo(1- Yo2)'
x
=----r' O<x<l
1- Yo
The conditional density of Y given X = Xo '
j(~,y)
jy!X ( Y / ~ ) = jx(~)' O<y<l, y<~ <1
4xoY
=--"3
4xo
:.jy/x(y!xo):=: Y2' O<y<l and y<xo<l
Xo
Now, J.( (x)jy(Y):=:4x3 • 4y(l-i) = 16x3y(l-i) in the region R

"* j(x,y)
:. X and Yare not independent.
Example 5. The joint pdf of (X, Y) is given by
1
j(x,y)=-(6-x-y), O<x<2, 2<y<4
8
= 0
, elsewhere
Find P(X < I,Y < 3), P(X + Y < 3), the marginal distribution,

P(X < l/Y = 3) and P(X < l/Y < 3) .


Solution. The region R {O < x < 2, 2 < Y < 4} is shown in the adjacent
figure
1st part: P(X <1, Y < 3) = P(O < X <1, 2 < Y < 3)
3 1
.
= J J j(x,y)dxdy :(2,4)
4 . - • - - - - - -.-- - - - - .
r-2x=0 3 R
3 1 1
= J J -(6-x- y)dxdy 2 8\_.
y=2x=08
1
.1 3 X2
o
=-
_ [
8 y-2
J 6x---XY
2
]
dy
x=O

=-
1
J (6---
3 1
y)dy=- J --
1 (11 3 )
Y dy=-
3
8 y=2 2 8 y=2 2 8
BIVARIATE DISTRIBUTIONS (CONTINUOUS VARIATE) 281

2nd part : the st. line x + y =3 is shown in the figure. x + y < 3 is


the origin side of this line.
RI is the region within R defined by x + y < 3 ,

:. the probability p(X + Y < 3) = II f(x,y)dxdy


Rl

I 3-x 1 I I [ 2 ]3-X
= I I g(6-x- y)dxdy =- I 6y-xy_L dx
x=Oy=2 ' 8 x=O 2 2

1
=- II[{ 6(3-x)-x(3-x)--2- (3-X)2} - { 12-2x-2 }] 1 5 =-5
dx =-x-
8 x=o 8 3 24
00

3rd part: The marginalpdf, fx(x) = I f(x,y)dy

Now, O<x<2, forallx,2<y<4

:·fx(x)=
4iI g(6-x-y)dy=g l[ 6Y-XY-T
i]4
y=2 - 2

1 1
=-(6-2x)=-(3-x),O<x<2
8 4
00

The marginalpdfof Yis fy(y) = I f(x,y)dx

Now, 2<y<4;foreachy O<x<2

21 1 1
.',fy(Y) = I-(6-x- y)dx =-(10-2y) =-(5 - y)
08 8 4
The conditional density of X,

O<x<2

= 6- x - Yo, 0<x <2


2(5 - Yo)
282 ENGINEERING MATHEMATICS -UA

Now, P(X<1/Y=3)=P(O<X<1/Y=3)

I 16-x-3 II
= ffx/y(xj3)dx = f dx=- f(3-x)dx
o 0 2(5 - 3) 40

=±[3X- X;I =±(3-~)=H=i


Now, P (X < 1/Y < 3) = P (0 < X < 1/2 < Y < 3)

P(O<X<I,2<Y<3)
= by definition of conditional probability
P(2 < Y <3)
3 I
3 I 1
f f f(x,y)dy dx f f -(6-x- y)dxdy
y=2x=O y=2x=O 8
= "-----3 ----
3 1
f/y(y)dy f-(5 - y)
2 24

Example 6. If f(x,y) = 3x2 - 8xy + 6y2, 0 < x < 1,0 < y < 1 is the joint
pdf of the bivariate (X, Y) then find the conditional densities fy (xl y ) and
fy (y Ix ). Hence find whether X and Yare independent.
co
Solution. Now the marginal pdj, fx (x) = f f(x,y)dy
I

or, fx(x) = f(3x2 -8xy + 6y2)dy


o
= 2 - 4x + 3x2, 0 < x < 1
co
and the marginalpdf, fy(y) = f f(x,y)dx
I
2
or, fy(y)='f(3x -8xy+6y2)dx=1-4y+6y2
I)

By definition, the conditional density, for 0 < y < 1

f (x/y)_f(x,y) O<x<1
x - Jy(y) ,

3x2 -8xy+6y2
or, fx(X/Y)~ 2 ,O<x<l,givenanyy,O<y<l
1-4y+6y
,
BIVARIATE DISTRIBUTIONS (CONTINUOUS VARIATE) 283

Now, fy (ylx) f(x,y)


fx(x)
3x2 -8xy+6/
= 0 < y < 1 given any x, 0 < x < I
2-4x+3x2 '
Example 7. The joint density of the bivariate (X, Y) is f(x,y) = 2,
o < x < 1, 0 < Y < x
Obtain the marginal and conditional density of X and Y. Evaluate

p (~ < X <
4 4
l/r = ~)
2'
Y

Solution. The region


R{O<x<l,O<y<x} is shown
in the adjacent figure.
The marginal densities are
00

fx(x) = J f(x, y) dy o x
x
= J 2 dy for 0 < x < 1and for an x, 0 < y < x
y=O
=2x
:. fx(x) = 2x, O<x<1
00 I
Now,fy(y)= J f(x,y)dx= J2dx for O<y<1andforany,y<x<1
--co y

= 2(1- y)
:·fy(y)=2(I-y),O<y<1
The conditional density of X given Y = Yo' 0 < Yo < 1,
_f(x,yo) 2 1
f xtr (Ix Yo ) - + ( ) = = -- in R for Yo, Yo < x < 1
JY Yo 2(l-yo) I-yo

., the conditional density, f x /~ (xl Yo) = _1 -, Yo < x -cl (1)


I-yo

fy/x (ylxo) = f(~o,Y) =~ =..!.., 0 < y < Xo


fx(xo) 2xo Xo
ENGINEERING MATHEMATICS-ItA
284

3/ y=- 1J = J fx/y(x/~)dx
(1 3/4
P -<X<-
4 4 2 1/4

3/4 1 1
= J Idx using (1) where Yo ="2
1/21- -
2

=2(~-~J=~
Example 8. Abivariate distribution for X and Yare given by the pdf

f(x,y) = e",
0 < x < y < 00
=0 , otherwise.
Find the conditional density of X given Y =Y.
Solution. the region Y
R:{O<X<Y<oo} y
R .
x
is open. It is shown
in the adjacent figure.
In this region, 0 < Y < 00
and for any y, 0 <x <y
__--~~-----------x
The marginal density,
ao Y
fy(y) = J f(x,y)dx = J e-Y dx = e-Y [x1~
--ao x=o
=ye-Y, O<y<oo
Required conditional density, for 0 < x < y,

fx/y(x/y)=f(X,y) = e-~ =!..


fy(Y) ye Y Y

l.e., fx/y(x/y)=!.., uc x c v .
y
Example 9. The pdf of the random variable X is fx (x) =11, 0 S x S 1.

Let the conditional density of Y given X =x is fy /x (y / x) = ;' 0 <y <x

Find the joint pdf t..c-» and the marginal distribution of Y.


285
BIVARIATE DISTRIBUTIONS (CONTINUOUS VARIATE)

1
Solution. fy/x (y/x) = -, 0 <y <x <1
x
=0 , elsewhere

r (/x) = fx,y(x,y)
We know, )Y/X y fx(x)
1
or, fx,y(x,y) = fy/x (y/x)fx(x) =~'1, 0 <Y <x <1

:. thejointpdf, fxr(x,y)= ,Yx, O<y<x<1


=0 , elsewhere
. eo 1 1 1
MarginalpdfofY,fy(y)= Jf(x,y)dx= J _dx=log-, O<y<l
," ,. --00 x=y X Y
1
:. fy (y) = log -, 0<y <1
y
=0 ,elsewhere
Example 10. If f(x,y) = x + y, 0 < x < 1, 0 < y < 1 is the joint pdf of
X and Y. Find the distribution of X +y.
Solution. We put U = X + Y and V = X .
(1)
Then u = x +y and v = x
au au
. o(u,v) - oy =-1
.. the Jocobtan - = ox
o(x,y) Ov Ov
ox oy
o(u,v)
:. o(x,y) has same sign.

Then bya theorem, fv v(u, v) = fx Y(X,y)\8(X'Y)\


" 8(u, v)
=(x+y)xl, O<x<I,O<y<1 (2)

From (1), x=v, y=u-v


Let R be the region {O< x < 1,0 < Y < I}
is converted to the region R' = {0 < v < 1, 0 < u - v < I}
286
ENGINEERING MATHEMATICS -IIA

This is shown in the u - v plane


u
within this R' ,0 < u < 2

:. From (2),

fv v (u, v) = u, when (u, v) E R'

i.e when 0 < v < I, 0 < u - v < I


0·' u ••. 1 u 2
:. pdf of U == X + Y is u
.: -I·'
/I

fu(u) == f fu.v(u, v) dv ,
v=o .~.
when 0 < u < 1

/I

or, fu(u)== fudv==u2,whenO<u<I


o
/

and frAu) == f u dv, when 1< u < 2


V=II-/

==U[vt_1 =u(2-u) when I <u «z


Thus this pdf of U == X + Y is given by,

fv (u)u = u2, 0 <u <I

== u(2 - u), I <u <2 .

Example 11. If X and Yare independent random variables each having


ax
density function ae- , 0 < x < co, where a is positive constant, find the
X
density function of the quotient y'
Solution. Let U == X and V == X
Y
x
:. u == - and v == x
Y
By problem pdf of X and Yare fx (x) == ae -ax, 0 < X < CO
y
and fy (y) == ae -a , 0 < y < co respectively.

Now, x=v, v
y=-.
u

j
BIVARIATE DISTRIBUTIONS (CONTINUOUS VARIATE) 287

Obviously as 0 < x < 00, 0 < y < 00, 0 < v < 00, 0 < u < 00
ox ox
o(x,y) - 0 1 v
Now, o(u,v) = ou 8v = v 1 = u2
8y oy
u2 u
au ov
Obviously, o(x,y) is always positive.
o(u,v)

By theorem, fv v(u, v)
.
= fx .
y(X,Y)\O(X,Y)\
o(u, v)

= fx (x)fy (Y)/:2/ .,' X and Y are indep .

= a2e-a(x+y) -;- •• ,' v >0


u

-a(v+~) V
= a2e U._
u2'
0 < u < 00
,
0 < v < 00

:. the pdfof U =X is
Y
fv(u) = 1
-00
f(u,v)dv= la2e-a(v+~J .-;-forO<u<oo
·~o u
a2 -{I-J)V
Ie
00

=2 ".vdv
u 0

a-7

= -;;
{

(0 - 0) +
1.

1
00

Ie" (I)}
-a 1+- v
dv .,' lim~=O
. a(l+~)O X~OO eX
288 ENGINEERING MATHEMATICS - lIA

-Jl+~)V
""
a2 U e ul /I

~7Ta(u+1) [ -a([+~) 0
J

( + ~1)
= a (0-1) .: lim ~=O
x-+oo eX
-u(u + l)a I .
1
=---
(u+ Ii
:. the density of U =~ is fuCu) = (u ~ 1)2' 0 < u < 00.

Example 12. X, Yare two independent random variables having densities


I
fx(x) = -2' x;;:::I and fy(Y) = 1,0 < Y < I respectively. Find the pdf of
x
x+y·
Solution.
Let U = X + Yand V =X .
:. u = x + Y, v = x.

:. 8Cu, v)
8Cx,y)
=1' 11=-1
1 0
which is always negative

Since x;;::: 1and 0 < y < 1


:. v;;:::1 and 0 <'U - v < 1 .: y =u- v

.'. fu vCu, v) = fx Y(X,y)18(X'y)1


' '8(u,v)

= fr(x)/y(y)/-I/ .,' X,Y independent

=_1 .1=_1
x2 v2
. 1
Thus the joint pdf, fu,v(u, v) = -? ' v z I, 0 < u - v < 1
v-
eo

The pdfofU, fuCu)= f /u,vCu,v)dv.


BIVARIATE DISTRIBUTIONS (CONTINUOUS VARIATE) 289

Now, the region R {v ~ 1, 0 < u - v < I} is shown in the adjacent figure.


When 1~ u ~ 2, 1~ v ~ u and when u > 2, U- 1~ v ~ u
u 1 u 1
Thus fu(u) = f2"dv, l~u~2 = f 2"dv,u >2
IV u_IV

1
or, fu (u) = 1- -, 1~ u ~ 2
u
1 1
=----, 2~u<oo
u -1 u
Exercise 8
1. The joint distribution of X and Y is given by the pdf
1
f(x,y) = -, (x,y) E R
2
=0 , elsewhere
where R is the interior of the triangle having vertices (0, 0), (2, 0), (1, 2).
Find P(X s 1, Y s 1).
I 2.< 1 I I 1
[Hint: Required probability = f f - dy dx + f f - dy dx ]
Oy=O 2 ~y=o 2
2
2. The joint distribution of (X, Y) is given by
f(x,y) = k, (x,y) E R
= 0 , elsewhere

where R is the region bounded by y2 = x and x = 4 .


Find k and hence fmd P(X < 3, y.< 0) .
3. The pdf of (X, Y) is given by
f(x,y)=k(2x+5y), 0~x~3,2~y~4

=0 , elsewhere

find (i) k (ii) marginal density of X and Y (iii) P(X + Y ~ 3) (iv) conditional
density of X given Y = y. (v) Whether X and Yare jndependent,

I 3-.<1
[Hint. (iii) P(X + Y s 3) = f f -(2x + 5y)dydx]
oy=2108

EM-2A-19
E GlNEERING MATHEMATlCS-IlA
290

4. The joint pdf of (X, y) is

f(x,y) = c(x + y), 0 < x < 1, 0 < Y < 1


=0 ' elsewhere.
y
Find(i)K (ii) fx(x)andfy(Y) (iii) fy/x(y/x)andfx/y(x/ )

(iv) p(\X - Y\~ ~) (v) whether X and Yare independent.

\ \

1) \ x-2 2\ 1
[Hint. (iv) p IX - yl s - ~ I I (x + y)dydx + II (x + y)dydx
~ 2 ~ 0 O x+I

5. f(X,y)=sinXSiny,O<x<~, 2 O<y<1r 2

=0 ' elsewhere
is a joint pdf of (X, Y). Find the marginal density of X and Y. Prove that

X and Yare independent


6. The joint pdf of the bivariate (X, Y) is

f(x,y)=2 , x+y<l,x>O,y>O
= 0 , otherwise
Find conditional density function of Y given X =x .
7. The joint pdf of (X, Y) is given by

f(x,y) =)..(y + 3x), 1 < x < 3, 0 < Y < 2


=0 ' otherwise.
Find )... Find P(X + Y < 2) . Find whether X, Yare independent.

8. The joint pdf of the bivariate (X, Y) is

f(x,y)=kx, O<y<x<l
= 0 , elsewhere.
Find k, Find the marginal density function of X and Y. Are X, Y

independent ?
BIVARlATE DISTRIBUTIONS (CONTINUOUS VARIATE) 291

9. The bivariate distribution is given by

f(x,y) = _1_, X2 + y2 < 5


25n
=0 ,elsewhere.
Find the marginal density of X and the conditional density of Y given
X = x(lxl < 5). Hence examine whether X and Yare independent
10. The joint pdf of the bivariate (X, Y) is given by

f(x,y)=6(l-x-y), x>O,y>O, x+ y c I
=0 , elsewhere.
Find the marginal distribution of X and Y. Are they independent?
11. The joint pdf of (X, Y) is
f(x,y) = 3(X2y + y2X), G c x c l, O<y<1
=0 , otherwise.

(I) Find the marginal density function (ii) Find p(.!.3 < Y s ~/.!.
3 2
< X ~~)
4
12. Raindrops fall at random on a square R with vertices (1, 0), (0,1),
(-1, 0), (0, -1). If (x, y) is the point struck by a random and X, Yassume
the value x and y respectively having joint distribution given by the pdf
f(x,y) =.!. if (x,y) E R
2
=0 , otherwise
fmd the marginal pdf of X and Y. he X and Y independent?
I-x 1 x+1 1
[Hint.fx(x)= J -dy,
22-
O<x<land fx(x) = J -dy, -1<x~O]
y=x-I y=-x-I .

13. The joint pdf of the bivariate (X, Y) is


f(x,y)=kxy(x+y), O~x~l, O~y~l
=0 , elsewhere

Find (i) k (ii) marginal pdf of X and Y (Ui) p(.!.2 s


4 3 3
X s ~,.!.
(iv)
.
:::;
y :::;~)

Conditional density of X given Y = y and the conditional density of Y given


X=x·
ENGINEERING MATHEMATlCS-IIA
292

14. If the distribution of the two independent random variables X and


Yare given by fx (x) = 1, 0 ~ x ~ I and fy(Y) = 1, 0 ~ Y ~ 1 then find the pdf

of X + y.
15. The pdf of two independent random variables are respectively

fx(x) =~, - 2 < x < 2 and fy(Y) =~, - 2 < Y < 2 respectively. Find the,
distribution of X + Y . [Hint. Put V = X + Y, V = X ]
16. The pdf of two independent random variables are respectively

fx(x)=e-X,x>O andfy(y)=e-Y, y>O.Findthepdfof X+y·

17. The pdf of two independent random variables X and Yare given
by fx(x) =1, O~x~l andfy(y)=l, O~y~l

=0, elsewhere = 0 , elsewhere

Find the distribution of ~ . [Hint: Put V =X , V = X]


Y
Answers
3
1.8
1
3. (i) 108

(ii) fx = _1_(4x + 30), 0 s x ~ 3


108
1
and fy =-(9+15y),2~y~4
108

...) 37/
(III (iv) not independent
/648

4. (i) 1

(ii) x + .!.., 0 < x < 1; Y + .!.., 0 < Y < 1


2 2
BIVARIATE DISTRIBUTIONS (CONTINUOUS VARIATE) 293

(Hi) X + ~ , 0 < y < 1, 0 < X < 1 and X + ~ , 0 < x < 1, for 0 < y < 1
x+i Y+i

3
(iv) "4 (v) dependent

5. fx(x)=sinx,o<x<~, fy(y)=siny,O<y< ~

1
6. --, O<y<l-x
I-x
1 13
7. 28' 168; not independent
3((1- 2)
8. 3,3x2 (0 < X < 1), y, 0 < y < 1, not independent
2

9. 2~ 25 - x' (-5 < x <5), ~ I (Iyl < .J 25 - X2 ), not independent.


25n 2 25 _X2
10. fx=3(1-x2), O<x<1 and fy=3(I-y)2, O<y<l,
X, Yare dependent
. 3x2 3y2
11. (I) fx =2+x,0<x<1 and fy =2+ y,O<y<l
311
(ii) 1053
12. fx(x)=I+x,-I<x:s;O fy(y)=l+y, -1<y:S;O
= I-x, 0 <x < 1 =l-y,O<y<l
= 0, otherwise =0 , otherwise
X, Yare dependent

13. (i) 3

3 3
(ii) fx =i x2 + x, 0:S; x:s; 1, fy =i / + y, O:S; y:S; 1

(iii) 311/3456
. _ 6y(x+ y) _ 6x(x+ y) < <
(IV) fx/y - ,0 :S; y :S; 1 fx/y - ,0 - x_I
2 +3x ' 3y+ 2
294 ENGINEERING MATHEMATICS-IIA

14. Ix+y = U, 0 < U < 1


=2-u,1<u<2
15. If U =X + Y then
u+4
Io =--, -4<u<O
16
4-u
=--,O:S;u<4
16
16. Ix+y=u.e-u, O<u<oo
where u =x+ y
X
17. If U=-
1 Y
Io =-, 0 <u < 1
2
1
=- u>l
2u2'
I
MODULE-4

m;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;B;;;;;;A;;;;;;S;;;;;;I C;;;;;;;;S;;;;;;T;;;;;;l\ T;;;;;;I;;;;;;S


T;;;;;;I;;;;;;C;;;;;;S

9.1. Statistics and its Related Terms

Statistics. An aggregate of facts which are affecetd by a number of


causes and which are expressed numerically to some resonable extent of
accuracy and which are collected in a systematic manner 'for a specific
purpose is called statistics.
The subject in which we study the characteristics of the facts is also
known as Statistics .
Illustration. Let we collect the facts and figures of the car accident
taken place in Kolkata during last 15 years. Then this collection is a
statistics of 'car accident in Kolkata during last 15 years '.
Variable. Variable is a symbol e.g. x,y. ..that can assume any prescribed
value.
If a variable assume only one value then it is called constant.
Illustration. (i) Let N = number of members in a family of India. Then
N can assume any of the values 1,2,3, So, here N is a variable.
(ii) Let X = number of prime ministers in India .Then we see X
assumes only one value,l.So X is a constant.
(iii) Let H = weight of a person of city. Then we see H can assume
the values 45 kg, 46.12 kg, 80.0015 kg etc.So here H is a variable.
Discrete and Continuous Variable. A variable that can theoretically
assume any value between two given values is called a continuous variable.
A variable which is not continuous is called a discrete variable.
In the above illustration the variable cited in (i) and (ii) are discrete
whereas the variable in (iii) is continuous.
Data / Observations. The values assumed by a variable are known as
data or observation. Sometimes, in statistics, these are regarded as
statistical data or statistical observation.
Illustration. let Y = marks obtained by the students in Mathematics in a
class. Then Y is a variable which can assume the datas 5,90,0,81 etc
.The data can be presented as
Y : 5 °. 81 90 70 65
296 ENGINEERING MATHEMATICS-IIA

Remark: In fact variable,data etc.can be defined in different way.In this


text we keep this defmition thinking of the relevant concerned readers
9.2 Frequency Distribution. •
Frequency. The number of occurence of an observation or data of a
variable is called the frequency of that data.
Illustration. Let x be the marks obtained by 30 students. Let x assumes
the values
30 35 31 32 34 31
30 34 42 30 57 68
42 71 20 15 10 51
57 51 51 52 51 80
51 57 20 71 35 32
Here we see the data 30 occurs three times.So frequency of 30 is 3.
Similarly the frequency of the data 80,71,68 are 1,2,1 respectively.
Simple Frequency Distribution. The simple frequency distribution of
a variable is the statistical table where the observations (assumed by ·the
variable) are arranged in order of magnitude and the frequency of each
observation is shown side by side.
Illustration. Let x be a variable which takes the value:
3 4 5 3 6 4
4 3 2 5 6 1
3 4 5 3 2 1
Then the frequency distribution of x is
x: 1 2 3 4 5 6 Total
J;: 2 2 5 4 3 2 18
The table can also be shown in column-wise.
Grouped Frequency Distribution.
When a large number of datas are available we cannot grasp their
characteristic only by placing them individually in a table . In these c;;tses
we group the observations into a number of suitable intervals.In a table
(or statistical table) these intervals are shown and the frequency of the
observations included in each interval are shown side by side. This table
is called Grouped frequency distribution.
BASIC STATISTICS 297

Illustration. The datas below give the marks secured by 70 candidates


in a certain examination:
21 31 35 52 64 74 89 53 42 7
22 35 43 67 76 35 46 26 32 40
72 43 38 41 63 71 28 32 45 54
15 18 52 73 86 50 39 55 47 12
44 58 67 85 39 40 50 65 72 69
57 63 . 5 56 79 37 24 54 82 49
51 54 68 29 34 44 58 62 59 65
Here we see there is a large number of the observations which are
almost distinct.We group the datas into the intervals 0-10,1l-21,22-32, ....We
see the data 7,5 are included in the interval 0-10.So the frequency (called
class-frequency)of the class interval 0-10 is 2. In this way we have the
following grouped frequency distribution :
Marks secured : 0-10 11-2122-32 33-43 44-54 55-65 66-76 77-87 88-98
Frequency : 2 4 8 14 15 12 10 4 1
Note. Frequency distribution is nothing but quantitative classification.
Terms associated with Grouped Frequency Distribution.
(1) Class interval: The group of datas into a number of suitable
intervals are called class interval. In the previous example 0 - 10, 11- 21
etc. are class intervals.
(2) Class limits: The two extreme values specifying a class interval
are called class limit. In the previous example the lower class limit (LCL)
and the upper class limit (UCL) of the class 22 - 32 are respectively 22
and 32.
(3) Class Boundaries: The class boundaries of a class are defined as
d
Lower Class Boundary (LCB) = LCL of the class --
2
where d = LCL of the class -UCL of the previous class.
d
Upper class Boundary (UCB) = UCL of the class +-
2
where d = LCL of the next class -UCL of the class. In the previous
example LCB of the class 22 - 32 is
22-21 33 32
22 - = 21· 5 ; UCB of the class 22 - 32 is 32 - - = 32·5 .
2 2
298 ENGINEERING MATIlEMATlCS-IIA

(4) Class Mark or Mid Value: Class Mark of a class


1
= - (LCL+UCL) of the class.
2
In the previous example, class Mark of the class 66 -76 IS

~(66+76)=71.

(5) Width of a class: Width of a class = (UCB-LCS) of the class. In


the previous example, width of the class 22 - 32 is 32·5 - 21·5 = 11 .
Cummulative Frequency.
For a simple frequency distribution the total frequency of the
observations lesser or equal to an observation is called "less (:::;) than
type " cummulative frequency of the observation.
For a grouped frequency distribution the total frequency of observations
lesser or equal to the observation in a class is called the "less (:::;) than
type" cummulative frequency of the observation.
Illustration.
(i) In the simple frequency distribution
x: 2 4 9 11 •
f: 3 6 4 1
the "less than (:::;)type" cummulative frequency of 9 is 3 + 6 + 4 = 13 ;
the "greater than (;:::)type" cummulative frequency of 9 is 4 + 1 = 5 .
(ii) In the grouped frequency distribution
Class 0-4 5-9 10-14 15-19
Frequency: 3 6 4 1
the "less than (:::;)type" cummulative frequency of the class 10- 14 or
against the upper boundary 14.5 of this class is 3+ 6+4 = 13; the "greater
than type (;:::)" cummulative frequency of the class 10 -14 or against the
LCB 9.5 of this class is 4 + 1 = 5 .
Measure of Central tendency
9.3. Mean.
A typical value which mayor may not be among the datas assumed
by a variable is considered as a representative of all the datas. For example
among the datas 2,5,6.1,8,9,7 the value 7 can be treated as that
BASIC STATISTICS 299

representative. Generally this representative-value tends to lie centrally within


the set of datas . This value is measured by different way. Following is
the one of best such measurements.
Arithmetic Mean. The Arithmetic mean (AM) or briefly the mean of
the values (datas) XI'X2' ..... x; is defined as
1 I "
x = -(XI + X2 +.... +x,,) = -LX; .
n n ;=1
If the datas have the frequency shown in the following table
Variable (x) : XI X2 X3 x"
Frequency (}; ) : II 12 13 I"
1 11

then their AM is x = ~(xdl + x2!2+ .... +xn!,,) =- LX;};


N 11 N ;=1
where N = II + 12+····.fn = Ii;
;=1 1 11
For a grouped-frequency distribution the AM, x = N ~x;}; where

X; is the mid-value of each class-interval.


Illustration. If X 2 4 1 3 Total
3 2 4 1 10
be the frequency distribution of a variable X then its AM,

x = _1 (2 x 3 + 4 x 2 + 1 x 4 + 3 x 1)
10
_ 1 I
or , X = -( 6 + 8 + 4 + 3) = -
x 21 = 2.1
10 10
Theorem 1. If the two variables X and yare related by the equation
x-c. _ x-c .
Y =d tnen Y = d 'wherec and d are any number,
Proof We consider the frequency distribution of X :

X: Total

};: .... /" N


Since the values of X are changed to those of y so the frequency
distrbution of y would be

y: YI ... Yn Total

... In N
300 ENGINEERING MATHEMATICS-IIA

X· -c
where Yj = _,_ .
d
1 1 x. -c
Now the A.M ofy, Y = N L/;
II

1=1
Yj = N
II

L/;
1=1
"s:
=-I(/;Xj
1 II - /;c) = - 1 ("I/;Xj - I/;c
")
Nd j=1 Nd j=1 j=1

=-1 (nIJiXj-cIJin) =-1 (" LJiXj-cN )


Nd j=1 j=1 Nd j=1

1 n C 1 1 11 C 1 c x-c
=-L/;Xj --=--I/;Xi -- =-x--=--.
Nd j=1 d d N j=1 d d d d
Note. The above theorem is very much helpful to determine the A.M of
the variable assuming large data. This is shown in the following Illustration.
Illustration. We are given the following Grouped frequency distribution:
Weight in gmsfx]: 110-119 120-129 130-139 140-149150-159
Frequency: 5 7 12 20 16
Weight in gms(x): 160-169 170-179 180-189
Frequency : 10 7 3
To fmd the A.M we construct the following table:
Calculation of Mean
Class- interval Midpoint Frequency x, -154.5
Xi-1545 Lv.
10
(Xj) (Ji) =»
110-119 114.5 5 -40 -4 -20
120-129 124.5 7 -30 -3 -21
130-139 134.5 12 -20 -2 -24
140-149 144.5 20 -10 -1 -20
150-159 154.5 16 0 0 0
160-169 164.5 10 10 1 10
170-179 174.5 7 20 2 14
180-189 184.5 3 30 3 9

Total - 80 - - -52
BASIC STATISTICS 301

8
Here N = 80 and 'LhYi = -52 .So the
i=1

A.M of y,y = 8~ x-52 = -0.65

. Xi -1545 _ x -1545
Smce Yi = 10 ' so Y = 10

x-1545 -
or, -0.65 = or, X = 148.
10
Thus the required A.M, x = 148.
Theorem. 2. (On Mean of Composite Group)
Let XJ,X2,.· .... x,. be the A.M. of r groups containig nl ,n2,.····· n,
observations respectively. Then their combined mean or composite mean
is given by

Proof Omitted.

• Illustration. Suppose the mean wage of 60 labourers in morning shift


is Rs 80 and the mean wage of 40 labourers working in evening shift is
Rs 70. Then the mean wage of all labourers (of both shift)

= nlxl +n2x2 = 60 x 80+40x 70 = 76


nl +n2 60+40
Note. There are other type of means viz., Geometric mean and Harmonic
mean. But study of these is beyond the scope of the book.
9.4. Median.
Median of a set of observations is the middle most value when the
observations are arranged in increasing or decreasing order of magnitude.
Thus to find the median of a set of observation it is necessrary to
arrange the observations in order of magnitude.
Calculation of Median.
Calculation of median may be confusing for even / odd number of
observation; for grouped frequency distribution. So we classify the
procedure of Calculation of Median in the following three cases :
302 ENGINEERING MATHEMATlCS -11A

Case 1. (For simple distribution i.e. without having any frequencyy:

Arrange the given n number of observations in ascending / descending


order of magnitude. •
(i) If n is odd then
Median = n + 1 th observation of the arranged set.
2
(ii) If n is even then

Median =~~ {!!.. th observation


+(!!..+ I) th observation}.
2 2 2
Illustration. (i) If we are required to find the median of the set {5, 8,
7, 20, 13, 3, 11} then we first arrange the datas in increasing order i.e. 3,
5, 7, 8, 11, 13,20. Here the number of observations, n = 7 (odd).

S·o Its me diIan = --7 + 1 th = 4 th 0b servation


. = 8.
2
(ii) If we are required to find the median of the set {5, 4, 7, 3, 21,
12}, we first arrange the datas in ascending order of magnitude {3, 4, 5,
7, 12, 21}. Here the number of observation, n = 6 (even).

So its median = ~ {~ th observation +(~ + 1) th observation}


2 2 2

=~ {3rd observation +4 th observation} = ~(5 + 7) = 6.


22·
Case 2. (For Simple Frequency Distribution).
Arrange the observations in ascending order of magnitude. Construct
"less (s) than type" cumulative frequency. Calculate N + 1 where N is
2
total frequency. Then Median = the observation corresponding to the
·
cummu 1ative frequency --N + 1 or next h erg
. h er (if
1 N + I IS
-- . not a
2 2
cummulative frequency).
Illustration.
Consider the following frequency distribution
Marks 30 40 50 60 70 80
No of students: 8 15 23 16 8 5
The "less (~) than type" Cumrnulative Frequency.
BASIC STATISTICS 303

Marks Frequency Cummulative Frequency


(x) (fi ) (less (s) than type)
(1) (2) (3)
30 8 8
40 15 23
50 23 46
60 16 62
70 8 70
80 5 75
Total N=75 --
Here N + 1 = 75 + 1 = 38.
2 2
We see there is no cummulative frequency '38' in column 3. Next higher
figure than 38 in column 3 is 46.
:. Required Median is the observation corresponding to the
cummulative frequency 46 = 50.

Note: N + 1 may be fraction. The procedure in same is that case also.


2
Case 3. (For Grouped Frequency Distribution)
Here also construct the "less (~) than type" curnmulative frequency
N
against class boundaries. Calculate 2 where N is total frequency. Find
the Median-Class, i.e. the class corresponding to the cummulative
N N
frequency 2 or next higher (if 2 is not a curnmulative frequency).
N_F
Then Median = I + 2
m I'
xi
1m

where 1m = lower boundary of Median-Class


N ~ Total frequency

F = Cummulative frequency of the class preceeding to the Median-


Class.
304 ENGINEERING MATHEMATICS -IIA

/,,, = frequency of median class.

i = width of the median class.


Illustration.
Consider the following grouped frequency distribution :
Class-interval Frequency
130-134 5
135-139 15
140-144 28
145-149 24
150-154 17
155-159 10
160-164 1
The "less than type" Cummulative frequency against Class boundaries.
~lass- interval Frequency Upper Class-Boundary Cummulative
Frequency (::; type)
(1) (2) (3) (4)
130-134 5 134.5 5
135-139 15 139.5 20*
140-144 28 144.5 48
145-149 24 149.5 72
150-154 17 154.5 89
155-159 10 159.5 99
160-164 1 164.5 100
Total N=100 -- --

* It means there are 20 observations which are less or equal to 139.5.

Now, N = 50 . There is no cummulative frequency 50 in the 4th


2
column of the above table. In the 4th column the next higher figure is 72.
This corresponds to the median class 145-149.
BASIC STATISTICS 305

Therefore, 1m = lower boundary of the median class = 144·5.


F = the cummulative frequency of the class
preceeding the median class = 48 .
1m = frequency of the median class = 24 .
i = width of the median class =5.
N
.
So the Median = 1m +
2-F X i = 144 . 5 +
50-48
X 5 = 144 . 92 .
1m 24
Note. There is another formula to fmd the median called "interpolation
1 formula".
9.5. Mode.
The observation having maximum frequency is called mode.
Calculation of Mode.
Case 1. For a simple frequency distribution the mode is calculated by
simply method of inspection.
Illu-stration. For the frequency distribution
~ x 10 20 30 40 50 70
Frequency 2 2 3 2 2 1

Here we see the observation 30 has heighest frequency 3. So the mode


is 30.
Case 2. (For Grouped Frequency Distribution).
For a grouped frequency distribution mode can be calculated for
frequency distribution having unique class with heighest frequency and
with equal class width. First find the modal-class i.e. the class having
heigh est frequency. Then

..; Mode = 1m + fi - 10 X i
2/1 - 10 - 12
where 1m = lower class boundary of the modal class
II = frequency of the modal class
10 = frequency of the class preceeding the modal class
12 = frequency of the class succeeding the modal class
i = width of each class (note that this is same for each class).
EM-2A-20
ENGINEERING MATHEMATICS -llA
306

Illustration.
Consider the grouped frequency distribution.
Marks No. of Candidates
____ --I

0-9 4
10-19 9
20-29 12
30-39 18
40-49 20
50-59 12
60-69 10
70-79 9
80-89 4
90-99 2
Here we see every class has same width which is 10 and only class
40-49 has heighest frequency. So the modal class is 40-49. As we know

11-/0
Mode = I + //I xi
2/1 - 10 - 12
where 11/1 = lower class boundary of the class 40-49 = 39 ·5
11= frequency of the class "40-49"=20.
10 = frequency of the class preceeding the class 40-49 = 18
12 = frequency of the class succeeding 40-49 = 12
i = width of each class = 10 .
:. Mode =39.5+ 20-18 xl0=41·5.
40-18 -12
Relation among Mean, Median and Mode: For a distribution having
single Mode the relation is Mean - Mode == 3(Mean - Median)
Note. For a symetrical distribution mean, median and mode coincide.
Measure of Dispersion
9.6. Variance and Standard Deviation.
As we have stated in the previous article A.M represents the entire set
of datas.But the degree to which the datas tend to spread about the A.M
is to be measured.It is usually measured by variance or standard deviation
which are discussed below:

J
BASIC STATISTICS 307

Variance.
The mean of the squares of the differences of the observations (or
datas) assumed by a variable from their arithmetic mean (A.M) is called
variance of the variable.
Standard Deviation.
The positive square root of variance is called Standard Deviation (s.d)
Thus (i) if XI> X2' .... x; be the datas then their variance,

Var(x) = -.!..{(XI - xl + (X2 - xl + +(Xn - xl} = -.!..f(.t"i - X)2 and the


n n i=I'

standard deviation, o x = +~~ ~)Xi - X)2


(ii) if the datas have the frequency shown in the following table
Variable (x) XI X2 ... XII
Frequency (Ji) : fi f2'" ... I,
then the Variance,
Var(x)= ~{fl(XI-X)2 + f2(X2 -x)2+ ..... +fn(XIl-x)2}
1 n .
= N trJi(Xi - X)2 where N = fl + f2 +....+fn and the standard

. . crx
deviation, = + - 1 ~~ J1"(i Xi - X_)2 .
N ;=1
If the datas have grouped frequency distribution then Xi will be the
mid -value of each class-interval and Ji would be the corresponding
frequency.
Illustration: Let X be a variable which assumes the datas:
20,85,120,60,40.Then x=.!.(20+85+120+60+40)=65.To find the
5
variance and standard deviation we go through the following table:

Xi x;-65
20 -45 2025
85 20 400
120 55 3025
60 -5 25
40 -25 625
Total 6100
308 ENGINEERING MATHEMATICS-IIA

5 2
Here :L(Xj - 65) = 6100
j;)

1 5 2 1
Then Var(x) = - :L(Xj - 65) = - x 6100 = 1220.
5 j;) 5
and the S.d, c = .J1220 = 34.93.

Theorem 2. If XJ,X2'... x; be the datas then


In
Var(x) = - :Lx? - - :LXj
(1,,)2
nj;) n I;)

Proof Left as an exercise.


Note. The above theorem can be extended for frequency distribution

also.There
I"
Var(x)=-IJ;x?- where N=j.,+j2+···+j,,·
(In)2
-IJ;Xj
Nj;) Nj;)
Theorem 3. If the two variables x and yare related by the equation

y = --x-c then c c were


= ---.:£.. h d i
IS ..
a posinve num ber.
d Y d
Proof Beyond the scope of this text.
Note. The above theorem is very much helpful to determine the s.d of
the variable assuming large data.This is shown in the following illustration.In
practice, we use the result in Th-2 to find the variance / S.d.
Illustration. (i) Let we be given the following observation having the
frequency distribution:
x : 240.12 240.13240.l5 240.16240.17240.21240.22
Frequency: 2 2 1 1 ·2 1 I
To find the S.d we go through the following table:

Xj /; Xj -240.16 x, -240.16 /;Yj /;YT


=yj
.01
240.12 2 -.04 -4 -8 32
240.13 2 -.03 -3 -6 18
240.15 1 -.01 -1 -1 1
240.16 1 0 0 0 0
240.17 2 .01 1 2 2
240.21 1 .05 5 5 25
240.22 1 .06 6 6 36
Total 10 - - -2 114
BASIC STATISTICS 309

7 7
Here, N = 10, IhYi = -2 and Ihyl = 114
i=1 i=1

Now, Vdr(y)= Ihyl IhYi)2 =~_(_2)2 =11.36


N ( N 10 10
:. cry = .Jl1.36 = 3.37
· Xi - 240.16 so c c
=---..:!..
Smce y. =--'----
I .01 y .01
or, 3.37 = crx :. o , = 0.0337 .
.01
(ii) Consider the following grouped frequency distribution:
Value : 90-99 80-89 70-79 60-69 50-59 40-49 30-39
Frequency : 2 12 22 20 14 4 1
To find the variance and standard deviation of this grouped frequency
distribution we construct the following table:
x-64·5 fy fy2
Class interval Mid point frequency x-64·5 Y=-1-0-
(x) (j)
, 90-99 94·5 2 30 3 6 18
80-89 84·5 12 20 2 24 48
70-79 74·5 22 10 1 22 22
60-69 64·5 20 0 0 0 0
50-59 54·5 14 -10 -1 -14 14
40-49 44·5 4 -20 ~2 -8 16
30-39 34·5 1 -30 -3 -3 9
Total -- 75 -- -- 27 127

Here N = 75, IhYi = 27, IhY/ = 127 .

1
Now, Var(y) = N IhY/ - ( N1 IhYi )2 = 751 xl27- (27)2
75 =1·56.

:. the standard deviation of Y, cry = ~ = 1·23.


.
Now smce Y = x-64·5 ; therefore o
o =_x
10 y 10
or, o , =1·23xl0=12·3.

1
310 ENGINEERING MATHEMATICS -IIA

9.7. Signillcance of measure of central tendency and standard


deviation.

The measure of central tendency represents the set of all observations/
datas of a variable and standard deviation shows the measure of
consistency of the datas. In fact to compare the consistency of datas of
two variables we are to find the Coefficient of Variation (C.V) of the two
datas. The variable having greater c.v is lesser consistent. Coefficient of
variation of a variable '
= Standard deviation x 100
Mean
For example: Let x = runs scored by batsman A in 20 innings and
y = runs scored by batsman B in 35 innings. Then x will get 20 datas
and y will get 35 datas.
From these data let the AM, ~ = 90 and s.d ax = 12 whereas the AM,
y = 80and s.d a y = 2 .
Then since ~ > y,
average runs scored by batsman A is 90 which is
greater than that of B. It seems performance of batsman A is better than
ax 12 1
that of B. But the C.V of X= ~ xlOO= 90xlOO=133 and the C.V of

ay 2 1
y=-=-xl00=-xlOO=2- which is lesser than that of A. We may
y 80 2
conclude batsman B is more consistant i.e, reliable than A.
9.8. Moments. The r-th moment of the values (datas) xl'x2,' ... ·x"
about a number A is defined as
.!.{(XI - At + (x2 - AY + +(xn - AY} =.!. f,(x; - At
n n~
If the datas have the frequency shown in the following table
datas (x) XI x2 x3 ,xn

frequency (J;): 1; 12 J; In
then their r-th moment about A is

~{(XI -AY 1; +(x2 -AY 12 +"'+(Xn -A)' !,,} =~ fJ;(x; -AY


]V ]V ;=1 0

n
where ]V = 1; + 12 + ... + In = 'LJ;
;=1
BASIC STATISTICS 311

For a grouped -frequency distribution, xl' x2,· . . . .. will be taken


as the mid-value of each class interval.
Example. Find the second moment about 10 for the following Grouped
frequency distribution.
Annual Sales 0-20 20-50 50-100 100-250 250-500 Total
(Rs.'OOO)
No. of Firms 20 50 69 30 25 194
The mid value of the class intervals are
10 35 75 175 375
•• the second moment of 'Annual Sales of Finns' about 10

=_1_{(10-10)220 + (35 -10)250 + (75 -10)269 + (175 -10)230


194
i
+(375-10 25}

= 17168 ·17

9.9. Central moments and Raw moments


Any moment about the A.M of the datas is called central moment. r th
order central moment is denoted by m, or J1r
Any moment about 0 is called Raw moment
Example. Find the third central moment and Raw moments of x from
the following distribution

x 2 4 3 Total

3 2 4 10

The arithmetic mean ~ = 2 . I

•• the third central moment of x

= /0 {(2 - 2 .Ii x 3 + (4 - 2 .1)3 X 2 + (1- 2 ·1)3 X 4 + (3 - 2 .1)3 X I}

=0·912
The third raw moment of x

= 1~ {23 X 3+43 x2+ 13x4+33 xl} = 18·3


312 ENGINEERING MATHEMATICS -11A

Theorems
(1) 1st central moment =0

1" _ 1
1st central moment =- 'L(x; -x)/; =N 'Lx;.!; -x'LJ;
(n _n)
N ;=1 1=1 1=1

I" -In --1


=- 'Lx;/; -x·- 'LJ; =x-x·-xN =0.
N ;=1 N ;=1 N
(2) 2nd central moment = variance.

Proof. 2nd central moment = ~ f(x; - ~)2 J; = variance by definition.


N ;=1

9.10. Relations between central moment and any moment


If m; is r-th central moment and mr' is rth moment about any number

then (i) m2 = m; - m;2

(U) ~ = m; - 3m;m; + 2m;3

1 n { - _}
= Nf;:J; (x, _A)2 -2(x-A)(x; -A)+(x-A)2

,In '2 - 1 1 - 2 n
=-'L/;(x;-A) -2(x-A)-'LJ;(x;-A) +-(x-A) 'L/;
N ;=1 N N ;=1

, - , - 2
= m2 - 2(x- A)ml +(X - A) (1) ':'LJ;=N

- 1 n 1 11

Now, x- A =- 'LJ;x; --A'LJ;


N;=I N ;=1

I" In In ,
=-'L/;x;--'LJ;A =-'LJ;(x;-A)=ml
N ;=1 N;=I N ;=1

.. fr om (1) ,m2 =m2 , --2m] "ml +ml '2


=m2 ' -m] '2
Proof of (ii) and (iii) are kept beyond the scope of the book.
BASIC STATISTICS 313

x-c
Theorem. If two variables x and yare related as y = -- where c,
d
d are constants then m (y) mr(x) =
dr r
mr(x) and mr(y) are central moments ofx andy respectively
- x-c
.
Proof. We know, y =--
d
x-c x-c x-x
:.y-y=-----=--
d d d
- x·-x
:.y. -y=_'- d
I
for i=12' , 3, .

1 n 1 n {x.-c
~_c}r
Now, m,.(Y)=NLJ;(y;-yY
1=1
_
=N~J;
1-1
=r=r
=~ I,J; (x; -rxY
N ;=1 d
1 1 n -
= -dr' N ~J; (x; - x)" ': d' is independent of i
1=1

1 mr(x)
or, mr(y) = dr mr(x)=-----;r-
9.11. Skewness and Kurtosis
The skewness and kurtosis of a frequency distribution are

y\ (skewness)
m
=--+(J

and Y 2 (kurtosis) = m: - 3 respectively


(J

where m3, m4 are 3rd and 4th central moments and (J is the standard
deviation of the distribution.
Note: Since skewness and kurtosis are ratio of two quantities having
same unit so they have not unit, they are pure number.
Example. Following is the frequency distribution of a variable x :
x : 112·45117·45 122·45127·45 l32·45 l37·45 142·45 Total
f : 5 15 20 35 10 10 5 100
Find its skewness and kurtosis
314 ENGINEERING MATHEMATICS-IIA

Solution. Calculation of moments

x.-127·45
x; J; s, -127·45 ' 5 =y; J;y; J;y/ J;y/ J;y/

U12·45 5 -15 -3 -15 45 -135 405


117·45 15 -10 -2 -30 eo -120 240
122·45 20 -5 -1 -20 20 -20 20
127·45 35 0 0 0 0 0 0
132·45 10 5 1 10 10 10 10
137·45 10 10 2 20 40 80 160
142·45 5 15 3 15 45 135 405
Total 100=1- - - -20 220 -50 1240

The raw moments i.e. moments of y about 0 are


,In 1 1
ml (y)=-:LJ;y; =-x-20=--
N j=1 100 5
, 1 /I 2 1 11
m2 (y)=-:LJ;y; =-x220=-
N ;=1 100 5
, 1~ 3 1 1
~ (Y)=-L.JJ;y; =-x-50=--
N ;=1 100 2
, 1~ 4 1 62
m4 (y) = - L.JJ;yj = - x 1240 =-
N ;=1 100 5
:. the central moments of yare
2
, '2 11 1 54
m2(y)=m2 (y)-ml (y)=-- (
--=-=2·16
)
5 5 25

~(y) = m3' (y) -3m2' (y)ml' (y) + 2{m/ (y)f

;-~-3. ~l-(-~)+2(-~J
1 33 2 201
= -- + - - - =- = 0·804
2 25 125 250
,
-I 315
BASIC STATISTICS

I m4(Y) = m4' (Y),- 4m3' (y)ml' (y) + 6m2' (y){ml' (y)f - 3{ml' (y)f
I
1 1
= 6 2-4{ -~}( -l)+6. 5 {-lJ -3(-lJ
5
= 62 _~+ 66 __ 3_= 782"'-=12.5232
5 10 125 625 625
. ~-127·45
Smce y- =
5'I

:. the central moments of x are given by


m2(x)
m2(y) =--2 - or, m2(x) = 25 x 2 ·16= 54
5
~(x)
m3(Y) =-3- or, ~(x) = 125x 0·804 = 100·5
5
m (x)
m4(y) =~ or, m4(x) = 625x12· 5232= 7827
5
Now we know the variance u; = mz (x) = 54
:. Ux = J54 = 7 ·348
m3(x) 100·5
... Skewness of x
, YI (x) = --
u; =
(7 . 348i
= 0·253

. m4(x) 7827
and KurtOSISof X'Y2(X) =---34 =-,--3 = -0·316
ux 54-
9.12. Significance of Skewness
Skewness shows the extent of symmetry of the frequency diagram
of a variable. Below we draw the frequency diagram of the variables
x,y,z and w respectively.
Frequency diagram of x
J;

__ ~4-I'
XI
__ --~-+--------------------+~
x2 X3 x . Xs x6 x7 x9
4 Xg
Note that this frequency diagram is symmetric about the frequency
Is . This type of frequency distribution will have skewness O. The curve
1is fitted to show the symmetry.
3]6 ENGINEERING MATHEMATICS-lIA

Frequency diagram of y

----~~~~~~~~~~~====~Yi
y, Y2 YJ Y4 Y7 Y9
Note that this diagram is not symmetric. It has positive value of
skewness as this has asymmetry towards +ve direction of X axis. The
curve f is fitted to show this asymmetry.
Frequency diagram of z
Ii

Note this diagram also has asymmetry towards right side. Its
asymmetry is more than that of y. Its measure of skewness will be greater
than that of y. It is also positively skewed.
Frequency diagram of w
BASIC STATISTICS 317

Note that this frequency diagram has asymmetry towards left side. Its
skewness will be negative.
9.13. Other Formula For finding Skewness
Skewness can also be measured with the help of following theorems
whose proofs are kept beyond the scope of the book.

Mean -Mode
Theorem 1. Skewness = -------
Standard deviation
3 (Mean - Median)
Th eorem. 2 Sk ewness = --'-------'-
Standard deviation
9.14. Significance of Kurtosis
Kurtosis shows the peakedness of the frequency diagram of a variable.
Below we draw the frequency diagram of the variables x, y, z and w
respectively
Frequency diagram of x

In this frequency diagram the greatest frequence is /6. The curve


f is fitted to show this peakedness.
Frequency diagram of y

•I
318 ENGINEERING MATHEMATICS-IIA

We see the peakedness of the frequency diagram is less than that of


x. Then the kurtosis of y will be less than the kurtosis of x.
The kurtosis of a normal variate is O. It is seen that the peakedness of
normal distribution is 3.
So, if the peakedness of a distribution is less then 3 then its kurtosis
will be negative. This type of distribution is known as Platykurtic
distribution.
If the peakedness of a distribution is greater than 3 then its kurtosis
will be posotive. This type of distribution is known as Leptokurtic
distribution
If the peakedness of a distribution is 3 then its kurtosis is o. This type
of distribution is known as Mesokurtic distribution.
9.15. Illustrative Examples.
Example. 1. Find the mean from the following data:
Daily wages (Rs) : 25-29 30-34 35-39 40-44
No. of workers: 16 28 14 12
It is given that the total wage for 10 workers earning Rs 45 and more is
Rs 600.
First we are to work out the mean for the rest part without the last
class.
For that we construct the following table:

x. -37
/
class mid point frequency y; = 15 I.».
interval (x;) /;
25-29 27 16 -2 -32
30-34 32 28 -1 -28
35-39 37 14 0 0
40-44 42 12 1 12
Total 70 -48

... Here N = 70'L.JJ


" r.i y., = -48
i
BASIC STATISTICS 319
I
I -48
:. Y=70=35
-24

I __
But y=x--
37
5
1 i.e., x=37+5y=37- 24 x5 =33.57.
35
So the total wage of 70 workers = Rs. 33·57 x 70 = Rs. 2349·9.
Thus the wage of total 80 workers is
Rs (2349·9 + 600) = Rs 2949·9.
r 2949·9
., Mean wage is Rs 80 = Rs 36·87.
,I Example. 2. TheA.M calculatedfrom the following frequency distribution
I is known to be 72.5. Find the value of x :

I classes : 30-39 40-49 50-59 60-69 70-79 80-89 90-99


I Frequency: 2 3 11 20 x 25 7
We first construct the following table to find A.M using the given datas :

Xi -74·5
Classes Midpt Frequency yJi
Yi= 10
(Xi) (/;)
f 3~39 34.5 2 -4 -8
40-49 44.5 3 -3 -9
5~59 54.5 11 -2 -22
6~9 64.5 20 -1 -20
7~79 74.5 x 0 0
8~89 84.5 25 1 25
9~99 94.5 7 2 14
Total 67+x -20

- LYi/; -20
.. Y = L/; = 67 + X

.. X = 74.5+10Y= 74.5-~
67+x·
320 ENGINEERING MATHEMATICS-IIA

By the given condition we have

74.5-~=72.5 or,67+x=100 .. x=33.


67 +x
Thus the missing frequency x is 33.
Example. 3. Average marks obtained by a class of 70 students wasfound
to be 65. Later it was found that the marks of one student was wrongly
recorded as 85 in place of 58. Find the corrected mean.
1 70
Wrongly calculated mean =- Ln; = 65
70
70. 1=
I

or, Ln; = 4550


i.e. ~tlmof wrong observations = 4550 .
:. sum of corrected observations = 4550 - 85+ 58 = 4523 .
So the corrected mean 4523 = 64.61 .
=
70
Ex. 4. Following is afrequency distribution lacking two class frequency.
Find them if the mean is 7.74.
value 3-5 5-7 7-9 9-11 11-13 Total
frequency 32 57 25 200
Let the two missing frequencies be II and 12 respectively. We
construct the followin table:
Class-interval Frequency Class-Mark xf
3-5 32 4 128
5-7 II 6 6/1
7-9 57 8 456
9-11 12 10 1012
11-13 25 12 300
Total 200=N 884 +6/1 + 10h
Now, 32+ fi +57+ I: +25=200
or, II + I: =86 (1)

The mean, x = ~ LX;/; or, 7·74 = _1_ x (884 + 6/1 + 1017)


N 200 -
or,3fi+5h=332 (2)
Solving (1) and (2) we get II =49, h = 37 so the two missing
frequencies are 49 and 37.

1
BASIC STATISTICS 321

Example.5. Two variables x and yare related by 3x+4y = 21. A. M of


x is 3. Find A. M ofy.
F rom thee given
zi re I'anon we get 21-3x
Y = -4- .
_ 21- 3x 21- 3 x 3 12
.. y=-4-= 4 =4=3
Example. 6. Two variables x and yare related by x = 2y + 5. The median
ofx is 25. Find the median of y.
F rom t hee given
zi re I'anon we h ave y x-5 .
= -2-

Here if x increases y also does so.


Medianof x - 5 25 - 5
So the median of y = 2 = -2 - = 10.
Example. 7. The number of observations of two groups are' in the 'ratio
2:1 and their A. M, are 8 and 128 respectively. Find the A. M of the
combined group.
Let the number, of observations of the two groups be 2k,k. XI = 8,
2k x8+k x 128
Zk+]:

or, X 144k = 48 .
=
3k
Example. 8. Find the median of the following frequency distribution:
x 5 10 15 20 25 30 35 40
f 7 10 15 18 23 21 17 8
The cumulative frequency distribution table is given below:

x f c.]'
5 7 7
10 10 17
15 15 32
20 18, 50
25 23 73 ~
30 21 94
35 17 111
40 8 119
EM-2A-21
322 ENGINEERlNG :\1ATIIEMATICS-lIA

Here N = 119
:.
N + 1 = 60
2 N+1
So the cumulative frequency just greater than -2- is 73 and the
value of x corresponding to c.f 1JTs-z5. Rence tlie median is 25.
Example. 9. Calculate mean, median and hence find the approximate
value of the mode from the following frequency distributions:
'Height (inches) : 60-63 64-67 68-71 72-75 7fr-79 80-83
No of students 8 3 18 6 16 8
To find mean we first construct the following table :
x·-73·5
classes Midpt frequency cf Yi = 4
I yJi
(Xi) (f;)
60-63 61.5 8 8 -3 -24
64-67 65.5 3 11 -2 -6

68-71 69.5 18 29 -1 -18


72-75 73.5 6 ~ 35 0 0
76-79 77.5 16 51 1 16
80-83 81.5 8 59 2 16

Here N = "Lh = 59 -16

- = "LYih = -16 = -0.2712


.. Y "Lh 59

Xi -73.5
Now, Yi = 4 or, Xi = 73.5 + 4Yi

x =.73.5 +4y = 73.5 + 4( -0.2712) = 72.4152


:. mean of the given distribution = 72.4152 .
N
Now 2=29.5.
So the median class is 72-75
:. 1m = 715, F = 29,/,,, =6
and i=755-715=4
BASIC STATISTICS 323

N_F
:. Median = 1m + 2 xi = 715 + 29.5 - 29 x 4 = 71.83 .
J; 6
Using the relation Mean-Mode = 3 (Mean-Median) we have

72.4152-Mode = 3(72.4152 -71.83) or, 72.4152-mode = 1.7556

:. Mode = 70.6596
Example. 10. Calculate the mode of the following data:
1, 12, 5, 8, 12, 13, 8, 1, 4, 8, 7, 8, 5.
Let us arrange the given variaties with corresponding frequencies as
given below

x f
1 2
4 1
5 2
7 1
8 4
12 2
l3 1
As the variate 8 ocurs 4 times which is maximum, so,
mode = 8.
Example. 11. Calculate the mode from the following distribution:
class : 10-15 15-20 20-25 25-30 30-35
frequency 6 9 11 7 7
Here the greatest frequency 11 lies in the class 20-25. Hence modal
class IS 20-25.
.. 1m = lower class boundary of the modal class
=20·
fl = frequency of the class = 11
fo = frequency of the proceeding the modal class
=9·
f2 = frequency of the class succeding the modal class = 7 .
i = width of each class =6.
324 ENGINEERING MATHEMATlCS-IIA

- I +
~-J;o. X I = 20 +
11-9
x 6
.. Mo d e - m
J
2f.. - fo - f2 22 _ 9 -7 = 22·
Ex.12. The median and mode of the fo 110wing frequency distribution are
known to be 27 and 26 respectively. Find the values of u and /3 :
class interval 0-10 10-20 20-30 30-40 40-50
frequency 3 u 20 12
Since mode = 26, so it lies within the class 20-:-30.

Mode = I + fi - fo x i = 20 + 20 - u x 10
In 2fJ - fo - f2 40 - u -12

= 20+ 20-u x 10
28-u

10(20 - u) _10--,-(
2_0_-_u-,-)
=6
20+ = 26 or,
28-u 28-u
u=8.

So the total frequency is N = 43 + /3.


But media = 27, so it lies within the class 20-30.
Then we construct the following frequency table :

class interval frequency c.f frequency


0-10 3 3
10-20 8 11
20-30 20 31
30-40 12 43
40-50 /3 43+/3

N_F
Now median = I + 2 xi
In fm

43 + /3 -11
21 + /3
.. 27 = 20 + 2 x 10 or --
'4 = 7 or 21
,t-'
+ (.l. = 28 /3 = 7.
20
BASIC STATISTICS 325

Ex. 13. Find the variance and standard deviation of thefollowingfrequency


distribution:
Weight (in kg) : 36-40 41-45 46-50 51-55 56-60 61-65 66-70
No of persons : 14 26 40 33 50 37 25
We construct the following table:

x, -53
class Midpt frequency r, = 15 /;Yi /;Y/
interval (Xi) h
36-40 38 . 14 -3 -42 126
41-45 43 26 -2 -52 104
46-50 48 40 -1 -40 40
51-55 53 33 0 0 0
56-60 58 50 1 50 50
61-65 63 37 2 74 148
66-70 68 25 3 75 225
Total 225 65 693

Here N = 225, "LhYi = 65, "LhY/ = 693

.. Var(y) = N
1 "LhYi 2 - (1N "LhYi )2
1
= 225 x 693 -
( 65
225
)2 = 2.996 .

:. S.D ofy. cry =·,/2.996 = 1.731


, . Xi - 53 _ crx
Since v, =-5-' so cry - 5

I :. cry = 1.731x 5 = 8.655 .

Example. 14. The mean and standard deviation of marks of 70 students


were found to be 65 and 5.2 respectively. Later it was detected that the
marks of one student was wrongly recorded as 85 instead of 58. Obtain
the correct s.d.
326 ENGINEERING MATHEMATICS-IIA

Let X\>X2,X3'" be the marks.

The "incorrect LX;" = 65 x 70 = 4550·

.. the "correct- LX; " = 4550 - 85 + 58 = 4523 .

.. the 'correct mean' = 4523 = 64 . 61 .


70

We know (s.d)2 =.!. LX? -(xl or, LX? = n((s.d)2 + x2) .


n

Again the "incorrect LX/" =70{(5.2)2 + (65)2} =297642·8·

2 2
Then the correct LX/ = 297642·8 - 85 + 58 = 293781· 8 .

2293781·8 2
Hence the correct (s.d) = 70 -(64.61) =22·43·

So the correct s.d = ·J22· 43 = 4·74.

Example. 15. Three factories A, B, C producing similar products are such


that the mean daily wage of workers offactory A is Rs. 100 with a s.d oj
Rs. 10, whereas in factory B, the mean wage is Rs. 150 and s.d is Rs. 12
and in factory C, the mean wage is Rs. 150 and s.d is Rs. 10. Which
factory is most consistent in respect of the daily wage of their workers?

The s.d of workers of factory A per unit-mean = ~100 = 0·1·

The s.d of workers of factory B per unit-mean = E- = ·08.


150
10
The s.d of workers of factory C per unit-mean = 150 =0·07

This shows that the daily wage of the workers of factory C is most
consistant.
Variability is highest for factory A.
Example 16. Calculate the first four central moments, the skewness,
kurtosis for the following distribution of 1083 cases of entric fever.
BASIC STATISTICS 327

Age (Years) No. of cases Age (Years) No. of cases


Under 5 33 30 - 35 63
5 -10 143 35 - 40 37
10 - 15 252 40 - 45 20
15 - 20 244 45 - 50 12
20 - 25 165 50 - 55 5
25 - 30 107 55 - 60 2
Solution.
-
Calculation of moments
x. -32·5
Class Mid- J; Xi -32·5 I 5 =Yi J;Yi J;y/ J;y/ f .v.4
interval
Value
(x.)

0-5 2·5 33 -30 -6 -198 1188 -7128 42768


5-10 7·5 143 -25 -5 -715 3575 17875 89375
10-15 12·5 252 -20 -4 -1008 4032 16128 64512
15-20 17·5 244 -15 -3 -732 2196 -6588 19764
20-25 22·5 165 -10 -2 -330 660 -1320 2640
25-30 27·5 107 -5 -1 -107 107 -107 107
30-35 32·5 63 0 0 0 0 0 0
35-40 37·5 37 5 1 37 37 37 37
40-45 42·5 20 10 2 40 80 160 320
45-50 47·5 12 15 3 36 108 324 972
50-55 ~2·5 5 20 4 20 80 320 1280
:):)-00 1"7. " , ,') ') 10 ')0 ',)0 1'')0
Total -- 108':] - - -2947 12113 48055 23025

=N F2J;Yi =2il =2iy/ F2);l

1
328 ENGINEERING MATHEMATICS - I1A

The moments of Y about 0 are

m,, (Y)=- 1 L t.», =--x-2947=-2·721


1
N Ji I1083

m;- (y) = ~N LJ;y/ = _1_x 12113 = 11·185


lO83

m3' (y) = ~ LhY/ = _1_x -48055 = -44· 372


N 1083
, 1" 4 1
m4 (y) = - L.....hYi = --x 223025 = 205·932
N 1083
:. the central moments of yare
m,(y) =0
m2(y) = m2' (y) - {m,' (y)f = 11·185 - (-2·721)2 = 3·781

m3(Y) = m3' (y) - 3m2' (y)m/ (y) + 2{ m,' (y)f


= -44 ·372 -3x 11·185x (-2·721) + 2 x(-2· 721)3 = 6·639
. 2 4
m4(y) = m4' (y) - 4m3' (y)m,' (y) + 6m2' (y){ m,' (y)} - 3{ m/ (y)}
= 205 ·932-4x (-44·372) x (-2·721) + 6 xll·185 x (-2·721)2

-3(-2·721)4
= 55·408

S·mce y. = xi-32·5
I 5
:. the central moments of x are given by m, (x) =0
m2(y)=m~~x) :.~(x)=25x3.781=94.525

~(y) = m~;x) .. m3(x) = 125 x 6·639 = 829.875

m 4 (y) __m4(x)
54 or, m4 ()x = 62 5 x 55· 408 = 346 30· 00

The Variance of x, a; = m2 (x) = 94.525

:. ax = ../94 ·525 = 9· 722


BASIC STATISTICS 329

m3 (X) 829·875
Skewness of x, Yl(x) = --3- = 3 = 0·903
(Yx (9·722)
. m4(x) 34630
Kurtosis of x r2(X)=--4--3= 4 -3=0·876
, (Y x (9· 722) .
I Example 17. The first four moments of a distribution about 5 are 2, 10,
40 and 218. Find the first four central moments and moments about O.
I Solution. Let m, and m; are central moments and moments about 5
respectively.
\
Using the relation between Inr and we have m;
I m2 = m2 "2- m1 = 10 - 22 = 6
I ,
m3 = m3 - 3m2 m1
"
+ 2ml '3
= 40 - 3 x lOx 2 + 2 x 2 =-4
3

m4 = m4 , - 4m3 "6"2
Inl + m2 m1 -
3Inl '4
I = 218 - 4 x 40 x 2 + 6 x lOx 22 - 3 X 24 = 90
I which are first four central moments.

I Let f..lPf..l2,f..l3,f..l4 be the moments about O.


.Now, first moment about 5 = 2
I 1 n
or, - LJ; (Xi - 5) = 2
Ni=l
1 n 1 n
or, - L J;xi - 5- L J; = 2
N i=l Ni=l
- 1 -
or, x - 5· - x N =2 :. x =7
N
. 1 n 1 n -
Now, f..ll=-LJ;(xi -O)=-LJ;xi =x=7
s t: Ni=l
Using relation between central moments and moments about any
number we have

m2 = f..l2- f..l12 or, 6 = f..l2 _72 :. f..l2 = 55

m3 = f..lJ- 3f..lzf..ll + 2f..l13

or, -4=f..l3 -3x55x7+2x73 or,f..lJ =465


and m4 =f..l4 -4f..lJf..ll+6f..l2f..l12 -3f..l14
330 ENGINEERING MATHEMATICS -IIA

or, 90=114 -4x465x7+6x55x72 -3x74


or, 114 =4143
which are the first four moments about O.
Example 18. The first three moments of a distribution about 7, calculated
from 9 datas, are 0.2, 19·4 and -41· 0 respectively. Find the mean,
standard deviation and the third moment about origin.
Soluition. Let m'r be the moments about 7.

:.m.',=0·2, m2' =19·4, m3' =-41·0


19 19 19
:.- L(xi -7) = 0·2 or,- LXi -- L7 =0· 2
9 i=1 9 i=1 9 i::1
- 1
or, x - - x 7 x 9 = 0 . 2 :. x = 7.2
9 .
From relation between central moments and moment about any number
we have
m2 =m2 "2-ml =19·4-(0·2) 2
=19·36
:. the variance = 19·36 :. S.d = .J19· 36 = 4·4
m3 = m3' - 3m2' ml' + 2ml'
= -41-3 x19· 4xO· 2 + 2x (0.2)3 = -52 ·624
Let 111,112,113 be the first three moments about O.
1" 1-
Now, 111 =gL.J(xi-O) =x=7·2
Using relation between m; and I1r we get
m2 = 112 - 1112 or, 19·36 = 112 - (7 ·2)2 :.112 = 71· 2
and ~ = 113 - 3112111 + 211~
or, -52·624 = 113 - 3 x 71·'2 x 7·2 + 2 x (7·2)3
:. 113 = 738·8
Example 19. The A.M of a distribution is 5. The second and the thin
central moments are 20 and 140 respectively. Find third moment abor
10.
Solution. Let m, be the central moments.
:.ml =0,m2 =20andm3 =140
BASIC STATISTICS 331

Let mr' be moments about 10.


, In In In
Then ml =-~:Cx;-10)=-Lx;--LlO
n ;=1 n ;=1 n ;=1
- 1
=x--xl0n=5-10=-5
n
Using the relation between mr and m'r we get

or, 140=m3' -3x45x(-5)+2x(-5i

= m3' + 675 - 250 :. m3' = -285


Example 20. The distribution of a variable x has coefficient of variation
= 5, variance = 4 and measure of skewness = 0·5. Find the mean and
mode of the distribution.

Solution. We know, C.v = SD xiOO


Mean
14
:.5=--xl00 :.Mean=
2x 100
=40
Mean 5
Mean-Mode or, O. 5 = 40 -,;;ode
We know, Skewness
S.D
Mode =40-1 =39
Example 21. Prove that the second order moment of a variable is minimum
about the mean of the variable.
Solution. Let x be the variable and x be its mean.
Let a be any real number.
2 }ry
Then (x, -a) = { (x, -
-x)+(x-a)
-
-

= (x, _~)2 +2(x; -~)(~-a)+(~-a)2


n n __ n _ 11_

:. L(x; _a)2 = L(x; _X)2 +2(x-a)L(x; -x)+ L(x-'a)2


;=1 ;=1 ;=1 ;=1

1/1 I" - - In - 1 -
:.- L(x; _a)2 =- L(x; _X)2 +2(x-a)·- L(x; -x)+-n(x-ai
n ;=1 n ;=1 n ;=1 n
332 ENGINEERING MATHEMATICS -IIA

In - - 1 1 - -
=- Lex; _X)2 + 2(x-a){- LX; --. nx} + (x _a)2
n ;=1 n n
In -2 - - - - 2
=- L(x; -X) +2(x-a)(x-x)+(x-a)
n ;=1
or, 2nd order moment about a

= 2nd order moment about ; + (; - a)2


~ 2nd order moment about ;
Hence proved.
Exercises 9

1. Find the AM of the variable assuming the datas 5, 8, 3, 10, 12.


2. Find the AM of the following frequency distribution :
x 6 5 2 8
Frequency: 4 3 1 2
3. The scores of a cricketer playing six matches are 84, 91 ~ 72, 68,
87 and 78. Find the arithmetic mean (AM) of the scores.
4. Ten measurements of the volume of a cone were recorded by an
engineer as 3.88, 4.09, 3.97, 4.02, 3.95, 4.03, 3.92, 3.98 and 4.06 c.c.
Find the AM of the measurements.
5. Find the AM of the following frequency distribution:
x 462 480 498 516 534 552 570 588 606 624
f 98 75 56 42 30 21 15 11 6 2
6. Following is the frequency distribution for the number of minutes
per week spent watching TV by 400 secior citizens.
Viewing Time
(minute) : 300-399 400-499 500-599 600-699 700-799
Number of
Citizens 14 46 58 76 68
800-899 900-999 1000-1099 1100-1199
62 48 22 6
Find the mean TV viewing time for the 400 secior citizens per week.
BASIC STATISTICS 333

7. Four groups of cattles, consisting of 18, 10, 20 and 15 cattles,


reported weight are 140, 153, 148 and 162 Kg respectively. Find the mean
weight of all the cattles.
[Hint: It is a frequency distribution like
x 140 153 148 162
Frequency 18 10 20 15
8. Average marks in Engineering Mathematies in a Class-test of 45
students is 62%. On scruitiny it is detected that the marks of two students
were erroneously recorded as 25 and 72 instead of 52 for both of them.
What should be the correct mean.
9. A teacher teaches two sections of Mathernaties class. Section A has
25 students and their average on the first test was 82. Section B has 15
students and their average on-this test is 74. Find the average on this test
if the teacher combines the scores for both the classes.
10. The mean weight of a lot of beams is 60 kg. The mean weight
of black beam in the lot is 70 kg and that of white beam is 55 kg. Find
the proportion of black-beams and white-beams in the lot. If we have an
additional information that there are 150 beams in the lot altogether, then
obtain the number of black-beams and the number of white beams in the
class.
11. Fifty students took up a class test carrying a total of 10 marks.
The result of those who passed the test is given below :
Marks 4 5 6 7 8 9
No. of students : 8 10 9 6 4 3
If the average marks for the 50 students were 5.16, find out the
average marks of the failed-students.
12. Following is a frequency distribution having two missing
frequencies. The mean of this distribution is 1.46. Find the missing
frequency:
x : 0 1 2 3 4 5 Total
Frequency : 46 25 10 5 200
13. The A. M. of the following frequency distribution of marks for
a group of 60 sudents is 30.5. Find the missing values:
x 10 20 40 50
f 8 10 20 15 7
334 ENGINEERING MATHEMATICS -IIA

14. Find the mean of the variable x assuming following values:


(i) 1,2,.·· n with frequency as its values.
(ii) the first n natural numbers. [W.B. U. Tech 2004]
(iii) 12,32,52,. .. (2n -1/
(iv) 2,4,6,.·· 2n .
(v) 23,43,63 ... (2n)3
(vi) 1,2,.·· n with frequency 12,22,. .. n2 respectively.
15. Suppose that the blood pressure for nine randomly selected
individuals are:
118.6, 127.4, 122.0, 133.2, 108.3, 138.4, 113.7, 130.0, 13l.5.
Find (i) the median.
(ii) If the values are rounded off to nearest 5 mm. Hg, what is
the sample of the values and what is the median ..
(iii) If the second persons blood pressure is 127.6 rather than
127.4 how does this change the median of the actual values and of the
rounded values ?
16. In a batch of 15 students, 5 students failed in a test. The marks
of 10 students who passed were :
90,60,70,80,80,90,60,50,40,70.
Find the median of the marks of all 15 students.
17. The following data relate to the sizes of shoes sold a shop. Find
the median and mode size of shoes :
Size : 9.0 8.5 8.0 7.5 7.0 6.5 6.0 5.5 5.0 4.5
Frequency: 1 4 11 23 40 60 30 15 5 2
18. (i) The frequency distribution of house rent for 30 families in certain
locality is given below:
Rent : 1800-2000 2000-2200 2200-2400 2400-2700 2700-3000 3000-3500
No-of
Families: 4 7 10 5 2 2
Find the Median.
(ii) Form an ordinary frequency table from the following data:
Marks : Below 10 Below 20 Below 30 Below 40 Below 50
No.of students : 3 8 17 20 22
Hence find the mean and median. [W.B. U. Tech 2005 ]
BASIC STATISTICS 335

19. Following is a grouped frequency distribution having a rnissing-


frequency. The median of the distribution is 127.5. Find the missing
frequency
Class-interval: 100-109 110-119 120-129 130-139 140-149 150-159
Frequency 5 7 8 4 6
20. Following is a grouped frequency distribution of expenditure of
1000 families. The mean and median of the distribution are both Rs. 87.50.
Find the missing frequency:
Expenditure: 40-59 60-79 80-99 100-119 120-139
No. of Family: 50 500 50
21. Show that the combined A. M. of two groups lies between the
arithmetic group means.
22. If the height in em of ten students are 63, 65, 66, 65, 64, 65,
65,61,67,68, find the modal height. [W.B.UTech 2004]
23. Among drivers the number of accidents in which each was
involved during a 5-year period was recorded: 2, 0, 1, 0, 1, 3, 1.
Determine the mode.
24. (i) Find the mode of the following frequency distribution :
Classinterval:1500-17001700-1900 1900-2100 2100-2300 2300-2500
Frequency: 25 30 37 27 II
(ii) Find mean, median and mode from the following frequency
distribution
cm-inlaVcll: 300--600600-1000
1000-18001800--28002800-3300
3300-3600
3(00.4500
Frequency: 10 20 30 20 JO 5
5
25. Find the standard deviation of the observations:
(i) 5, 18, 10, 15,3, 7, 6 and 12
(ii) 5, 7, 1,2,6,3
(iii) 3.2,4.6, 2.8, 5.2, 4.4
26. Find the s.d and variance for the following frequency distribution:
x 2 5 9 10
f 3 6 8 4
336 ENGINEERING MATHEMATICS - I1A

27. Find the s.d and variance of the variable x having following
frequency distribution :
x 94.5 84.5 74.5 64.5 54.5 44.5 34.5
f· 2 12 22 20 14 4 1
28. Find the mean and s.d of the following distribution:
Class-interval : 4-6 6-8 8-10 10-l2 12-14 14-16
Frequency 13 111 182 105 19 7
29. Find the standard deviation of the set of numbers in the arithmetic
progresion 4, 10, 16, 22, 28, ... 154.
30. Find the standard deviation for the frequency distribution of life
time (in hours) of 80 light bulbs:
Life-hours:500-600 600-700700-800 800-900 900-1000 1000-1100
Frequency: 6 12 25 18 14 5
31. The mean and standard deviation of marks of 70 students were
found to be 65 and 5.2 respectively. Later it was detected that the value
85 was recorded wrongly and therefore it was removed from the data
set. Then find the mean and s.d for the remaining 69 students.
32. Following are the maximum daily temperature (0 c) recorded in
a week in Kolkata :
38,40,36,35,30,32,34.
Using transformation property determine the s.d of maximum daily
temperaturae in Farenheit scale.

[H·int : Th ere I·ation berween


etween PoC and
- 32 ] .
an of . -=--
IS C F
5 9
33. Find the standard deviation for the distribution of duration of
telephone calls in a telephone booth given below:
Time (in seconds) N umber of Calls
10-109 I
5
110-209 29
210-309 77
310-409 175
410-509 124
510-609 43
610-709 17
Total 470
BASIC STATISTICS 337

34. The marks obtained by 10 students are


70 65 68 70 75 73 80 70 83 86.
Find the variance.
35. (a) The s.d of first n positive integers is 2. Find n.
(b) The s.d of fust n even positive integers is J5. Find 11.
(c) The s.d of first n odd positive integers is 8. Find n.
36. The A. M. and variance of 20 observations were calculated by a
student as 20 and 5 respectively. But while calculating an item 13 was
mis-read as 30. Find the correct A. M. and variance.
37. Find the mean and standard deviation from the following grouped
frequency distribution :
Weight :35.0-39.9 40.0-44.9 45.0-49.9 50.0-54.95 5.0-59.9
60.0.:.64.9 65.0-69.9
Frequency: 5 16 30 23 17
8 1
38. Compute the s.d for the following frequency distribution on average
daily sales (is Rs.) of 80 salesmen of a departmental store:
Class 50-59 60-69 70-79 80-89 90-99 100-109 110-119
Frequency: 6 9 157 5 25 13
[WE. UTech 2004]
39. Compute the arithmetic mean and standard deviation for the
following data :
Score 4-5 6-7 8-9 10-11 12-13 14-15
Frequency: 4 10 20 15 8 3
[WE. UTech 2002]
40.Find the standard deviation from the following frequency distribution
Sala No. of workers
uptoRs.10 8
up to Rs. 20 24
up to Rs. 30 56
up to Rs. 40 95
up to Rs. 50 136
up to Rs. 60 178
uptoRs.70 192
up to Rs. 80 200 [W.E. U Tech.2008]
EM-2A-22
ENGINEERING MATHEMATlCS-IlA
338

41. The mean and s.d of the income of teachers in two schools are
given below:
Mean income s.d
School A 8000 1200
School B 9000 1350
Compare the variability of the incomes of the teachers in two schools.
42. Following are the scores of two batsmen, A and B in ten innings:
A : 84 73 115 36 19 7 12 6 119 29
B : 13 42 12 48 51 4 47 76 37 0
Who is the more consistent player.
43. For a class of students the height has a distribution with mean
162 em, s.d 10 em and weight has mean 57 kg, s.d 8 kg . Compare' the
variability aspect of the distribution of height and weight.
44. Thew scores of two batsman A and B in 10 innings are
19 31 48 53 67 90 10 62 40 80
A
32 28 47 63 71 39 10 60 96 14
B
Find which batsman is more consistent in scoring.

[Hint: Find ~ = 50, 0" = 24·4 :. C Vof x = 49 ]


Similarly find y = 46, O"y = 25·5 and C.V of y = 55]
45. The mean life in days and standard deviation for two types of
electric bulbs are given below:
Mean life in days s.d in days

Type I 310 9

Type II 260 14
compare the relative variability of life of the type of bulbs.
46. You are given the distribution of wages in two factories X and Y.
Wages (Rs): 50-100 100-150 150-200 200-250 250-300 300-350

No. of X 6 11 18 32 27 11

workers Y 2 9 29 54 11 5
State in which factory the wages are more variable.
BASIC STATISTICS 339

47. Find the first three central moments of the following frequency
distribution:
Yearly income 3-6 6-9 9-12 12-15 15-18 18-21 21-24 Total
in lakh
Nu of'families 28 292 389 212 59 18 2 1000
48. ·Find the first four central moments and the skewness, kurtosis
for the variable x having the following frequency distribution:
x 21-24 25-28 29-32 33-36 37-40 41-44
f 40 90 190 110 50 20
49. Find mean, mode and standard deviation for the body weight of
the children having following frequency distribution. Hence find the; .
measure of skewness :
Body weight : 14·5 15·516·5 17·5 18·519·5 20·5 21·5
No. of children: 35 40 48 100 125 87 43 22
50. Find the skewness of the following distribution
Wages (x) 55-58 58-61 61-64 64-157 67-70
No. of workers: 12 17 23 18 11
51. Find the skewness and kurtosis for the following distribution:
x 4.5 14.5 24.5 34.5 44.5 54.5 64.5 74.5 84.5 94.5
f 1 5 12 22 17 9 4 3 1 1
52. From the following distribution, find the skewness and explain its
significance.
Weekly: 70-80 80·90 90-100 100-110 110-120 120-130 l30-140 140-150
wages (in Rs)
No. of: 12 18 35 42 50 45 20 8
employees
53. (a) Find skewnes based on mean-median from the following
frequency distribution.
Marks No. of Students 'Marks No. of Students
Above 0 100 Above 50 50
" 10 98 " 60 35
" 20 95 " 70 23
" 30 90 " 80 13
" 40 80 " 90 5
340 ENGINEERING MATHEMATICS- IlA

(b) For a distribution if the skewness is 0·42, A.M is 86, Median is


80 find coefficient of variation.
54. If m2' and m2 are the second moment about 10 and the AM
respectively then prove that m2' = m2 + e where k = ~ - 10
55. The first four moments of a distribution about 3 are 2, 10, 40 and
218. Find the moments about the origin and the moments about mean.
56. Find out kurtosis to the following data :
class interval 0-10 10-20 20-30 30-40
frequency 1 3 4 2
comment on the nature of the distribution .

57. If a, = mla.r where m, is central moment and a is standard

deviation, prove that a, =0 and a2 = 1.


58. If m, is rth central moment and ar is the rth central moment of

x-x
z = -- a , where a is S.D of x, then show a
r
= mr jar .
59. The first three moments of a distribution about 2 are 1, 22 and
10. Find its mean, s.d and the skewness.
60. The first three moments of a distribution about 2 are respectively
1, 16 and -40. Prove that the mean is 3, the variance is 15 and the third
central moment is -86. also find the first three moments about O.
61. The variance of a symmetrical distribution is 25. What must be
the value of the fourth central moments so that the distribution is (i) lepto
kurtic (ii) mesokurtic (iii) platykurtic.
62. Calculate the first four central moments for the following data and
examine for the nature of the distribution:
x 1 2 3 4 5 6 7 8 9

f 1 6 13 25 30 22 9 5 2
63. Find the second, third and fourth moments about mean of the
following frequency distribution. Hence find the skewness and kurtosis
of the distribution.
BASIC STATISTICS 341

class limits: 100-104·9 105-109·9 110-114·9 115-119·9 120-124·9


frequency: 7 13 25 25 30
64. The first four moments about the mean of a distribution are 0,
2·5, 0·7 and 18·75. Find the skewness and kurtosis of the distribution.
65. The first four moments of a distribution about 5 are respectively
2, 20, 40 and 50. Obtain, as for as possible, the character of the
distribution in terms of skewness and kurtosis.
66. The following data are given to the manager of Good Year Tyres
company. Is the distribution platykurtic ?

N = 100, LJ;x; = 50, L J;x/ = 1967·2, L J;x/ = 2925·8,


L f.x,4 = 86650· 2
67. A survey was conducted by a tophy manufacturing company to
enquire the maximum price at which persons would be willing to buy their
product. The following table gives the stated price (in Rs.) by 100 persons:
Price : 1·80-1·901·90-2·00 2·00-2·10 2·10-2·202·20-2·30
No. of persons: 11 29 18 27 15
Calculate the skewness and interpret it.
68. Find skewness for the frequency distrirbution given below.
Weekly wages No. of Weekly wages No. of workers
(Rs) workers (Rs)
23-27 2 48-52 16
28-32 6 53-57 12
33-37 9 58-62 6
38-42 14 63-67 2
43-47 32 68-72 1
69. For a variable, coefficient of skewness =-0·375, Mean =62,
Median = 65 . Find the variance.
70. Find the coefficient of variation ofa variable whose median = 17·4,
Mode = 15·3 and skewness = 0·35.

1
/

342 ENGINEERING MATHEMATICS -IIA

71. The mean, median and the coefficient of variation of the weekly
wages of a group of workers are respectively Rs 45, Rs 42 and 40. Find
the (i) mode "(ii) variance (iii) skewness, for this distribution.
72. The distribution of the Wages of the Govt. employee in a country
is such that its first two moments about 4 are respectively 1.5 and 2.7.
The median of this distribution is 2.1. What will be the shape of the
frequency diagram of the distribution.
73. For a group of 10 items, LX; = 452, LX; = 24,270 and the Mode
= 43·7. Find the skewness.
74. For a moderately skewed distribution, mean is 160, mode is 157
and standard deviation is 50.
" Find (i) coefficient of variation
(ii) skewness
(iii) median.
75. In a distribution, Mean = 65 , Median = 70 and skewness = -0·6 .
Find Mode and the coefficient of variation.
76. Given mean = 50, C.v = 40010, skewness = -0.4. Find s.d, Mode
and Median.
77. Consider the following distribution :
Distribution of x Distribution of y
Mean 100 90
Median 90 80
s.d 10 10
Find whether (i) distribution of x has same degree of the variation as
distribution of y. (ii) both the distribution have the same degree of
skewness.
'. 78. The mean of a certain distribution is 50, its s.d is 15 and skewness
is -1. Find the median.
79. The skewness of a distribution is 0.32. Its standard deviation and
mean are respectively 6·5 and 29. 6. Find the mode and median.
80. In a frequency distribution the c.v = 5, s.d =2 and skewness
= 0 .5 . Find the mean and mode of the distribution.
BASIC STATISTICS 343

Answers
l. 7·6 2. 5·7 3.80 4. 3 ·98 5.501 6.715 7. 150
8.62·16 9.79 10. 1: 2;50,100 11. 2·1 12. 76, 38 13. 30

2n +1 n +1 4n2 -1
14. (i) -3- (ii) -2- (ill)-3- (iv) n + 1(v) 2n(n + 1)2

. 3n(n+l)
(VI) 2(2n+l)

15.127,6; 130 16.60 17. Both are 6·5 18. (i) 2280
19. 10 20. 250, 150 22. 65 23. 1 24. (i) 1982·40
(ii) mean = 1765, median = 1533·33, mode = 1070

25. (i) 4.87 (ii) 2.16 (Hi) 0.90 26.2·82,7·95 27.12·3


28.9·12,1·93 29.45 30.131·29 hours 3l. 64·71,4·64

32. 5·6934. 42·835. (a) 7 (b) 4 (c) 5 36. 19·15, (4·67)2

37. Mean =50·4, s.d=6·71 38.15·41 40. Rs.16·87


4l. Same 42. B is consistent but A is better because mean of A is
higher than B
43. The variability is more for weight distribution.
44. batsman A 45. Type II more variable 46. factory X
47. 0,9·4,17·4
48. 31.30,23·04,26·11, 1496·68 s.k= 0·24, kurtosis = -0 ·18

49.18·07,18·40,-0·1951·0·713,0·787 50. -·048


52. -0·332; Negatively skewed 53. (a) 0·574 (b) 50
54. -0·197,-·7455 55.2,10,40,218; 0,6,4, 90
59. 7·2,4·4 60. 3, 24, 76

6l.(i) m4 >1875 (ii) m4 =1875 (iii) m4 <1875

62. 0,2·49,0·7,18·33; the distribution is almost normal

63. 0,38·09, -110·672,3229·7057;0·4708; - 0·774


64. 0·031, mesokurtik
344 ENGINEERING MATHEMATICS -IIA

65. negatively skewed; playtykurtic 66. Platykurtic


68. 0·057 69. 5.76 70 .0.49
71. (i) 36 (ii) 324 (Hi) 0·5
72. The frequency curve is asymmetrical, has a longer tail on te right
hand side.
73. 0·08 74. (i) 31·25 (ii) 0·06 (iii) 159
75. 80,38·46 76. 20,58,52·67
77. (i) distribution of y is more variable than that of x (ii) Yes, both
have same degree of skewness 78. 0.55
79. 27·52, 28·9 80. 400, 399

MULTIPLE CHOICE QUESTIONS

1. The frequency distribution of a variable


,,-
x is
2 3 1 5
/; 1 2 3 1
Then its mean is
16
(a) 2 (b) -
7
1 7
(c) -
2 (d) 16
.
2. If the AM of 2, 6, x, 5, 7 be 4, then the value of x is
(a) -20 (b) 5 (c) 4 (d) O.
3. The AM of 2,4,6,.··2n is

(a) n + 1 (b) n(n + 1)


n+l n(n + 1)
(c) - (d) 2
2
4. The AM of 7, x - 2,10, x + 3 is 9. The value of x is
" (a) 0 (b) 9 (c) 18 (d) 2x + 18·
5. The mean of the following distribution is :
Marks 20-39 40-59 60-79 80-99
No of students 10 12 8 10
(a) 68.5 (b) 39.5 (c) 58.5 (d) 60

L
345
BASIC STATISTICS

6. The standard derivation of a frequency distribution is given by

(c) l ".~Jj ',2 (X,. _ x)2 (d) 1"


N~Ji
,2 (Xi-X _)2
N I I

7. The variance of the following distribution is :

Xi
~1 0 3 4

/; 3 2 1 4

(a) 56.8 (b) .568


(c) 5.68 (d) 2.383
8. For the datas Xl ,x2, "·,xn , variance of X is given by

1" 1" 2
(c) - ~Xi -- ~Xi (d) None.
n i n i
9. The standard derivationof the followingobservation is 5, 7, 1,2,6,3.
(a) 4.66 (b) 2.16
(c) 1.47 (d) none.
10. The mode of the following data is : 2, 1,3,2,1,5,2,2,1,6,4,21,3

(a) 5 (b) 2

(c) 3 (d) 1.
11. The median of the following distribution is: 7,9,5,3,10, 15,21,
19, 17
(a) 15 (b) 9
(c) 10 (d) 17.
12. The median of the following distribution is: 10, 13,9, 7, 37, 16,27,32
(a) 16 (b) 14.5
(c) 13 (d) 15.5.
346 ENGINEERING MATHEMATICS -IIA

13. The relations between. mean, median and mode is


(a) Mode = 3 Median -2 Mean
(b) Mode = 3 Median +2 Mean
(c) Mode = 2 Median -3 Mean
(d) Mode = 2 Median +3 Mean.
14. For a moderately asymmetric distribution median = 27, mean = 26
then mode = O.
(a) 133 (b) -24
(c) 29 (d) -29.
15. The AM of 20 datas IS calculated to be 89.4. Later the data 78 is
replaced by 87. The AM of the datas after replacement is
(a) 89 (b) 85
~
(c) 89.85 (d) none of these.
16. AM of six datas is 0.3. AM of these six datas together with another
data is 0.5. Then the seventh data is
(a) 1.7 (b) I
(c) 2 (d) 0.3.
17. If 10Yi =», -85 a!ld y=--{)523 then x=
(a) 80 (b) 79.77
(c) 78.77 (d) 77.77.
18. AM of a group of 5 observation is 240; that of a group of 2
observation is 100. If the two group are merged into one then the A.M of
the merged group is
(a) 250 (b) 125
(c) 200 (d) none of these.
19. AM of a group of n observation is 540; that of a group of m
observation is 460. If the AM of the merged group is 520 then n:m =
(a) 2:1 (b) 2:2
(c) 3:1 (d)none of these.
20. In a simple frequency distribution of 200 items one frequency
against a data is missing; but the mean of all datas is known as 7740. The
missing frequency is
(a) 49 (b)50
(c) 48 (d) none of these.
BASIC STATISTICS
347

21. Two set of datas {Xi} and {Yi} are related by Yi ==12xi - 3115. If
the median of the first set is 130 then the median of the second set is
(a) 124.5 (b) 130.5
(c) 140.5 (d) none of these.
22. The mode of the frequency distribution
X 0 1 2 3
f 8 24 36 10
is
(a) 0 (b) 1
(c) 2 (d) 3.
23. The mode of the frequency distribution
n 0 1 234
f 23 24 21 24 20
are

(a) 0 (b) 1
(c) 2 (d) 3.
24. The AM of 100 observation is 2.5. So the AM of 50 of these
observation is more than 2.5
(a) True (b) False.
25. The median of 100 observation is 2.5. So 50 of these observation is
2.5 or more
(a) True (b) False.
26. If you remove the largest data from a group of different datas, the
AM of the remaining datas always changes
(a) True (b) False.

27. If the relation between the two group of observations {Xi} and

{Yi} is 3xi +4Yi ==21 and if x=3 then y=


(a) 1 9
(b) -
4
(c) 3 (d) none of these.
348 ENGINEERING MATHEMATICS -UA

28. If the relation between two set of observations {Xi} and {Yi} is
Xi =2 Yi + 5 and median of X is 25 then the median of y is
(a) 20 (b) 10
(c) 12.5 (d) none of these.

29. If the relations between two set of observations {Xi} and {Yi} is
2Yi - 6xi = 6 and mode of the 1st set is 21 then the mode of the second set
IS

(a) 13 (b) 29
(c) 55 (d) none of these.
30. The number of observations of two groups of datas are in the ratio
2: 1 and their A.M are 8 and 128 respectively then the A.M of the combined
group is
1
(a) 88 (b) 45-
3
(c) 48 (d) none of these.
31. The mean of the observations 1,2,3,. =.n with frequencies
12,22,32,. ··,n2 respectively is
3n(n + 1) n(n + 1)
(a) 2(2n + 1) (b) 2(2n+l)

3n(n + 1)
(c) 2n + 1 (d) none of these

32. The AM of the frequency distribution


X 5 6 7 14
/; 5 6 7 14
is
(a) 10 (b) 10.37
(c) 12 (d) none of these.
33. The AM of the datas 5,55,555,.·· upto nth term is

(a) -50 (1 0n -1 ) -- 5 (b) -50 ( 10n -1 ) -- 5


n 9 8In 9

(c) 8~ (IOn -1)


BASIC STATISTICS 349

34. The datas of the two groups {x j} and {Yj} are related by
,
Xj -800 . .
Yj = 50 and If s.d of the second group IS 2.6257 then the s.d of the

first group is
(a) 131.29 (b) 135.16
(c) 134 (d) none of these.
35. The s.d of maximum daily temperatures in centrigrade scale is .
3.16. Then the s.d of those of in Farenheit scale is
(a) 5 (b) 7.1 (c) 5.69 (d) 6.69
C F-32
[Hints: 5= 9 is the relation between Centrigrade and Farenheit
scale).
36. 2xj + Yj = 3 is the relation between two sets of datas {xj} and
{Yj} . If c x = 3 then o y =
(a) 3 (b) 4 (c) 6 (d) none of these.
37. The standard deviation of the datas -5, -10, -12, -19, -20 is a
positive number
(a) True (b) False.
38. The s.d of n number of observations xpx2,. ",xn be s then the s.d
of -xt>-x2, •• ,-xn is -s
(a) True (b) False.
39. The variance of first n natural numbers is
n2 -1
(a) n2 -1 (b)--
10
n2 n2 -1
(c) - (d) --
12 12
40. The variance of the frequency distribution
x 1 2 3 n
I, 1 2 3 n is

(n + 2)( n - 1) (n + 2)( n + 1)
(a) 18 (b) 18

'. (c) (n-2)(n-1) (d) none of these.


18
350 ENGINEERING MATHEMATICS -IIA

41. The first moments about 4 of the set of numbers 2, 4, 6, 8 is


(a) 0 (b) 1 (c) 2 (d) -2
42. The second moment about 4 of the set of numbers 2, 4, 6, 8 is
(a) 0 (b) 4 (c) 6 (d) 7
43. The third moment about of 4 of the set of numbers 2, 4, 6, 8 is
(a) 1 (b) 4 (c) 16 (d) 12
44. The first four moments of a distribution are 1, 4, 10 and 46
respectively. Then the third central moment is
(a)1 (b) 3 (c) 4 (d) -1
45. Kurtosis reveals the shape of the distribution at the top:
(a) True . (b) False
46. The two halves of an asymmetrical distribution are mirror images
of each other
(a) True (b) False
47. Two distributions, with same mean, s.d and skewness, must have
same peakedness
(a) True (b) False
48. If the mean and the mode of a given distribution are equal then
skewness is
(a) 0 (b) -1 (c) 1 (d) 00

49. SKewness is positive when


(a) mean < mode (b) mean = mode
(c) mean> mode (d) for any mode
50. If mean, mode and s.d are 41, 45 and 8 respectively then skewness =
(a) O·5 (b) 1 (c) 5 (d) - 0 . 5
51. The skewness can not exceed
(a) -3 (b) 0 (c) 4 (d) 3
52. If kurtosis has a value less than 3 the distribution is called
(a) leptokurtic (b) mesokurtic
(c) normal (d) platykurtic
BASIC STATISTICS 35]

Answers
1.b 2.d 3. a 4. b 5.c 6. a 7. c

8. a 9. b 10. b 11. c 12.b 13. a 14. c

15. c 16. a 17. b 18. c 19.c 20. a 21. a

22. c 23. b d 24. b 25.a 26.a 27. c 28. b

29. d 30. c 31. a 32. b 33.b d 34. a 35. c

36. c 37. a 38. b 39. d 40.a 41. b 42. c

43. c 44. b 45. a 46. b 47. b 48. a 49. c


50. d 51. d 52. d
rpo
~ RANK
I----;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;C;;;;O;;;;R;;;;R;;;;E;;;;L;;;;A;;;;T;;;;I CORRELATION
O;;;;N;;;;,;;;;RE;;;;;;;;;;;;G;;;;RE;;;;;;;;S;;;;S
I;;;;O;;;;;;;N

10.1 Introduction:
In an earlier chapter we introduced Bivariate data and its distribution.
Consequently the correlation between two variable was discussed. In basic
statistics we are concerned with only the bivariate data and the correlation
- regression of these. In order to go through an independent study of this
chapter we again give the idea of bivariate data with an example. It is
very obvious to mention that there may exist two or more random variables
on an event space. Two random variables show two characters of the
outcomes of the experiment. In this chapter we are concerned of two
random variables defined on an event space.
10.2 Bivariate Data
Let X Y be two random variables defined on an event space S of an
experiment. Then the pair (X Y) is called a Bivariate on the same event
space. Such a bi-variate (X y) assumes a pair of rea Is (x, y) corresponding
to an outcome in the event space where x and yare assumed by the random
variables X and Y respectively against the same outcome. These pair of
values (x, y) are called Bi-variate Data.
Illustration: Let us consider the experiment of drawing a student from
the colleges of West Bengal. Then the event space S = Set of all college
students ofW. B. Let X and Ybe two random variables defined on S such
that X ( a student) = his / her height and Y (a student) = his / her weight.
Then the pair (X Y) is a bi-variate defmed on S ; a pair of values say
(5.6, 63.4) means X assumes the value 5.6 and Yassumes 63.4 against a
particular student, i.e. the height and the weight of a particular college
student is 5.6 and 63.4 respectively. (5.6, 63.4) is a particular bi-variate
data ; only 5.6 or only 63.4 is an uni-variate data. So, if

x Xtl

y Yl YII

then (x, y): (Xl' YJ), (X2' Y2), ... "', (XII' Ytl) are the bi-variate data.
CORRELATION, REGRESSION RANK CORRELATION 353

Note: For a bi-variate (X; Y) assuming the datas (x, y) there mayor
~ may not have any relation or association between x and y. For example if
x = height and Y = weight of a student then there may exists a relation
or association between x and y. But if x = height and Y = I.Q of a student
then so far as we know there is no relation or association between x and
y. In the subsequent sections of this chapter we will be mainly concerned
with the character' association'.
10.3. Scatter Diagram
Thedigramrnatic representation of bivariate data is scatter diagram.
Let (XIoYl), (X2,Y2), ... "', (x",Yn) be nnumber of bivariate datas.
For each (Xi' Yi) we get a point P; on the X- Y plane whose co-ordinate
is (Xi' ») as shown in the figure. .

y"


Y2 •
Yl

:X
0 Xl x2 X"

The diagram constituted by these dots is the scatter diagram of the


bivariate datas. It is to be mentioned that some of the points may have
same abcissa/ordinate. So more than one dots may appear against an
abcissa. Similarly more than one dots may appear against an ordinate. See
the adjacent diagram.
Different Types of Scatter Diagram.
Let (x, y) be a Bi-variate data which assuming the pair of values
(XIoYl), (X2,Y2),':' .. ·,(xn,Y,,)· We get the nature of association
between x and Y from the sactter diagram of the bivariate datas.
If the scatter diagram is like Fig 1 then we say the association between
the two variables x and Y is positive, i.e. if x increases Y has a tendency
to increase also.
EM-2A-23
354 ENGiNEERING MATHEMATlCS -IIA

....

.. . . •

--+-------------------~x
o Fig!
If the scatter diagram is like Fig 2 then we say the association between
the two variables x and y is negative, i.e. if x increases y has tendency to
decrease.
y

o 0
o
o
·0 0

o
o • 0

--I-------------------+x
o Fig 2

If the scatter diagram is like Fig 3 we say the association between the
two vairables x and y is 0, i.e. there is no association between the two
variables x andy.
y

. ... .. .
o •
o •
• 0 0

o •

~~-----------------+x
o Fig 3

If the scatter diagram is like Fig 4 then we say the two variables x
and yare perfectly positively associated, i.e. if x increases y also increases
- never decreases at any stage. [WB.V.T. 2005
CORRELATION, REGRESSION RANK CORRELATION 355

-o~--------------~x
Fig 4
If the scatter diagram is like Fig 5 then we say the two variables x
and yare perfectly negatively associated, i.e. if x increases y decreases -
- never increases at any stage. [W.E. U.Tech 2005 ]
Y

o x
Fig 5
Illustration.
Suppose the bivariate datas assumed by the bivariate (X; Y) are given
below:
x : 1 1.8 2 2.5 3 3.5 3.8 4.4 5 5.5
y :.9 1.6 2.1 2.1 2.4 2 3 3.1 3 3.5 3.8
We plot the points (1, .9), (1.8,1.6), (1, 2.1) . and so on. Thus the
scatter diagram of these data is shown below :
y

6
5
4

...
3

2
1 •
2 3 4 5 6
356 ENGINEERING MATHEMATICS - IIA

From the scatter diagram we see x and yare positively associated but
not perfectly. As x increases y has a tendency to increase; at some stage
it decreases.
10.4. Correlation and its Different Types.
Let us have a bivariate data (x, y). The degree of association or the
strength of relationship between the two variables x and y is called the
correlation between the two random variables X and Yor the correlation
between their assumed datas x and y.
If Y has a tendency to increase as x increases we say x and yare
positively correlated. That is if the scatter diagram is like Fig 1 of the
previous Article then we say x and yare positively correlated.
If y has a tendency to decrease as x increases we say x and yare
negatively correlated. If the scatter diagram is like Fig 2 of previous Art
then x and yare negatively correlated.
If the value of yare not affected by the changes in the values of x
then we say x and yare uncorrelated or zero correlated.
Illustrations.
Let (X; Y) and (Z, W) be two bi-variates where the datas assumed by
them are
x 1 2 3 4 5 6
y 2.5 1.5 3.5 2.5 3.5 1.5
and
z 1 2 3 4 5 6
w 2.5 3 4.5 4 5.6 6
From the two sets of datas we see the degree of association between x
and y is lesser than that between z and w. This would be more clear if we
draw the scatter diagram of them. In this case we say 'correlation of x and
y' is lesser than the 'correlation of z and w '.

Thus a measurement of correlation between two variates becomes


neccesary to·get the idea of association between two variates. In the next
section we introduce such a measurement tool.
10.5. Correlation Coefficient
The correlation between two variates is measured numerically by
correlation coefficient which is going to be defined.
CORRELATION, REGRESSION RANK CORRELATION 357

Co-Variance. Let (x, y) be a bi-variate data assuming the n number of


values
XII

Y : YI Y2 Y3 Yn
Then the covariance of X and Y is

n ;:1
i
Cov(x, y) = J.. (x; - x)(y; - y) where x and yare the Arithmetic

Means of the value assumed by x and Y respectively (i.e.

X=J..(XI +x2 +···+xn) and y=J..(YI + Y2 + ... + Y,,))


n n
Theorem 1. For the bivariate data (x, y)

Cov(x,y) =-
1 n
LX;Y; - -Lx;
(1 n )( 1 n
- LY;
)
n ;:1 n ;:1 n ;:1

Proof: By definition

Cov(x, y) = J.. ~)x; -x)(y; - y)


n ;:1

= -1 L..Jx;y;
~( _ _
- x;y - xY; + x Y
__ )
n ;:1

= -1"
L..Jx;Y;-
_1" _1" __ 1"1
Y- L..Jx;- x- L..JY;+ x Y- L..J
n n n n
1" 1
=- L..Jx;Y;- Y x -x Y +x y-·n
n n

=-1 LX;Y; -x y=- InLX;Y; - (In-Lx; )(In- LY; J


n n ;:1 n ;:1 n ;:1

Illustration. For the following bi-variate data


x -2 -1 o 2
Y 4 1 o 4
358 ENGINEERING MATHEMATICS -IIA

5
we have Lx;=-2-1+0+1+2=0
;=1

5
LY; =4+1+0+1+4=10
;=1

5
LX;Y; =-2x4+(-1)xl+OxO+lxl+2x4
;=1

= -8 - 1+ 1+ 8 = 0 .
Therefore their covariance,

=~XO-(~XO)(~XIO)=O.

Correlation Coefficient.
Let (x, y) be a bivariate data assumed by a bi-variate (X; Y). Then the
correlation coefficient of x and Y is denoted by

where Cov(x, y) is covariance of x and y; c x and cry are standard


deviation of the values of x and y respectively.
This definition is given by Pearson and so this is also known as
Pearson's Product Moment Correlation Coefficient.
Illustration. We illustrate the definition of correlation coefficients with
some small bivariate datas like
x : --6 -4 -3 -1 1 2 4 7
y : -4 -3 -1 -1 o 2 3 4

(To deal with big values we need the help of the subsequent theorems)
CORRELATION, REGRESSION RANK CORRELATION 359

It is convenient to calculate using table:

x y xy X2 y2
-6 -4 24 36 16
-4 -3 12 16 9
-3 -1 3 9 1
-1 - 1 1 1 1
1 0 0 1 0
2 2 4 4 4
4 3 12 16 9
7 4 28 49 16
0 0 84 l32 56

Here n = 8, LX; = 0, LY; = 0, LX;Y; = 84, LX? = l32, LY? = 56

Cov(X,y) =~ LX;Y; -(; LX;)(; LY;) =i 84-(i 0) (i 0) ~l .


x x x =

cri=~LX?-(~LX;r =~XI32-(~XOr 3;

cr~ =~LYl-(~LY;r =~X56-(~XOr =7

Thus the correlation coefficient between x and Y,


21
= Cov(x,y) = 2 =0.977
J¥J7 .
rxy

crxcry

We shall see in a subsequent theorem the maximum value of r.,y is 1.


Since 0.977 is very near to 1 we understand that these two variables x
and yare highly correlated that is the strength of association between them
is very high.
Note:
(1) rxy = ryx - this is obvious.

(2) Since Cov( x, y) and c x' cry have same unit so rxy is a pure number.

rt \
360 ENGINEERING MATHEMATICS -IIA

(3) Below we give three scatter diagram of the bivariate data (x, y) against
which different sign of rxy is shown :
y
y

..
..
X X
o Here rxy > 0 0 Here rxy < 0
y

..... ......
...
.
. . ..
--+-----------------~~
o Here r 0 =
xy

(4) Greater the values of rxy strength of association between x and y is


greater.

(5) Obviously rxx = 1 and rx( -x) = -1 .

(6) If rxy = 0 we say the two variables x and y uncorrelated.


(7) The following theorem will help to fmd the correlation coefficient
for 1 ge variate datas.
T or m 2. Let (x, y) and (u, v) represent two sets of bivariate datas
that u = ax + b and v = ey + d then
ae
ruv = lallel
r xy ; where a, b, e,
d are constants.

Proof . Since u = ax + b and v = cy + d so Ii = ax + b and v = e y +d .


In 2 In 2
Var(u) = - :L(u; - Ii) = -:L {(ax; + b) -(ax + b)}
n ;=1 n;=1

In 2 In , .
=- :La2(X; - X) = a2 - :L(x; - xt = a2 Var(x)
n ;=1 n ;=1
CORRELATION, REGRESSION RANK CORRELATION 361

Similarly o , = I c 10" y
1 II

Now, Cov(u, v) = - 'L(u; - u)(v; - v)


n ;;\
1 n
=-I(ax; +b-ax-b) (
cy; +d-cy-d -)
n;;\
1 II
=- 'L(ax; -aX)(cy; -cy)
n ;;\

=!fac(x; - x)(y; - y)
n j;\
1 n
= ac- 'L(Xj - x)(y; - y) = ac Cov(x, y)
n j;\

Cov(u, v) ac Cov(x,y)
Therefore, ruv
O"uO"v lalO"x\cIO"y

ac Cov(x, y) ac
or, ruv = lal \cl 0" xO"y = lall~ rxy .

Theorem 3. Let (x, y) and (u, v) represent two sets of bivariates datas
such that u = ax + b and v = cy + d . If a, c have same sign then ruv = rxy .
Proof: This is deduced directly from the previous Theorem 2.
There we get .
ac
ruv = lal \cl rxy .

If a, c are both positive, lal = a, \cl = c


ac
So, from above ruv = -rxy = rxy .
ac
If a, c areboth negative, lal = -a, I~ = -c.
ac
So, from above, ruv = (-a )(-c) rxy = rxy .
362 ENGlNEERlNG MATHEMATICS -IIA

Theorem 4. Let (x, y) and (u, v) represent two sets of bivariate datas
such that u = ax + b and v = cy + d . If a, e have opposite sign then

ae
Proof: From Theorem 2 we deduce ruv = lallel 1"..y.
If a is positive and c negative then lal = a, Icl = -c. So from above

ac
we have ruu = -( -) rxy = -r xy .
a -e
ac
Similarly if a < 0 and e> 0, ruv = -(-)- rxy = -r xy .
-a c
Theorem 5. If (x, y) represents bi-variate data then x + y and x - yare
two variates. Then
Var(x + y) = cri + cr~ + 2cr xcr yrxy
Var(x - y) = C"i + cr~ - 2crxcr yrxy
where o stands for s.d and rxy stands for correlation coefficient.

Proof: Var(x + y) = -
1
LII {
(Xi + ») - X + Y
-----}2
n i=l

1 ~{
= - L... Xi
_
+ Yi - X - Y
_}2 [ ..
n i=l

1 II 2
=- L {(Xi - x) + (Yi - y)}
n i=l

=~ f {(Xi - x)2 + (Yi - y)2 + 2(Xi - X)(Yi - y)}


n i=l

1 " 1 " 2 II

=-2)Xi _x)2 +- ~)Yi - y)2 +- ~)Xi -X)~Yi . .:yy


n i=l n i=l n i=l ~

= Var{x) + Var{y) + 2Cov{x, y)


Cov{x,y)
= cr2x + cr2y + 2 o xo y
crxcry

= cri + cr~ + 2cr xcr yr xy


Proof of the second part is similar.
CORRELATION, REGRESSION RANK CORRELATION 363

Corollary: (l) If the two variates x and yare uncorrelated then rxy = 0
and hence
Var(x ai + a~,= Var(x - y)
+ y) =
(2) Var(x + y) = ai + a~ + 2Cov(x,y)
Var(x- y) = ai + a~ -2Cov(x,y)
Theorem 6. Let x and y be two variables whose means are X, y; s.d

are a x' a y respectively. If u =--,


x-x v =--
y-y then rxy = Cov'( u, v) .
ax ay

Proof: Left as exercise. Since au = _1_ ax = 1 this is nothing but the


a
application of Theorem 2. x ,

Theorem 7. For any bivariate data given by (x, y) -1 ~ rx:y ~ 1 where


rxy is correlation coefficient of x and y.

Proof: Let tu, v) represents another bivariate data s.t.

U=--
x-x v=--
y-y
ax' ay·
Then we have, from previous theorem,

rxy ~ Cov(u, v) ... (1)

Now, !tul=!t(x;_X)2
n ;=\ n ;=\ ax

1 1 " 2 1
=-- 2 "(x.
~
-x) I
= -·a
2
2
x
=1 (2)
....a x n ;=\ ax

Similarly, !t v? = 1 (3)
n ;=\

Now since (u; + V;)2 is non negative so

(u; + v;Y~ ~ 0
or, UT +2u;v; +VT ~O
n II n
or, Lul + L2u;v; +L vl ~ 0
;=\ ;=\ ;=\

or -
,
lL21L
U +- I 2u·v·
I I +-1L2 v· >0, -
n n n
ENGINEERING MATHEMATICS -lIA
364

2
or, I+-Lu;v;+I~O from (3)
n '
1 n
... (4)
or, - LU;v; ~-I
n ;=1
_ x-x - y- Y
Now, U =--=0 and v =--=0
crx cry

:. Cov (u, v) =~ L(u; - u)(v; - v) = ~ fu;v;


n n ;=1
So, from (4) we get Cov(u, v) ~ -1.
So, from (1) we have rxy ~ -1 .

Similarly, starting from the fact (u; - V;)2 ~ 0

1
we get - L U; v; ~ 1 and consequently rxy ~ 1.
n
Thus -1 s rxy s 1.
Theorem 8. If rxy = i1 (the max and min value of- correlation
coeffiecient) then y is a linear function of x and vice-versa.
x-x - y-y - [W.B. U. Tech. 2008
Proof: Let u=--, v=--.
o, cry
First part: Let rxy = il

Now, 1"( u;iv; ;;L.. )2 1 2


=-LUti-Lu;v;+-Lvt 1 =li2Cov(u,v)+1
n n n
( -.!..
n
2>1 =_n L vt = 1 ; this was proved in the previous theorem)
-.!..

=2i 2 Cov( U, v)

= 2i 2rxy [rxy = Cov( U, v) ; this is proved m Theorem 6 ]

If rxy = 1 then take '-' sign and if rxy = 1 then take + sign in (1).

'J
CORRELATION, REGRESSION RANK CORRELATION 365

Then we have from (1)


II 2
L:(Ui ± Vi) =0 or, (Ui ± vi)2 = 0 for all values of i
i=1

or, u, ± Vi = ° for all values of i.


So, u and v are related by the equation
-
x-x y- y-
U±v=o or, --±--=o
ax ay
a
or y=y±---.L(x-x)
, ax
which is a linear function.
Second part (converse case) :
Let y be a linear function of x. Then y = ax + b where a, b are constants
. l·x+O ax+b
Think x andy as, x = y =--
, 1

l'a a
Then rxy = Illla(xx =~ rxx (1)

.!. f(Xi -X)(Xi -x)


Cov(x, x) n i=1
Now, rxx = = ---'-=--------
a x ax ai
1 II 2
-L:(Xi-x) 2
n
---,i.==,--I a
=~ = 1.
ai a;
a
Then from (1) ;e get rxy = ~
If a> 0, rxy = - = 1
a
a
If, a < 0, rxy = - =-1
-a
Thus rxy = ±1.
Theorem 9. If the two vari~bles x and yare independent then they are
uncorrela ted.
Proof: If x and yare independent then it can be shown that
366 ENGINEERING MATHEMATICS- I1A

Then Cov(x, y) = ~ ~>;y; -(~ LXi)(~ LY;)

=(~LX; )(~LY; )-(~ LX;)(~ LY;) =0·

So, rxy = _0_ = O. i.e., X and yare uncorrelated.


<Jx<Jy
The on rse of the above theorem is not true. This can be observer
ollowing example.
pie (Two variables are uncorrelated but they may be dependent)
Let X and Y be two variables related by Y = x2 . ... (1)
Then a set of bivariate datas given by (x, y) is .
x -2 -1 o 1 ·2 y
Y 4 1 o 1 4
In this case,

x =!( -2 -1 + 0 + 1+ 2) = 0
5
Y =!( 4 + 1+ 0 + 1+ 4) = 2
+-----~~~----~x
o
5

Cov(x, y) =! ±x;Y; -(! ±x;) (! ±Y;)


5 ;=1 5 ;=1 5 ;=1
1 .
= 5"( -8 - 1+ 0 + 1+ 8)- 0 x 2 = 0 .

. Cov(x,y) 0
•• rxy = =--=0
<Jx<J y <Jx<J y

Here we see x and yare uncorrelated though they are related as y = X2


The graph of y = X2 explain this more clear.

T~~rem 10. Let (x, y) represents a bivariate data. If (u, v) is relate


j!fo (x, y) as u=alx+qy+c;
v = a2x + b2y + C2
then COV(Ii,v)=ala2 Var(x) + (alb2 +bla2)COV(x,y)+blb2 Var(y)
where al>b, , cl> a2, ~ ,C2 are all constants.
Proof: Beyond the scope of the book.
CORRELATION, REGRESSION RANK CORRELATION 367

Illustrative Examples
Example 1. Find the correlation coefficient between the two variables x
and y where the bivariate datas are given by :
x 65 66 67 67 68 69 70 72
y 67 68 65 68 72 72 69 71
Since the datas are so large we subtract 67 from x and 68 from y to
make the datas small.
Calculation of Correlation Coefficient.

Xi v. Xi -67=Ui Yi -68=Vi 2
u, Vi
2
u.v,
65 67 -2 -1 4 1 2
66 68 -1 0 1 0 0
67 65 0 -3 0 9 0
67 68 0 0 0 0 0
68 72 1 4 1 16 4
69 72 2 4 4 16 8
70 69 3 1 9 1 3
72 71 5 3 25 9 15
Total - 8 8 44 52 32

Here n = 8. Cov(u, v) =; LUiVi -(; LUi )(; LVi)

= i (i )(i
x 32 - x 8 x 8) = 3

av
2 = -;I LVi 2 - (I-; L )2 = "81 x 52 - ( "81 x 8)2 = 55
Vi

:. ruv = Cov(u, v) = 3 = 0.603


a ua v .J4.5 J5.5
Ix l
Since u = x - 67 and v = Y - 68 therefore ruv = Nf1IYXY by Th- 2.
or, ruv = rxy. So, rxy = 0.603 also.
368 ENGINEERING MATHEMATICS -11A

Example 2. How would the scatter diagram be look like if rxy = -I.
In a previous theorem we have seen if rxy = -I the relation between x

and y is y=y- cry (x - x) a st. line having negative gradient _ cry.


crx o,
So the scatter diagram would be like the adjacent figure. The dots with
strictly decreasing ordinate will strictly lie on a straight line.
Example 3. Prove that the magnitude of correlation coefficient of x and
y does not depend on change of origin and scale.
x-a y-e
Let x and y be changed to u and v S.t. u = "t> v = d .
.
In a previous th ~
eorem we get ruv = lallel r:.y .

This implies Iruvl = Irxyl. Hence proved.

Example 4. If u + 3x = 5, 2y - v = 7 and correlation coefficient of x and y


is 0.12 then find the correlation coefficient of u and v.
From the given relation we get
u=-3x+5, v=2y-7
-3x2
:. rl/V = 1-31 x 12(ty = -rxy = -0.l2

Example 5. If x and yare two variables, determine the value of k such


c
that x + ky and x + -2.. yare uncorrelated.
cry
Let u = x + ky and v = x + crx y
cry
Since 'i/V = 0 so Cov( u, v) = 0 (1)
Now, by Theorem 10,
Cov(U,V)=IXIVar(x)+(I. crx +k'I)COV(X,y)+k. crx Var(y)
cry cry
crx
or, o=cr;+(crx +k)COV(X,y)+k .cr~
cry cry

o = cr; + ( :: + k ) cxcryrxy + k cxcry


CORRELATION, REGRESSIO RA KCORRELATION 369

.!l+ ~T)')ci cx
or, k =- cr xcr J:: rxy) = cry .

e . While calculating the coefficient of correlation between


va x and y the following results were found :
25 25 25 25 25
LXi = 125, LYi = 100, LX; = 650, Li = 460, LXiYi = 508,.
i=l i=l i=l i=l i=l
Later it was discovered that at the time of checking two pairs of
observation ex,
y) were copied wrongly as (6, 14) and (8, 16) while the
correct values were (8, 12) and (6, 8) respectively. Determine the
correlation coefficient between x and y.

Corrected LXi = 125-(6+8)+(8+6)= 125


corrected LYi = 100- (14 + 6)+ (12 + 8) = 100
corrected LX; = 650 - (62 + 82) + (82 + 62) = 650
corrected Li =460-(14 +6 )+(12 +8 )=436
2 2 2 2

corrected LX;Y; = 508-(6 x 14 + 8 x 16) + (8 x 12+ 6 x 8) = 520


.. Corrected Cov(x,y) =.; x LX;Y; -(.; LX;)(,; LYi)
=~x520- 1x 125(~ x 100)
25 25 25

= 104 -5x4=i
5 5

c x2 I
=-"X· 2 (I
n L. - -"X·
n L.
I
)2
I
I
=-x650-
25
(-x125
1
25
)2 =1

436 324 36
---=-
25 25 25

EM-2A-24
370

.. the correlation coefficient, after correction,


4
Cov(x, y) 5 2
rxy = =-
O'xO'y Jixf36 3.
V2s
Example 7. If Var(x)=9, Var(y)=4 and Var(x-y)=Var(x) , find the
correlation coefficient between x and y.
We know, Var(x - y) = cr; + cr; - 2Cov(x, y)
(See corollary (2) of Theorem 6)
or, cr;=9+4-2Cov(x,y)

or, 9 = 9+4-2Cov(x,y) or, Cov(x, y) =2

Now, rxy = Cov(x, y) = _2_ = ~.


crxcr y 3x 2 3

10.6. Regression
From the bivariate datas given by (x, y) the estimation or prediction of
the average value of a variable, say y , corresponding to a specified value
of the other variable x is called regression. The average value of x may
also be predicted corresponding to a specified value of y.

Illustration.
Below we are given pairs of marks in Mathematics and Mechanical
Science obtained by students in an Engineering College.
x (Marks in Math) : 65 66 67 67 68 69 70 72
y (Marks in Mech Sc.) : 67 68 65 68 72 72 69 71
Depending on these data let we are to predict the marks in Mechanical
Science obtained by a student who scored 73 in Mathematics.
There may exist many students scoring 73 in Mathematics. Every suet
student may not score same marks in Mechanical Science. In the proces:
of regression the average of these marks is predicted.
If this average is 75 we say the predicted value ofy is 75 correspondiru
to x = 73.
CORRELATION, REGRESSION RANK CORRELATION 371

10.7. Regression Line.


From the bivariate datas given by (x, y) the regression i.e: the prediction
of average value of one variable against other is done by a suitable curve
or straight line which fit best the scatter diagram of the datas. This line is
called regression line.
The regression line which is fit and used to predict the average value
of y depending on x is called Regression line of y on x. The line which is
used to predict the values ofx depending ony is called Regression line of
x on y. These two lines may not be identical.
The equation of a regression line is called regression equation.
Illustration.
Suppose we have the following scatter diagram representing the
bivariate data given by (x, y).

---o~----------~--------------------~x
3.6

We fit the straight line AB to this diagram in purpose of predicting the


value ofy depending on the· values ofx. This line AB is regression line of
yon x. Similarly CD is fit as regression line ofx on y.
Let the equation of the regression line of yon x be 2y = x -1. Then if
we are to predict the value of y for x = 3.6 then we shall find this as
3.6-1
:h 'predicted value of y , or Y predict = -2- = 1.3. In the figure these are
IS
shown.
The most scientific method of finding the equation of re'gression line
is 'Method of Least Square'. This is being discussed in the following
theorems.

______________ ~ jL ~_
372
ENGINEERING MATHEMATICS - IU

Theorem 1. (Finding the Regression Line by Method of Leasl


Square).

Let (XI>Yl), (X2, Y2), (X3, Y3)··· ... (xn, Yn) be n pairs of observation

given by the bivariate data (x, y). Let i, y be the means; CJ.o CJ
y be th
standard deviations of the values of x and Y respectively; rxy is correlatio
coefficient.
Then the equation of the (i) Regression line of Y on x is
Y- Y--b
- yx h
(x-x -) were byx =r - CJy
xv
. CJ
x
(ii) Regression line of x on Y is x - i = bxy (y - y) , where bxy = rxy . CJ x .
CJ I'
Proof: (i) Let Y = a + bx (1)

be the equation of the straight line 'regression of Y on x'.


We shall find a, b by method of least square.

Satisfying the equation (1) by every pair (Xi' ») we get


Yi =a+bx,
(2)
for all i. Summing up these we get
n n n
LYi = La + Lbxi
i=l i=l i=l

II n

or, LYi = na + b LXi·


i=l i=l

Dividing both side by n we get y =a + bi (3)


Multiplying both side of (2) by Xi we get
XiYi = aXi + bx] for all values of i.
Summing up these we get
n n n
LXiYi = Laxi + Lbx?
i=l i=l i=l

n n n
or, LXiYi = a LXi + b LX?
i=l i=l i=l

n n n
or, LXiYi =(y -bi)Lxi +bLxl, putting the value of a from (3)
i=l i=l i=l
CORRELATIO ,REGRESSION RANK CORRELATION 373

Dividing both side by n we get

1" __
- ~XiYi -x Y Cov(x,y)
or, b = -'-'n'--- _
1" 2 -2 cri
-~Xi -x
n

From (3) we get a = y =bx .

Putting these values of a and b in (1) we get the equation as

y = y - bx + bx or, y - y = b(x - x)
c
or, Y - Y = rxy· cry (x - r) which is the required 'regression line ofy on x'.
x

(ii) Assuming the equation as x = a + by we can find the same as (i).

Regression Coefficients
. cry
Th e quantity bl'x = rxy - appeared in the 'regression equation of y
. crx
on x' is called regression coefficient of yon x.

The quantity bxy = r xy ~ appeared in the 'regression equation of x on


cry
y , is called regression coefficient of x on y.

Illustration.
Let us consider the bivariate datas given by (x, y) as
x - 6 - 4 -3 -1 1 2 4 7
Y - 4 -3 -1 -1 ° 2 3 4
1 11 1
Here we see X=-LXi =-(-6-4-3-1+1+2+4+7)=0
n i=1 8

y=o, crx= ~Lxr-(~Lxi)2


8 8
= ~X132_(~XO)2
8· 8
= [33.
V2
374 ENGINEERING MATHEMATICS -IIA

21
2
The correlation coefficient,

So, the Regression Coefficient 'of y on x',

av J7 fl4
byx = rxy .-- = 0.97 x r;:;;:; = 0.97 x r;:;;:; = 0.632·
ax v33 v33
.fi
Therefore in these observations the equation of the regression line of
yon x is
y - 0 = 0.632(x -0) or, Y = 0.632 x.
Now if we are to predict the value of y when x = 2.3 then it is
y (predicted) = 0.632 x 2.3 = 1.454 .
. If there exist many values of y corresponding to x = 2.3 then
approximately 1.454 is average of all of them.
10.8 Properties of Regression Line and Coefficients.
The properties of regression line and the regression coefficients are
presented as the following theorems.
Theorem 1. For a bivariate data given by (x, y) the correlation coefficient
and the regression coefficients 'have same sign.
ay
s;
£ . Pr Q .. • s::e the regression coefficients are given by

b.; -

same sign as
.
1xv -
- ay
rry.
and since P x' a yare
= r.; cr.,
always positive byx, bxy are of
CORRELATION, REGRESSION RANK CORRELATION 375

Theorem 2. Product of two regression coefficients = (Correlation


Coeffi d ern)2.

Proof: 'Since byx = rxy cry and bxy = rxy ~ therefore byx x bxy = (rXy)2 .
crx cry
Theorem 3. The two regression lines intersect at the point (x, y).
Proof: The equations of the two regression lines are y - y = byx (x - x)
and x - x = bxy(y - y) .
We see both the equations are satisfied by x = x, y = y . Hence proved.
Theorem 4. The two regression coefficients have same sign.
Proof: Since byx x bxy = (rxy)2 = positive ;.the result is proved.

Theorem 5. The gradient of 'regression line of y on x' is byx and the

gradient of 'regression line of x on y' is _1_.


bxy
Proof: The equation of the regression of y on x is y - Y = byx (x ~ x)
or, y = byxx + (y - byx . x)
:. the gradient of this line is byx.

The equation of the regression of x on y is x - x = bxy (y - y)

. or, y=-x+ 1 (_y--x 1_)


bxy bxy
1
:. the gradient of this line is b'
xy
Theorem 6. The regression coefficients,

'Cov(x, y) . Cov(x, y)
byx = 2 and bxy = 2
crx cry

bxy =~rxy o, Cov(x,y) Cov(x,y)


cry cry cxcr y cr;,
ENGINEERING MATHEMATICS - IIA
376

Note: The result obtained in Theorem 6 is more useful in finding


regression coefficient because here it does not involve rxy and
involve variance of only one variable.

10.9 .Illustrative- Examples.


Example 1. From the following data, obtain the two regression equations.

Sales 91 97 108 121 67 124 51 73 111 57


Purchases: 71 75 69 97 70 91 39 61 80 47
Hence estimate the purchase when sale will be 100.To get a purchase
of 60 what is the required sales?
Let x = sales and y = purchases.
_Calculation for Regression Lines
x y x-90 y-70 u2 v2 uv
=U =V

91 71 1 1 1 1
97 75 7 5 49 25
108 69 18 -1 324 1
121 97 31 27 961 729
67 70 -23 o 529 o
124 91 34 21 1156 441
51 39 -39 -31 1521 961
73 61 -17 -9 289 81
111 80 21 10 441 100
57 47 -33 -23 1089 529

Total o o 6360 2868

:. LUi = 0, LVi = 0, LU; = 6360, LV; = 2868, LUiVj = 3900


and number of pairs, n = 10.

:. Ii = 110LUi =0, V = 11 LVi


0
=0
2
I 1 I
cr~=-LUi
?

10
- -LUi
10
2
( ) =-x6360-(0)
10
2
=636

I
CORRELATION, REGRESSION RANK CORRELATION 377

2
aV=-L>i
1 2- (1-LVi )2 1
=-x2868-0
2 =286.8
10 10 10

Cov(u, v)= /0 LUivi -(~ LUi)(~ LVi) = 1~ x 3900- Ox 0= 390.

:. the correlation coefficient between u and v,


Cov(u, v) 390 390
ru,,= auav = ../636../286.8 = 427.089 =0.913·

Since U = x-90, v = y - 70 therefore u=x- 90

u=x-90, v=y-70; au=ax and av=ay.

So, a x = ../636, a y = ../286.8

1x l
and ruv = NNrxy = rxy :. rxy = 0.913 .

a
So the regression coefficient of y on x, byx = -..l'...rxy.
ax
.J286i
= ../636 (Note that if rxy was not required we could find byx by direct

applying Theorem 6)
and the regression coefficient of x on y,
ax ../636 .
bxy = -rxy = ~ x 0.913 = 1.36
ay v286.8
:. the regression line of y on x is
y-y=by,r{x-x) or, y-70=0.61(x-90)
'---
or, y = 0.61x + 15.1 (1)
and the regression line of x on y is x - x = bxy (y - y)
or, x-90=1.36(y-70) or, x=1.36y-5.2 (2)
When x = 100, Yesttmated = 0.61 x 100+ 15.1= 76.1, unit [from (1) ]

When Y = 60, Xestimaled = 1.36x 60 - 5.2 = 76.4 unit [from (2) ]


Example 2. For two variables x and y the equations of two regression
lines are x+4y+3=0 and 4x+9y+5=0. Identify which one is 'ofy on
x'. Find the means of x and y. Find the correlation coefficient between x
and y. Estimate the value of x when y = 1·5. [WB. UTech, 2002]
378
ENGINEERING MATHEMATICS _JlA

Let if possible, x+4y+3=O


(1)
be regression line 'ofyonx' and 4x+9y+5=O
(2)
be 'ofx ony'

Gradient of (1) is -± :. the regression coefficient byx = _{

Gradient of (2) is -~ :. 1/b xy


= _~ :. bxy=-
-9
4
1 9 9
Now ,xyr2 =b yx xb xy =--x--=_
4 4 16 which is less than 1. So Our
hypothesis is correct
[If r~ > I would be then our hypothesis is wrong]
:. Equation (1) is the Regression equation of y on x.
Equation (2) is regression equation of x on y.
We know the two regression lines intersect at the point (x, y) where
x, yare means of x and y respectively.
From (1) 4x+ 16y+ 12 = 0
4x+ 9y+5= 0

Subtracting, 7y+7=O or, y=-I. Then from (1) we get


x=-3-4(-I)=4-3= I. So,x= I, y=-I are the means ofx andy.

We got rxy
2
= -9 .', rxy = ±_3 .
16 4
Since we know the regression coefficient byx and r
xy
have same sign
so r-,:y is negative as byx is negative.
To estimate the value of x we use the regression line 'of x on y '. So,
from (2) we get for y=1.5, 4x+9xI5+5=O.

185
or, 4x=-185 :. xesl =--=-4.625
4
Example 3. If x = 4y + 5 and y = kx + 4 be two regression equations
of 'x on y' and of 'yon x' respectively then find the interval in which
k lies. [ W.B. U. Tech 2004 ]

Since x=4y+5 is 'ofx ony' and since its gradient is 4"I :. I =4"1
bxy
or, bxy=4.
CORRELATION, REGRESSION RANK CORRELATION 379

Again since y=kx+4 is 'of y on x ' and since its gradient is k


.. byx = k .

We know, byx x bxy = rxy-~ •. rxy 2 = k x4 =4k


1
i.e. 0 ~ 4k ~ lor, 0~k ~-
4

Example 4. The bivariate (U, V) is related with the bivariate (X, Y) by


the two relations 4U = 2X + 7 and 6V = 2Y -15. Given regression
coefficient of Y on X is 3. Find the regression coefficient of V on U.
From the given relation we have

1 7 1 5 - 1- 7 - 1- 5
U=-X+- v=-y-- .. U=-X+- V=-Y--
2 4' 3 2 2 4' 3 2

2 cry 2 2
Now, =--rxy =-byx =-x 3= 2 .Ex.
3 crx 3 3

Example 5. The relationship between travel expenses (y) and the duration
of travel (x) is found to be linear. A summary of datas for 102 pairs is
given below:

:L>= 510, 2) = 7140, I>2 = 4150, LX}' = 54,900


.
and ~>=2
7,40,200.

(i) Find the two regression coefficients (ii) Find the two regression
equations
(iii) A given trip has to take seven days. How much money should a ,
salesman be allowed so that he will not run short of money ?
_ 1
- I" 1
x=-L..x=-x51 0=5 y=-x 7140= 70
n 102 ' 102

COV(X'Y)=~LXY-(~LX )(~ LY)


=_I_x
102
54900-(_I-x
102
510)(_1- x 7140
102 )
1 = 9150 -5x 70 = 188.24
17
380 ENGINEERING MATHEMATICS - I1A

0";ry=- 1 L> 2 - ( -LX


1 )2 =-x 1 4150- ( -x510
1 )2 =---25=
2075 15.686
. n n 102 102 51

0"; = ~ Li - (~LY J = 1~2 x 7,40,200- (70)2 = 2356.863·

O"X o, Cov(x,y) Cov(x,y)


(i) bxy=-xrxy=-x = 2
188.24 = 0.08
0" Y 0" Y 0" xo"Y 0" Y 2356.863

Cov(x, y) = 188.24 = 12
byx = 2 86 (by Th. 6.)
c, 15.6
(ii) The regression line of y on x is y - y = byx (x - r)
or, y-70= 12(x-5)
or, y = 12x+ 10

The regression line of x on y is x - x = bxAy - y)


or, x-5=0.08(y-70)or, x=0.08y-0.6

(iii) For x=7, y=12x7+10=94.

Example 6. If 8 is the acute angle between two regression lines in the


case of two variables x and y, show that

where r, a x' a y have their usual meanings.

Explain the significance of the formula when r = 0 and r = ± 1.


The equations of the lines of regression of y on x and x on yare

y-y=-- ray ( x-x )


ax

and x - x = ro x (y - y) .
Oy

respectively. So their gradients are


CORRELATION, REGRESSION RANK CORRELATION 381

cry _ rcry
tan8=+ rcrx crx
1+ cry .rcry
rc , o,
. 2 2
= ± _1_-_r_. cry . _-::-cr--,x,--~
2 2
r o, o, + cry

1- r2 cxcry
=+-------"--
- r cr/+cr/'
. As r2 ~ 1 and o x .o yare both positive, so the positive sign gives the
acute angle between the lines.
l-r2 crxcry
Hence tan8 = 2 2
r crx +cry
7t
When r = 8,8 ="2' So in this case the two required line are

perpendicular to each other.


Again when r = ±l, tan 8 = 0
.. 8 = 0 or, rt .
Hence the lines of regression coincide.
Example 7. If the equations of two regression lines obtained in a correlation
analysis are 3x + 12y = 19, and 3y + 9x = 46, determine which one of
there is regression equation of x on y. Find the means, correlation
coefficient and ratio of standard deviation of x and y.
[WB.U.Tech. 2004]

If possible, let 3x + 12y = 19 (1)


be the regression line of y on x and
3y+9x=46 (2)
be that of x on y .
1
Thus the gradient of (1) is
4'

:. the regression coefficient, b


yx
= -~.4
382 ENGINEERING MATHEMATlCS-IIA

Also the gradient of (2) is -3.


1
-=-3
bxy
1
bxy =--3·

But the correlation coefficient,

r
xy
2 =(-~)(-~)
4 3 = ~12 < 1 .
Thus our hypothesis is correct.
Hence equation (1) is the regression equation of y on x and equation
(2) is the regression equation of x on y.
Let x,y be the mean of x and y, respectively
.. From (1) and (2), we have.
3x + 12y = 19
3Y+9x=46.
Solving there equation we get
- 5 -
x= ,y=3".1
Again we have
2 1
r xy = 12
1
r
xy
=±--
2..fj .
But we know that the regression coefficient
byx and y have same sign, so r xy must be nagative as byx is negative.
So, the required correlation coeffcient is
1
r
xy
=---
2..fj
Also we have
CORRELATION, REGRESSION RANK CORRELATION 383

1
= - 2./3
1
4
crx:cry=2:./3.

10.10. Rank Correlation.


Sometimes for a bivariate data (x. y) x and y can not be measured
quantitatively but qualitative assessment can be done by alloting rank to the
individuals. For example, let X = beauty often girls gl'g2'gp .... 'glo and
y = intelligence of them Then numerical values can not be alloted to X
and Y but the girls may be ranked according to their beauty and according
to their intelligence.
This is shown in the followig table:
Rank of girls
Girls:
x (rank accordong
to beauty) 1 2 3 4 5 6 7 8 9 10
y (rank according
to intelligence) 3 5 9 2 6 4' 7 8 10
For each individual we have two ranks one in each of x and y.
The correlation coefficient of the two series of ranks is called Rank
Correlation Coefficient of x and y.
Theorem. If two series of rank are given in the characteristies x and y

then Rank Correleation Coefficient = 1-


6L:d 2

-=)=----
n -n
where d is the difference of corresponding rank in x and y and n is the
number of individuals.
Proof. Beyond the scope of the book.
Example. Calculate rank correlation between the performance of 8
students in Physics and Chemistry.
384
ENGINEERING MATHEMATICS -IIA

The rank of the eight students are given below


Rank in Physics 3 5 2 7 8 4 1 6
(X)
Rank of Chemistry : 1 4 5 3 2 6 7 8
(Y)
Here number of students i.e. the number of individuals n = 8
Calculation of rank correlation If

Rank X RankY d=X-Y d2


3 1 2 4
5 4 1 1
2 5 -3 9 tl
7 3 4 16 a!
8 2 6 36
a!

4 6 4
-2
1 7 -6 36
6 8 4
-2
Total - - 110= Ld2

o!
6Ld2
:. rank correlation =1- 3 =1- 6xll0 =1- 660 =-0.3095
n -n 83 -8 504
10.11. Rank Correlation when there is Tie Rank in any series
\I
If there exist more than one individual with same rank in either or both
n
the series we say there exists Tie Rank.

If k number of individuals get Tie Rank 'r' then the Average Rank of

each of these k individuals = r + (r + 1) + ..... + (r + k -1) .


k

After assigning this average rank to each of the k individuals, the next
rank assigned will be r + k and so on. ar
C(

E
CORRELATION, REGRESSION RANK CORRELATION 385

Example:
Individual 1 2 3 4 5 6 7 8 9
Rank x 5 3 5 1 2 5 4 6 2
Ranky 3 1 6 4 3 2 5 7 8
In series x, we see the individuals 5 and 9 obtain tie rank 2. So each of

them will be assigned the average rank 2 + 3 = 2·5 . Next the individual 2
2
will be ranked 4, each of the individuall, 3, 6 will be assigned average rank

6 + 7 + 8 = 7 . Next individual 8 will be ranked 9.


3
In series y individual 1 and 5 get Tie Rank 3. So they will be assigned
3+4
the average rank -2- = 3·5. Next the individual 4, 7, 3, 8 and 9 will be
assigned ranks 4, 5, 6, 7 and 8 respectively. In the following table the
assigned rank to x, y series are shown.
Individual 2 3 456 789
assigned Rank X 7 4 7 1 2· 5 7 5 9 2·5
assigned Rank Y 3.5 1 7 5 3.5 2 689

Theorem (Rank correlation for Tie)


If two series of rank are given in the characteristic x and Y and if some
of individuals get Tie Rank then

Rank Correlation = 1-
6{Ld2+L~}
3 12
n -n
where d is the difference of the corresponding rank and t denotes the
number of individuals having the tie- rank in the first or second series.
Example. We consider the previous example :
Individual 1 2 3 4 5 6 7 8 9
Rank x 5 3 5 1 2 5 4 6 2
Ranky 3 1 6 4 3 2 5 7 8
Here some individuals get tie-rank. We assign the average rank to them
and the revised ranking is shown in the following table. Hence the Rank
correlation is determined.
EM-2A-2S
386 ENGINEERING MATHEMATICS -IIA

Calculation of Rank Correlation

Indi- Rank Rank assigned assigned d=IR, -R 21


. d 2

vidual x y rank x ranky

(R,) (~)
1 5 3 7 3·5 3·5 12·25
2 3 1 4 1 3 9
3 5 6 7 7 0 0
4 1 4 1 5 4 16
5 2 3 2·5 3·5 1 1
6 5 2 7 2 5 25
7 4 5 5 6 1 1
8 6 7 9 8 1 1
9 2 8 2·5 9 6·5 42·25

Total - - - - - 107·5 = 2:d2


In series x two individual 5, 9 get tie rank, three individual 1, 3, 6 get tie
and in y series two individuals 1, 5 get tie rank. So value of tare 2, 3, 2
t3 - t 23 - 2 33 - 3 23 - 2
So 2:--=--+--+--=0.5+2+0.5=3
12 12 12 12
6{:Ld2+:Lt3_t}
:. the Rank Correlation = 1- 3 12
n -n

=1- 6{107·5+3} =1- 6xllO·5 =0.079


93 -9 720
10.12. Illustrative Examples.
Example 1. In a beauty competition two judges I and II rank the 12 entries
as follow. Find the rank correlation between the judgement of the two
judges.
X (Judge I) : 1 2 3 4 5 6 7 8 9 10 11 12
Y (Judge II): 12 9 6 10 3 5 4 7 8 2 11 1
Give your interpretation.

1
CORRELATION, REGRESSION RANK CORRELATION 387

Solution. Here n = 12
Calculation of Rank Correlation

Rank X RankY d=IX-yl . d2


1 12 11 121
2 9 7 49
3 6 3 9
4 10 6 36
5 3 2 4
6 5 1 1
7 4 3 9
8 7 1 1
9 8 1 1
10 2 8 64
11 11 0 0
12 1 11 121

Total - - 416 = Ld2


Now, Rank Correlation Coefficient
2
=1- 6Ld =1- 6x416 =1- 2496 =-0.4545
n3 - n 123 -12 1716
Since the rank correlation is negative so we may interpret that if the
rank given by judge I is higher then there is a tendency that the rank given
by the judge II will be lower.
Example 2. In a cIass, ten students obtained the following marks in
Mathematics and Physics out of 100 each :
Mathematics: 8 36 98 25 75 82 92 62 65 35
es Physics: 84 51 91 60 68 62 86 58 35 49
/0 Find the rank correlation coefficient of the marks obtained in the two
subjects.
Solution. In the following table we assign rank to the marks in Mathematics
(X) and to the marks in Physics (Y). Rank 1 is given to the highst value,
Rank 2 to the second highest and so on, in every series :
388 ENGlNEERING MATHEMATICS -IIA

Calculation of Rank Correlation Coefficient

Marks in Marks in Rankin Rankin d=IX-yl d2

Math (X) Physics (Y) X Y


8 84 10 3 7 49

36 51 7 8 1 1
98 91 1 1 0 0
25 60 9 6 3 9
75 68 4 4 0 0
82 62 3 5 2 4
92 86 2 2 0 0
62 58 6 7 1 1
65 35 5 10 5 25
35 49 8 9 1 1

Total - - - - 90= :Ld2


Here n = number of individuals i.e. number of students = 10

Rank Correlation eocfficient = 1- 6:Ld2 = 1- 6 x 90 = 1- 540 = 0.455


n3 - n 103 -10 990
Example 3. The coefficient of rank correlation coefficient of marks obtained
by 10 students in English and Economics was found to be 0·5. It was
later .discovered that the difference in ranks in the two subjects obtained by
one of the students was wrongly taken as 3 instead of 7. Find the corrected
rank correlation coefficient.
Solution. Before correction ,
the wrong
Rank correlation coefficient = 1- 6Ld
-3 -
2

n -n

or, O·5 = 1-
6x Ld 2
or, Ld 2
= 82 . 5
103 -10
:. before correction Ld' = 82 . 5
After corrcetion, Ld 2
=82.5-32 +72 =122·5

/
CORRELATION, REGRESSION RANK CORRELATIO 389

:. the corrected Rank correlation coefficent


2
=1- 6xLd =1- 6x122·5 =1- 735 =0.2576
n3 - n 103 -10 990
Example 4. Ten competitors in a beauty contest are ranked by three judges
in the following orders :
Judge I 6 5 10 3 2 4 9 7 8
Judge II 3 5 8 4 7 10 2 1 6 9
Judge III: 6 4 9 8 1 2 3 10 5 7
Use the rank correlation to determine which pair of judges has the nearest
approach to common taste in beauty.

Solution. Let R), R2' R3 be the ranks given by the three judges respectively.
Calculation of Rank Correlation

R) s, e, d)2 = dlJ = d23 = d)/ dIJ2 d232

IR) -R21 IR)-R31 IR2-~1


1 3 6 2 5 3 4 25 9
6 5 4 1 2 1 1 4 1
5 8 9 3 4 1 9 16 1
10 4 8 6 2 4 36 4 16
3 7 1 4 2 6 16 4 36
2 10 2 8 0 8 64 0 64
4 2 3 2 1 1 4 1 1
9 1 10 8 1 9 64 1 81
7 6 5 1 2 1 1 4 1
8 9 7 1 1 2 1 1 4
Total - - - - - 200 60 214

Here number of individuals, n = 10 .


The Rank correlation between judgements of judge I and II

=1- 6Ld)22 =1- 6x200 =1_1200 =-0.2121


n3 - n 103 - 10 990
390
ENCINEERINC MATHEMATICS -IIA

The rank correlation between judgements of judges I and III


6" d2
=1- ~3 13 =1- 6x60 =1- 360 =0.6363
n - n 103 - 10 990
Rank correlation between the judgement of IT and Ill

=1- 6Id;
3
=1- 6x214 =1_1284 =-0.2969
n - n 103 -10 990
We see the Rank correlation between judgements of judges I and ill is
maximum. So judges I and ill have nearest approach to common teste of
beauty.

ExampJe 5. Marks obtained by 10 students in Physics and Mathematics


are given in the following table:
Marks in Physics: 48 33 40 9
16 16 65 24 16 57
Marks in Math : 13 13 24 6
15 4 20 9 6 19
Find the Rank correlation coefficient of the two series of marks.
Solution. Let X = Marks in Physics; y = Marks in Math.
Calculation of Rank correlation
x Y Rank: X /Rank:Y assigned assigned d= d2
RankinX RankinY /R1-R2/
(R1) (~)
48 13 3 5 3 5·5 2·5 6·25
33 13 5 5 5 5·5 ·5 0·25
40 24 4 1 4 1 3 9
9 6 8 7 10 8·5 1·5 2·25
16 15 7 4 8 4 4
16 16
4 7 8 8 10 2 4
65 20 1 2 1 2 1
24 1
9 6 6 6 7 1 1
16 6 7 7 8 8·5 ·5 0·25
57 19 2 3 2 3 1 1
Total - - - - - 2
=
CORRELATION, REGRESSION RANK CORRELATION 391

In series X 3 individuals get tie-rank 7, and in seriers Y two individuals


got tie-rank 5, two individuals got tie-rank 7. So the value of tare 3,2, 2
3
1 - 1 33 - 3 23 - 2 23 - 2
:.L--=--+--+--=2+0.5+0.5 =3
12 12 12 12
:. the rank correlation coefficient R

t 3}- 1
= 1-
6
{Ld + L-U
2

= 1-
6{4l+3}
= 0 . 733
3
n -n 103 -10
Exercise 10
[I] Short Answers Questions
1. If the relation between variables X and y, U and V are
2X + 3Y = 4,3U + 4V = 5 and the regression coefficient of X on U is 4,
fmd the regression coefficient of Y on V .
2. For a set of bivariate data x and y, the lines of regression are
4x + 3y+ 7 = 0 and 2x + 5y= 4. Identify the lines and hence find the
correlation coefficient between x and y.
3. The bivariate data (x, y) results the following:

LXiYi = 414, LXi = 120, LYi = 90, LXT = 600, LYT = 300.
Calculate the correlation coefficient between x and Y if the number of
datas is 30. [ W.B. U. Tech 2005 ]
4. The results of a bivariate analysis are given in the following form :

LX = 30, LY = 67, Lx2 = 220, Ly2 = 1059 LXY = 480 and n = 5.


Calculate the coefficient of correlation and comment on the nature of
mutual dependence between x and y.
5.If 2u+5x=17 and 5v=2y+l1,Cov(x,y)=3 find the covariance
between u and v.

6.If 3y+2x-7=0 is the relation between x and Y then find the


correlation coefficient between x and y.
7.Find the correlation coefficient between x and Y if 2x + 3y =7 .
392 ENGINEERING MATHEMATICS - JJA

8. Two random variables X and Yare connected by the relation


3X + 4 Y +5 = 0 . A sample (Xi, ») is taken from the bivariate population;
obtain the correlation coefficient of the sample.
9.If two variables x and yare related by ax + by + c = 0 find the
correlation coefficient of x of y.

Ifl.Find the correlation coefficient between 5x + 2y and 2x - 5y where


x and yare uncorrelated variables and their variances are 9 and 16
respecti vely.

1l.Prove that the correlation coefficient is independent of change of


origin and numerically is also independent of change of scale.
[Hint: This is nothing but Theorem 3 under section 10.5]
12.Four variables x, y, U, v are related as 3y- x -4 = 0; u + 4v - 9 = o.
Given r = 0.3. Find ryv.
XlI

13.If the correlation coefficient betwen x and y is 0.5, what is the


correlation coefficient between 5x and -3y?
14. If the height of wife is always less than that of husband by 3
inches, what would be the correlation coefficient between the height of
wife and husband.
IS.Find the regression coefficient of y on x, of x on y , and the
correlation coefficient between x and y from the following values:
LX}' = 1500, x = 15, Y = 12, crx = 6.4, cry = 9, and number of
observations is 10.

16.Given that x = 45, o, = 2.5, Y = 60, cry = 3.2 and the correlation
coefficient 0.75. Find

(i) the regression coefficient of y on x.


(ii) the regression equation of y on x.
(iii) predict y when x = 35 .
17.The following results relate to bivariate data on (x, y) :

LXiYi = 414; LXi = 120, LYi = 90, LX? = 600, LY? = 300, n = 30
If z = 5 + 3Y obtain the estimate of z when x = 15 .

[ Hint: rxz = rxy = 0.9 Regression equation of z on x is z = 8.6+ l.35x ]


lIA CORRELATION, REGRESSION RANK CORRELATION 393

Ion 18.If regression line of x on y and that of y on x coincide, determine


on; the value of the correlation coefficient between x and y.
19.Ifthe two regression lines ofx andy are perpendicular to each other
the then find r.
20.Identify which one of the following is regression equation of y on
ere xandofxony: 2x+3y=1O and x+6y=6.
16 Find means ofx andy and the correlation coefficient of x, y. Estimate
x when y=-l
of 21.If 3y-2x=9is the regression line ofy on x, correlation coefficient
between x and y is .!., Var( x) = 4 then find the variance of y.
3

:0. 22.If u = 2x + 5 and v = - 3y + 1 and regression coefficient of y on x is


- 1.2, determine the regression coefficient of v on u.
the 23.Length and breadth of 50 'pieces of metal sheet have means lOft.
and 6 ft. s.d's 3 ft. and 2 ft. respectively, and the correlation coefficient
3 between x and y is 0.3. But on subsequent verification it was found that
of the length and breadth of one sheet was wrongly recorded as 10ft. and
6 ft.Omitting this sheet determine the correlation coefficient between x
he and y and the regression line of x on y.
2
24.If Var(x) =4 and Var(y)=9, find Var(2x-3y) when rxy ='3'
of
[Hint: Var(2x - 3y) = Var(2x) + Var(3y) - 2 JVar{2x) JVar{3Y) x r2x,3y

= 22Var(x) + 32Var(y) - 2J22 Var{x) J32 Var{y) x t:ry ]


on
Answers
32
1.'9
2. 4x+3y+7=0 isthelineofxonY'2x+5y=4 is the line ofy onx

3. 0.9 4. 0.97 ; x andy are also perfectly positively related.


5. - 3 6. - 1 8. - 1 9. ±1 10. - 0.2 12. - 0.3 13.- 0.5

14. y = x + 3, where x and yare height of wife and husband respectively.


15. -0.73, - 037, 052 16. 0.96, y = 0.96x + 16.8, 50.4
17. Zpredicl = 28.85 18. ± 1 19. 0
394
ENGINEERING MATHEMATICS -IIA

20.First is of x on y; 2nd is of y on x.

22. l.8 23. 0.3, x = 0.45 y + 7.3


[II] Long Answers Questions
1. Construct a scatter diagram for the following data :
x 6 5 8 8 7 6 10 4 9 7
y 8 7 7 10 5 8 10 6 8 6
From the diagram tell the nature of correlation.
2. Construct a scatter diagram from the following bivariate datas :
Height of father: 65 63 67 64 68 62 70 66 68 67 69 71
Height of son : 68 66 68 65 69 66 68 65 71 67 68 70
From the diagram state the nature of association of the height of
father and son.
3.Find the correlation coefficient between x and y where
x 1 3 4 6 8 9 II 14
y 1 2 4 4 5 7 8 9
Draw the scatter diagram of the above datas.
4.Find the correlation coefficient between the two grades :
grade 1 6 5 8 8 7 6 10 4 9 7
grade 2: 8 7 '7 10 5 8 10 6 8 6
5. Find the correlation coefficient between age and systolic blood
pressure from the following data on 12 women:
Age : 56 42 72 36 63 47 55 49 38 42 68 60
Blood
Presure : 147 125 160 118 149 128 150 145 115 140 152155

6.Find the product moment correlation coefficient of x and y when


x 65 63 67 64 68 62 70 66 68 67 69 71
Y : 68 66 68 65 69 66 68 65 71 67 68 70
CORRELATION, REGRESSION RANK CORRELATION 395

7. Find the correlation coefficient between x and y where


x -3 -1 1 3
y 9 1 1 9
Interpret your result. Can you say there can have no relation between
x and y. Give reasons of your answer.
8. Find the correlation coefficient between the two variables x and y
where
x 1 2 3 4 5 6 7 8
Y 74 54 52 51 52 53 58 71
Interpret the result.
9. The marks obtained by nine students in Mathematics and Statistics
in an examination are as follow :
Student A B C D E F G H 1
Mathematics (X) : 70 72 80 45 60 35 50 94 55
Statistics (Y) 60 83 72 63 74 54 40 85 58
Find the correlation coefficient between the marks in Math and Stat.
lO.If X = height and y = weight then the bivariate datas of 5 persons
are given:
X 64 60 67 69 69
Y: 57 60 73 52 68
Determine the correlation coefficient between X and Y.
If by a defect in weighting machine, weights are recorded by 2 kg
more than the true weight then indicate the value of the correct correction
coefficient.
ll.(i) The marks obtained by eight students in History and Geography
are given below :
Marks in History 43 77 64 96 48 35 86 71
Marks in Geography 36 68 49 79 50 41 82 65
Calculate the correlation coefficient between marks in two subjects by
Spearman formula.
396 ENGlNEERING MATHEMATlCS-IIA

(ii) Calculate the correlation coefficient between the height an weight


of six persons given as follow:
Height 162 165 167 168 170 175
Weight: 58 60 65 67 72 75
12.Compute the correlation coefficient between the two variables X
and Y from the following observations :
X 2 4 5 6 8 11
Y: 18 12 10 8 7 5
Multiply each value of X by 2 and add 6. Multiply each value of Y by
3 and subtract 15. Find the correlation coefficient between the new sets
of values, explaining why you do or do not obtain the same result.
13. For the following bi-variate data
x -3 -'1. -1 0 1 2 3
y 3210123
show that their correlation coefficient is O. Are the variable
independent? Then why is the correlation coefficient 0 ?
14.How would the scatter diagram be look like if (i) rxy = 1, (ii)
rxy = O.

15. The following results were obtained for 25 pair of bivariate datas :
x==5, y=4, crx=2, cr~==2.25, rxy==0.6

It was later detected at the time of checking that one pair of values
x = 5, Y = 4
was copied wrongly in computing. the above results. Find the
exact correlation coefficient between x and y after correction.
16. The following results relate to bivariate data on (x, u):
2:>jUj = 414 ; 2:>j = 120 ; 2:>j ==90, Lxr ==600 Lur ==300 n = 30

Calculate the correlation coefficient between x and y where y ==5+ 3u .


17. A computer while calculating correlation coefficient between two
variables X and Y from 50 pairs of observations obtained the
following results: X = 10, c x ==3, Y = 6, cry = 2, rXY = 0.3 . Later it was
found that one value of X(= 10) and the corresponding value of r(= 6) were
found inaccurate and hence those were rejected. Find the correlation
coefficient of the remaining 49 pair of values.
·IIA
CORRELATION, REGRESSION RANK CORRELATIO 397
:ight
18. The following marks have been obtained by students in Mechanics
and Statistics (out of 100) :
Mechanics: 45 55 56 58 60 65 68 70 75 80 85
Statistics : 56 50 48 60 62 64 65 70 74 82 90
:sX
Compute the coefficient of correlation for the above data, Find also
the equations of lines of regreesion. [WB. U. Tech 2005]
19.Calculate the coefficient of correlation and obtain the lines of
regression for the following data
Yby x 1 23 45 6 7 8 9
sets y: 9 8 10 12 11 13 14 16 15
Obtain an estimate of y which should correspond on the average to
x = 6.2 . [ WB. U.Tech 2003]
20 .Marks of 5 students in Mathematics and Statistics are given:
Mathematics : 38 48 43 40 41
ble Statistics : 31 38 43 33 35
Find the regression lines. When marks of a student in Mathematics
(ii) is 42, determine his most likely marks in Statistics.
21.In the following table we are given the ages of husband and wife
for 20 couples. Obtain the regression lines. Also draw the scatter diagram
for the data.
Age of: 22 24 26 26 27 27 28 28 29 30 30 30 31323334 35353637
Husband
Jues
Age of: 1820 20 2422 24 27 24 2125 29 32 27 2730 27 30 3130 32
the Wife
22. Fit a straight line to the following data by method of least square:
x : 15 20 25 30 35

30
y : 12 14 18 25 31
Predict y when x = 15.
311 •
23. The following are the scores of 10 students studied for a
two mathematics test and their scores on the test :
the Hour of study (x): 4 9 10 14 4 7 12 22 1 17
was Test score (y) : 31 58 65 73 37 44 60 91 21 84
'ere Find the equation of least square line apprroximate of the test score
tion on the numbers of hour studied. Also predict the average test score of a
student who studied 14 hours for the test.
[ Hint: 'leas square line' means regression line. Find both.]
398 ENGINEERING MATHEMATICS -UA

24.You are given the following data:


Variable x y
Mean 47 96
Variance : 64 81
Correlation coefficient of x and Y = 0.36. Determine the equations of
regression lines. Calculate y when x = 50, and x when y = 88 .
25.Estirnate the value of y when x = 15 and estimate the value of x
when y = 8 from the following informations :
LXi = 120, LYi =90,Lxl =600,LY? =300, LX;Yi =414, n=30.

26.For two variables X and Y the equations of regression lines are


4x+9y+5=0 and x+4y+3=0. Find which one is regression line of Y
on X and which one is regression line of X on Y. Also find means x and y
and correlation coefficient. [W.B. U. Tech 2004]

27.If 3y + 2x = 9 and 3x + 2y = 7 are the two regression lines, then what


is the value of the correlation coefficient ?

28.Iftwo regression lines are 3x+9y =46and 3y+ 12x = 19 find which
one is the regression line of x on y. If the variance of x is 4, fmd the means
of x and y, the variance of y, and the correlation coefficient between x and
y.

29.If the equations of the two regression lines are 3x + 12y = 19 and
3y + 9x = 46, determine which one of these is the regression equation of
y on x and which one is that of x on y. Give reasons for your answer.
Find also the arithmetic mean, correlation coefficient and ratio of variances
ofx andy.

30.If y = -12x and x = -0.6y are two regression lines, compute rxyand
cr2 x
-2·
cry

31. The 5 pair of values of x and yare such that

x+ y : 24 28 30 33 35.

Var(x)=6, Var(y)=2. Find the correlation coefficient ofx andy.


CORRELATION, REGRESSION RANK CORRELATION 399

32. Find the rank correlation coefficient for the following data :
(i) x 80 91 99 71 61 81 70 59
y 123 l35 154 110 105 l34 121 106
(ii) x 78 89 97 69 59 79 68 57
y 125 137 156 112 107 136 123 108
[Hint. first assign Rank to each of the serie x and y]
(iii) x 75 88 95 70 60 80 81 50
y 120 l34 150 115 110 140 142 100

33. The marks obtained by the students in Physics and Chemistry are
as follow:
Marks in Physics 30 33 45 23 8 49 12 4 31
Marks in Chemistry : 35 23 47 17 10 43 9 6 28
Compute their ranks in the two subject and find the rank correlation
of the two series of marks. Give your interpretation of the result.

34. Production of steel in billion tonnes and the manufactures of cars


in thousand in 7 countries in the years 2005 are given below:
Countries C1 C2 C3 C4 c, C6 C7
Steel Production 97·5 99·4 90·6 96·2 95 ·1 98·4 97 ·1
Car manufacture : 75 ·1 75·9 77 ·1 78·2 79·0 74·8 70 ·1
Obtain the rank correlation of the production of steel and cars in the
countries.

35. In a certain examination 10 students obtained the following marks


in Mathematics and Physics. Find the Rank correlation coefficient
Student Roll No. 1 2 3 4 5 6 7 8 9 10
Marks in Math 90 30 82 45 32 65 40 88 73 66
Marks in Phy. 85 42 75 68 45 63 60 90 62 58

36. In a contest, two judges ranked eight candidates A, B, C, D, E ,


F, G and H in order their preference, as shown in the following table.
Find the rank correlation coefficient:
400
ENGINEERING MATHEMATICS -IIA

Candidates :A B C D E F G H
Judge I :5 2 8 1 4 6 3 7
Judge IT :4 5 7 3 2 8 1 6
37.The rank often students in Statistics and Mathematics are as follow

Statistics 6 4 9 8 1 2 3 10 5 7
Math. 3 5 8 4 7
1 6 10
9 2
Find the rank correlation coefficient. Give your interpretation.

38. Production of Rice and Wheat ( in billion metric ton) in 8 countries


C1, C2, C)' C4, C5, C6, C7 and Cs are given as
Countries

Rice 75 88 95 70 60 80 81 50
Wheat 120 134 150 115 110 140 142 100
Compute the rank correlation of production of Rice and Wheat in
different countries.
39. Two series of values are given as
X 130 132 128 130 127 125
y 115 134 120 130 124 128
Find the Rank correlation coefficient between the two series of values.
40. Ten competitors in a beauty contest are ranked by three judges in
the following order :
Judge I 1 5 4 8 9 6 10 7 3 2
Judge II 4 8 7 6 5 9 10 3 2 1
Judge nr 6 7 8 1 5 10 9 2 3 4
Use rank correlation to measure which pair of judges have the nearest
approach to common taste in beauty.
41. Following are the I.Q and Scores of ten students in a class:
Student 1 2 3 4 5 6 7 8 9 10
I.Q 100 100 110 140 150 130 100 120 140 110
Score 35 40 25 55 85 90 65 55 45 50
Find the rank correlation coefficient between I.Q and Scores of
students.
CORRELATION, REGRESSION RANK CORRELATION 401

42. Obtain the Rank correlation coefficient from the following two series
of observation :
X : 62 58 68 45 81 60 68 48 50 70
Y : 68 64 75 50 64 80 75 40 55 64
43. The following table gives the monthly family income in thousand
rupees of 11 students and their scores in English. Find the rank correlation
coefficient between monthly family income and scores in English:
Income 40 46 54 60 70 80 82 85 85 90 95
Scores 45 45 50 43 40 75 55 72 65 42 70
44. Compute the correlation coefficient of the following ranks of a
group of students in two examination. What conclusion do you draw from
the result?
Roll Nos. I 2 3 4 5 6 7 8 9 10
Rank inH.S
Exam. 5 8 6 7 4 2 3 9 10
RankinJEE
Exam 2 5 7 6 3 4 8 10 9
45. Ten competitors in 'a table tenis tournament are. ranked by three
judges X, Y, Z in the following order:
Rank by X 1 6 5 10 3 2 4 9 7 8
RankbyY 3 5 8 4 7 10 2 1 6 9
Rank by Z 6 4 . 9 8 1 2 3 10 5 7
Discus which two of these three judges has the nearest approach to
common assessment.

Answers
2. Positively correlated 3. 0.977 4. 0.5533 5. 0.8961 6.0.7027
7. rxy = 0 ; there is relation like y = x2. 8. 0.0096, almost no association.

9. 0.733 10. 0.201,0.201 11. (i) H, (ii) 0.97 12.- 0.9203

14. (i) (ii) 15.0.6


-t--------+

EM-2A-26
402 ENGINEERING MATHEMATICS - IIA

16.0.9 = rxy= rxu 17. 0.3 20.y = 0.79x+2.82; x = 0.52y + 23.18 ; 36


21. y = 0.89x- 0.57, x = 0.82y+8.6 22.y = -4.5+ 0.98x; ypredict= 10.2
23. Reg of Y on x is y=3.47x+217,of x on y is, y=3.642x+19.98; 7028
24.y=0.405x+76.965, x=0.32y+1628, 97215, 44.44 25.7.95,13

26. The second equation is 'of y and x' ; the first is 'of x on y
'x-=l,y-=-l. ./3
,rxy =--.
2
2
27.
3
28. 3y+12x=19 .
IS 'f 0 xony ' ; x=-,
- 1 y=5,Var
- ()y =-,rxy=-
16 1r;;
3 3 2v3
29. 1st is ofy on x, 2nd is ofx ony.

- - 1 1 2 2
x=5, y="3' r=- 2./3 crx: cry=4: 3.

30. - 0.85, 0.5 31. 0.98 32. (i) 0·952 (ii) 0·952 (iii) 0·929

33. 0·9; If Rank in Physics is higher then there is a tendency that the
rank in Chemistry would be higher also
2
34. -0·357 35·0·84 36."3 37. -0·3 38.0·939.0·33

40. II, III are in nearset apporach


41. 0.47 42. 0·545 43. 0.36 44. 0·64
45. Rank correlation between X and Y, X and Z, Y and Z are respectively
-0·21,0·64,-0·30; Judges X and Z have the nearest approach of
assesment.
[III] Multiple Choice Questions
1. The covariance of x and y is defmed by
1n n 1 n
(a) -2:(Xi-X)~]Yi-ji) (b) -2:XiYi -xy
n i=l i=l n i=l

(d) none.
CORRELATION, REGRESSION RANK CORRELATION 403

2. The correlation coefficient between two variable x and y is


cov(x,y) cov(x,y)
(a) ~- (b) a +a
x y x y

cov(x,y)
(c) a -a (d) O. [W.E.U.Tech 2007]
x y

3. When two variables x,y are uncorrelated, then the correlation


coefficient between them is
(a) 0 (b) ±l (c) 1 (d) -1.

4. The maxiinum and minimum values for correlation coefficient are

(a) 1,0 (b) 2,1 (c) 0,-1 (d) 1,-1.

5. When y is a liner function of x, then the correlation coefficient


between x and y is

(a) 0 (b) 1 (c) -1 (d) ±1.

6. If two variable x andy are independent, then the correlation coefficient


between them is
(a) 0 (b) 1 (c) ±l (d) -1.

7. The regression coefficient of y on x is defined by

(d) r«.
8. The regression coefficient of x on y is given by

cov(x,y) cov(x,y)
(b) 2 (c) 2 (d) none.
ax ay

9. byx x bxy; where bxy ,byx and r are regression and correlation
coefficient, is
(a) r (b) r2

1 1
(c) - (d)2' [WE.U.Tech,2006]
r r
ENGINEERING MATHEMATICS -IIA
404

10. For five given bi-variate datas, we have

L,Xi == 15,L,Yi == -5,L,xiYi == -10


i i ;

Then the covariance of x and y is


(c) 1 (d) -5·
(b) 5
(a) -1
11. If for some bivariate datas, cov(x,y) == 32, var(x) == 36var(y) == 64

then the correlation coefficient of x,y is


1 (d) 24.
(c) 24
2
12. If the r
xy
(correlation coefficient) == -3" '
u == 2x + 5, v == -2y + 3, then the correlation coeftkient of u and v is
2
8
(c) -- (d) 3".
3

1
13. If u == _ x + 2, v == 5x + 1 and r llv
1
== -, then rxy == [c
1
3 5
1 (d) 3.
1 (c) -
(b) - 5
(a) 5 3 then
and x,y are uncorrelated
14. If var(x) == 25, var(y) == 9
var(x + y) == (d) none.
(b) 34 (c) 16
(a) 8

x-X y-y -
15. If u==_,v ==-, then rxy-
o, cry
(d) none.
(b) cov(x,y) (c) cov(u,v)

. 4 1
16. For given datas, we have var(x) == 1,var(y) =="9 and rxy =="5.

Then var(x - y) ==
77 (d) none.
13 53
(c) 45
(a) - (b) 45
9
CORRELATION, REGRESSION RANK CORRELATION 405

17. If x + 5u = 2,2y + v = 7 and the correlation coefficient of x and y


is 0.25, then the correlation cofficient of u and v is

(a) 0.25 (b) -0.25 (c) 0.05 (d) 0.50.


18. A list of a data for 7 pair are given below:
Lx=2l,1>=-7 and byx =3.
The regression line of y on x is

(a)x+3y=1O (b) x-3y=6

(c) 3x- y =0 (d) 3x - y = 70 .


19. If two variables x and yare perfectly positively correlated then in
the scattor diagram every point (x, y) lies on
(a) a straight line whose gradient is positive
(b) a straight line whose gradient is negative
(c) a parabola
(d) none of these.
20. If two variables x and yare perfectly negatively associatted then in
the scatter diagram every point (x, y) lies on

(a) a straight line whose gradient is positive

(b) a straight line whose gradient is negative

(c) a parabola

(d) none of these.


21. If x:l 1·5 2 are the bivariate data assumed by (x,y) then
y:l 1·8 2
(a) x and yare perfectly associated

(b) x and yare positively correlated

(c) x and yare negatively correlated

(d) x and yare uncorrelated.


406 ENGINEERING MATHEMATICS - llA

22. If two variates x and yare negative correlated then

(~ y decreases when x increases


(b) Y increases when x increases

(c) y has tendency to increase when x increases


(d) none of these.
23. For the bivariate data x 2 4
y o 0.1,
covariance of x and y is

1 1 1
(a) 18 (b) --
18 (c) 20 (d) none of these.

24. The correlation coefficientfor the bivaiate datas


x -15 -12 -9 -3 o 3 9 12 15
y -14 -11 -8 -{) o 6 8 11 14
is
(a) -1 (b) 1 (c) 2 (d) O.
25. If the correlation coefficients of x,y is 0.8 and that of u,v is 0.6
then

(a) association ofx andy is lower than that ofu and v.


(b) association of x and y is higher than that of u and v.
(c) association of x and y is same as that of u and v.
(d) none of these.
26. Which one of the following statements is not true for the correlation

coefficient rxy of two variables x and y
(a) rxy has no unit
(b) The maximum value of rxy is 1.
(c) rxy must be positive
(d) The minimum value of rxy is -1.
CORRELATION, REGRESSION RANK CORRELATION 407

27. If the bivariate (x,y) has the scatter diagram


• •
then rry is
(a) 0 (b) -1
(c) negative (d) positive.
28. If the two bivariates (x,y) and (u, v) are such that x = -2u + 4
and y = 3u - 6 then

(b) rry = -6r"v


(c) rry = 6ruv (d) rry = -t;/V .
29. Ifthetwobivariates (x,y) and (u,v) are such .that u= ax+b and
v = ey + d then rry = r"v if --.. ,
(a) a,b,e,d all have same sign
(b) b,d have same sign
(c) a,e have same sign

(d) b = d = O·
30. If (x,y) is a bivariate where var(x)=0.2,var(y)=0.1 and
correlation coefficient of (x,y) is 0.01 then variance of x + y is
approximately
(a) 2.300 (b) 0.3 (c) 0.3028 (d) 0.5069.
31. For the bivariate (x,y) s.d of x + y is same as the s.d of x - y if
(a) x andy are negatively correlated
(b) x and yare positively correlated
(c) x and yare uncorrelated
(d) x and yare perfectly correlated.
32. If var(x) = 0.2, var(y) = 0.09 and covariance of x,y is 0.1 then
the s.d of x+ y is
(a) 0.3 (b) 0.7 (c) 0.11 (d) 0.1.
408 ENGINEERING MATHEMATICS-IIA

33. Two variables x andy are such that CJx = I,CJy = 2,CJx_y = J2 then
cov(x,y) =

1 2 3
(a) - (b)- (c) 1 (d) 2.
2 5
. 34. Two variables x and yare related as 3x + 4y =5 then correlation
coeffieient of x and y is
1
(a) -1 (b) 1 (c) 2 (d) O.
35. If the two variable x andy are related as 2y2 +5x =9 then rxy =1
(a) True (b) False.
36. Correlation coefficient of a variable with itself is 1
(a) True (b) False.
37. If two variables are independent then they are uncorrelated

(a) True (b) False.


38. If two variables are uncorrelated then they are independent

(a) True (b) False.


39. If two variable x and yare related by y = X2 then
(a) x,y are independent
(b) x,y are positively correlated
(c) x,y are negative correlated
(d) x, yare uncorrelated.

40. If two variables x and yare related by y = Ixl then

(a) x,y are independent


(b) x,y are positively correlated
(c) x,y are negatively correlated
(d) x,y are uncorrelated.
41. Magnitude of correlation coefficient does not depend on change of
origin and scale
(a) True (b) False.
CORRELATION, REGRESSION RANK CORRELATION 409

42. If 2u + 4x = 1,3y - v = 7 and correlation coefficient of u and v is


0.4 then the correlation coefficient of x,y is
(a) 0.4 (b) -0.4 ,(c) 0.7 (d) -0.7.
43. If9y+4x+:5=0 is regression lineofx ony and x+4y+3=0 is
regression line of y on x then predict the value of y when x = 5

(a) 2 (b) _25


9
(c) -2 (d) none of these.
44. (x, y) is a bivariate such that the regression coefficient of y on x is
3, <Jy=2,<Jx=4 then rxy is
3 1
(a) '2 (b) 6 (c) 2 (d) none of these'.
45. If the regression coefficient ofx ony is -05, correlation coefficient
is -0.1, standard deviation is 0.002 then the s.d of x is
(a) 0.01 (b) 0.001
(c) 0.002 (d) none of these.
46. If the regression coefficient of x on y is 2 then find which one of
the following may be the correlation coefficient between x and y
(a) -05 (b) 15 (c) 05 (d) -0.3·
47. If the regression coefficient of y on x is 2 and the correlation
,
is
.
coefficient between y and x is 0.2 then the regression coefficient of x on y

(a) 0.2 (b) 0.002 (c) -0.02 (d) 0,02.


48. Product of two regression coefficients must be positive •
(a) True (b) False.
49. The regression line of x on y is 2x - 3y + 5 = 0 then the regression
coefficient of x on y is
2 3
(a) - (b) - (c) 2 (d) 3.
3 2
50. Two variables x and yare such that cov(x,y) = 0.32,
ax = 0.2,a y = 0.4 than regression coefficient of x on y is
(a) 2 (b) 1
(c) .8 (d) none of these.
"
410 ENGINEERING MATHEMATICS -IIA

51. For a bivariate 4y + x + 3 = 0 is regression equation of x on y and


4x + 9y + 5 = 0 is reqression equation of y on x.
(a) True (b) False.

52. If x + 4y + 3 = 0 and 4x + 9y + 5 = 0 be the two regression line


then the expectation of y is

(a) 1 (b) 2
(c) -1 (d) none of these.
53. If the mean of the variate x and yare respectively 5 and 70 and the
reqression coefficient of x on y is 0.08 then the regression line of x on y is
(a) y=12x+l0 (b) x=8y-0.6

(c) x=0.08y-6 (d) x=0.08y-0.6.

54. If for a bivariate (x, y)


the regression coefficient of y on x is 3 and
1
the reqression coefficient of x on y is "2 then an angle between the two
regression line

1t
(a) - (b) tan-I.!..
4 7
(c) tan " 7 (d) none of these.
55. The two regression lines become identical. The correlation coefficient is
(a) ±1 (b) 0 (c) only 1 (d) none of these.

56. Rank correlation of the following two series of ranks is


Rank X 1 2 3 4 5 6
RankY 6 5 4 3 2 1
(a) 0 (b) 1 (c) -1 (d) none of these
57. Rank correlation of the following two series of datas :
Marks in Math : 98 20 60 75 62
Mark in Phy 96 24 57 71 65
Sa) 0 1b) 1 (c) -1 (d) none of these
CORRELATION, REGRESSION RANK CORRELATION 411
"
58. Rank correlation of the following two series of datas :
X 52 52 52 52 52
y 62 62 62 62 62
IS

(a) 0 (b) 1 (c) -1 (d) none of these


Answers
l.b 2.a 3..a 4.d 5.d 6.a 7.a 8.c
9.b 10.c ll.b H.b 13.c 14.b 15.c 16.b
17.a 18.c 19.a 20.b 21.b 22.c 23.b 24.d
25.b 26.c 27.d 28.d 29.c 30.c 31.c 32.b
33.d 34.a 35.b 36.a 37.a 38.b 39.d 40.d
41.a 42.b 43.c 44. d 45.a 46.c 47.d 48.a
49.b 50.a 51.b 52.c 53.d 54.b 55.a 56.c
57.b 58.a
MODULE-5
~ CURVE FITIlNG BY MlmIOD OF LEAST SQUAR

11.1 Introduction:
Consider the bivariate (X, Y) assuming the data (xPYj). We consic
a scatter diagram of (X, Y) as follow :

.. . i

o 10 11 12

In the scatter diagram we see there are many dots having co-ordin
(XI' Yj) correspending to X = XI . The ordinates of these dots are YJ
corresponding to x;::: XI . Let m y (x I) be the mean of these Yj. 1
bold dot on the line of X = XI represent this my (XI) i.e. the ordinate
this bold dot is m/x l). Similarly the bold dots on the line
X = x2,x = x3' X = x4 ..... are found which represent the means 'of
corresponding to X = x2,x = x3'x = x4 .
In the figure the smooth well known curve Y = ¢(x) is drawn in su
way that though this curve may not pass through every bold dots t
almost all those dots lie very closer to the curve.
In this 'chapter we are going to get such a curve known as best fitl
curve to the dots. If we can find such a best fitted curve Y = ¢(x) I
can predict a value of Y corresponding to a value of x. This type .of CUI
is known as best fitted curve of y on x as we considered differerit
corresponding to a single Xi'
CURVE FITTING BY METHOD OF LEAST SQUARE 413

Similarly corresponding to a single Yj we may have different Xi and


consequently corresponding to a single Yj their exists a mean mx(Y)·
There also a best fitted curve Y = 'I'(x) may be obtained. This curve is
known as best fitted curve of x on y.
When the variable Y depends on X then it is suitable to get the best
fitted curve of Y on x. Similarly if the variable x depends on y then we
should fit the curve of x on y.
11.2. Least Square Regression Curve
In this section we shall discuss how the best fitted curve stated above
would be selected from many curves. This is done by the principle of
least square which is a broad mathematical principle in the subject of
regression analysis. This principle is stated as follows:
Principle of Least Square : Let (X, Y) be a bivariate;
y = if>(x; A, B, .... ) be a family of curves; A, B, . . .. being the parameters
of the family.
If the deviation ofY from y = if>(x;A,B,· ..)

i.e. E[{Y-¢>(X);A,B···Y] is minimum for A=a,B=b,····· then

the curve Y = if>(x;a, b,·· ..) is the best fitted curve of the bivariate values
(Xi' Y j} which is known as least square regression curve.
The function if>(x;a,b,··· ..) is called Least Square Regression
Function of Y on X.
The random variable U; = if>(X;a,b,··· ..) is the best representation
of Y by a function of X. The variable Vy = Y - U , is known as Residual
ofY.
Example: In a night school of adult education, let (X, Y) be a bivariate
where X = age of students and Y = No. of days absence.
The average number of days absence is 4 for the students having age
21. That is for X = 21, the mean of the corresponding values of Y be
my (21) = 4. Following is the table of these type of datas :

Age (X) 21 38 42 47 53 61 64

10 14 17 19 34 38
414 ENGINEERING MATHEMATlCS-UA

The scatter diagram of these data is

y = -17.4+ 0.791x

21 38 53 61 64

The co-ordinates of the above dots are (21,4), (38, 10), (42, 14), (47,
17) . ... and so on. Suppose we desire to draw a straight line which will
fit best to the above dots. Consider the family of straight lines y = A + B x
where A, B are arbitrary. It can be proved that among this family of lines
the straight line y = -17 . 41 + 0 . 791 x will be such that the deviation
E[{X-e-17.41+0.791X)}2] will be mimmum. So
y = -17 . 41 + 0 . 791 x will be the best fitted straight line or the least square
regression line of the bivariate values (Xi'Y) and so the function
¢I(x) = -17 ·41 + 0·791 x is least square regression function of Y on X.
Uy=-17·41+0·791X IS the best representation of Y.
Vy =Y-(-17·41+0·791X) is the residual ofY. From this best fitted
straight line y=-17·41+0·791x we can predict the number of days
absence for a student of 50 years as -17.41+0.791x50=22·14:::22
days.
11.3. Normal Equations
For a bivariate (X, Y) if y=¢lex;A,B,C,. .. ·) be a family of curves ;

A, B, C
.
bemg parameters,
as = 0 -as = 0 -as = 0 ..
- are called
aA 'os 'ac
the normal equations where S = E[{Y -¢l(X;A,B,C,' .... rj
415
CURVE FITTING BY METHOD OF LEAST SQUARE

Theorem. If A = a, B = b, C = c, be the solutions of the normal

.
equatiOns as
aA -0 as
- 'aB --0 ,etc. then y = ifJ(x; a, b, c,···· ..) is the least
square regression function and so this will give the best fitted curve to
the given data.
Proof. Detail of the proof is not discussed.
Note. In fact the above development is done for finding the least square
regression curve of Y on X.
For least square regression curve of X on Y similar fonnulation do

hold.
Observing the scatter diagram of a bivariate (X, Y) we decide the form
of the curve to be fitted to the datas. It may be straight line or parabola
or any other curve like exponential curve etc. In our next discussion we
shall show how to fit these curves.
11.4. Finding best fitted straight lines.
(i) For a bivariate (X, y), y = a +bx is the best fitted straight line of
_ a a -
Y on X where a = Y - r,y -L and b = rxy -L X
ax ax
\
(ii) x = a + by is the best fitted straight line of X on Y

where a = -X a -
- r ~ _x Y and b
-;a y
= rxv a _x ,where X,Y are mean of
. ay
all values assumed by X and Y respectively.
Proof, Beyond the scope of the book.
The above two formulae can be used for any distribution of (X, lJ·
But if Yassumes unique Yi corresponding to an Xi then the above
formulae can be reduced to more simpler form.
This is shown below.
11.5. Finding best fitted straight line when (X, Y) assumes n pair
of datas (X"y,),(X1'Yl)" .... (xn,Yn)
(i) Regression line of Y on X : When (X, Y) assumes the n pair of
1
values (XI'Yl)'(X2'Y2)····· (xn,Yn) (with same probability -;;) then the

least square regression line ,


416 ENGINEERING MATHEMATICS -IJA

i.e. the best fitted straight line of Y on x becomes Y = a +b x where


a, b are solution of the equations
LY; =an+bLx;

LX;Y, = a LX; + bLX;


which are nothing but the normal equations for least square
regression line
Proof. Y = A + B x is the family of st. lines

S = E[{ Y - (A + B X) V] =:..n L; (y; - A - B xY


as 2
:. aA =-;L(Y; -A-Bx;)
I

~ 2 2~ 2'
and aB =;~(y; -A-Bx;)(-x;) =-;~(Ax; «s», -x;y;)

.
.. the normal equations -
as = 0 and -as = 0 become
aA aB
L(Y; -A-Bx)=O and L(Ax; + BX;2-x;yJ=O
;

or, LY; -An-BLx; =0 and ALx; +BLx;2 - LX;Y; =0


; ; i

By a previous theorem if a, b be the solution of A and B from these


two normal equations then Y =a + b x is the least square regression line

and LY; =an+bLx; and LX;Y; =aLx; +bLx;


(ii) Regression line of X on Y : When (X, Y) assumes the n pair of
values (xI'Y1)'(X2'Y2)····(x",Yn) then the least square regression line

i.e. the best fitted straight line of x on y becomes x = a + by where


a, b are solution of the equations

LX; =an+bLY; s
LX;Y; =a LY; +bLy;2
which are known as normal equations for least square regression line.

EM-2~
417
CURVE FITfING BY METHOD OF LEAST SQUARE

Proof. As above.
Note. Making predictions is an important objective of finding least
square regression line. When we have to predict Y for a given value of X,
then we have to use the regression line 'of Y on X in which Y is dependent
variable and depends on the variable X. Similarly, when the problem is of
predicting X, given Y then the regression line of X on Y has to be used.
Example 1. Fit a straight line of the form Y = a + bx to the following
datas:
x 2 5 6 8 9
y 8 14 19 20 31

Predict Y when X = 9 . 6
Solution. Calculation for fitting straight lines
X2 x;Y;
X; Y; I

8 4 16
2
14 25 70
5
19 36 114
6
20 64 160
8
31 81 279
9

:se
~>; =30 I>; =92 LX; = 210 LX;Y; =639

Here n =5.
The normal equations for a and bare

LY; =ax5+bLx;
'of
and LX;Y; =aLx; +bLx;2

That is, 92 = 5a + 30b (1)


639 =-30a + 210b (2)
Solving (1) and (2) we get a = 1, b = 2·9
:. the equation of the best fitted straight line is Y = 1+ 2 . 9x
ne. When X = 9 . 6 , predicted value of Y = 1 + 2 ·9 x 9 . 6 = 28 ·84
EM-2A-27
418 ENGINEERING MATHEMATICS - IlA

Example 2. (X, Y) is a discrete bivariate assuming (x;, Y; ). Fit a straight


line ofY on X and of X on Y to the following data.

Yj
Xi 0 2 3

0 (0,0) (0, 1) (0,2) (0, 3)


1 (1, 0) (1, 1) (1, 2)
2 (2, 0) (2,1)

Solution. The scatter diagram of these data is as follow


• •

Finding the least square line of Y on X :

0+1+2+3 6 3
1st Process: For x ==0 , mean of y, my ==--4-- =-=_.
4 2'

0+1+2
For x ==1, m == ==1 ;
y 3
0+1 1
For x==2 m ==--==-
, y 2 2
So we construct the following table.
x o 2

3 1
2 2
CURVE FIITING BY METHOD OF LEAST SQUARE 419

I We shall now fit a st. line to these data:

X; m/x;) X2
I
x;m/x;)

• 0
3
-
2
0 0

1 1 1 1

1
2 - 4 1
2
3 3 5 2
Here n = 3.
The normal equations are

Lm/x;)=an+bLx;

and Lm/x;)x; =aLx; -n»


That is, 3 = 3a + 3b ie. a + b = 1

and 2 = 3a + 5b i.e. 3a + 5b = 2

Solving we get a =~, b =-.!..
2 2
•• the required best fitted straight line is y = ~ - .!.. X
2 2
2nd Process
From a theorem we get the best fitted straight line as
y = a +bx where
- ay - - Cov(x,y) ay-
a=y-rX),-x=y- '-x
ax «», ax

I - Cov(x,y)-
or, a= y-
ax
2 X

1
420 ENGINEERING MATHEMATICS- IIA

Calculation of a and b
2 2
X, Yi Xi Yi XiYi
0 0 0 0 0
0 1 0 1 0
0 2 0 4 0
0 3 0 9 0
1 0 1 0 0
1 1 1 1 1
1 2 1 4 2
2 0 0 0
2 1 ~ 1 2

L>=7 L>=10 L>2 =1\ Ll=20 5= LXY

= .!. x 5- (.!. 7) (.!.


x x 10) = _ 25
9 9 9 81

a} 1 LX2
=- - ( -Lx
1 )2 =-xll-
1 (1)2
-x7 50
=-
n n 9 9 81
- 1 1 7 - 1 1 10
x=-Lx=-x7=-, Y=-Ly=-xl0=-
n 9 9 n 9 9
25
. . 10 81 7 10 1 7 3
.. a=-+-x-=-+-x-=-
9 50 9 9 2 9 2
81

25
. -81 1
and b=--=--
50 2
81

:. the best fitted curve of yon X is Y = a+bX i.e.y =~ _.!.x


2 2
CURVE FITTING BY METHOD OF LEAST SQUARE 421

Finding least square regression line of x on y :


From a theorem we get the best fitted straight line of x on y as
x=a+b y
were a=X -r o"xy =X _ Cov(x,y). o"xy =X _ Cov(x,y) y
9
=. O"xO"y =. o,
2

o,
and b =r X)'-
. O"y

Cov(x,y) (Jx Cov(x,y)


= .-=
(J2
y

Now; =.2. =-n1 2:>2 - ( -2:> )2 1


n
1
9
( 1
=-x 20- -xl0
9
)2 80
81

25 25
. 7 81 10 7 25 10 9 81 5
.. a=-+-·-=-+-·-=- and b=--=--
9 80 9 9 80 9 8 80 16
81 81
:. the best fitted line of x on y is
9 5
x =a + by or, x = "8-16 y

11.6. Finding best fitted Second Degree Parabola when (X, Y)


assumes n pair of datas (xpY1)'(X2'Y2),-·· .. · (xn,Yn).
In this case the least square second degree parabolic curve i.e ..the best
fitted second degree parabola will be

y= a+b x+c x'


where a, b, c are solution of the equation

aLx/ +bLx; +cLx: = LX/Yi


which are the normal equations for least square parabolic curve fitting.

Proof. Omitted. In fact the proof is similar to that for straight line fitting.
422 ENGINEERiNG MATHEMATICS -llA

Example. Fit a second degree parabola, Y = a + bx + ex' , to the following


data: x 0 1 2 3 4
y: 1 5 10 22 38
Predict y when x = 4 . 8
Solution. Calculation for fitting parabola

x3 X4
2
= Y; s. I I =». 2
x; Y;
0 I 0 0 0 0 b
1 5 1 1 1 5 5
2 10 4 8 16 20 40
3 22 9 27 81 66 198
4 38 16 64 256 152 608

LX; =10 76 30 100 354 243 851

=LY; =Lx/ = LX;3 = LX/ = LX;Y; = LX/Y;


Here n = number of pair (xPy;) = 5
If Y = a + bx + ex' be the second degree parabola then a, b, C are given
by the equations (called normal equations) :

an+bLx j +cLx; = LY;


aLX; + bLX;2 + C LX; = LXiYi
aLx/ +bLX; +CLX: = LX/Y;

That is, 5a + lOb + 30c = 76


lOa + 30b + lOOc = 243
30a + 100b +354c = 851
Solving these three equations we get
a=.!Q b=!2 C=~
7 70 14
10 17 31 2
:. the best fitted parabola is y=-+-x+-x
7 70 14
For x = 4·8, the predicted vaule of y
10 17 31 2
=-+_. x4·8+-x(4·8) =53·61
7 70 14
\ .

CURVE FITTING BY METHOD OF LEAST SQUARE 423

11.7. Finding best fitted Exponential Curve when (X, Y) assumes


n pair of datas (X.,y,),(xz,yz)······(xn,yJ
In this case the required best fitted exponential curve is
y = ab", a > 0, b > 0
If we take log on both side we get
• log y = log a + x log b
or, Y = A + B x where Y = logy, A = log a and B = 10gb
This is a straight line in x - Y frame. We find best fitted straight line
of this type in x - Y frame. Then taking antilog on A and B we find a, b

r for the exponential curve.


Example. Fit an exponential curve of the form y = ab' to the following
data.
x 2 3 4 5 6 7
y 640 512 410 328 262 210
Taking 'log.,' on both side of y = ab" we get
log., y = 10glOa + x log 10 b or, Y =A +Bx (1)

where A = loga and B = log o .


We shall find A and B for the best fitted straight line like (1) in x - Y
frame for the given data
Calculation of A, B

x; y; Y; = log., y; X2
I
x;Y;

2 640 2·8062 4 5·6124


3 512 2·7093 9 8 ·1279
4 410 2·6128 16 10·4512
5 328 2·5159 25 12·5795
6 262 2·4183 36 14·5098
7 210 2·3222 49 16·2554

LX; =27 LY; =15·3847 LX/ =139 LX)': = 67·5362


Here n = number of pair-data =6

l
424
ENGINEERING MATHEMATICS - llA

The normal equations for A and Bare


L)~=Axn+BIx;
and Ix)~ =AIx; +BIx;2

That is, 15·3847=6A+27B


and 67·5362 = 27 A + 139B
Solving these two equations we get
A=3·0 B =-0·0968
:.loga= 3·0 .'. a = antilog (3·0) = 1000
and 10gb = -0·0968 .'. b = anti log (-0·0968) = 0.8002
.'. the best fitted exponential curve is
y = ab' or, y = 1000(0· 8002Y .
11.8. Finding best fitted Geometric Curve when (X, Y) assumes
n pair of data (Xpyl)'(X2'y2) (x ,y,,).
n

In this case the required best fitted geomentric curve is y = ax"


If we take log on both side we get
logy = loga+ blogx

or y = A + bX where Y = logy, A = loga and X = logx


This is a striaght line in X - Y frame. We find best fitted straight line
of this type in X - y frame. Then taking antilog on A and X we find a, b
for the geometric curve.

Example. Fit a geometric curve of the form y = ax" to the following data :
x 1.2345
Y 5·0 6·3 7·2 7·9 8.5
Taking 'log.,' on both side of y = a x" we get
log., y = 10gIO a + b log., x
or, Y = A + b X (1)
where A = loga and X = logx.
We shall find A, b for the best fitted straight line like (1) in X _ Y
frame for the given data
CURVE FITTING BY METHOD OF LEAST SQUARE 425

Calculation of A, b

Xi Yi Y; = 10gYi Xi = log Xi X2 XiY;


I

1 s·o 0·699 0 0 0
2 6·3 0·799 0·301 0·091 0·240
3 7·2 0·857 0·477 0·288 0·409
4 7·9 0·898 0·602 0·362 0·S40
5 8·5 0·930 0·699 0·489 0·650
4·183 2·079 1·17 1·839

Here n = number of pair-data = 5


The normal equations for A and bare

LY; =An+bLXi

and .L:XiY; =A.L:Xi +b.L:Xi2

That is, SA+2·079b=4·183


2·079A+l·17b= 1·839
Solving these two equations we get
A = 0 . 7009 b = 0 . 3263

:.loga = 0·7009 :. a = antilog (0·7009) = 5·022


:. the best fitted geometric curve is

Y = a x" or, Y = (5· 022)xO.3263


Illustrative Examples
Example 1. The data below show the lengths (y) in em. attained by a
coiled spring corresponding to various weight (x) in gm. Fit a straight
line of the form Y = a + b x . Hence predict the length of a coil spring when
an weight of 698 gm in loaded.
x (gm) 100 200 300 400 SOO 600
Y (ern) 90·2 92~3 94·4 96·3 98·2 100·3
426 E GINEERING MATHEMATICS -IlA

Solution. The figure against x and yare very large. We shall reduce the
figure in the following table:

x y x-350 y-90 =v x-350


--=u
50 u2 UV

100 90·2 -250 0·2 -5 25 -1


200 92·3 -150 2·3 -3 9 -6·9

300 94·2 -50 4·2 -1 1 -4·2


400 96·3 50 6·3 1 1 6·3
500 98·2 150 8·2 3 9 24·6
600 100·3 250 10·3 5 25 51· 5
Total - - 31· 5 0 70 70·3

=L:> =L:> =Lu


2
=LuV
Here n = 6.
The normal equations involving U and v are
LV; =an+bLu;

and LU;v; =aLu; +bLu;2


That is, 31· 5 = 6a + Ob
and 70 . 3 = Oa + 70b
31·5
These give a = -- = 5 . 25 and b = 1. 004
6
:. the best fitted straight line on (u, v) data is
v = 5·25 + 1· 004u
:. the required straight line on (x, y) data becomes
x-350
y - 90 = 5 ·25 + 1·004--
50
1· 004x - 88 ·9
or, y - 90 = -----
50
or, 50y - 4500 = 1· 004x - 88·9
or, 50y = 1. 004x + 4411 ·1
or, y = 0 . 020 lx + 88 . 22 .
CURVE FITTING BY METHOD OF LEAST SQUARE 427

Example 2. An analyst for a company was studying the relationship


between travel expenses in rupees (Y) for 102 sales trips and the duration
in days (X) of these trips. He found that relationship between Y and X is
linear. A summary of the data is given below :
2
LX = 510, LX2 = 4,150, LY = 7,140, Ly = 7,40,200

and LXY = 54,900

(i) Obtain the two least square regression line from the above
information
(ii) A given trip has'to take seven days. How much money should a
salesman be allowed so that he will not run short of money ?
- 1~ 1
Solution. Here X = - L..X = -x 510 = 5
n 102
- 1 1
Y=-LY=-x7140=70
n 102
(i) Regression line ofY on X is
y=a+bx
2
when LY=na+bLX and LXY=aLX +bLX

That is, 102a+510b=7140


and 510a+ 4150b = ~4900
Solving these two normal equations we get a = 10, b = 12
:. the regression line of Yon X is
y = 10+12x (1)
Regression line of X on Y is x =a + by
2
where LX =an+bLY and LXY=aLY +bLy

That is, 510=102a+7140b


54900 = 7140a + 740200b
Solving these normal equations we get

- a = - 355 = -0.59
601
and b = 48 = 0·080
601
428 ENGlNEERING MATHEMATICS - IlA

:. the regression line of X on Y is


x = 0·080y-0 ·59
(ii) Y depends on X. We put X = 7 in the regression line (1), of Y on
X,
We get y = 10 + 12 x 7 = 94
:. the salesman should be allowed at least Rs 94.
Example 3. A ball is drawn at random from a box containing 3 blue balls
numbered 0, 1, 2; two white balls numbered 0, 1 and one black ball
numbered O. If the colour blue, white and black are numbered 0, 1 and 2
.'
respectively, find (a) the least square regression lines ofY on X and of X
on y. (b) fit a second order parabola.
Solution.
X = Colour- number
y = The number on the ball
Then possible values of (X, Y) are
y

x
o 2

(blue) 0 (0,0) (0,1) (0,2)


(white) 1 (1,0) (1,1)
(black) 2 (2,0)

(a) Calculation of Cov (x,y), a;, a:

x Y X2 l xy
0 0 0 0 0
0 1 0 1 0
.
t ,

0 2 0 4 0
1 0 1 0 0
1 1 1 1 1
2 0 4 0 0

2>=4 Ly=4 LX2 =6 Ly2 =6 1

J
CURVE FITTING BY METHOD OF LEAST SQUARE 429

Here n =6

Now, Cov(x,y)=! LXY-!Lx!L>


n n n
1 1 1 1 16 5
=-xl--x4·-x4 =---=--
6 6 6 6 36 18

2 1 2 ( 1 )2 1 ( 1 )2 5
ax =-;;Lx - -;;Lx ="6x6- "6x4 "9
t

2 1 2
a=-LY--LY
( 1 )2 1·
=-x6--x4
(1 )2 =q
5
Y n n 6 6

- 1 1 2
X=- Lx=-x4=-
n 6 3
- 1 1 2
y=- Ly=-x4=-
n 6·3
(i) The least square regression line ofY on X is
(J - - Cov(x,y)-
Y =a + b x where a = y - r ---L x = Y 2 X
XY(J (J
x x

and b = r (Jy = Cov(x,y)


xy (J (J2
x x
5 5
:.a=3.+ 18 x3.=3.+'!'=1 b=_18 =_!
35333 52
9 9
.. . lime 0 f Y on x IS
t h e regression . Y = 1- -1 x
2
(ii) The least square regression line of X on Y is x =a + by
5
- (J - - Cov(x y)- 2 18 2 2 1
where a=x- r ....!.. y =X - 'y =-+-x-=-+-=1
xy (J (J2 3 5 3 3 3
Y Y
9
and b = r ax = Cov(x,y) =_.!.
XY
=, ay
2
2

So the least square regression line of X on Y is x = 1- !


2
Y
430 ENGINEER) G MATHEMATlCS-IJA

(b) Calculation for fitting parabola

x Y X2 x3 X4 xy x2y

0 0 0 0 0 0 0
0 1 0 0 0 0 0
0 2 0 0 0 0 0
1 0 1 1 1 0 0
1- 1 1 1 1 1 1
2 0 4 8 16 0 0
4 4 6 10 18 1 1

If the best fitting parabola is y = a + bx + ex'


then a, b, C are given by

an+bLx+c.z>2 = LY

aLx+bLx2 +cLx3 = L:xy


aLx2 +bLx3 +cLx4 = Lx2y

That is, 6a + 4b + 6c = 4

4a + 6b + lOc == 1

6a + lOb + 18c == 1

Solving these three equations we get


1
a == 1, b == -- 2 and c == 0

.', the best fiitted parabola is y = 1- !x + Ox 2

2
1
or , y == 1--x
2
See the parabola reduces to a straight line.
Example 4. Fit a second degree parabola to the following data taking
x as independent variable, by method of least square:
x o 1 2 3 4
y 1 1· 8 1·3 2·5 6·3
CURVE FITTING BY METHOD OF LEAST SQUARE 431

Find out the difference between the actual value of y and the value of
y obtained from the fitted curve when x = 2
x y x-3 y-l·3 10(y-l·3) u2 £13 u4 £IV u2v
=£1 =v
0 1 -3 -0·3 -3 9 -27 81 9 -27
1 1· 8 -2 0·5 5 4 -8 16 -10 20
2 1· 3 -1 0 0 1 -1 1 0 0
3 2·5 0 1· 2 12 0 0 0 0 0
4 6·3 1 5 50 1 1 1 50 50
2
otal -5 2>=64 15 -35 99 49 Lu v
2 3 4
= LU F Lu Lu = Lu = LU~ =43

Here n = number of bivariate data = 5


If V = a + bu + cu' be the second degree' parabola then a, b, care
given by the normal equations:

an+bLu+cLU2 = Lv

aLu+bLu
2
+cLu
3
= LUV
3 4
aLu2 + bLu + cLu = Lu2v
That is, Sa - 5b + 15c = 64
-Sa + 15b -35c = 49
15a - 35b + 99c = 43
Solving these three equations we get

a = 93 = 18. 6 b = 223 = 22 ,3, c = .!..! = 5 . 5


5 '10 2
:. the parabola on (U, V) data is

v = 18·6 + 22· 3u + 5· 5u2

.or, 10(y -1· 3) = 18·6+ 22·3(x-3) +5· 5(x- 3)2

.or, 10y-13= 18·6 + 22· 3x - 22·3 x 3 + 5· 5(x2 - 6x + 9)


432
ENGINEERING MATHEMATICS - IIA

or, 10y = 14 . 2 -10 . 7x + 5 ·5x2

or, y = 1·42 -1 ·07 x + 0 ·55x2

2nd part: For x = 2 , obtained value of y

= 1·42 -1·07 x 2+ 0 ·55 X 22 = 1·48


For x = 2 the actual value of y = 1· 3

.'. the required diefIerence = 1. 48 - 1. 3 = 0 . 18 .

Exercise 11
1. Fit a straight line to the following data by method of least square:
x o 1 2 3 4
,y 1 1·8 3·3 -1·5 6·3
2. The following table gives the age of cars of a certain make and
annual maintenance costs. Find the least square regression line for costs
related to age:
Age of Cars 2 4 6 8
(in years)
Maintenance 10 20 25 30
cost (in hundreds of Rs.)

3. Find the equation of the best fitted straight line for the following
bi-variate data :
x 10 12 13 16 17 20 25
y 19 22 24 27 29 33 37
4. Find the equation of the best fitted straight line for the following
observations :

x 4 6 8 12 15 17 22
y 46 42 40 36 30 25 19
S. Find the least square regression line for the following data
(i) x 70 80 90 100 118 120
y 553 547 539 533 527 520
CURVE FITTING BY METHOD OF LEAST SQUARE 433

(ii) X 5 7 9 11 13 15

y 1·7 2·4 2·8 3·4 3·7 4·4


6. Apply the principle of least square to fit a straight line y = a + bx to
the following data :

X 2 4 6 8 10 12 14

Y 10 14 15 16 15 17 18
7. The following data relate to results of a fertiliser experiment on crops
yields:
Units offertiliser 0 2 4 6 8 10
used (x)
Units of field (y) 110 113 118 119 120 118
Fit a straight line to the above data and estimate the amounts of yeild
when units of fertilisers used are 3 and 7 respectively.

8. In the following table, S is the weight of pottassium bromide which


will dissolve in 100 grams of water at T"c. Fit an equation of the form
S = mT + b by the method of least square. Use this relation to estimate S
when T = 50·.
T o 20 40 60 80
S 54 65 75 85 96
9. Obtain the equations of the two least square line of regression for
the following observation :
X 1 2 3 4 5 6 7 8 9
Y 9 8 10 12 11 13 14 16 15.
lO.From the following data, obtain two least square regression line
Sales 91 97 108 121 67 124 51 73 111 57
Purchase: 71 75 69 97 70 91 39 61 80 47
Hence find the correla~ion coefficient between sales and purchase.

[Hint. 2nd part. byx =0·61, bxy =1·36 -r; =+.Jl.36xO.61]

EM-2A-28
434 ENGINEERING MATHEMATICS -IIA

11.ln trying to evaluate the effectiveness of its advertising compaign,


a firm compiled the following information:
Year 1973 1974 1975 1976 1977 1978 19791980
Adv.Expense : 12 15 15 23 24 38 42 48
(in '000 Rs)
Sale 5.0 5.6 5.8 7 .0 7 .2 8 .8 9 . 2 9· 5
(Lakh Rs)
Find the regression line of sales on advertising expense. Evaluate
probable sales when advertisement expense is Rs 60,000.
12.Following are the data showing evaporation coefficient of burning
fuel droplets against air Velocity in an impulse engine :
Air Velocity: 20 60 100 140 180 220 260 300 340 380
(in cm/s)
Evapor-: 0·180·370·350·78 0·56 0·75 1·18 1·36 1·17 1·65
ation coefficient
Fit a straight line to these data by method of least square and use it to
estimate the evaporation coefficient of a droplet when the velocity of air
is 190 em/sec.
13. An engineer found that by including small amounts of a compound
in rechargeable batteries for laptop, he could extend their lifetimes. He
experimented with different amounts of the additive and the data are
Amount of additive: 0 2 3 4
Life in hours 2 4 3 7 9
Find the least square regression line to the amount of additive. Hence
estimate the average battery life when the amount of additive is 12.
14. Fit a straight line to the following datas by the method of least
squares and .use it to predict the extraction efficiency one can expect when
the extraction time is 35 minutes:
Extraction time: 27 45 41 19 35 39 19 49 15 31
(minutes)
Extraction effi- : 57 64 80 46 62 72 52 77 57 68
ciency (%)
·.
CURVE FITTING BY METHOD OF LEAST SQUARE 435

IS.During first seven years of ~peration, a company's gross income


from sales was 1·4,2·1,2·6,3·5,3·7,4·9and5·5 million dollars. Fit a
least square line and assuming that the trend continues, predict the
company's gross income from sales during the eight years of operation.

16. At the ends of years 2005 through 2012, a company had the
following investments in plants and equipment :

1·0, 1·7, 2·3, 3 ·1, 3·5, 3·4, 3·9 and 4·7 million dollars.

Fit a least square line. Assuming that the trend continues, predict the
company's investment in plant and equipment at the end of 2014.

17. Find both the least square regression line from the following:

LX = 60, LY = 40, LXY = 1150, LX2 = 4160,

Ly 2
= 1720, and n = 10
18.From the following results, obtain the two least square regression
line and estimate the yield, when the reainfall is 29cms and the rainfall,
when the yield is 600 kg.

Yield in kg Rainfall in ems

Mean 508·4 26·7

S.D 36·8 4·6

19.For a bivariate (X, Y) we get the following data

X=20,Y=15,O"x=4'O"y=3,rxy=0·7. Obtain the least square


regression lines. Find the most likely value ofY when X = 24 .

20. A ball is drawn from a box containing nine balls numbered 0, 1,


2, 8 of which the first four are green, the next three yellow and the
last 2 white. If the colours green, yellow and white are reckoned as colour
number 0, 1 and 2 respectively, find the regreession lines of the random
variables- the number of the ball and the colour number.
436 ENGINEERING MATI-IEMATICS-IIA

21. By least square method fit a parabola to the following bivariate data
(i) x 1 2 3 4 5
y 11 30 59 98 147
(ii) x -1 0 1 2 ·1 3 4·2
Y 3 -1 ·9 10·13 23·1 47·7
(Hi) x 0 2 4 6 8 10
y 2 ·1 -2·1 2 ·1 14 ·1 34 ·1 62 ·1
(iv) x -1 0 1 2 3 4
y 4 2 6 16 32 54
22. The profit y ( in lakh of rupees) of a certain manufacturing company
in the year x are given in the following table:
x 2 3 4 5

y 2·18 2·44 2·78 3·25 3·83

Fit a second degree parabola y = a + bx + ex' to these data.

[Hint. Reduce x by putting U = x - 3 ]


23. The profit of a company in a year are given in the following table.
Fit a second order parabola and estimate its profit in the sixth year:
x (year): 1 2 3 4 5
y (profit): 1250 1400 1650 1950 2300
24. Fit a second degree parabola to the given data by method of least
square:
height(x): 7·5 10·012·5 15·0 17·5 20·0 22·5
weight(y): 1·9 4·5 10·1 17·6 27·8 40·8 56·9
25. The following are data on the drying time of a certain varnish and
the amount of an additive that is intended to reduce the drying time:
Amount
of additive o 2 3 4 5 6 7 8
(gms)
Drying time 12·0 10·5 10·0 8·0 7·0 8·0 7·5 8·5 9·0
(hours)
CURVE FITTING BY METHOD OF LEAST SQUARE 437

Draw a scatter diagram to verify that it is reasonable to assume that


the relationship is parabolic. Next fit a second degree parabola by the
method of least square. Use the result to predict the drying time of the
varnish when 6·5 gms of the additive is being used.
26. Fit an equation of the form y = ab' to the following data:

(i) x -1 0 3 5 7
y 1· 5 3 24 96 384
(ii) x -1 -2 0 1 2 3 4

Y 8 16 4 2 1 0·5 0·25
(iii) x 1 1· 5 2 2·5 3
y ·6 ·848 1·2 1· 70 2·4
(iv) x -1 0 2 4 5 6
y 1· 5 3 12 48 96 192
(v) x 2 3 4 5 6

y 144 172·8207·4248·8 298·5

27. Fit the geometric curve of the form y=axb to the following
bivariate datas :

(i) x 1 2 3 4 5
y 3 48 243 768 1875
(ii) x 0 1 4 9 16 25

Y 0 ·5 1 1·5 2 2·5
(iii) x 4 5 6 7 8
y 8 12·5 18 24·5 32
(iv) x 1 2 3 4 5

Y ·2 1· 6 5·4 12·8 25

,
I
-J
438 ENGINEERING MATHEMATICS - I1A

Answers
1. y = 11. 96 - 4 . 29 x

2. Y = 3 . 25 x + 5

3. y=7·75+1·21x

4. Y = 52 -1· 5x

5. (i) y=599·2-0·66x (ii) 'y = 0 . 50 + 0 . 257 x

6. y=1O·72+0·536x

7. y = III . 88 + O·886 x; 114 . 5,118 . 1units

8. S = 0 . 52T + 5·2; 80 . 2 units

9. y = O· 95x+7 ·25; x =0·95y-6·4

10. y=0·61x+15·1; x=I·36y-5·2

11. Y = ·125X + 3·87; 11· 371akh rupees

12. y=0·069+0·00383x; 0·80

14. y=39·05+0·764x; 65·8

15. y=3·39+0·679x; 6·11

16. y=2·95+0·244,x; 5·61

17. reg. line of yon x is y=0·24x+2·56

reg. line of x on y is x = O·58y + 3·68

18. Y = 4 . 16x + 397 ·328, x = 0 . 065 y - 6 . 346 ;

517· 968kg, 32·654 em

19. y = O· 525x +4· 5; x = 0·933y + 6;17·1

7 17 153( 7)
,20. y-9= 60(x-4);x-4= 50 y-9
439
CURVE FITTING BY METHOD OF LEAST SQUARE

2
21. (i) y=2.-t4x+5x2 (ii) y=-I.02-I.Ox+3.0x
2
(iii) y = 2·04 - 4· OOx-1· Ox2 (iv) y = 2·0 + 1· Ox ., 3x
22. Y = 2 ·048 + ·08Ix + ·055x2
23. y = 1140 + 72 -Ix + 32· 2X2; Rs 2732
2
24. Y = 9 . 8 - 2 . 60x + 0 . 208x

25. Y = 12 . 2 -1· 85x + 0 . I83x2; 7·9 hour

26. (i) y = 3· (2.y (ii) y = 4(·5y (iii) y = ·30(2· oy


(iv) y=3·(2Y (v) y=IOO(l·2Y

27. (i) y = 3x4 (ii) y = (.5)$


3
(iii) y = (0· 5)x 2
(iv) y = (0· 2)x
1iill~~~~~;;;;;S;;;;;AMP~L;;;;;I;;;;;N;;;;;G;;;;;&~I;;;;;T;;;;;S;;;;;D;;;;;I;;;;;S;;;;;TR;;;;;I;;;;;B;;;;;U;;;;;T;;;;;IO~N
12.1. Population and Sample
Sometimes from a set of statistical observations or data a part is selected
and from this part the information about the entire set of data is obtained. The
set from which this part is taken is called Population or Universe. The part
which is selected from a population is called a sample.
Illustration. (i) Suppose we have a list of incomes of 1000 persons.For
some purpose we Craw the income of 100 persons from these.Then the set
of incomes of the 1000 persons is population and the set of incomes of the
100 persons is a sample.
(ii) The weight of students in India form a population-population of weights
of students.
Sampling. The process of drawing sample from a population is called
sampling.
Sampling is done in every field of science and technology, social science
etc.
..
Population Size and Sample Size. The total number of datas in a
population is called population-size. It is denoted by N.The total number of
datas in a sample is called sample-size.!t is denoted by n.!n the above
illustration (i) the sample size is 100 whereas the population size is 1000.
Note. The population size may be infinite e.g. if we draw a sample of size
10 from the set of all integers then the population size is infinite whereas
the sample size is 10.
Law of Statistical Regularity.
If a sample of moderate size is drawn at
random from a population then it has almost the same characteristics as
that of the population.
Sampling Error. We see the formation ofa sample is purely a matter of
4
. chance. So,whatever method is adapted to form a sample the result obtained
from a sample should not be exactly same as the characteristics of the
population; some error would be there.This error is called sampling error.Thr
magnitude of this error depends upon the size of the sample and the variabilin
.of the datas in the population.
SAMPLING & ITS DISTRIBUTION 441

12.2. Random Sampling


There are different types of sampling;among these the most important
one is Random Sampling or Simple Random Sampling which is defined
below:
Simple Random Sampling (SRS). This is the process of drawing a
sample where every member (or data) of the population has equal chance
of being included in the sample.Usually the datas are drawn one by one.The
obtained sample is called Simple Random Sample or briefly Random Sample.
Illustration. If 10 Cards are drawn at random from a pack of playing
cards then this is a Simple Random Sampling.
There are two types of SRS.These are:
Simple Random Sampling With Replacement (SRSWR)
A random sampling is called Simple Random Sampling with Replacement
(SRSWR) if every member (or data) is drawn one by one and after drawing
a member it is noted and returned to the population. The next drawing is
done after that.Obviously the population is not changed before each
drawing;a particular member of the population may occur more than once
in a sample under SRSWR.
Simple Random Sampling Without Replacement (SRSWOR)
A random sampling is called Simple Random Sampling Without
Replacement (SRSWOR) if every member is drawn one by one and after
each drawing the member is not return~ the population. Obviously a
particular member of the population cannot occur more than once in this
type of sample.The size of the population will go on decreasing after each
drawing (if the population size is tinite).This drawing is equivalent to the
drawing the all members of the sample at a time.
Illustration.
Let S = {2,4,S,9} be a population. We are required to draw a sample of
size 2.
Under SRSWR the possible samples are
{2,2}'{2,4}{2,S}{2,9}{4,4}, {4,S},{4,9},{S,S},{S,9} and {9,9} if
the order of drawing is ignored.

Upder SRSWOR the possible samples are {2,4}{2,S}{2,9}{4,S}

{4,9} {S,9} ,if the order of drawing is ignored.


442
ENGlNEEIUNG MATHEMATICS- IIA

Note. In fact there is a systematic procedure of drawing a member under


Simple RandomSampling.We usually take the help of Random Number
Table to insure that the drawing a member is really random.
12.3. Sample Mean & Sample Variance
4
Let {X1>X2,... X,,} be a sample drawn from a population.The A.M of
1 1 n
these values, i.e. x = -(XI + X2+...+Xn) = - LX; is called a Sample Mean.
n n ;=1
The variance of these values, i.e.
1 2
S2 =~{(XI -xf +(X2 -xY+··.+(x" -if} n
=-L(X;-X)
_

n ;=1
is called a Sample Variance.
The positive square root, i.e. S is called a Sample Standard Deviation.
Population Mean and Population Variance. The A.M of all the observations
in a population is called Population Mean.It is denoted by )..l •
The variance of all the observations in a population is called Population
Variance.!t is denoted by 0"2 •
The positive square root of population variance is called Population S.d.lt
is denoted by 0".

Illustration. Let {2,6,10,14} be a population of population-size 4. {6,10}


is a sample drawn from this population. Then
x = -(2 6 + io) = 8 is a sample mean.
1 2 +102)-
S2 =2"(6 {1}2
2"(6+10) =68-64=4 is a sample variance.

S= 14 = 2 is sample S.d.
Again, )..l = ..!..(2 + 6 + 10 + 14) = 8 is the population mean;

0"2
1 4
="4(22 +62 +lO2 +142)-
{I"4(2+6+10+14) }2 =84-64=20 IS

population variance.

0" =.J20 = 4.47 is the population standard deviation.


Note. From the above illustration it is obvious that any sample measures
-Iike sample mean,sample variance etc varies from sample to sample. But tht
population measure like the population mean, population variance are fixed
SAMPLING & ITS DISTRIBUTION 443

12.4. Sample Proportion and Population Proportion.


The ratio of the number of particular type of observation in a sample
and the sample size is called the 'sample proportion' of that type of
observations. It is denoted by p.
For example, let a sample of 100 electric bulbs contains 5 defective
5 1
bulbs. Then the sample proportion of defective bulbs, p = 100 = 20 .It
varies from sample to sample. So p is a variable.
The ratio of the number of particular type of observations in a population
and the population size is called the 'population proportion' of that type of
observations. It is denoted by P.

For example let a population of 100000 electric bulbs contains 150


defective bulbs.

Then the population proportion of defective bulbs p = 150 = _3_


100000 2000
12.5. Sampling Distribution
Parameter. Any statistical measure of all the observations of a population
is called a parameter.
For example,Population Mean,Population S.d. are all parameter.
Statistic. Let {Xl> X2 , x,,} be a sample (of size n) drawn from a known
population.Now Xl ,X2 , X" may vary over the observations of the population.In
other words, Xl>X2,'''X'' are variable.A function of these variables Xj,X2,"'X"

is called a statistic.
For example X[ + xi +...+x~ is a statistic.
Illustration. Let {2,4,5} be a population. Consider the samples {Xj,X2}

under SRSWR.Then Xl and X2 may take any value among 2, 4, 5.


That is all possible samples are

{Xj,X2} : {2,4},{2,5}, {4,2}, {4,5}, {5,2}, {5,4}

Now consider the sample mean, x = .!..(Xl + X2)


2

J
444 ENGINEERING MATHEMATICS - IIA

Then since x is a function of Xl and X2 so the sample mean x is a


statistic.Its possible values are

x : ~(2 + 4),~(2 + 5),~(4 + 2),~(4 +5),~(5+ 2),~(5+4)

. _ 7 9 7 9
1. e. 3 - 3 - - -
X •
. '2' '2' 2 '2
Similarly,sample variance, sample standard deviations etc. are all statistic.

Here the population mean, J..l = .!.(2 + 4 + 5) =.!..!. is a parameter.


3 3
Note that it is fixed.
Population Distribution. Let X be a random variable which assumes
.' the datas or observations of a population. Since X is a random variable so X
must have a probability distribution. This distribution of X is called population
distribution of the given population ..
Illustration. (i). Consider the population of weights of students in India.Let
X assumes these weight,i.e. X (a student) = his / her weight.Here X is a
continuous random variable. Suppose X has normal distribution. Then we
say the population-distribution is normal or the population is normally
distributed.
(ii) Let a sample of 5 c.c. water be taken from a huge tank of polluted
water. X = number of bacteria present in this sampleX may assume the
values 0, 1, 2, 3,....Let this X has poisson distribution. So the population of
the number of bacteria, {O,1,2,3, ...} is distributed according as poisson.
ote. If {Xl ,X2 , ••• Xn} be a random sample drawn from an infinite
population or if the samplingis an SRSWR then each of XI,X2""Xn
varies over the datas of the population with same probability. So each X,
has same distribution as that of the population.

Sampling Distribution of a Statistic


Let t be a statistic.Then as we have observed,t maybe considered as a
variable rather a random variable. So. t must have a pr obab ili t;
distribution. This probability distribution of t is known as the sampliru
distribution of the statistic t.
SAMPLING & ITS DISTRIBUTION 445

Illustration. Let a population consists of the four numbers 2, 6, 10,


o x
14.Consider the statistic,sample mean ;the size of every sample being 2
and the samples are drawn without replacement.Then the different samples
and their sample mean will be :
Sample, (X(,X2) : (2,6) (2,10) (2,14) (6,10) (6,14) (l0,14)
Samplemean, x: 4 6 8 8 10 12
Thus the probability distribution of x becomes
i 4 6 8 10 12
1
Probability -1 -
1 2 1
-
666 6 6

This probability distribution of x is a sampling distribution of the statistic


x.
Note. Sampling distribution of a statistic depends upon the size of the
sample, population distribution and the nature of sampling.

Standard Error (S.E). The standard deviation ofthe sampling distribution


of a statistic is called the standard error (S.E) of that statistic.
Illustration. Consider the sampling distribution of x (sample mean) cited
in our previous illustration.This was
i 4 6 8 10 12
1 1 2 1 1
Probability
6 6 6 6 6
_ 1 1 2 1 1
Here E (x) = 4 x - + 6 x - + 8 x - + lOx - + 12 x -
6 6 6 6 6
1
=(4+6+16+10+12)x-=8
6
_ 1 1 2 2 1 1
and E(x2)=42 x_+62 x-+82 x-+10 x-+122 x-
6 6 6 6 6
1 212
= (16+36+ 128 + 100+ 144) x - = -.
6 3

var(x)=E(X2)_{E(X)}2=2~2-82=230:. o-x=J¥=2.58.

So the S.E of x is 2.58.


446 ENGlNEERING MATHEMATICS -IJA

Characteristic of Standard Error


Standard Error indicates the degree of dispersion of the values assumed
by the concerned statistic.It depends on the sample size.It decreases as
sample size increases.Standard error tells us about the extent to which a
sample is reliable. Greater the S.E lesser the sample reliable.It is useful to
estimate a parameter.

12.6. Sampling Distribution of Sample Mean


Theorem 1. Let X be the sample mean of the samples (of size n) drawn at
random from a population which is normally distributed with mean ).l and S.d
a .Then the sampling distribution of X is a normal distribution with mean ).l
a
and S.d Fn'
Proof Beyond the scope of the text.
Illustration. Let X= diameter of a ball bearing manufactured by a
company.There exist several values of X,say 2.001, 1.9998,etc.!t can be
shown that X has normal distribution. Let its mean, ~L = 2 and S.d, a =.001.
Let samples of ten balls are taken from the lot produced by the
manufacturer.Let X = A.M of the diameters of the 10 balls.Then X varies
from sample to sample, i.e. X would be a random variable.
According to above theorem that X has normal distribution with mean
.001
2 and S.d rrz :
,,10
Note. (l) From the above theorem we see that the S.E of the statistic X is
cr .
.J,; if X is sample mean of the samples of size n drawn from a population
normally distributed with mean ).l and S.d a.
(2) If the population is not normally distributed then the following results
can be obtained.
Theorem 2. Let all possible samples of size n be drawn from a population
(not necessarily normally distributed) of size N, mean ).l and S.d cr. Then
the mean and S.d of the sampling distribution of the sample mean are
respectively

(i) ).l and ;;; ~ ~ =~ if N is finite and the sampling is SRSWOR.


SAMPLING & ITS DISTRIBUTION 447

(J
(ii) f..l and r if N is infinite
. vn
(J
(iii) f..l and .[;z if the sampling is SRSWR.

Proof: Beyond the scope of the book.


Note. In the above theorem we do not tell about the probability distribution
of the statistic X, i.e. we did not tell whether the distribution is normal or
poisson or etc. However we have the following theorem:

Theorem-3. (Central Limit Theorem) For large values of sample size


n (~30) the sampling distribution of sample mean is approximately a 'normal
distribution, irrespective of the population (Population mean and variance
should be finite and the population size is at least twice the sample size)

Proof Beyond the scope of the book.

Note. (1) The mean and S.d of the approximate normal distribution is
according to Theorem-2.

(2) In fact the above-Theorem-S is a particular case of Central


Limit Theorem of advanced probability theory.
(3) The accuracy of approximation improves as n gets larger. Because
of this we say the sampling distribution of X is asymptotically normal.

Illustration. Let a population consists of the five numbers 11, 8, 6, 3 and


2.Consider all possible samples of size two that can be drawn with
replacement from this population.

11 __ 11+8+6+3+2 --6.
The mean of the population, ,.-
5
The variance of the population

U2+82+~+32+22 C1+8+:+3+2 J =10.8.

So the S.d of the population, (J = .J1O.8 = 3.29 .


Number of all possible samples of size two = 5 x 5 = 25 .
448 ENGINEERING MATHEMATICS -UA

These samples are :


(11, 11) (11, 8) (11, 6) (11,3) (11,2)
(8, 11) (8, 8) (8, 6) (8, 3) (8,2)
(6, 11) (6, 8) (6,6) (6, 3) (6,2)
(3, 11) (3, 8) (3, 6) (3,3) (3,2)
(2, 11) (2, 8) (2, 6) (2, 3) (2,2)
The corresponding sample means are
11 9.5 8.5 7 6.5
9.5 8 7 5.5 5
8.5 7 6 4.5 4
7 5.5 4.5 3 2.5
6.5 5 4 2.5 2
So the sampling distribution of X is
X 11 9.5 8.5 8 7 6.5 6 5.5 5 4.5 4 3 2.5 2

Probability:
1
2 5 ~ -2 -
I
-4 2
- -
I
-2 2
- -
2 2
----
I 2 I
25 25 25 25 25 25 25 25 25 25 25 25
The mean"of this sampling distribution
1 2 2 1" 4 2
= 11 x-+9.5x-+8.5x-+8 x-+7 x-+6.5x-
25 25 25 25 25 25
1 2 2 2 2 1
+6x-+5.5x-+5 x-+4.5x -+4 x -+3 x-
25 25 25 25 25 25

+2.5 x - 2 + 2 x - 1 = 6 = Ponulati
opu anon mean.
25 25

Now, E(X2)=IF x_I +(9.5)2 x~+(8.5/ x~


25 25 25
1 4 2 2 I
+82x_+72 x-+{6.5) x-+62 x-
"25 25 25 25
2 2 2
+(5.5)x-+5 2
x-+ (4.5)2 x_+42
2 x- 2
25 25 25 25
I 2 2 I
+32x-+(2.5) x_+22 x-
25 25 25

/
/
SAMPLING & ITS DISTRIBUTION 449

Then Var(X) = E(X2) - {E(X)V = 5.40


- rrt: 0- 3.29 232
So, S.d of X =v 5.40= 232. On the other hand Fn = .fi =. .
This illustrates the result stated in the previous Theorem-2 (iii).
ote: In the above illustration we could also state that S.E of X is 2.32.

12.7. Sampling Distribution of Sample Variance


Theorem 1. If S2 be the sample variance of a sample of size n drawn

from a population with mean ~ and S.d 0- then E( 82) = n- 10-2 (where
n
population size is infinite or the sample is drawn with replacement).

Proof: Let {Xj,X2, ... Xn} be the sample.


1
Then S2 = -
n
2: (x; - x)
n

;=1
2
... (1)

Now, by Central Limit Theorem,the statistic X has approximately normal


0-
distribution with mean ~ and S.d Fn.
So, E(x)=~ and E{(x_~)2} =(;;r n
Now since X; varies over the values of the population so Xi has same
distribution as that of population. So Var(x;) = 0-2 also.
Now (1) can be written as

S'- = -1 ~(
L.J X; - ~ + ~ - X_)2
n ;=1

EM-2A-29
450 ENGlNEERlNG MATHEMATICS -IIA

1 II cr2
=-}2Var(x;)--
n ;=1 n
1 II cr2 1 cr2
= - 'Lcr2 -- = -·ncr2--
n ;=1 n n n

cr2 n-l
:::;cr2--=--cr2
n n

Corollary. If s2 = _n_S2 then E(s2) = cr2 •


n-l
n n n-·l
Proof E(S2) = __ E(S2) = -_·--cr2 = cr2 •
n-l n-l n

Note. Though in the above theorem we found the mean of the sample
variance S2 we did not get the sampling distribution of the statistic S2.IL
the following theorem we go for those.In fact we get the sampling distributior
nS2
of which is sufficient to characterise S 2.
-2-
c
Theorem 2. If S2 be the sample variance of a sample of size n draw
nSl
from a normal population with mean Il and S.d o then the statistic -,cr-

has X2 distribution with n - 1 degree of freedom.

Proof: Beyond the scope of the text.


SAMPLING & ITS DISTRIBUTION 451

Deduction of Mean and Variance of S2 from the distribution of


nS2
cr2
(1) Since mean ofaX2 distribution with n degrees of freedom is n so the

mean E (nS2 ) =n- 1 or, ~ E (S2) =n- 1


cr2 cr2

or, E (s 2) = n - 1 cr2 which was obtained in Theorem-I.


n /
(2) Since the variance of a X2 variate with n degrees of freedom is 2n so

here Var ( n~2 ) = 2 (n - 1) or, ~ Var (S2 ) = 2 (n - 1)
lc
4
(cr2)
2) 2cr (n-l)
or, Var ( S = n2 .

12.8. Illustrative Examples.


Example 1. From a population with 20 members a random sample (without
replacement) of 2 members is taken.Find the possible number of such
samples.
Under SRSWOR the drawing is equivalent to the drawing of two at a
time.
So if we ignore the order of the sample-member the number of all
possible samples =20C2 = 190.
If the order is not ignored then the required number of all possible
samples =2°Pz = 380.

Example 2. A random sample of two individuals is to be drawn from a


population of size 43.What is the possible number of distinct samples when
sampling is (i) with replacement (order of drawing to be taken into account)
and (ii) without replacement (order of drawing to be ignored).
(i) For SRSWR,the number of such possible samples = number of
permutations taking two out of 43 where repetitions are allowed
=43x43= 1849·
(ii) For SRSWOR,the number of all such possible samples = number
of combinations of two out of 43 = 43C2 = 903 .
ENGINEERI G MATllEMATlCS-llA
452

Example 3. The standard deviation of the marks obtained in Mathematics


by 112 boys is found to be 13.2.Find the standard deviation (ie. S.E) of the
sampling distribution of the statistic sample mean for a random sample of

size 10
(i) taken with replacement
(ii) taken without replacement.
Here sample size n = 10 ,population size N = 112 ,populatiOlyS.d,
c = 13.2 .The distribution of the population is not known.

·) Sd f-- o _13.2_13.2_417
CI . 0 x- ..r;;- .JiQ-'3.16- .

Cii) S.dof x=~~N-n = 13.2 x 112-10 =4.002 .


..r;; N -1.JiQ 112-1

Example 4. Compute the standard error of the mean and construct the
sampling distribution ofthe mean for simple random samples of two families
each from a population of 5 families which is given below:

Family A B C D E

Family size: 4 3 2 5 7

The population is {4,3,2,5,7}


The population S.d,

c= ~(42 + 32 +22 +52 + 72)-H(4+3+ 2+5+ 7)2} = 1.72 .

We suppose this is a SRSWOR .

So, S.E of x = S.d of the statistic

x=~ ~N- n = 1.72~ 5 - 2 = 1.06 .


..r;; N-l fi 5-1
All possible such samples and the corresponding sample mean are given

below:
Sample (4,3) (4, 2) (4, 5) (4, 7) (3, 2)(3, 5)(3, 7) (2, 5) (2, 7) (5, 7)

4 5 3.5 4.5 6
3.5 3 5.5 2.5
SAMPLING & ITS DISTRIBUTION 453

So the sampling distribution of x is

2.5 3 3.5 4 4.5 5 5.5 6


I
Probability
10 10 5 10 5 0 10 10
Example 5. A normal population has a mean 0.1 and standard deviation
2.l.Find the probability that the mean of a sample of size 900 will be
negative. Given that pO
z / < 1.43)=.847 . [WE. U. T. 2012, 2008, 2003]

Since the population is normal and the sampling is random so the sampling
distribution of the statistic x (sample mean) is normal distribution with
mean = population mean = 0.1 and standard deviation

(S.d) = J;; = .j~~o =.07

(as we had a theorem)

So, z =
r -u:
----;yf- is
--~+---~----~~Z
standard normal variate. 1.43

Now the event

"The sample mean is negative" = (x < 0).


O-OJ
Whenx=O, z=--=-1.43.
. .07
So, p(x < 0) = p(z < -1.43) = area under the normal curve on left side of
the ordinate z = -1.43
(Shown in the following figure).Now we are given
p(/z/<1.43)=0.847. i.e. P(-1.43<z<1.43)=0.847. So the area under
the normal curve (shown in the figure) enclosed between the two ordinates
z = -1.43 and z = 1.43 is 0.847. Now the area under the normal curve on
left side of y axis is 0.5.
So p(z < -1.43) = area on left side of the ordinate z = -1.43
= 0.5-..!.. x 0.847 = 0.0765
2
Hence the required probability = 0.0765.
454 ENGINEERING MATIIEMATlCS-IlA

Example 6. From a normally distributed population samples are drawn of


size 25.Given that the mean of the population is equal to the S.E of the
sample mean. Show that the probability of the mean of a sample of size 49
drawn from the same population will be negative is 0.0808.
Let us consider all random samples of size 49 drawn from the population.
Let ~ = population mean, (J = population S.d
· (J (J (J
. B y h ypothesis
I ~ = Fn = .J25 = 5 or, c = 5~.

Since the population is normally distributed so x is a normal variate


.I = d (J 5l-l
with mean ~,s. = J49=7'
'So, z = x - ~ = 7(x - ~) is a standard normal variate.
5~ 5~
7
Now "sample mean will be negative" = (x < 0)

When x=O, z= 7(0-1!) =-2=-1.4


5~ 5
:. the required probability = p(x < 0) = p(z < -1.4) = area under the
normal curve on the left side of the ordinate at z = -1.4 (as shown by shade
in the following figure shown in the next page).

From table (shown at the end of the book) the area under the normal
curve enclosed between y-axis and the ordinate z = 1.4 is .4192.
So the area under the normal curve enclosed between y axis and the
ordinate z = -1.4, = 0.4192 .
So the required probability = OJ - 0.4192 = 0.808.
SAMPLING & lTS DISTRIBUTION 455

Example 7. The guranteed average life of a certain type of electric light


bulbs is 1000 hours with a standard deviation of 125 hours.lt is proposed
to sample the output so as to assure that 90% of the bulbs do not fall short
of the guranted average by more than 2.5 per cent.What should be the
minimum size of the sample? (The area under standard nOD1UlIcurve from
z = 0 to z = 1.28 is 0.4000)
Here Il = 1000, o = 125 .Let n be the size of the sample.
Now, "Short of the guranteed average by more than 2.5%"

= Il- (2.5% of u ) = 1000- 2.5 x 1000 = 975·


100
So "The bulbs do not fall short of the guranteed average by more than
2.5%" = (x > 975) .

By hypothesis,the probability of this event = 90% =.2-10 .


So p(x> 975) =.2- (1)
, 10
The sampling distribution of x is normal distribution with mean = Il = 1000
cr 125
and S.d = .In = .In.

So z = x -1000 = .In (x -1000) is a standard normal variate.


, 125 125

Wh _.In .In (975-1000)


en x = 975, z= 125

So, P(X>975)=P(Z>-~)

From (1), p(z>- .In)=~=0.9


5 10

'or, p( z < - ~) = 1- 0.9 = o.t .


By symmetry of normal curve

-..In
5
456 ENGINEERI G MATHEMATICS -IIA

p(z> ~)~o.l or, p(o<,< ~)~05-0.l~0.4.

So the area under standard normal curve enclosed between z = 0 and


Z =.j;; is 0.4.
5

By supplied data, ..In5 = 1.28 or, 11 = 40.96::::41. So the minimum size of


the sample should be 41.

Example 8. Assume that 3000 male students of a universety are normally


distributed with mean 68.0 inches and standard deviation 3.0 inches.80
samples consisting of 25 students each are obtained.In how many samples
would you expect to find the mean between 66.8 and 68.3 inches.
[Given area under standard normal curve enclosed between z = -2 and
z = 0 is 0.4772 and between z = 0 and z =·5 is 0.1915.]
The sample mean X is normally distributed with mean = population
cr 3
mean = 68 and S.d = r = r;:::; =.6 .
-en ,,25
X
S0, z = -- - 68. IS stan
da r d norma I' variate .
.6
- 66.8-68
When X = 66.8, Z = .6 = -2

- 68.3-68
When X=68.3, Z= =·5
.o
:. Probability of X lying between 66.8 and 68.3
y

z
0.5

= P(66.8 < X < 68.3)= P(-2 < z<.5) = P(-2 < Z < O)+p(O<z <.5)
= 0.4772 + 0.1915 = 0.6687 .
So,the expected number of required type of samples
= 80 x 0.6687 = 53.496::::53 .
SAMPLING & ITS DISTRIBUTION 457

Example 9. The mean weight of 500 ball bearings is 5.02 gms.Their S.d is
0.30 gm.Find the probability that a random sample of 100 ball bearing
2.96
would have a combined weight more than 510 gm.[given f <j>(z)dz = 0.4985
where Hz) is standard normal function] 0

The population is not given as normally distributed. The sample size 100
is larger (> 30) . So by Central limit theorem the statistic X is approximately
normally distributed with mean == populati<?nmean = 5.02 and S.d

= fn ~~ =~ (supposing the sampling is SRSWOR)


= 0.30 500 -1 00 = 0.027
•.1100 500-1

If the combined weight of 100 bearings is 510 gm then mean of the


510
sample =- = 5.1 gm.
100
X -5.02
Now, z = 0.027 is standard normal variate.

When

- z- 51-502=2""
X = 5.1, 0017 :>u.
~
Probability that the sample has combined wt. greater than
510= p(X > 5.1) = Piz > 2.96)
= area under standard"normal curve enclosed on right side of the ordinate
z=2.96 (shown in the figure by shade) =0.5-0.4985=0.0~15.

Example 10. Distribution of marks scored in an examination is


nonnal.Samples of four students' marks are drawn and it is seen that the
probability ofthe sample mean to be less then 61 is 0.44,to be more than 80
is 0.04.Find the mean and S.d of the distribution.
z
[Given f<l>(t)dt = 0.06,0.10,0.46 according as z = 0.15,0.25 and 1.75]
o
Let mean = Il and S.d = c of the population.
458 ENGINEERING MATHEMATICS - IIA

Since the population is normal so the sampling distribution of x (sample


mean) is normal with mean = population mean = Il and S.d
c o c
= .,In = .14=2'
By problem p(x < 61) = 0.44 (1)
p(x > 80) = 0.04 (2)

Now, Z = --x -(jIl = 2(x (j


- Il) has stan dar did'
norma istn'b ution.
.

2
2 (61- Il) 2 (80 - Il)
When x = 61, Z =
(j
and when x = 80, Z =
(j
.

2(61-1l))
.', From (1) we have, P ( Z < o = 0.44
"

:. P 2(61-(j Il)) < Z < 0 = 0.50 - 0.44 =.06 :. P (2(6~-


0<Z < 1l))-_.06
(

2(1l-61)) (3)
or, P(0 <Z < • o =.06

We are given a data p( 0 < Z < 0.15) =.06 (4)


2(1l-61)
From (3) and (4) we have =.15
o
or, 2(1l-61)=.l5(j ... (5)

Similarly from (2) and from other hypothesis


we have 2(80-1l)=1.75(j (6)
Solving (5) and (6) we get Il = 62·5 and c = 20 .
Example 11. A population consists of the three numbers I, 3, 4. Consider
all possible samples of size two with replacement.Find the mean of the
sampling distribution of the sample variance and verify the result

E (S2) = n n- 1 (j2 • Find the standard deviation of the sampling distribution of


variances.
There are 3 x 3 = 9 number of such samples.
SAMPLING & ITS DISTRIBUTION
459

In the following table we show these and corresponding sample


variances:
Sample
: (l, I) (l, 3) 0, 4) (3, I) (3,3) (3, 4) (4, I) (4,3) (4, 4)
.
(S2) Vanance: 0 "49 1 9 1
o 4 4 4
o
So the sampling distribution of S2 is
1 9
o - -
4 4
3 2 2 2
Probability :
9 9 9 9
So Mean of S2 =E(S2)=ox2+1x3.+J..x3.+2.x3.
' 994949
2 1 1 7
=-+-+-=-
9 18 2 9 (1)
2 2; . )2
.
Now, populatIOn •
vanance 0'
2
=1 +33 +4- - ( 1+3+4
3 .

_ 26 (8)2 _ 26 64 _ 14
--- - ------
3 3 3 9 9
Then ~O'2 = 2-1 x 14 =2
'n 2 9 9 (2)

From (1) and (2) the result E ( S 2) =--0'


n -1 2. .
venfied.
IS

va,(s')={o'+I'+W' +m' x~}-


n

3 2 1 2 9 2}2
{ Ox-+lx-+-x_+_x_
9 9 4 9 4 9

= {3.+~ 3.+~}_{3.+~+J..}2
9 16
x
9 16 9 18 2
49 49 245
=---=_.
36 81 324

:. s
O' ! =J:;: =0.87.
460
ENGINEERING MATHEMATICS-IIA

Example 12. Samples of weights of 200 students each are drawn from a
very normal population of the weights of students with S.d 10 pound.
Variance in each sample are computed.Find (i) the mean and (ii) the standard
deviation of the sampling distribution of sample variance.
If S2 is sample variance then.
The required mean

E (S2) = n - 1(i = 200 - 1 x 102 = 199


n 200 2

We know, var( ~2)=2(n_l)

.'. vaf~~n=2X(200-')

.: or, Var(2S2) = 2 x 199 or, 22 Var(S2) = 2 x 199

or
,
Var(S2)= 2x 199
4
= 199
2
:. c
s
2 =t 99
2'

Example 13. From a normal population of S.d ..fiO samples of size 5 are
drawn. Prove that the probability that the sample variance are greater than
2.336 is O.90.Find the mean and variance of the sampling distribution of
the sample variance. [Given X~.90.4 = 0.584 ]
Here 11 = 5, c =..fiO .Let S2 be the sample variance. Since the population
S2
is normally distributed so n 2 has X2 distribution with 5 -1 =4 degrees of
c
,
freedom.Given X~.90.4 = 0.584 . This implies that for a X- variate with 4 degrees

of freedom p(/ ~ 0.584) = 0.90


i.e p( ::;' > 0584) = 0.90.

or, {~)' > 058~) = 090 or, p( ~' > 0584) = 090
or, p(S2 ~ 2.336) = 0.90 .Hence proved.
SAMPLING & ITS DISTRIBUTION 461

. 2 I1S2 5S2 S2
Since the X variate -2 = ---2 =- has 4 degrees of freedom so
c (Fa) 4

E( ~2 ) = 4 -1 = 3 (by properties of l distribution)

or, ± E (S2 ) = 3 or, E (S2) = 12 .

:. Mean of sampling distribution of variance = 12; and

va{ :) ~ 2 (4 - I) ~ 6 (by properties of X' distribulion) or,

412Var(S2) = 6 or, Var(S2)=96

:. Variance of the sampling distribution of variance is 96.


Example 14. The life of an electronic device is normally distributed with
mean 4 and variance 6.Ten devices are drawn at random in different ways.Find
the probability that sample variance lies between 1.995 and 10.146. [Given
X6.950;9 = 3.325; X6.05;9 = 16.91 ]
Find the mean and variance of the sampling distribution of sample
variance.
nS2 2 .
If S 2 be the sample variance then -' -2 has X distribution with 10- 1= 9
o
d.o.f.

Now p( ~: > 3.325) ~ 0.950 (from the 1st data)

or, p( lOt > 3.325) ~ 0.950 or, p( S' >1.995) ~ 0.950 (1)

Again p( ~: > 16.91) ~ 0.05 (from the 2nd.data)

or, p( S2 ~ 10.146) = 0.05 (2)


.'. p( 1.995s S2 s 10.146)= 0.950- 0.05= 0.9

1.995 10.146
462
ENGINEERING MATHEMATICS - JJA

Exercises 12
1. Ages of 5 persons have been recorded (in years) as 14, 19, 17,
20, 25.For random samples of size 3 drawn without replacement from
this population obtain the sampling distribution of x. Show that the
mean of x equals the population mean and obtain the S.E of x, directly
from the sampling distribution and also by using the formula.
[Hint : the rrean of the distributionof x is 19 = populationmean]
2. The values of the characteristic x of a population containing 6 units
are given by 2,6,5, 1, 7, 3.Take all possible samples of size two and verify
that the mean of the population is exactly equal to the mean of the
statistic,sample mean.
[Hint: Mean of x = 4 = population mean]
3. A simple random sample of size 4 is drawn with replacement from a
population with S.d 6.What is the standard error of the statistic sample
mean?

4. Random samples of size 3 are drawn from a population consisting of


the datas 7, 5, 3, I without replacement.Find the mean and variance of the
sampling distribution of the sample means. Hence find the standard error
of the sample mean. Hence find the standard error of the sample mean.
5. The mean weight of 500 ball bearings is 5.02 gms.Their S.d is 0.30
gm.Find the probability that a random sample of 100 ball 'bearings would
have a combined weight between 496 and 500 gms.
2.22 0.74
[Given H(z)dz = 0.4868; f Hz)dz = 0.2704]
o 0
6. The masses of 1500 ball bearings are normally distributed. The mean
and S.d of those are 22.40 gm and 0.048 gm.300 random samples of size
36 are drawn from this population. Determine the mean and S.d of the
sampling distribution of sample means if the sampling is done (i) without
replacement (ii) with replacement.

7. The random variable X is normally distributed with mean 68 em. and


S.d 2.5 em.What should be the size of the sample whose mean shall not.
differ from the population mean by more than I em with probability 0.95?
[Given that area under standard normal curve to the right of the ordinate at
1.96 is 0.025] [WB. U.Tech, 2004]

J
SAMPLING & ITS DISTRIBUTION 463

8.Distribution of life times of tyres manufactured by a company is


nonnal.Samples of nine tyres are drawn and it is observed that the probability
of the average life time of these nine tyres to be less than 75 months is 0.90
and to be more than 40 months is 0.46.Find the average life time and standard
deviation of all the tyres manufactured by the company.
[Given: area under standard normal curve is 0.54 and 0.90 between
the ordinates z = 0.10, y axis and z = 1.28, Y axis respectively]
9. The wages of workers in a factory is normally distributed with mean
Rs. 400 and standard deviation Rs. 100.Several samples of four are drawn
and it is observed that 80 such samples have sample mean less than Rs.
350.How many samples were drawn?Then find the number of samples for
which the average wage would be greater than Rs. 450.
[Hint: 1sf part: Find p(X < 350) = .1587, then 8: =.1587.
See table for the data under standard normal curve]
10. Samples of 25 workers' wage are drawn from the population of
wages of workers in a factory whose distribution is normal with mean Rs.
70 and S.d Rs. 25.Find the probability that the mean of the sample lies
between (i) Rs. 66 and Rs. 72 (ii) less than Rs. 66 (iii) more than Rs. 72.
0.4 0.8
[Given fH=)d==0.1554 , f~(=)d==0.2881]
o 0
II. Samples of certain size are drawn from a normally distributed
population with S.d 16.1t is observed that the probability of the sample
mean lying between 9.8 and 14.6 is 45.14%.Find the sample-size and the
0.6
population-mean. [Given that f H=)d= = 0.7257]

12. A machine produces bolts whose lenghts are normally distributed


with population mean 4 and variance 6.25.A bolt is defective if its lenght
does not lie between 3.8 and 4.3.Different samples of size 25 are drawn
from all such bolts.Find the percentage of samples whose sample mean lies
in the defective-range.
[Given: area under the normal curve enclosed on left side of the
ordinate z =.6 is 0.7257;on left side of the ordinate c =.4 is 0.6554]
[Hint: Find \- P(3.8 < X < 4.3) = 0.62.Then the required percentage
=.62 x 100]
464 ENGINEERING MATHEMATlCS-llA

13. The average life time of electric bulbs manufactured by a company


is 40 months and S.d 50.Samples of size 100 are drawn from the entire lot
manufactured.Find the probability that the average life time of the sample
(i) greater than 40 months (ii) greater than 50 months (iii) between 38
months and 52 months.
z

[Given f 4>(z)dz = 0.9772 ,0.6554


-00
and 0.9918 when z = 2 ,0.4 and 2.4
res pect ively]

14. From a normal distribution of variance 5 samples are drawn of


size 20.Find the mean and standard deviation of the sampling distribution
of the sample variance.Find the probability that the sample variance lies
between 8.21 and 9.645. [Given X125,19 = 32.84, X~05,19 = 38.58 ]

15. The radii of the ball bearing is normally distributed with standard
deviation JlO .Samples of size 15 are drawn.Find the mean and S.d of the
sampling distribution of the sample variance.Find the probability that the
sample variance does not lie between 2.90 and 4.70.

[ Given, X~95;14 = 4.35, d95;14 = ~.6 , X~8;14 = 7.05, X~5;15 = 7.3]


16. Samples of certain size are drawn from a population normally
distributed.It is found that the mean and variance of the sampling distribution

of the sample variance are respectively ~ and 275 .Find the variance of
12 72
the population and the size of the samples drawn. Then fmd the probability
of sample variance exceeding l3.025. [Given that X]JOl;11 =31.26]
Answers
5J5
3. 3 4.4, "9' 3 5.0.2164

6. (i) 22.4, approx 0.008 (ii) 22.40, .008 7. 25


8. Mean = 37.03, S.d = 89.1 9. 504; 80
10. (i) 0.4435 (ii) 0.2119 (iii) 0.3446 11.16; 12.2 12.62%

13. (i) 0.5 (ii) .0228 (iii) 0.6~72 14. ~ , Jl9; .02
28 112 8
15.3" 9; .955 16. 5, 12; .001.
SAMPLING & ITS DISTRIBUTION 465

Multiple Choice Questions

1. Which one of the following is correct?


(a) Sample is an element of Population.
(b) Poputation size < sample size.
(c) Sample size x Population size.
(d) Population c sample.
2. Which one of the following is correct
(a) Sample c Population.
(b) Sample and Population are two different elements.
(c) Sample is always a single-tone set.
(d) Population must be an infinite set.
3. Which one of the following is correct
(a) Calculation of statistical measures of a sample is sampling.
(b) Process of drawing sample from a Population is sampling.
(c) To make a partition in population is sampling.
(d) There should be some minimum number of elements in a
sample. -,
4. Sample error depends on
(a) size of the sample
(b) Condition of mentality who draws samples
(c) size of the population.
(d) none of these.
5. Sampling error depends on
(a) the size of population
(b) variability of datas in the sample
(c) Population size
(d) Variability of the datas in the population.
6. There is only one way of sampling
(a) True (b) False.

EM-2A-30

-
466 ENGINEERING MATIIEMATICS-IJA

7. In a Simple random sampling


(a) the drawing of the data is simple
(b) every data of the population is equi-likely to be included in the
sample.

(c) Probability of inclusion of each data of the population to be


included may not be same.

(d) any arbitrary method of selection of data is accepted.


8. Find which one of following is not simple random sampling:
(a) Picking a number at random from the set of integers
(b) Picking person from the set of all persons of Kolkata
(c) Picking the first four member of the set of intergers,
{1,2,3,4, ... } .

(d) every one of these.


9. In S R S W R

(a) the size of the population goes on decreasing after each drawing.
(b) the size of the population does not go on decreasing after each
drawing.

(c) the sample size becomes equal to that of the population size.
(d) none of these.
10. An observation may repeat in a
(a) SRSWR sample (b) SRS WOR sample
(c) any type of sample (d) none of these.
11. Under SRSWR the possible samples from the population {2,6,7}
are
(a) {2,2},{2,6},{6,7}
(b) {2,6},{2,7},{6,7}
(c) {2,2},{6,6},{7,7},{2,6},{2,7},{6,7}
(d) none of these.
SAMPLI G & ITS DISTRIBUTION 467

12. Under SRSWOR all the possible samples from the population
{x,y,z} are

(a) {x,x},{x,z} (b) {x,x},{y,y},{z,z}

(c) {x,y}, {y,z} (d) {x,y}, {x,z}, {y,z} .


13. Number of all possible SRSWOR samples of size three of the
population {3,5,7,9} is

(a) 1 (b) 2 (c) 3 (d) 4.

14. Number of all possible SRSWR samples

(a) 64 (b) 4 (c) 62 (d) 60.

15. SRSWOR of size two is done from the population {2,3,4,6}.


Which one of the following is a sample mean.
5 I
(a) 5 (b) - (c) 6 (d) "2.
2
16. SRSWR of size one is done from the populaion {1,2,4}. Which
one of the following is a sample mean

5 3
(a) 3 (b) - (c) - (d) 2.
2 2
17. Sample standard deviation is

(a) a fixed quantity. (b) a variable quantity.

(c) always zero. (d) none of these.

18. If {Xi ,x} } be sample drawn from the population {Xl ,X2'X3' ··,xIO}

{i,j = 1,2,3, ...io} then which one of the following is not statistic

2 2
(b) Xi +Xj

(d) X2
I +x/. _
468 ENGINEERING MATHEMATICS-IIA

19. If {x;,xj Hi,} = 1,2,3,.· ·10} be sample drawn from the population

{x ,X2,X3""'XIO},
j then which one of the following is not parameter.

(a) x; + x j (b) x/ + x /
9 10
(c) s, + LX; (d) LX; .
;;1 ;;1

20. If {6} is a sample drawn from the population {2,4,6,8} then sample
standard deviation is
(a) 1 (b) 0.5 (c) 0 (d) 0.8.
21. Which one of the following is incorrect
(a) s.d of a sample is a statistic.
(b) mean of a sample is a parameter.
(c) variance of population is a parameter.
(d) s.d of population is a parameter.
. 22. Sampling distribution is the
(a) distribution of population.
(b) distribution of a sample.
(c) distribution of a sample statistic.
(d) distribution of a parameter.
23. If a sample of size two is drawn from a popuation {4,6,8} at a
time then the sampling distribution of the sample mean is
(a) X 10 12 14
1 1 1
f - - -
3 3 3
1 1 1
(b) X - - -
5 6 7
2 1 1
f .' - - -
3 3 3
(c) X 5 6 7
1 1 1
J; - - -
3 3 3
(d) none of these.
SAMPLING· & ITS DISTRIBUTION 469

24. s.d of samples has a distribution but s.d of a population has no


distribution
(a) True (b) False.
25. Standard error is standard deviation of
(a) a parameter (b) sample mean only
(c) a statistic (d) none of these.
26. If t is a statistic having standard error 3 than variance of t is
(a) 9 (b) 6 (c) J3 (d) none of these.
2
27. If t is a statistic such that E(t ) = 5 and E(t) = 2 then the standard
error of tis
(a) 0 (b) 1 (c) 1.5 (d) 2.
28. Standard error of a statistic depends on
(a) population size (b) observarions of sample.
(c) observation of population (d) sample size.
29. Which one of the following is incorrect?
(a) Greater the S.E lesser the sample reliable
(b) S.E decreases as sample-size increases.
(c) S.E shows the variability ofthe observarions of the population.
(d) S.E shows the variability of the values assumed by the concerned
statistic.
30. If X is sample mean of a population having normal distribution
then the sample distribution of X IS
(a) normal (d) X2
(c) uniform (d) exponential.
31. The mean and s.d of a normal population are 3 and 0.2. Then the
standard error of the sample mean, with sample size 100 is
(a) 0.2 (b) 0.02 (c) 0.002 (d) 0.6.
32. A population has normal distribution with parameter m = 5 and
c = 0.1 . Then X (the sample mean with sample size 25) is a normal variable
with mean and s.d
(a) 5, 0.02 (b) 5, 0.1 (c) 5, 0.2 (d) none of these.
470 ENGINEERING MATHEMATICS -IIA

33. The s.d of the sampling distribution of sample mean for the population
with s.d 0.5, population size 122 and sample size 22 is
:; 05
(a) 22 (b) ~
5
(c) 11m (d) none of these.

34. For any large population, for large values of sample size the sampling
distribution of sample mean has
(a) exactly normal distribution
(b) approximately nonnal distribution
(c) uniform distribution
(d) t-distribution.
35. The sampling distribution of the sample mean for a large population
is approximately normal if the sample size is
(a) 2 (b) 5
(c) 10 (d) 100.
36. Consider all possible samples of size two that can be drawn with
replacement from a population with mean 6 and variance 10.8. Then the
standard error of the sample mean is

(a) 2.32 (b) 2

(c) 8.32 (d) 3.29.


37. The mean of the sample variance with sample size 9 drawn from an
infinite population with s.d 3 is
(a) 6 (b) 7
(c) 9 (d) 8. I.
38. If a sample of size n is drawn from an infinite population with s.d
o then, for the sample s.d S,

(a) E(_n_S2)=cr 2 (b) E(S2)=cr2


n-l

(c) E(_n_S) = cr2 (d) none of these.


n -I
SAMPLING & ITS DISTRIBUTION 471

39. If S be the sample standard deviation where sample of size 12 is


drawn from a very large population with s.d.2 and if E(t) = 4 then t =
11 ?
(b) 12S-
12 2
(c) -S (d) none of these.
11
40. If S be the s.d of sample of size 4 drawn from a normal population
with s.d 2 then S2 has.

(a) normal distribution (b) standard normal distribution

(c) X2 diatribution (d) t-distribution.

41. If S be the s.d of sample of size 9 drawn from a normal population


922
with s.d 2 then "4 S has X distribution with degree of fredom

(a) 9 / (b) 8

(c) 7 (d) 10.


42. From a population with 10 members a random sample (without
replacement) of 2 members is drawn. The possible number of such sample
is (if order is ignored).

(a) 40 (b) 45

(c)'50 (d) none of these.

43. The possible number of samples of size two drawn with replacement
from a population of size 25 is

(a) 625 (b) 600

(c) 300 (d) none of these.

44. A normal population has a mean 0.1 and s.d 2.1. The mean of the
sampling distribution of the sample mean with sample size 900 is

(a) 1 (b) O~l

(c) 0.001 (d) none of these.


472 ENGINEERING MATHEMATICS -IIA

45. A normal population has s.d 2.1. The standard error of the sampling
distribution of the sample mean with sample size 900 is

(a) 0.07 "(b) 7 (c) 0.007 (d) .002.


46. The mean and s.d of a normal population are 100 and 24 respectively.
(x -100)
If x is the sample mean with sample size n then
24vn
r: is a standard

normal variate
(a) True (b) False.
47. The mean weight of 500 guinepigs is 5.02 gms and their s.d is 0.30.
Samples of 100 guinepigs are drawn. The s.d of the sample mean is
(a) 0.27 (b) 27

(c) 0.027 (d) none of these.


48. Samples of weights of 200 persons each, are drawn from a normal
propulation of the weight of persons with s.d 10. The mean of the sample
variance is
(a) 20 (b) 200

197 199
(c) - (d) -
2 2
Answers
l.c 2.a 3.b 4. a 5.d 6.b 7.b 8.c

9.b 10.a l1.c 12.d 13.d 14.a 15.a 16.d

17.b 18.c 19.d 20.c 21.b 22.c 23.c 24.a

25.c 26.a 27.b 28.d 29.c 30.a 31.b 32.a

33.c 34.b 35.d 36.a 37.d 38.a 39.c 40.c

41.b 42.b 43.a 44.b 45.a 46.b 47.c 48.d


13.1. Introduction:
In practice we make decision about populations on the basis of sample
information. In this chapter an assumption is made on the populaion and
by going through some statistical analysis of the sample informations the
validity of the assumption is tested. For example, we decide on the basis
of the sample data whether a new drug is really effective in curing a
disease.

13.2. Statistical Hypothesis


Any assumption taken on a population, regarding its probability
distribution or its parameters, is called a 'Statistical Hypothesis'. Such a
'Hypothesis' mayor may not be true.
For example, let we assume that the mean of a population is 60. Then
" Il = 60 " is a statistical hypothesis. Drawing samples from this population
we can test the validity of this hypothesis and this is test of hypothesis.
There are two types of hypothesis. These are discussed below :
Simple Hypothesis
A Statistical Hypothesis which specifies the probability distribution and
all related parameters of a population is called simple hypothesis.
Illustration: Consider the population of lifetime of electric bulbs
manufactured by a company. Let the lifetime (denoted by X) be normally
distributed with standard deviation 4. We have to test its mean. Let we
assume the mean Il = 600 hours and we shall test its validity. Then- we
see under this hypothesis the entire character of the population is specified.
So this hypothesis " Il = 60" is simple hypothesis.
Composite Hypothesis
A Statistical Hypothesis which does not specify the population
completely is called composite hypothesis.
Illustration: Consider the population of weekly wages of the workers
of a big industry. Let the population be normally distributed whose mean
and s.d are unknown. We have to test its mean. Suppose we assume its
mean Il = 60 and go to test its validity.
474 ENGINEERING MATHEMATlCS-lIA

Then we see under this hypothesis the entire character of the population
is not known because the standard deviation remains unknown. So this
hypothesis " J.l = 60" is a composite hypothesis.
Null Hypothesis
Test of Hypothesis starts with a statistical Hypothesis. A statistical
hypothesis whose possible acceptance or rejection is tested on the basis
of sample observation is called a Null Hypothesis. Usually it is denoted by
Ho·
Illustration. Suppose we assume that "the mean of a population is 40".
Let a random sample drawn from this population has mean 38. By going
through some statistical analysis we have to test whether our assumption
may be accepted or rejected. Then the assumption" J.l = 40" is a Null
Hypothesis; in notation Ho(J.l = 40).
Alternative Hypothesis.
A statistical hypothesis which is different from the Null Hypothesis is
called Alternative Hypothesis. It is denoted by HI .
Illustration. Consider the null hypothesis Ho (J.l = 60) . Then HI (J.l :j; 60)
is alternative hypothesis. Again if it is seen from the fact that there is no
chance of J.l being less than 60 then we may also take HI(J.l > 60) as
an Alternative Hypothesis to Ho(J.l = 60). Similarly HI(J.l < 60) may be
also considered as Alternative Hypothesis.
Both Sided Alternative Hypothesis.
Let Ho (8 = 80) be a null hypothesis where 8 is a parameter. Then
the alternative hypothesis HI (8:j; 80) is called both sided alternative
hypothesis.
For example, the alternative hypothesis HI(J.l:j; 60) is both sided
alternative hypothesis against the null hypothesis Ho(J.l = 60), where J.l
is the mean of the population.
One Sided Alternative Hypothesis.
Let Ho (8 = 80) be a null hypothesis where €I is a parameter. Then
the alternative hypothesis HI (8 > 80) is called one sided (Right) Alternative
'Hypothesis. The alternative hypothesis HI (8 < 80) is called one sided (Left)
Alternative Hypothesis.
TESTING OF SIGNIFIANCE 475

For example the alternative hypothesis HI(~ > 60) is one sided
alternative hypothesis against the null hypothesis Ho(~ = 60) .
13.3. Test Statistic.
To test the possible acceptance or rejection of a null hypothesis we
have to take the help of a statistic whose sampling distribution is known.
By evaluating the value of the statistic we decide whether the ulI
Hypothesis is to be rejected or not. This statistic is called Test-Statistic.
Illustration. Let Ho(~ = 60) be a Null Hypothesis where Il is the
population mean. Let the population be normally distributed with s.d.
• c = 3. We consider the statistic x,
the sample mean. Its sampling
3
distribution is known to be normal with mean 60 and s.d. where.J15
15 is the size of the sample. We compute the value of x
for a sample
drawn from this population and from this computed value we decide
whether Ho is to be accepted or not. So here x is the test statistic.
Note: Selection of test statistic depends on the character of population,
sample and the Null Hypothesis.
13.4. Critical Region, Region of Acceptance and Level of
Significance.
Let Ho be a null hypothesis and t be the appropriate test statistic by
which we decide whether Ho would be accepted or not. This decision is
based on probability consideration. Usually a lower probability a. (e.g 0.05
or 0.01 etc) is taken under consideration and let P(a ~ t ~ b) = 1- a.. If
a computed value of t lies within this interval (a,b) then 'we decide Ho
is accepted at "a. Level of Significance" or " lOOo.% Level of
Significance". The interval (a,b) is called Region of Acceptance
corresponding to" a. Level of Significance".
If a computed value of t lies beyond the interval (a,b) we decide Ho
is rejected at " a. Level of Significance" or " 100 a. % Level of
Significance". This region beyond the interval (a,b) is called "Critical
Region" (CR) of the test at" a. Level of Significance".
Note: (1) The region of acceptance/Critical region is not unique
corresponding to a 'Level of Significance'.
476 ENGINEERING MATIIEMATICS-IIA

(2) Thus if a computed value of the appropriate test statistic t falls in a


Critical Region then Ho is rejected.
(3) The 'falling of the computed value of t in CR' is complementary to
the 'falling of the computed value of t in Region of Acceptance".
Thus, Probability that a computed value of t lies within the Critical Region
= 1- Probability that a computed value of t lies within the Region of
Acceptance = 1- P(a < t < b) = 1- (1- a) = a.
(4) It would be seen in the latter section of this chapter that when the
pdf of the test statistic t is symmetric about 0 it is best to take the region
of acceptance as. (-a,a) (i.e. a =b in above) if both sided alternative
hypothesis is taken both sided.
(5) It would be seen in the latter sections of this chapter that it is best to
take the region of acceptance as (-oo,a) (i.e. C.R as t > a) if the
alternative hypothesis is taken right sided and (a,oo) (i.e. C.R. as t < a)
if the alternative hypothesis is taken left sided.
Illustrations. (i) Let Il be the mean of the population of lifetimes of
electric bulbs manufactured by a company. Let the population has normal
distribution with s.d 3.30. We want to test the null hypothesis
Ho(1l = 171.17) by drawing samples of size 400 from the population. We
know the sampling distribution of the sample mean x is normally

distributed with mean 171.17 and s.d = ~ = 0.165. We select x as


,,400
. . N ow z
the test stanstic. =
x-171.17. IS stan dar d norma 1·vanate. HT
vve ta ke
0.l65
the alternative hypothesis as HI (Il';t: 171.17), both sided. Since

p( -1.96 < z < 1.96) = 0.95 = 1- 0.05 (obtained from statistical table) so it
is best to consider (-1.96,1.96) as the region of acceptance at 5% level
of significance and the region beyond it. i.e. the region Izl ~
1.96 as the
critical region at 5% level of significance. (The reason of such selection
would be discussed later)
Now if it is seen that a random sample (of size 400) gives its mean,
. 171.38-171.17
x = 171.38 then we compute the value of z = = 1.27 .
0.165
TESTING OF SIGNIFIANCE 477

We see this value lies in the region of acceptance (-1.96, l.96). So we


may conclude Ho is accepted at 5% level of significance i.e. the average
lifetime of the bulbs manufactured by the company may be. 171.17. This
decision of us is 95% true.
Since the region of acceptance determined by z,
x-171.17
-1.96 < z < l.96 ~ -1.96 < < 1.96 ~ 1708-
. 5 < x < 171.49
0.165
so the region of acceptance determined by the test statistic x is
(170.85,171.49) ; consequently the CR determined by x is x < 170.85
together with x > 171.49 .
13.5. Type I Error and Type n Error [ WE. U.Tech 2005 ]
A null hypothesis Ho is tested on the basis of sample values only. For
this reason it has no gurantee that we always take right decision regarding
acceptance or rejection of Ho. There may occur an error in taking
decision. Two types of Error may exist :
Type I Error: This error is made when a Null Hypothesis is rejected
though it was really true.
Type II Error: This error is made when a Null Hypothesis is accepted
though it was really false.
Note: In order to ensure a good test of hypothesis, the testing should
~
be designed so as to minimize the Type I and Type II error. This is not
so simple because for any given sample size an attempt to decrease one
type of error is accompanied by an increase in other type of error. In
practice one type of error may be more serious than the other. Thus it is
wise to reach a compromise to limit the more serious error.
Probability of Type J Error.
Let Ho(8 = 80) be a Null Hypothesis and t be test statistic. Now, the
probability of Type I error = Probability of rejection of Ho(8 = 80) on
the hypothesis that Ho is true.
= P{(computed value of tlies in C.R)/8 = 80 is true } (this IS

conditional probability)
= Level of Significance of the test.
This is also called Size of Type Error
478 ENGINEERING MATHEMATlCS-IlA

Illustration. Let P be the proportion of defective items in a large lot


i.e. P =( No. of defective items in the lot) + ( No. of items in the lot). Let
Ho (p = 0.2) be a null hypothesis. A sample of 8 items is drawn from the
lot; Ho is accepted if the number of defective items (I) in the sample
is ~ 6 . Here P is parameter,j is test statistic and the CR is [7, 8]
Now, P(I = r) = Probability of r defective items in the sample of 8
articles =8Cr pr (1- ptr (': f is (8,P) binomial variate). So the
probability of Type I error = Probability of rejection of Ho on the
hypothesis that Ho (p = 0.2) is true

=P (I = 7,8) on the hypothesis p = 0.2


= P(I = 7) + P(I = 8) assuming P = 0.2
=8C7 (O.2f (1- 0.2)8-7 +8C8(0.2)8 (1- 0.2)8-8
= 0.00008448. This is also the level of significance of the test.
Probability of Type II Error.
Let Ho (8 = 80) be a Null Hypothesis and t be test statistic. Now, the
probability of Type II error = Probability of acceptance of Ho(8=80)
on the hypothesis that 8"* 80•
Now e may assume any value other than 80• So the probability of
this type of error depends upon the fact 'what value is taken by 8'. Say
8 assumes the value 81 (which is "*80) • Then probability of Type II
Error, assuming 8 = 8\ is the Probability of acceptance of Ho on the
hypothesis e assumes the value 81
= P{(computed valueoft falls in Region of Acceptancej/O = 81}

Illustration. Let P be the proportion of defective items in a large lot,


i.e. P = No. of defective items + total number of items in the lot. Let
Ho (p = 0.2) be a null hypothesis. A sample of 8 items is drawn from the
lot; Ho is accepted if the number of defective item (I) in the sample is
s 6· Here P is parameter, f is test statistic and the CR is [ 7, 8 ] . Now
P (I = r) = probability of r success in 8 trials =8Cr P" (1 - p) 8-r (.: f is
(8, p) binomial variate).

L/
TESTING OF SIGNIFIANCE 479

Now if we go to find the probability of Type II error we have to assume


a value of P other than 0.2. Suppose P = 0.1.
:. Probability of Type II error (when P = 0.1 is true)
= Probability of acceptance of Ho assuming P = 0.1
= p(O ~ I ~6) assuming p == 0.1

= 1- {P(J = 7) + P(J = 8)} assuming P = 0.1


= 1-{ 8C (0.1/
7 C
(0.9)8-7 +8 S(O.l)8(0.9)8-S}

[':/is (8,0.1) Binomial variate]


= 0.99999927 .
Note: From the above two illustrations it is easy to understand that if
probability of Type I error (or Type II error) is given, we can find the
critical region. Students are advised to exercise this reverse process i.e to
find the CR given that the probability of Typer I error is 0.00008448 in
the previous illustration. This will be helpful to understand the notion of
Best Critical Region which is being discussed below.
13.6. Best Critical Region.
Let Ho (8 = 80) be a null hypothesis which is to be tested, by the test
statistic t, against the alternative hypothesis HI' The test is a procedure
depending on the choice of a region of acceptance or a Critical Region. It
would be good if we can select a critical region so as to minimize the
r Type I and Type II error. This is not so simple because one type of error
is accompanied by an increase in other type of error. (This can be seen
from the two previous illustrations)

To obtain the best test we find a critical region. (e.g the region where
It I "> a) corresponding to a level of significance a such that Probability
of Type I error relative to this CR is a i.e. P (computed value of t lies in
the C.R assuming 8 = 80) = a. There may exist several CR like this.
) Among these that critical region is called Best Critical Region for which
(
the Type II error is least. We say this is the Best Critical Region or simply
Critical Region (CR) corresponding to a level of significance.
480 ENGINEERING MATHEMATICS-IIA

Illustration. Suppose we are to test the null hypothesis Ho(J.l = 52)


against the alternative hypothesis HI (J.l = 49) where J.l is the mean of a
population, having normal distribution with s.d. 5. We do this by drawing
sample of size 25 from the population. Since the sample mean has x
(J.l, ;;;) normal distribution we take the test statistic as

x-J.l ..In(x-J.l) J25(x-J.l)


z=--= = =X-J.l
cr cr 5
..In

which is standard normal variate. Take the level of significance of the test
as a=O.OI.Assuming J.l=52 we see z=x-52.
From statistical table we have
p(O < z < .03) = 0.01
p( -.900 < z < -.806) =.3159-.3051 =.01

P(2.29 < z < 3.l) =.4990-.4890 =.01

p( -00 < Z < -2.32) =.0 I

Thus the intervals like (0,


-2.32
.03), (-.900, -.806), (2.29,
3.1), (-00,-2.32) are all
Critical Region corresponding to 0.01 level of significant. There are more
CR like these.
Now it can be shown that among all these CR, for the CR (-00, - 2.32)
the probability of Type II error assuming J.l=49 is least (this is not shown
due to lengthy calculation). So we conclude the region (-00, - 2.32) i.e.
-00 < Z < -2.32 is the Best Critical Region. This is displayed in the figure
by shading the region.

Now if for a sample its mean x = 50 then the computed value of


z = 50 - 52 = -2 not lying in the best critical region. Hence we accept
Ho as true at 0.01 level of significance. i.e. we think the mean of the
population may be 52 - this decision is 99% correct.
TESTING OF SIGNlFlANCE 481

Power of a Test ..
From the previous discussion we see that the goodness of a test depends
upon the choice of the critical region. As the probability of Type II error
decreases the goodness of the test increases. This notion leads to have
the definition:

Power ofa Test = 1- Probability of Type IT Error assuming HI'

So if the Best Critical Region can be selected the Power of the Test
becomes highest.
Illustrative Example.
Example 1. In order to test whether a coin is perfect the coin is tossed 5
times. The null hypothesis of perfectness is rejected if more than 4 heads
are obtained What is the probability of Type I Error? Find the probability of
Type II Error when the corresponding probability of head is 0.2.
[ W.B. U. Tech 2007]
Let P be the proportion of the number of obtained head among 5 times
of throw. The coin is perfect if P = ± since probability of" head for a

perfect coin is 112. So the null hypothesis is Ho( P = ±). Ho is rejected


if the number of obtained heads, j > 4 .
Now, P(j = r) = Probability of r heads in 5 tosses
=SCrpr(l_p)S-r [':j is (S,p) Binomial variate].

So the probability of Typer I Error = Probability of rejection of Ho


on the hypothesis that Ho( P = 7i) is true = P(j = S) on the hypothesis
P=~
2

. =SCS(~)S
2
(1- ~)S-S =_1
2 32 .
Now, Probability of Typer II Error when P = 0.2
= Probability of acceptance of Ho assuming P = 0.2
= P(j s 4) assuming P = 02 = 1- P(j = S) assuming P = 02
= l-sCs(o.2)5(1- 0.2)5-5 = 0.99968 .
Ei\J-2A-31

,...

I
482 ENGINEERING MATHEMATICS -IIA

Exercises 13
1. (a) In order to test whether a coin is perfect, I shall toss it 6 times.
I shall reject the null hypothesis of perfectness if and only if I get no head
or 6 heads. What is the probability of Type I error for my test?
(b) To test the unbiasedness of a die it is thrown six times and is
accepted that the unbiasedness if not more than one sixes are obtained. Find
the probability of Type I Error. [ W.B.UTech 2005 ]

2. The null hypothesis Ho(1l = 7) is tested against HI (11= 6) where


u,o' are mean and s.d. of a normal distribution. Given a = 2. The test is
performed by drawing a random sample of size 25 from the given
population and using the best cirtical region at 0.16 level of significance.
Find the probability of Type II error.

3. The proportion of defective items in a large lot of items is p. To


test the hypothesis p = 0.2, we take a random sample of 8 items and accept
the hypothesis if the number of defectives in the sample is 6 or less. Find
the probability of Type I error of the test. What is the Type II error if
p = 0.3 ? [W.B. U Tech. 2008]

4. Let p denote the probability of getting a head when a given coin


is tossed once. Suppose that the hypothesis Ho:p = OJ is rejected in favour
of HI: P = 0.6 if 10 trials result in 7 or more heads. Calculate the probability
of Type I and Type II error.

5. Given the density function j(x,9) = i, 0 :s;x :s;e

= 0 , elsewhere
and that you are testing the null hypothesis Ho:9 = 1 against HI:9 = 2 by
means of a single observed value XI. Determine the size (i.e. probability)
of Type I and Type II error if you chose the interval 0.5:s; x as the critical
region.
Answer
1. (a) 1/32 (b) 0·263 3. 0.00008448; 0.99870967
11
4. 64,1-0.382=0.6185. 0.5,0.25
TESTlNG OF SIGNIFIANCE 483

Multiple Choice Questions


1. Consider the normal population of life times of tyres manufactured by
a company whose mean and s.d are unknown. Then the assumption average
life time =3600 kms is

(a) simple hypothesis (b) composite hypothesis

(c) alternative hypothesis (d) none of these.


2. Consider the normal population of the body weight all B.Tech students
in West Bengal whose standard deviation is 5. Then the assumption "the
average bodyweight = 50kg" is a

(a) simple hypothesis (b) composite hypothesis


(c) alternative hypothesis (d) none of these.
3. Simple hypothesis does
(a) specify the population completely
(b) not specify the population completely
(c) simplify the calculation of parameters of the population
(d) none of these.
7 4. If m be the mean of a population having t distribution. Then the
assumption m = 20 is to be tested. Then this is a
~
,
(a) alternative hypothesis (b) simple hypothesis

(c) Null hypothesis (d) none of these.


5. An alternative hypothesis
(a) is same as the null hypothesis
(b) different from the null hypothesis
(c) mayor may not same as the null hypothesis
(d) none of these.
.:q- 6. If (cr = 3) be the null hypothesis then which one of the following is
I alternative hypothesis

(a) (cr=4) (b) (cr = 1) (c) (cr=O)



484 ENGINEERING MATHEMATlCS-IIA

7. If ~ is a parameter and H(~ = 5) is null hypothesis, then which one


of the following is Left sided alternative hypothesis

(a) H(~ * 5) (b) H(~ < 5) (c) H(~ > 5) (d) H(~ = 4)
8. If ~ is a parameter and H(~ = 7) is null hypothesis, then which one
of the following is Both sided alternaive hypothesis

(a) H(~ * 7) (b) H(~ = 8)


(c) H(~ < 7) (d) none of these.

9. If Hl(~ > 60) be an alternative hypothesis then the Null hypothesis is


(a) Ho(~ < 60) (b)Ho(~ ~60)
(c) Ho(~ ~ 60) (d) Ho(~ = 60).
10. To test the acceptance of the null hypothesis Ho (populationmean
= 10) for a population having normal distribution with s.d 4 the test statistic
is

(a) sample s.d (b) sample proportion

(c) sample mean (d) none of these.


11. If x be test statistic and (a,b) is region of acceptence correspending
to 3% level of significance than p( a ~ x ~ b) =
(a) 0.9 (b) 0.97 (c) 0.99 (d) 0.03.
12. If t be test statistic and (a,b) is the critical region at 4% level of
significance then p( a < t < b) =
(a) 0.04 (b) 0.4
(c) 0.96 (d) none of these.
13. If -3.9 < t < 3.9 be a region of acceptance in a test of hypothesis
then the critical region is
(a) t < -3.9 (b) t > 3.9 (c) 0 < t < 3.9 (d) none of these.
14. In a test of hypothesis if (-4.6,6.8) is region of acceptance then the
Null hypothesis is rejected if the computed value of the test statistic is
(a) 5 (b) 6 (c) -3 (d) -6.
TESTING OF SIGNIFlANCE 485

15. 'Falling of the computed value of the test statistic critical region' =

complement of the even 'Falling of the computed value of the test statistic
in the region of acceptance.
16. If in a test of hypothesis (-00,1.02) is critical region at .0 I level of
significance then, if t is test statistic,

(a) p(O < t < 1.02) = 0.01 (b) P(1.02 < t) = 0.99
(c) p(O < t) = 0.96 (d) none of these.
17. If the alternative hypothesis is taken right sided with test statistic t
then indicate which one of the followings may be a possible region of
acceptance

(a) t <a (b) t >a (c) t ~ a (d) t = a .


18. If the alternative hypothesis is taken left sided then indicate which
one of the followings may be a possible critical region

(a) t <a (b) t > a (c) t ~ a (d) t = a

19. In a test of hypothesis Type I Error is committed when

(a) Null hypothesis is rejected though it was really false.

(b) Null hypothesis is rejected though it was really true.

(c) Null hypothesis is accepted though it was really false.

(d) Null hypothesis is rejected though it was really false.

20. In a test of hypothesis Type II error is committed when

(a) Null hypothesis is rejected when it was really false

(b) Null hypothesis is rejected when it was really true

(c) Null hypothesis is accepted when it was really false

(d) none of these. [ WB. U. Tech 2007]

21. If in a test of hypothesis Ho is accepted at 2%


the null hypothesis
level of significance then the probability that the statement in Ho is false is
(a) .2 (b) .02 (c) .98 (d) none of these.
486 ENGINEERING MATHEMATICS -IIA

22. If Ho(t = 5) where t is test statistic is null hypothesis then probability


of Type I error =

(a) Probability of acceptance of Ho assuming t =5


(b) Probability of rejection of Ho assuming t =f; 5
(c) Probability of rejection of Ho assuming t =5
(d) Probability of acceptance of Ho assuming t =f; 5.
23. If Ho(8 = 2) is null hypothesis then the probability of rejection of
Ho though 8 = 2 is true is probability of

(a) Type I error (b) Type II error

(c) Probability of 8 =f; 2 (d) none of these.


24. In a test of hypothesis Probability of Type I error is same as level of
significance of the test.

(a) Yes (b) No.

25. If Ho(8 = 10) be null hypothesis then the probability of Type II


error =

(a) Probability of rejection of Ho assuming 8 = 10


(b) Probability of rejection of Ho assuming 8 =f; 10

(c) Probability of acceptance of Ho assuming 8 = 10


(d) Probability of acceptance of Ho assuming e =f; 10.
26. In a test of hypothesis corresponding to a particular level of
significance, among all critical region 'Best critical region' has

(a) least Type I error (b) least Type IT error

(c) greatest Type I error (d) greatest Type II error.


27. In a test of hypothesis the critical region corresponding to a particular
level of significance is unique.

(a) True (b) False.


TESTING OF SlGNIFIANCE 487

28. If a null hypothesis is accepted at .05 level of significance then this


decision is
(a) 5% correct (b) .05% correct
(b) .95% correct (d) 95% correct.
29. As the probability of Type Il error decreases the goodness of the
test
(a) decreases (b) increases
(c) does not chang (d) none of these.

Answers

l.b 2.a 3.a 4.c S.b 6.d 7.b

8.a 9.d 10.c l1.b 12.a 13.d 14.d

IS.b 16.b 17.a 18.a 19.b 20.c 21.b

22.c 23.a 24.a 2S.d 26.b 27.b 28.d

29.b

r
111411;;;;;;;;;;;;;;;;;;;;;L;;;AR;;;;;;;;;G;;;E;;;S;;;A;;;MP;;;;;;;;;L;;;E;;;T;;;E;;;S;;;T;;;O;;;F;;;;;;;;;SI;;;G;;;N;;;IF;;;I;;;C;;;AN;;;C;;;;;;;E

14.1. Introduction:
The sampling distribution of many of the commonly used statistic is
almost normal when the statistic is measured on a large sample
(approximately more than 30). For example for a large sample the statistic

Z= T - ()o appropriate follows Standard Normal distribution where


SEof T
Tis the approximate test statistic and Ho«() = ()o) is null hypothesis. When
the sample size is small we do not get this advantage. In that case we
have to assume the population as 'normal' and various test statistic are
used which follow Standard Normal, Chi-square, tor F distribution. This
compel us to classify 'Test of Significance' as 'Large Sample Test' or Small
Sample Test'. In this chapter Test of Significance or Test of Hypothesis
is conducted by drawing a large sample from the population.
In fact this procedure is nothing but an outcome of the following
theorems which are consequences of the renowned theorem 'Neyman-
Pearson Theorem'.

14.2. Test for Single Mean:

Theorem 1. Let a large random sample XI ,X2 ""Xn (n > 30) be taken
from a normal population with mean J.1 and s.d. cr to test the Null
Hypothesis Ho(J.1 = J.1o) against y
an alternative hypothesis HI at
a -level of significance, where
o is known (i.e. the hypothesis
are simple).
o
Then the Best Critical Region (CR) determined by the test statistic x (the
sample mean) is

. X-J.1o
i.e., Z = cr/.J;, < -Za if HI is Left sided.

.
i.e, Z = cr/.J;,J.1o
X - > Za 1if H I IS
' Righ t SIide d .
LARGE SAMPLE TEST OF SIGNIFICANCE 489

(iii) Ix - 1101> J-;; Za/2 i.e. Izi = :i?n > Za/2 if H is both sided,
J

where z is a standard normal variate and a. is the area under standard


normal curve enclosed between the ordinates Z = Za and Z = 00 as shown
in the adjacent figure
Proof: Beyond the scope of the book
Illustration. Let the null hypothesis
Ho(1l = 2) is to be a tested against the
alternative hypothesis HJ(1l > 2) by
drawing sample of size 100 where 11 is
1 the mean of the normal population with
2.32

s.d. 0.1.
Note that this is a Right Sided hypothesis. The test statistic is x
(sample mean) which is normally distributed (as discussed in an earlier

(J 01 x-2
chapter) with mean 11= 2, s.d = r - ~ =;()1 :.Z
is standard = --
,,1m "n .01
normal variate. Let the testing is to be done at 0.01 level of significance.

As we see from the above theorem the Best Critical Region is Z > 2.32
at .01 level of significance since .01 is the area under the standard normal
curve enclosed between the ordinate Z = 2.32 and Z = 00 as shown in the
adjacent figure. This is found from Table I given in Appendix
If for a random sample, its mean x is 2.015 then the computed value

of Z = 15 which does not lie in the Best CR. We accept Ho


= 2.015 - 2
.01
i as true at 0.01 level of significance.
Note. In above theorem the population is normal or any. Moreover if
the population s.d. (J is not known then we may think (J = S where S is
sample standard deviation.
Example.I, From a large population a sample of size 400 is drawn with
mean 171.38. Cat it be resonably regarded as the mean of the population
I71.I7? The standard deviation of the population is 3.30.Test 5% level
of significance.
490 ENGINEERING MATHEMATICS -IlA

Solution. The Null hypothesis is Hoeu = 171.17) i.e. we suppose the


population mean, p = 171.17
Alternative Hypothesis HI (p "# 171.17) . Alternative hypothesis is taken
both sided as p has the possibility of being less or greater than 171.17
The sample size, n = 400 . Since the sample size is large the sampling
distribution of the sample mean x has normal distribution with mean

p=171.17 and s.d = S.E of x= 0;.= ~=0.165


-m ,,400
. _x-I7l.I7
.. z - is standard normal variate
0.l65
(Here Z is the test statistic)

Here x=17l.38. Then z=171.38-17l.17 =1.27


0.165
Alternative hypothesis HI is both sided. So the best Critical Region
(CR) Izl > 1.96 at 5% level of significance since the area under normal
curve is .05 for z > 1.96 and z < -1· 96 as shaded in the adjacent figure.

-1.96 1.96

For the computed value of z does not fall in the C.R. So Ho is


accepted. We conclude "it may be regarded that the mean of the population
is 171.17".
Example.2 From a random sample of size 100, mean J 05, s.d 20 test at
1% level whether the mean of the population can be less than 120.
[Given z.OI = 2.33 J
Solution. Null Hypothesis Ho(p = 120).
We take alternative hypothesis HI (p < 120).
Here n = 100 .We can take population mean p = 120
:.s.d of X=S.E of x= ~
LARGE SAMPLE TEST OF SIGNIFICANCE 491

As the sample is large we may take,


CJ = s.d of sample = 20.
:.s.d of x= .J~~o=2
:. z = x -120 is standard normal variate -2.33
2
Here x = 105 :. Z =
105 -120 = -7.5
2
Since z.OI = 2.33 the area as shaded in the figure is .01 as HI is left
sided.
:. the CR is z < -2.33
Here the computed value of z = -7.5 < -2.33
i.e. z falls in CR. So Ho is rejected.
HI is accepted. We conclude "the population mean can be less than 120".
Example.3. Find the size of the random sample drawn from a normal
population with mean f.l and s.d 0=10 if the probability of Type II
Error is 0.02 in testing the null hypothesis HO:f.l=100 against the
alternative hypothesis HI:f.l = 105 with respect to the best critical region at
5% level of significance.
Let n be the size of the random sample. Since Ho: f.l = 100, the test
.. X- 100 I,; (X -100)
r statistic z = 10/1,; = 10 (1)

is standard normal variate. .: HI is right sided, the best CR is given by


z> 1.64 at 5% level .: p(z> 1.64) =.05

1,;(1' -100)
•• C.R is > 1.64
10
1,;(1' -100)
•. the region of acceptance is < 1.64 .
10

Now, assuming " f.l = 105 ". X is (105, 1) normal variate

X-I05 I,;(X -105) .


--'------!..is standard normal variate.
:.V= IJ!I,; 10
492 ENGINEERING MATHEMATICS-IIA

Now, probability of Type II Error


=P (computed value of the test statistic falls in Region of acceptance
/J.l=10S)=P(z<1.64/J.l=IOS) (2)

Now, J.l = 1OS => X is (lOS, 1) normal variate

. X -10S .In(X -IOS) (3)


.. U= IJi.Jn = 10

is standard normal variate. From (3) X = lo;! + IOS

.JnCOU + IOS-100)
Putting this in (1) we get z = _-'---'.In_n ---'-
10

.In( lO"-n +
"tin
U s) IOU +S.Jn
=
10 10
:. From (2) Probability of Type II Error

=p( IOU I~S.Jn < 1.64)

IOU +S.Jn )
By problem, P ( 10 < 1.64 = 0.02

or, p( U < 164~:.Jn)=0.02

164-S.Jn )
or, P ( 10 < U < 0 = 05 - 0.02 =.48

or, p(o< U < 164-S.Jn)=.48 [note that 164-S.Jn IS a


10 10
negative quantity]

So, from statistical table we have 1·64- sfn = 2.0S => n == SS


10
:. the sample size is 55.
LARGE SAMPLE TEST OF SIGNIFICANCE 493

Example. 4. The mean breaking strength of the cables supplied by a


manufacturer is 1800 with a s.d. 100. By a new technique in the
manufacturing process it is claimed that the breaking strength of the cables
have increased. In order to test this claim a sample of 50 cables is tested.
It is found that the mean breaking strength is 1850. Can we support the
claim at 0.01 level ofsignificance? (Given the area under st-normal curve
is 0.01 enclosed between the ordinates z = 2.33 and z = CXJ)
We suppose the population of strength of the cables is normally
distributed with mean ~ and s.d. (J. After introduction of new technique
if the population mean becomes greater than 1800 we can support the
claim of the manufacturer. So we take the null hypothesis Ho(~ = 1800),
i.e. the breaking strength is not increased. The alternative hypothesis is
HI(~ > 1800), i.e. the breaking strength is increased. This is right sided.
Here the population s.d. (J = 100, sample mean x = 1850, the sample size,
n = 50. So x has normal distribution with mean = ~ = 1800 ,

s.d.
=~ 100 = 14 14
.r;; = J50 .. So,
z_
-
x -14.14
1800 .
IS standard normal vanate.
.

Since HI is right sided, the best critical region at 0.01 level is z » 2.33
(from the supplied data). For the sample the computed value of
1850-1800
z = 14.14 = 3.54 which lies in the CR. Thus Ho is rejected and so

HI is accepted at 0.01 level of significance. We conclude the breaking


strength of the cables is increased, i.e. the manufacturer's claim is
supported.
Example. 5. A random sample of 200 tins of coconut oil, gave an average
weight of 4.95 kg. with a s.d. 0.21. Do we accept the hypothesis that net
weight is 5 kg pert in at 1percent level? [Given p(lzi > 2.58) = ·01]
Here the sample size 200 is large (.:> 30) .The null hypothesis is
Ho(1! = 5) .The alternative hypothesis is HI(w;e 5) since I! may be > or
< 5. The test statistic
X-I! X-I!
Z = cr/.rn = S/.rn (:.n is large)
x-5 x-5
= 0.21j.hoo = 0.015
494 ENGINEERING MATHEMATICS - JIA

Since we are given p(lzi > 2.58) = ·01 so the Best CR is Izl > 2·58

Here the computed value of z = 4·95-5 = -3·33 or, z = 3·33 which is II


0·015
greater than 2.58. :. the value of z lies in Critical Region. :. Ho is
rejected at 1% level of signicance. We conclude " the net weight per tin
is not 5 kg."
Example. 6. A machine part was designed to withstand an averge
pressure of 120 units. A random sample of size 100 from a large batch
waS tested and it was found that the average pressure which these parts
can withstand is 105 units with a s.d of 20 units. Test at 5% level whether
the batch meet the specification. Suppose the population has normal
distribution. (W.E. U.Tech 2006, 2015]
The population has normal distribution with mean ~ and s.d. a. "The
batch meet the specification" means ~ = 120. So the null hypothesis is
Ho(~ = 120). "The machine can not withstand 120 units pressure" means
~ < 120. So the alternate hypothesis HI(~ < 120).
Sample size n = 100 .
The sampling distribution of x is normal with mean ~ = 120 ,
a S
S..d = ..In::: ..In
(.: the sample is large we take
sample s.d. = population s.d.)
20
= .JIOO = 2 -1.645 1.645

x-120
:. z = --2- is standard normal variate. Since HI is left sided, the
Best Critical Region is z < -1.645 at 5% level of significance
(.,' from statistical table we see area under the standard normal curve
enclosed between the ordinates z = 0 and Z= 1.645 is 0.45)
105-120
The computed value of z = 2 = -7·5 which lies in the CR.

So Ho is rejected at 5% level. We conclude "the batch does not meet


the specification".
LARGE SAMPLE TEST OF SIGNIFICANCE 495

14.3. Test of Single Proportion:


Theorem. Let a random sample Xl ,X2'" .xn be taken from a population
with proportion P to test the Null Hypothesis Ho(P = po) against an
alternative hypothesis HI at a -level of significance. Then the Best critical
Region determined by the test statics p (sample proportion) is given by

(i) Z =~ < -z. if HI is 'eft sided


Po 0
n

(ii) z = ~ > z. if HI is right sided


PoQo
n

( ...) I I =
111 Z ~p - Po > Za/2 1'fH I IS
. b oth SIidedwere
h Q0 = 1- D
'0 an d a IS
.
PoQo
n
the area under standard normal
curve enclosed between the
ordinates Z = za and Z = 00
as shown in the adjacent figure
Proof . Omitted.
Illustrative Example :
Example. 1. In a random sample of size 400 there are 80 defective items.
Test at 5% level whether the proportion of defective items in the population
1
may be regarded as "6
1·96
[Given f<l>(t)dt=0.475,<I> is the pdfofnonnal variate]

Solution : The null hypothesis is Ho( P = ~). The alternative hypothesis

.) is H{P;C~)
The obtained sample proportion p = 80 = O· 2
400
496 ENGINEERING MATHEMATlCS-IIA

The sample size, n = 400


1
p--
Then z
· r,Q, J.!'~
= p - Po

n
=

~
6 = p - o· 167
0·186

400
This is both sided test.
. From the given data we have P(O < z < 1·96) = ·475
or, P(I.96< z < 00) = ·5--475 = ·025
By symmetry P( -00 < z < -1· 96) = ·025 also
:. P(lzJ > 1· 96) = 2 x ·025 = ·05
:. The Best Critical Region is JzJ> 1·96
.
Here the computed value of z IS z = 0·2 -0·167
= 1. 77 which is not
0·186
greater than 1.96.
:. z does not lie in the cirtical Region. So H 0 is accepted at 5% level.
• 1
We conclude 'proportion of defective items in the population is (5 ,
Example. 2. A sample of size 600 persons selected at random from a large
city shows that percentage of male in the sample is 53%. It is believed that
male to total proportion ratio in the city is
continued by the observatiors.j,
+. Test whether this belief is

[Given J Ij>(z):lz = 0·025]


--«)
Solution : The sample size n = 600

The obtained sample proportion p = 53 = .53 .


100
Let P = population proportion of male.

The null hypothesis is Ho( P = i = 0.5).


The alternative hypothesis is HI(P * 0·5) This is both sided test.

p-Po p-0·5 p-0·5


Here Z= = =~--
~Po{1 ~0.5XO.5 0·0204
. n 600
LARGE SAMPLE TEST OF SIGNIFICANCE 497

-1·96
We are given' f <1>( z) dz = ·025

i.e p(-oo < z < -1· 96) = ·025


:. by symmetry
P(I. 96 < z < (0) = ·025
p(lzl> 1· 96) = ·025 + ·025 = ·05
:. the Critical Region is Izl> 1·96
0·53-0·5 I
The computed value of z = = 1·47 which is not greater
0·0204
than 1.96. Therefore z does not lie in Critical Region. So H 0 is accepted at
5% level. We conclude" the belief is true".
Example. 3. In a sample of 600 parts manufactured by a factory, the
number of defective parts was found to be 45. The company however
claimed that atmost 5 percent of their product is defective. Is the claim
tenable?
2·3

[Given f <I> = 0·99]

Solution: The-:ull hypothesis


100
= .05) Ho(P =~
The alternative Hypothesis is HI (p > .05)
Note that if HI is accepted then company's claim is not tenable. This is
right sided test. The obtained sample proportion p = 45 = 0.075 ; sample
600
size n = 600 z - curve
p-·05 p-·05
Here z =
lO~95 = -"----
O·0089

2·33
We are given J <I>(z}1z = O·99, 2·33

This is shown in the adjacent figure. From the figure we have


P(2·33 < z < (0) = 1-0-99 = 0·01

EM-2A-32
498 ENGINEERING MATHEMATICS -IIA

The best critical region is z > 2 . 33 at 1% level.


The computed value of z for this observations
·075- ·05
z= = 2 ·81, which is greater than 2.33.
·0089
So Ho is rejected and HI is accepted at 1% level of significance. So
we conclude "Company's claim is not tenable"
Example. 4. A coin is tossed 900 times and heads appear 490 times. Does
this result support the hypothesis that the coin is unbiased?
[Given P(0<z<2·58)=0.495]

Solution: If the coin is unbiased then the proportion of occurence of


1
head in 900 trials will be "2

We take the null hypothesis Ho( P =~). The alternative hypothesis

HI ( P ~ ~). This is a both sided test.


Here the sample size, n = 900.
490
Obtained sample proportion p =- = 0·544
900
1
p--
-=~2= p-0·5
Now Z = ~! x! 0.0166
~
900

We are givenP(O < z < 2 ·58) = O· 495


:. P(2·58 < z < (0) = 0·5-0·495 = 0·005
By symmetry P( -00 < Z < -2.58) = 0·005
:. P(lzJ > 2.58) = O·005 + 0 ·005 = 0·01
:. the best critical region is given by Izl > 2·58 at 1% level.
0·544 - 0·5
Here the computed value of z = = 2 . 65
0·0166
which is greater than 2.58. So z lies in the critical region. Ho is accepted
at 1% level. We conclude "the coin is unbiased"
LARGE SAMPLE TEST OF SIGNIFICANCE 499

Example. 5. A die was thrown 9000 times and of these 3220 yielded a 3 or
4. Is this consistent with the hypothesis that the die was unbiased?
2·58
[Given J 4> = 0·95 ]
-2-58
Solution : If the die is unbiased then the probability of" 3 or 4" is

~ = .!. .That is the proportion of occurence of '3 or 4' in any number of


63
1
trial will be "3 .
So we take the null hypothesis Ho(P=~). The alternative

hypothesis HI ( P :;t: ~ ) • This is both sided test. The sample size n = 9000 ;

Obtained sample proportion, p = 3220 = 0.358


9000
1
P-"3 __p-0·33
Here z = --====
1 2 0·00496
-x-
~
9000
Izl
The best critical region is > 2·58 at 5% level (as for the previous example).
0·358-0·33
Here the computed value of z = 0 . 00496 = 5 . 04

which is greater than 2.58.


:. z lies in the critial Region. So Ho is rejected at 5% level. We conclude
, the die was not unbiased'.
14.4. Test of Single Standard Deviation
Theorem: Let a random sample xl'X2' ••.. ,xn be taken from a
population with mean f..l to test the Null Hypothesis Ho(a = ao) against
an alternative Right sided hypothesis HI at a -level of significance.
Then the Best Critical Region (CR) determined by the test statistic (i)
n
L(x; - f..l)2
X2 = ..:..;=;;.:..1_-::--_ is given by X2 >X~ when population mean f..l IS
ag
known,
500 ENGINEERING MATHEMATICS -IIA

Where a is the area under


X- curve (with n degrees of
freedom) between the ordinates
X~ and X2 =ooas shown by
shade in the adjacent figure.

n
~::CXi_x)2 nS2
(ii) X2 =..:....i=....:..I _ (S is sample s.d) is given by X2 > X~ ,
O"~ O"~

when population mean is unknown, where a is the area under X- curve

(with n -1 degrees of freedom) between the ordinates X~ and X2 = ex) as


shown by shade in the adjacent figure.
Example.I. A sample 4, 5, 6, ..... , 32, 33 of size 30 is drawn from a
population with mean 4. It is assumed that the standard deviation of the
population is 3. Test this assumption at 0.1 % level of significance.
[Given X.~I = 59.703 with 30 d.o.f and X.%ol = 58.302 with 29 d.o.f]
Solution. Population mean, J1. =4 .
Population s.d 0" •
Null Hypothesis Ho(O" = 3), we take HI (0" > 3)

.. X2 = -21 ~( )'
Th e test stanstic L... Xi - J1. -
0" i=1

=321 {(4 -3) 2 + (5 -3) 2 + ... + (33-3) 2}


=_1 (12 +22 + ... +302)=~. 30(30+1)(2x30+1)
32 9 6
= 1050.56 with 30 degrees of freedom.

Here a = 0.1 = .001


100
Given X.~I = 59.703 (with d.o.f 30)
:. the critical region is X2 > 59.703 at .1% level. Here the computed
value of X2 = 1050.56 > 59.703.

--
LARGE SAMPLE TEST OF SIGNIFICANCE 501

So Ho is rejected. We conclude "the population s.d can not be 3".

Example.2. A random sample of size 31 from a normal population gives


a sample mean 42 and a sample s.d of 5. Test at 5% level the hypothesis
that the population s.d is 7.
[Given X.~5 = 43.77 and X.~5 = 42.557 with 30 and 29 degrees of
freedom respectively]
Solution. Here n = 31 . H 0 (a = 7)
The test statistic X2 = ns2 = 31 x 52 = 15.82 with n -1 = 30 degrees
a2 72
of freedom.
Here a =~=.05
100
We are given X.~5;30 = 43.77.
:. the CR is X2 > 43.77 at 5% level

Our computed value of X2 = 15.82:f1- 43.77


:. this value does not fall in CR. :. Ho is accepted at 5% level.
We conclude "the population's s.d may be 7".
14.5. Difference of Mean ( or Test of equality of means)
Theoreml , Let two independent large random samples {Xt>X2, ••. Xnl}

and {x;,X2, ...x~J, be drawn from two Normal populations (~hcrl) and
(~2 , c 2) respectively to test the hypothesis Ho (~I = ~2) at a level of
significance ;

_ =-1 ("XI +x2 +···+x , ) •


X;z n2
n2
S02 ENGINEERING MATHEMATICS-11A

XI -X2
Then z= is the test statistic which is standard normal
0"2 O"Z
variate. _I +_2 z-curve
• nl n2

Accordingly the C.R is

(i) Z < -Za if HI (Ill < Ilz) (left sided)

(ii) Z > Za if HI (Ill> 1l2) (right sided)

I I
(iii) z > za/2 if HI (Ill ':F- 1l2) (both sided)
where a. is the area under standard normal curve enclosed between the
'ordinates Z = za and z = 00 as shown in the adjacent figure
Note. In the above theorem the two population mayor may not be
Normal.Moreover if the population s.d o , and O"z are not known then
we may think 0"1 ~S1 ,0"2 ==S2 where SI'SZ are sample standard
deviations.
14.6. Illustrative Example's.
Example. 1. The means of two large samples of sizes J 000 and 2000 are
67.5 and 68.0 respectively. Test the equality of means of the two normal
populations each with s.d. 2.5 at 0.5% level. Given area under st. normal
curve is 0.25 enclosed between the ordinates z = 0 and z = 0.0987.
Let the two population means be III and 112'

The two s.d. crl = cr2 = 25 .

The two sample means XI = 675, X2 = 68.0.


We take the null hypothesis HO{1l1 = 1l2) against the alternative
hypothesise HI{1l1 *- 1l2) •

We take the test statistic z = -;==X~I=-=x~z == which is normally


{2.5)2 (25)2
distributed. --+--
1000 2000
Corresponding to this sample the computed value of

675-68.0 _r=-O=5=== -0.5 =-5.1642


Z=-;======
(25t (25)z 1 1 0.09682
--+-- 25 --+--
1000 2000 1000 2000
LARGE SAMPLE TEST OF SIGNIFICANCE 503

Since HI is taken, both sided so from given data the critical region is
Iz I>.0987 at 5% level of significance. Here for the computed value
Iz 1=1-5.16421>.0987 . So Ho is rejected. We conclude the two population
means are not same.
Example. 2. A college conducts both day and night classes intended to
be identical. A sample of 100 days student yields examination results as
under.
x = 72.4; ax = 14.8
A sample of 200 night students
yields examination results as under:
x=73.9; ax =17.9
.0987
Are the two means statistically equal at 10% level 1".0981
Given p{ 0 < z < l.645)=.05.
Take the null hypothesis HO{1l1 = 112) against HI{1l1 ~ 112) where 111>112
are the two population means.
Here, means of the two sample, XI = 72.4, X2 = 73.9
s.d. of the two samples, SI = 14.8, S2 = 17.9
size of the two samples, nl = 100, nz = 200 .
Since the sample size are too large so though the s.d of population are
unknown we take the test statistic z = XI - X2 which is standard normal
S2 S2
variate. _I+~
nl nz
Computed value of z = 72.4-:-73.9 = - 0.77 .
{14.8)2 (17.9)2
--+--
100 200
Since HI is both sided and since p{O < z < 1.645)= 0.05 so the critical region
is Iz I> l.645 at 10% level. Since 1-0.771= 0.77 does not lie in this region,
Ho is accepted at 10% level.
14.7. Test for Difference of Proportions (i.e. Test of equality of
proportion)
Equality of proportions of a particular type of items in two populations
are tested by drawing two random samples from the two respective
populations.
504 ENGINEERING MATHEMATICS - JIA

This method of testing is based on the following theorem:


Theorem : Let two large independent random samples of sizes nl
and n2 be drawn from two populations with proportions ~ and ~
respectively to test the hypothesis Ho(~ =~) at a level of significance.
Let XI and X2 number of particular type of items be present in the 1st
. x
and 2nd samples respectively i.e. the two sample propertions are PI = _1
n l
x2
and P2 =- ~
n2 Xl _ X2

l 2 n n :. Xl + X2
Then Z = -r==========, where P = .is the test
1 1) nl +n2
p(l- p) (-+-
n1 n2
statistic which is standard normal variate.
Accordingly the CR. is
(i) Z < -za if HI(~ < P2) left sided
(ii) Z > za if HI(~ > ~) right sided
(iii) Izi > za/2 if HI(~ = ~) both sided
00

where P(za < Z < 00) =a i.e. J


Za
<p =a
14.8. Illustrative Examples.
Example. 1. From two cities two random samples of 600 and 1000 men
are drawn respectively. It is found that 400 and 600 men are illiterate
among the men in the two samples respectively. Test at 5% level whether
the population of the two cities have same percentage of literacy.
00

Given H(t)dt = 0.025.


1.96
Let PI = Proportion of illeterate persons in 600 men
P2 = Proportion of illeterate persons in 1000 men

" The population of the two cities have same percentage of


literacy" <=> ' ~ = P2 '.
We ,shall test the hypothesis Ho(~ =~) agianst HI(~ ~ ~) .
Here {400} and {600} are two samples of size 1 drawn from the two
populations X and Y respectively.
LARGE SAMPLE TEST OF SIGNIFICANCE 505

~_X2

As we know the test statistic is z= nl n2

P(l-P)(~+~) ,
nl n2
A XI +X2 .. .
where P = -+-, which IS a standard normal vanate.
nl n2

Since P(l. 96 < < 00) = ·025


I
Z

:. Here the C.R is Izl> 1·96.at 5% level


1 Corresponding to the samples XI = 400, X2 = 600 ,
A XI + X2 400 + 600 5
P=--= -
nl +n2 600+ 1000 8
and the computed value of
1.96
400 600
----
z= 600 1000 = 2.66 Which is greater than 1.96

~(1-~)(6~0 + 10~0)

So, the computed value 2.66 does lie in this CR .We do not accept
Ho at 5% level and HI' is accepted. We conclude PI "# P2 i.e. the two
cities have not same percentage of literacy.
Example. 2. A company has the head office at Kolkata and a branch at
Bombay. The personnel director wanted to know if the workers at two
places would like the introduction of a new plan of work and a survey

I
r.
was conducted for this purpose. Out of a sample of 500 workers at Kolkata
62% favoured the new plan. At Bombay out of a sample of 40q workers
41 % were against the new plan. Is there any significant difference between

i the two groups in their attitude towards the new plan at 5% level. Given
area under standard normal curve enclosed between the ordinates z = 0
and z = l.96 is 0.475.
Let lj = Population proportion of workers who favoured at Kolkata.
P2 = Population proportion of workers who favoured at Bombay Office.
•• "there is a significance difference between the two groups in their
attitudes towards the new plan" ~ , lj "# Pz '
We shall test the null hypothesis Ho(lj = Pz) against HI(lj "# Pz).
506
ENGINEERING MATHEMATICS _JlA

Now, 62% out of500:=:310 and (100-41)% out of400:=:236.


:. Here XI :=:31O,nJ :=:500,x2 :=:236,n :=:400
2
As we know the test statistic is
.5.._ X2

z== nz
nJ ~ xJ+x2
where p==~
Ip(J- p)(~+~)
V, nJ n2
nl +».
which is a standard nonnal variate.
Corresponding to the two samples
~ 310+236 546 91
p==500+400 ==900==150 and the computed value of
310 236
---
z :=: 500 400 == 3 ==~ ==0.915
J 91 (1 91)( 1 1)
150 - ISO 500+400
.0328x 100 3.28

From the given data we have z pO I>


1.96)==.05.So the critical region
I I>
is z 1.96. Our computed value does not lie in it. So Ho is accepted at
5% level. We conclude "there is no significant difference".
Example. 3. In afaelory Producing articles, 400 artie/es out of a sample
of 500 aractes were found 10 be of excellent quality. After laking a step
offuel reducing process it isfound 400 articles in a sample of600 articles
are of excellent. State whethere there is a sufficient decrese in the quality

.
of product IOn. .
Test at 1% level. GIven .[2; !
1 2.33 -Z2/2
e dz==0.49

XI:=:400, x2:=:400, p==~


~ XJ +x2 400+400 :=:_
8
nl + n2 500 + 600 II
.5.. - X2 400 400
z:=: nJ n2 == 500-600 ==~==4.944
J p(J-p) ( -+-
1 ])
nJ n2
J -8 ( 1--
11
8)( _+_
1 ])
11 500 600
.02697
.
.

Since HI (~ > P2) is right sided. From the given data we have
P(z> 2.33) :=:0.5-0.49=0.01. TheCRis z>233.
LARGE SAMPLE TEST OF SIGNIFICANCE 507

Here the computed value lies in this region. So Ho is rejected and


H1(P1 > P2) is accepted. We conclude "The quality of articl.es is
sufficiently decreased".
Example.4. Two samples of size 900 and 1600 are drawn from two
population respectively. The number of defective items in the two samples
are 20% and 15% respectively. Test at 1% level of significance whether
the proportion of defective items in the first population is more than that
of in the second population.
00 00

[Given f tP = 0.01 and f tP = 0.05]


2.33 1.64
Solution. Here nl = 900, n2 = 1600
xl' x2 be number of defective items in the two samples respectively.
20 15
:. Xl =-x900=180, X2 =-x1600=240
100 100
~ xl +x2 180+240 420 21
p= = =---=-
nl + n2 900 + 1600 2500 125
:. the test statistic,
xl _ x2 180 240
----
nl n2 900 1600
Z=-P====~===7=j=~====~======~
p(1- p{ ~l + :J 1~~(1- 1~~)(9~0 + 16~0)

1 1 1

I 20
2184
--x-
1
15625 576
~
375000
~~xp;~o ~~xJ~1.21~3.21

I
Let ~,P2 is the two proportions.
1
( We take null hypothesis H 0 (lj = P2)
Alternative hypothesis is HI (lj> P2) .

The alternative hypothesis is right sided.


So toe critical region is z > Za where a = 1% = _1_ = .01
. 100
00 t.e. z > z.OI
Given J tP = 0.01 :. z.OI = 2.33
2.33
508 ENGINEERING MATHEMATICS -UA

Our computed value of Z = 3.21 > 2.33


:. the value lies in CR. So Ho is rejected and HI accepted. We conclude
'the proportion of defective items in first population is more than that of
second population'.
14.9 Test for difference of standard deviations (or, equality of s.d)
Theorem. Let two random samples of sizes nl and "2 be drawn from
two population with sample standard deviation .s; and S2 to test the
hypothesis H0 (0"1 = 0"2) at a level of significance. 0"1,0"2 are s.d of the two
population respectively. Then the test statistic is

which is standard normal variate.


[Since the sample are large we may take 0"1 = SI' 0"2 = s2]
As usual the best critical region (CR) is
Z < -za if HI is left sided
Z > za if HI is right sided

Izi > z~ if HI is both sided


where a is the area under standard normal curve enclosed between
the ordinates Z = Za and Z = 00
14.10. Illustrative Examples.
Example.1 A company claims that the consistency regarding life time of
the electric bulbs produced by them is superior to those of a competitor
on the basis of a study which showed that a sample of 30 bulbs made by
.
them has s.d 25 hours while a sample of 40 bulbs of the competitor
.
has
s.d 27 hours. Test at 5 percent level of significance whether the claim of
the first company is justified.
[Given are under standard normal curve enclosed between the ordinates
Z = 1.645 and Z =00 is .05]
solution. The consistency of life time is better if standard deviation is
lesser.
I
I LARGE SAMPLE TEST OF SIGNIFICANCE 509
j
Let 0"1 and 0"2 be the s.d of the two population respectively.

I Null hypothesis HO(O"I =0"2). Alternatively hypothesis HI (0"1 < 0"2) .


This is taken as the 1st company claims consistently good bulbs then the
I second.
Here for the 1st company, nl = 30, SI = 25
and for the second competitor, n2 = 40, S2 = 27

SI -S2
The test statistic, Z = --;=========-
S2 S2
_I +_2
2nl 2n2

= ---;=2=5
=-=27= = -2 = -0.452
252 272 .J19.529
-+-
60 80
Since HI is left sided so the best critical region (CR) is z > -z.05

': here a = _5_ = .05


100
Since area under standard normal curve enclosed between the ordinates
Z = 1.645 and Z =OOis .05 so z.05 = 1.645 and the critical region is
Z < -1.645 at 5% level.
Our computed value of Z = -0.452 1. -1.645
:. this does not fall in CR. So Ho is accepted.
:. We conclude 'the s.d of the two population are same". The claim
of the first company does not stand,
Exercise 14
[I] Short Answers Questions
1. The mean lifetime of a sample of 100 fluorescent light bulbs
produced by a company is computed to be 1570 hours with a s.d of 120
hours. The company claims that the average life of the bulbs (distributed
normally) is 1600 hours. Using a level of significance of 0.05 is the claim
acceptable?
1.96
(Given J cp(t)dt =.4750,
o
510 ENGINEERING MATHEMATICS - IIA

2. A sample of 100 iron bars is said to be drawn from a large number


of bars whose lengths are normally distributed with mean 4 ft. and s.d
0.6 ft. If the sample mean is 4.2 ft. can the sample be regarded as a truly
random sample? (Null hypothesis and assumptions should be stated
2.58
clearly). Test at 1% level of significance. (Given f $( z)dz = 0.4951.
o
3. When a certain production machine is in perfect adjustment it
produces bolts with a mean diameter of 0.0600 inches and a s.d of 0.0150
inches. In order to ascertain whether or not the machine is still in
adjustment, a sample of 36 bolts is selected. The sample mean diameter
is found to be 0.0575. Is the machine still in adjustment? (Use level of
significance of 0.05) [Hint: take HI(~ *.06)]
4. 200 men out of 600 men and 300 men out of 1200 men in two
cities are highly educated. Do the data indicate that the two cities are
significantly different regarding the educational standard of men ?
[ W.B.UTech, 2002]
5. In a sample of 600 men from a certain city, 450 men are found
to be smokers. In a sample of 900 from another city 450 are found to be
smoker. Do the data indicate that the two cities are significantly different
with respect to prevalence of smoking habit among men? Test at .05
level of significance. Given Z2.5 = 1.96 .
6. A sample survey results show that out of 800 literate people 480
are employed whereas out of 600 iIleterate only 350 are employed. Can
we tell the scope of employment among literate and illiterate people are
same. Test at 5% level. Given p(O < z < 1.96) = 0.475.
[Hint: Out of two Binomial population B(800,PI), B(600,pz) we
have to test Ho(PI = pz); XI = 480, X2 = 350 ]
7. In an industry A, 20% of a random sample of 900 razor-blades
are defective. In another industry B, 15% of a random sample of 1600
blades are defective. Do the two industries produced same quality of blades
? Test at 1% level.
1 2.58_~
Given ~
..;2rt
f
0
e 2 dz = 0.495 .
LARGE SAMPLE TEST OF SIGNIFICANCE 511

8. The mean yield of wheat from district A was 210 1bs. with s.d = 10
lbs. from a sample of 100 plots. In another district B, the mean yield
was 220 lbs with s.d. = 12 lbs from a sample of 150 plots. Assuming
that the s.d of yield in the entire state was 11 lbs, test whether there is
any significant difference between the mean yield of crops in the two
districts. Given pO
z / > 258) =.01 .
[Hint: Here. population (J is known. Take HO(J.l1 = J.l2) against
HI(J.l1 *' J.l2) ]

9.Intelligence test on two groups of boys and girls gave the following
results:
Mean S.D. Number
Boys 70 20 250
Girls 75 15 150
Is there a significant difference in the mean scores obtained by boys and
girls? [ W.B. U. Tech, 2004 ]
[Hint: (JI "'"20, <12 "'"15 as the samples are large]
lO.A simple sample of heights of6400 Englishmen has a mean of67.85
inches and a s.d of 2.56 inches, while a simple sample of heights of 1600
Australians has a mean of 68.55 and a s.d of 2.52 inches. Do the
data indicate that Australians are on the average taller than the Englishmen?
2.33
Test at 1% level. Given J~(t )dt =.01.
o
[Hint : Since sample-size are so large we may take (JI "'" 2.56 ,
(J2 "'"252. Take HI(J.l1 < J.l2)]

l1.What is the test statistic which is used to test the equality of two
means of two normal populations 'when their common s.d. is unknown.

Answers
1. No ; /z/ = 25 > 1.96 2. No. Iz/ =3.33 3.0.065

5. z = 9.69 ; There is significant difference

6. z = 0.74; employment scope is same 7. No; z = 321

8. z = -7.04, there is significance difference in the mean yields of crops.


512 ENGINEERING MATHEMATICS - I1A

9. z = -2.84, 'There is difference' at 5% level 10. yes; z = -9.9

11.

[II] Long Answers Questions


1. A manufacturer of string has found from past experience that
samples of a certain type have a mean breaking strength of 15.6 kg and
s.d of 2.2 kg. A time-saving change in the manufacturing process of this
string is tried. A sample of 50 pieces is then taken, for which the mean
breaking strength turns out to be 15.5 kg. On the basis of this sample
can it be concluded that the new process has a harmful effect on the
strength of the string ? (Assume that the breaking strength of string is
normally distributed). Test at 5% level of significance. See appropriate
statistical table.
[Hint: Take Ho(j.l = 15.6) and HI (j.l < 15.6) ]
2. A company manufactures car tyres. The lives of the tyres are
normally distributed with a mean of 40,000 kms. and s.d of 3,000 kms.
A change in the production process is brought. A test sample of 64 new
tyres has a mean life of 41,200 kms. Can you conclude that the change
produces better quality of tyres ? Test at 5% level. See statistical table.
3. A new printer-head is introduced into the market. It is claimed
that it has an average life of 200 hours with s.d of 21 hours. The claim
came under severe criticism from dissatisfied customers. A customer
group tested 49 such heads and found that they have an average life of
191 hours. Is the claim justified at 1% level of significance? See table.
4. The sample mean of a random sample of size 10 is 12.1. Given
that the s.d of the population is 3.2. Can you conclude that the sample
comes from a normal population with mean 14.5 ? Test at 5% level of
significance. Given Z.4750 = 1.96 ( Hint: Take Ho(j.l = 14.5) )
5. Is it likely that a sample of 300 item whose mean is 16.0 is a
random sample from a normal population with mean 16.8 and s.d 5.2 ?
Test at 0.01 level of significance. (Given area under st. normal curve
enclosed between z = 0 and z = 2.58 is 0.4951)
(Hint: Take Ho(j.l = 16.8))
LARGE SAMPLE TEST OF SIGNIFICANCE 513

6. A random sample of 900 members is found to havea mean of


4.45 ern, Can it be resonably regarded as a sample from a large population
whose mean is 5 ems and variance is 4 ern.
[Given area under z curve from z = -00 to z = I· 65 is 0.95 ]
[Hint: take HI (Il ;t:. 15·5) . CR is Izl > 1·65 at 10% level]
7. A manufacturer claimed that at least 95% of the equip
ments which he supplied to a factory conformed to specification. An
examination of a sample of200 pieces of equipments revealed that 18 were
a:>

faulty. Testhis claim at a significance.level of 0.0 1. [Given f <I>= ·0 I]


2·33

[Hint: Take Ho( P = i~ = 0.95). HI (p < 0·95) ]


8. In a hospital, 132 females and 168 males were born in a month.
Do these figures conform to the hypothesis that sexes are born in equal
-1·96

proportion? [Given f <I>


= 0·025]

[Hint:Take Ho(P=~).Hl(P*t)]
9. In a sample of 400 burners there were 12 whose internal diameter's
were not within tolerance. Is this sufficient evi dence for concluding
that the manufacturing process is turning out more than 2% defective burners.
1·645

Test at 5% level of significance. [Given fo <I>= .475]


[Hint Take n,(P = 1~O = O·02 ). HI ( P > 0 . 02) ]
10. In a random sample of 400 persons from a large population, 120
are females. Can it be said that males and females are in the ratio 5 : 3 in the
population? Use 10% level of significance.

[Given j<l>=0.5] [Hint: Ho(P=5~3=0.375).Hl(P*O.375)]


1·96

EM-2A-JJ
514 ENGlNEERING MATHEMATICS-IIA

11. A bottle manufacturing process is 'under control' if no more


that 1% of the bolltes are defective. A random sample of 120 bottles
showed 5 to be defective. Do these data indicate that process is out of
control? Use 5% level of significance. [Given zO.05 = 1·64 ]
12. In a sample of 400 parts manufactured by a factory, the number
of defective parts was found to be 30. The company, however, claim that
only 5% of their product is defective. Is the claim tenable?
[Given p(z> 1· 645) = ·05]
[Hint: Take Ho(P = O·05), HI (p > O·05)]
13. A die was thrown 400 times and 'six' resulted 80 times. Do the
data justify the hypothesis that the die is unbiased?
00

[Give f = 0.05]
1~6
4> [Hint: Take Ho(P
6
=~)]
14. In a sample of 600 students of a certain college 400 are found to
use bi-cycle. In another college from a sample of 900 students 450 were
found to use bi-cycles. Test at 1% level whether the two colleges are
significantly different with respect to the habit of using bi-cycle. Given
area under standard normal curve enclosed between the ordinates z = -2.58
and z = -00 is 0·005.
15. Before an increase in excise duty on tea, 400 people out of a sample
of 500 persons were found to be tea drinkers. After an increase in duty,
400 people were tea drinkers in a sample of 600 people. State at 2.5%
level of significance whether there is a decrease in consumption of tea
o ,2
due to increment of excise duty. Given ~
,,2n
fe--
-1.96
2
dz=0.475.

16. In a year there are 956 births in town A, of which 52.5% were
males, while in town A and B combined, this proportion in a total of 1406
birth was 0.496. Is there any significant difference in the proportion of male
births in the two towns. Test at 5% level. See appropriate statistical table.
17. A machine produced 20 defective articles in a batch of 400. After
overhauling it produced 10 defectives in a batch of 300. Has the machine
improved? Test at 5% level of significance. Given area under standard
normal curve between the ordinates z = 1.645 and z = 00 is 0.05.
LARGE SAMPLE TEST OF SIGNIFICANCE 515

18.ln a certain factory there are two different processes of


manufacturing same item. The average weight in sample of 250 items
produced from one process is found to be 120 grammes with s.d of 12
grammes ; the corresponding figures in a sample of 400 items from the
other process are 124 and 14. Is this difference significant ? See
appropriate statistical table. [Hint: For large sample assume 0"1 zz 12 , 0"2 == 14
and so the test statistic z.]

19.Two random samples of size 500 and 400 are drawn from two
normal populations having same s.d. 5. It is observed that the two sample
means are 11.5 and 10.9 respectively. Can the sample b~ regarded as drawn
from the same population. See appropriate table.

20.lntelligence tests on two groups - one group consisting 121 girls


and the other group consisting of 81 boys gives the mean 84 and 81
respectively. The intelligence are normally distributed with s.d. 10 for girls
and 12 for boys respectively. From these observations can we say the
average intelligence of boys and girls are same ? Test at the level which
can fit the data pOz 1< 1.96) =.95 .

21.From two normal populations two independent samples of size 30


and 55 are drawn. The two populations have a common s.d. 4.195. The
means of the two samples are seen as 23 and 21.9 respectively. Test at
5% level of significance whether the two populations have also the same
mean. Given p(z> 1.96) = 0.025.

22.A company claims that its light bulbs are superior to those of a
competitor on the basis of a study which showed that a sample of 40 of
its bulbs had an average life time of 628 hours of continuous use with a
s.d of 27 hours, while a sample of 30 bulbs made by the competitor had
an average life time of 619 hours of continuous use with a s.d of 25 hours.
Check, at the 5% level of significance, whether this claim is justified.
23. The means of two large samples of size 1000 and 2000 are 67.5
and 68.0 respectively. Test at 1% level of significance whether the means
of the two population each with variance 6.25 are equal.
00 00

[Given J ¢J = .005, J ¢J = .056]


2.58 1.59
516 ENGINEERING MATHEMATICS -IIA

24. The mean yield of rice from a district A was 220kg per acre with
s.d 12kg recorded from a sample of 150 plots. In another district B,the
mean yield was 210kg with s.d 10kg per area from a sample of 100
plots.Assuming that the standard deviation of the yield in the entire state
was 11kg, test at 5% level whether there is any significance difference
between the mean yield of rice in the two districts.
[Give z025 == l.96, z.05 = 1.645 ]
25. LED bulbs manufactured by company A and company B gave the
following results:
company A company B
No. of bulbs used. 100 100
Average life in hour. 1248 1300
s.d in hour. 93 82

Find whether the average life of the two makes are same. Test at .05
level of significance
[Given area under standard normal curve enclosed between the ordinate
Z = 1.96 and Z =00 is .025]
26. A manufacturing company produces electric lights in each of its
two factories. It is suspected that the efficiency in the factories is not the
same. So a test is conducted and following datas are collected
Factor A Factor B
No of tube lights in sample 200 100
Average life 900 1100
s.d 220 240

From the above collected information, find whether the variability of


life of tube lights from each factory is same. Test at 1% level .
00 00

[Give J ifJ = .005, J ifJ = .006 ]


2.58 2.51

27. The mean yield of two sets of plots and their variability are as
given below. Examine whether the difference in variability of yields is true.
Test at 1% level.
[Give z005 = 2.6 and ZOI = 2.3]
LARGE SAMPLE TEST OF SIGNIFICANCE 517

28. A potential buyer of elective bulbs bought 100 bulbs each of two
famous brands. Upon testing these he found that brand A had a mean life
of 1500 hours with a standard deviation of 50 hours whereas brand B
has a mean life of 1530hours with a standard deviation of 60 hours. Can
it be concluded at 1% level of significance that the two brands differ
significantly in quality? [Given zoos = 2.6, zOI = 2.3 ]
29. The mean height of 50 students who showed above average
participation in college atheletic was 68.2 inches with a standard deviation
of 2.5 inches; while 50 students who should no interest in such a
participation had a mean height of 67.5 inches with a standard deviation
of 2.8 inches. Test at 1% level the hypothesis that students who participate
in college atheletics are taller than other male students.

f rp= .005 and f rp= .01]


00 00

[Give [Hint: Ho : f.11 = f.12; HI : f.11 > f.12]


30. In ~6city, 20% of aVandom sample of 1100 students had a certain
physical defect. In another city, 200 out of a random sample of 900
students had the same defect. Do you think that the percentage is less in
the former city. Test at 5% level of significance.
[Hint Ho(~ = P2)' HI (P., < P2)]
31. In a certain state A, 450 persons were considered regular consumer
of alcohol out of a sample of 1000 perf-on. In another state B, 400 were
regular consumers of alcohol out of a sample of 800 persons. Do these
fact reveal that a significant difference between the two state as for as
alcohol consuming habit is concerned? Use 5% level of test.
[Given z.05 = 1.64; z.005 = 2.6 and z025 = 1.96 ]
[Hint: H 0 (~ = P2) etc. ]
32. Before imposing GST, 400 people out of a sample of 500 persons
were found to go to restaurant regularly. After imposition of the tax, 400
people out of a sample of 600 persons were found to go to restaurant.
State whether there is a significant decrease in going restaurent due to
imposition of GST. Test at 1% level.

f rp=.OI][Hint:
00

[Given Ho(~ =P2), Hl(~ >P2)]


2.33
518 ENGINEERING MATHEMATICS - IIA

33. In a sample of 600 families in a city, 400 families are found to use
broadband internet. In an another city, from a sample of 900 students 450
were found to use this type of internet. Test whether the two eities are
significantly different in using broadband internet.Test at 1% level
[Given z05 = 1.64; z.005 = 2.6]
Ansvvers
1. No.; z=-0.32 2. yes 3. No; Izl=3
4. No; Izl=2.37 5. No; Izl=2.67,
6. Izi = 8· 21; the population cannot have mean 5 em
7. Computed value of z = -2·597 . Supplier's claim is not valid
8. Computed value of z = 2·08 'The sexes are born not in equal
proportion
9. Value of z = 1·429, Manufacturing process is not turning out more
then 2% defective burners
10. value of z = -3·25 .Males and females in the population are not in 5
: 3 ratio
11. Value of z = 1·69 ; Rejected
12.Value of z = 2·27 , Company's claim is rejected
13.z = 1· 79 . The die is unbiased
14. yes; Izl=6.38 15. yes; z=4.94
16. z = 3.37; There is difference in A and B
17. not improved, z = 1.08
18. z=-3.9 Reject HO(I-Il =1-12)at 1% level
19. yes; z = 1.79 20. z = 1.86, No 21. yes, z = 1.16
23. Izl = 10.12 , not equal 24. Izl = 7.05, significantly different
25. Izl = 4.19 . Average life are not same 26. Izi = 0.96 . same
27. z = 1.3 , Not true, no difference.
28.lzl = 3.84 ;differ significantly 29. Izl = 1.32 ; not taller.

30. z = -1.21; yes 31. Izl = 2.11 ; yes

32. z = 4.94 ; yes 33. Izi = 6.38 ; different


MODULE-6

15.1. Introduction:
In the previous chapter test of significance / hypothesis was conducted
on the basis of large sample drawn from the population.In this chapter
this test will be done by drawing a sample of smaller size C < 30). Practically
it is convenient to draw a smaller random sample and sometimes a large
sample may not be available also.
In fact the process used in this direction is also a consequence of
some results discussed in the chapter of 'Sampling & Its Distribution' and
the well known 'Neyman pearson Theorem'.

15.2. Test for single Means :


Theorem 1. (To test Mean of Normal Population.)
Let a random sample XI ,X2 , ••• x; be taken from a normal population
with mean 1-.1. and s.d. o to test the Null Hypothesis Ho(1l = Ilo) against
an alternative hypothesis HI at a -level of significance, where o is
known (i.e, the hypothesis are simple). Then the Best Critical Region
determined by the test statistic x (the sample mean) is

·) - cr. X - Ilo ·f H . L ft id d
C1 X < Ilo - J;; Zu i.e., Z = cr/ J;; < -Zu 1 I 1S e S1 e .

··) -
C 11 X
cr.
> 1-.1.0 + J;; Zu i.e. Z
X -
= cr/
Ilo
J;; > Zu· 1if H I 1S
. Ri h id d
g t S1 e .

C···)
III
1- X - Ilo I > J;; . IIZ = Xcr/-Ilo
c zU/2 i.e, J;; > Zu/2 1if H I IS
. b ot I1 S1id e d ,

where Z is a standard normal variate


and a is the area under standard
normal curve enclosed between the
ordinates Z = Zu and Z = 00 as
shown in the adjacent figure

Proof: Beyond the scope of the book


520

ote. Here if the population s.d a is not known we can not assume
CT the sample s.d
= S ,
In that case we go to the following method.
If the sample size is small « 30) and the sample is not drawn from
a normal population then the following theorem shows the method of
testing the hypothesis.
Theorem 2. ( To test Mean of any Population, not necessarily normal)
Let a random sample Xl ,x2,.· ,xn be taken from a population with mean
11 to test the null Hypothesis H 0 (11 = 110) against an alternative hypothesis
HI at a -level of significance. Then the Best Crilical Region determined
by the test statistic x (the sample mean) is given by

(i) t = /~ < -to. if HI is left sided


S n-1

(ii) t = /~ > to. if HI is right sided


S n-1

(iii) It I=
n-1 S
/~
> ta/2 if HI is both sided

where S is the sample standard


t - curve
deviation, ( is a r-variate and a is the
area under the probability density
curve of t distribution (with n-1
degrees of freedom) enclosed
between the ordinates z = fa and
z = 00 as shown the adjacent figure.
Proof Omitted
Illustration. Let the' null
hypothesis Ho(1l = 140) is to be
tested against the. alternative
*
hypothesis HI (11 140) bydrawinga
sample of size 26 where 11 is the
mean of the population. Let the mean 1·708
and standard deviation of the sample
observations be 147 and 16.
SMALL SAMPLE TEST OF SIGNIFICANCE 521

Note that this is a two sided hypothesis.

Here t = . x - ~ = x -140 = x -140 has I distribution with


S/Jn-1 161m 3·2
26-1=25 degrees of freedom.
Let the testing is to be done at 10% level of significance. Now it is
known that area under t-curve with 25 d.o.f enclosed betwen the ordinates
t=l.708 and t = 00 is 0.05 (as shown in adjacent figure) So by (iii) of
above theorem the Best critical region is ItI > 1· 708 at 10% level.

Since here x = 147.


:. here t = 147 -140 = 2·1875 which lies in Best CR.
3·2
So we reject H 0 at 10% level of significance.

Illustrative Example.
Example.I, A tyre manufacturing company claims that the average life
of their product is. 69 thousand miles. A car manufacturing company
planning to purchase tyre collect a sample of 10 lyre 's life from the
population of 'life' of tyres which are 65, 71, 64, 71, 70, 69, 64, 63, 67
and 68 thousands of miles. It is known that the population is normal and
the standard deviation of the population is .J7.056. Test at 5% level of
significance that the averege life of tyre of the company is lower than
they claim.
[Given Z.05 = 1.65, z.OI = 2.36, /'01;9 = 2.28]
Solution. We take the Null hypothesis H 0 (p = 69) and HI (p < 69) as the
alternative hypothesis.
Here Po = 69
Since population is normal we take the test statistic,

Z -
x - Po -
--
x-69
----;:::::==--;-- =--
x- 69
- a/ r: - .J7.056/ 0.84
/"n 1M
Here the sample mean,
x = 65 + 71 + 64 + 71 + 70 + 69 + 64 + 63 + 67 + 68 = 67.2
10
ENGINEERING MATHEMATICS -IIA
522

.. +:lor t hiIS samp Ie t h e compute d va Iue as z = 67.2-69


,0.84 = -2.14
As the alternative hypothesis is HI (JI. < 69), it is left sided test.

The level of significance is 5% :. a = _5_


100 = .05
00

Since Z05 = 1.65 :. J ¢ = .05


1.65
-1.65
:. J ¢ = .05 also

The Best CR is z < -1 .65 at 5% level.


Our computed value of z = -2.14 < -1.65
:. this falls in the critical region.
:. Ho is rejected. HI accepted. .
We conclude 'the average life of tyre is lower than their claim'
Ex. 2. A salesman is expected to effect an average sales of Rs. 3500 per
day. Observing the sales of a particular salesman for 6 days we see that he
gives average sale of Rs 3300 per day with s.d 1016.53. Using 0.05 level of
significance conclude whether his work is below standard.
[Given /,05 = 2·02 for 5 d.o.f] t - curve
Solution. Here the sample size n = 6 which
is small. Let ~ = Mean of sales. If
~ < 3500 then we say his work is below
standard. We take null hypothesis 2.02
Ho(~ = 3500) ; Alternative hypothesis
HI(~::: 3500).
t= sfx-~
J n -1 has t distribution with n - I degrees of freedom.

The computed value of t = 3300- 3500


f .
r: = -0·44 With degrees of freedom
1016·53 -../5
6-1 = 5·
This is left tailed test. We are given 1.05 == 2·02
i.e., P(2 ·02 < t < 00) = 0 ·05.
From symmetry of t-curve we have p( -dJ < t < -2·02) = 0·05
:. the Critical Region is t < -2·02 as HI is left sided.
But here the computed value of t == -0·44> -2·02.
So the value oft does not lie in Critical Region
SMALL SAMPLE TEST OF SIGNIFICANCE 523

. :. Ho is accepted and consequently HI is rejected at 5% level.


We conclude" his work is not below standard".
Ex. 3. Certain pesticide is packed into bags by a machine. A random sample
of 10 bags is drawn and their contains are found to weigh (in kgs) as
follow:50, 49, 52, 44,45, 48, 46, 45, 49, 45
Test if the avarage packing can be taken to be 50 kg.
[Given t.025;9 =2·26]
Solution: Here the sample size is small. Let J.I. = average weight of the
packing.
Calculation of sample mean(x) and sample s.d.

x x-45 = Y y2
50 5 25
49 4 16
52 7 49
44 -1 1
45 0 0
48 3 9
46 1 1
45 0 0
49 4 16
45 0 0
Total 23 117

:. y = 1~ :L>i = 1~ x 23 = 2 ·3

.: Y = x - 45 :. Y =x - 45 :. x = Y + 45 = 2·3 + 45 = 47·3
.Var (y) =~ LY; - (~LYi r = 1~ x 117 - (2.3)2 = 11. 7 - 5 . 29 = 6·41

:. s.d. of y=~ =2·53

Since y =x - 45 :.0' y = 0'x


:. s.d. of x = 2 ·53
Thus S = 2·53 is the sample standard deviation. The null hypothesis
Ho(Il=50) and the alternative hypothesis IS HI(Il*50).
524
ENGINEERING MATHEMATlCS_IIA

t~ sf ir!:-J has t distribution with n _ I d o.f

47·3-50
Here the computed value of t = J9
= -3.20 with degrees offreedom
10-1=9. 2 ·531 9

We are given the data shown in the following figure

t - Curve

-:2.26 2.26

From the above figure we have P(/t/ > 2.26) = .05

:. the critical region is/t/>2·26 at 5% level. Here /'/=/-3.20/=3.20


which is greater than 2.26.
:. Ho is' rejected.
We conclude "average packing is not 50 kg"
Ex. 4. An IQ test was administered to 5 persons before and after they were
trained. The results were given below:

I II III IV V
IQ before training: 110 120 123 132 125

IQ after training 120 118 125 136 121


Test whether there is any change in IQ after the training programme. [Given
1.01(4) =4·6]

Solution: The changes in IQ for the 5 persons are


120-110, 118-120, 125-123, 136-132,
i.e., 10, 121-125
-2, 2, 4, -4
We think this as a sample of size 5 (small) drawn from the population of
"Changes in IQ after training". If ~ be the mean of this population then
~ = 0 stands for 'no change in IQ'. SO we take the null hypothesis
Ho(~ = 0). The alternative hypothesis H, (~::j: 0)

-;
SMALL SAMPLE TEST OF SIGNIFICANCE 525

_ 10-2+2+4-4
The sample mean, x = =2
5
Calculation of sample s.d. :
10 -2 2 4 -4 10
100 4 4 16 16 140

Sample s.d : = 'I.nx;_[~X' J = Jl!O _(2)' = 4· 90

X-Il
Here t = / r=; has n - I d.o.f
S -Jn=):
2-0
The computed value of t =
./ r: = O· 82
4·90/,,4
This is a two tailed test. Since t.01 = 4·6 we get the following figure:
t - curve

-4.6 4·6
The figure gives P(4. 6 < t < ex)) = ·01
By symmetry p( -00 < t < -4·6) = ·01 also
:. p(ltl > 4.6) = ·02.
:. the critical region is given by It I > 4·6 at 2% level of significance.
The computed value of t = 0·82 < 4·6 that is it does not lie in critical
Region. :. Ho is accepted at 2% level.
:. We conclude" there is no change in IQ"
15.3. Test for Difference of Mean (Test of equality of means)
Theoreml. (If the population is ormal) Let two independent random
samples {XI,X2,"'Xn1} and {x[,X2,,,,X:'2} be drawn from two Normal
of populations (111)0'1) and (112,0'2) respectively to test the hypothesis
:n
Ho(1l1 = 112) at a level of significance ;
is
XI =~(Xl
nl
+-X1 +".+xnJ
'
x 2 = ~(XI
n2 I
+ x' + ... + x' )
2 n2 .
ENGINEERING MATHEMATICS-11A
526

Then z = Xl - X2 is the test


cr2 cr 2
_1 +_2

nl n2
statistic which is standard normal variate. Accordingly the C.R is
(i) z < -Za. if HI (Ill < 1l2) (left sided)

(ii) Z > Za. if HI (Ill> 1l2) (right sided)

(iii) 1Z 1 > Za./2 if HI (Ill * 1l2) (both sided)


where a. is the area under standard normal curve enclosed betweer
the ordinates z = za. and Z = 00 as shown in the adjacent figure
Note. If the population s.d o I and o 2 are not known then we car
not think er 1 ~ 8 ' o 2 ~ S2 where SI' S2 are sample standard deviations'!l
1
that case we go to the following rule.
If the two small samples are not drawn from normal populations the
the following theorem shows the method of testing the hypothesis.
Theorem 2. To test the difference of mean for any population

nl n2 (nl + n2 - 2) XI - X2
Hcre t= .~========
nl + n2 ~nlSf + n2Si
is the test statistic which is a students t variate with nl + n2 - 2 degre
of freedom where Sf ,Sf are variances of the two samples respectivel!
Accordingly the C.R is I - curve

(i) t < La. if HI (Ill < ~l2) (left sided)

(ii) t> t« if Hl(1l1 > 1l2) (right sided)

(iii) 1/1>1a./2 if Hl(1l1 *1l2) (both side)

where a. is the area under r-curve with nl + n2 - 2 d.o.f enclose


between the ordinates t = ta. and t = 00 as shown in the adjacent figure

Illustration.
Let two samples be drawn from two (Ilt>cr) and (1l2,cr) norm
populations. The details of the two samples are
mean s.d
size
7.52 0.024
6
7.49 0.032
5
SMALL SAMPLE TEST OF SIGNIFICANCE 527

We are to test the null hypothesis Ho(1l1 = 112)' o is not given, with
the help of the above sample-observations. We choose the test statistic

nln2 (nl + n2 - 2) XI - X2
t= '-r=======
+ n2 ~nIS? + n2S}
nl
which is a student-t variate with nl + n2 - 2 degrees of freedom.
Corresponding to the drawn sample the computed value of
6x5(6+5-2) 752-7.49
t =. = 160496
6+5 ~6 x (0.024)2 + 5 x (0.032)2 .
having d.o.f 6+5-2 = 9.
Let the alternative hypothesis be HI (Ill > 112), right sided.
From the statisticaltable we see p(t > 2.26) = 0.025 correspondingto d.o.f 9.
So at 0.025 level of significance the CR is t > 2.26 corresponding to
d.o.f 9. Here the computed value 1.60496 does not lie in this region. So
we accept Ho and conclude "the two means of the two populations are
equal" at 2.5% level of significance.
Exampe.1. There are two normal populations, population I and population
II. 8 and 6 are s.d of the two population respectively. Two independent
samples of size 10 and 12 are drawn from the two population respectively
with sample mean 20 and 27. Test at 1% level (i) whether the two
population mean are equal (ii) whether the population I mean is less than
the population II mean. [Take the help of statistical table}
Solution. If J.1.1 and J.1.2 be the population mean of the two population
then the null hypothesis is H 0 (J.1.1 = J.1.2) .
Since the population are normal we use the test statistic,

XI -x2
z=
a + __
_I a;
2

nl n2
Here XI =20, x2 =27, al =8, a2 =6, nl =10, n2 =12
20-27 -7
:. the computed value of z = -;====== or, z = -- = -2.28
82 62 3.07
-+-
20 12
528 ENGINEERING MATl-IEMATlCS-IIA

(i) Here consider the alternative hypothesis HI (J.il = J.i2) .Both sided.At
1 a
1% level a =- =.01 :. -= .005
100' 2
From the table we get za = z.005 = 2.58
2
00 -2.58
f rp = .005 f rp = .005 also
2.58

:. the CR is Izi > 2.58 at a =.0 I level i.e. at 1% level of significance.

Our computed value of Izi = 1-2.281= 2.28 1> 2.58


.. this does not lie in CR.
:. Ho is accepted. We conclude "the two population means are equal"
(ii) Here we take the alternative hypothesis HI (J.il < J.i2) .Left sided.
00 -2.32
From statistical table we see f rp = .01 or, f rp=.Ol
2.32

:. the CR is z < -2.32 at 1% level.


Our computed value of z = -2.28 1:. -2.32
this does not fall in CR. Ho is accepted, HI is rejected. We conclude
'population I mean is not less than the population II mean, rather they are
equal'
Ex. 2. Two types of batteries are tested for their length of life distributed
normally and the following data are obtained.
No ..of batteries Mean life(hours) Variance
Type A 9 600 121
Type B 8 640 144
Is there a significanct difference in the life times of the t"Yo types of
batteries? Value oftfor 15 degrees offreedom at 5% level is 2.131.
*'
Take HO(~1 = ~2) against HI(~I ~2)' Consider the test statistic, as
population s.d is unknown,

. nln2h +n2 -2) .-r========


Xl -X2
t =
nl +n2 I 2 2
VnlSI +n2S2
which is a t variate with degrees of freedom nl + ni - 2.

-----~
SMALL SAMPLE TEST OF SIGNIFICANCE 529

Corresponding to the samples the computed value of


9 x 8(9 + 8- 2) 600- 640
t=.I---~----~-p========== - curve
9+8 v'9xI21+8xI44
-40
"
= 7.97 x ~ =-6.73
,,2241
-2.131 2.131
having d.o.f 9 + 8 -2 =15.
t From the given data we have p(1 t I>
2.l31) = 5% shown by shade in
the adjacent figure. Here the computed value lies in this region. So Ho is
rejected. We conclude "there is a significant difference in the average life
times of the two types of battery". .
Ex. 3. A group of 5 patients treated with medicine 'A' weigh 42, 39, 48,
60 and 41 kgs .. a second group of 7 patients from the same Hospital
treated with medicine B weigh 38, 42, 56, 64, 68, 69 and 62 kgs. Do
you agree with the claim that medicine 'B' increases the weight (which is
normally distributed) significantly than that by A. [The value of t at 5%
level of signigicance for 10 degrees offreedom is 2.2281J
Let III = population mean-weight of the patients treated with medicine A

112 = population mean-weight of the patients treated with medicine B.

Take null hypothesis Ho(1l1 = 1l2) against the alternative hypothesis


HI (Ill < 1l2) left sided.
If we calculate the mean and variance of the two sets of observations:
42, 39,48, 60,41 and 38, 42, 56, 64, 68, 69, 62

we get XI = 46, X2 = 57, S~ = 58, si


= 132.28 (detail computation is
not shown). Since the population is normal whose s.d are not known
we take the test statistic,

The computed value of t is


-2.2281 2.2281

5 x 7 (5 + 7 - 2) 46 - 57
t = . = -1.70
5+7 v'5 x58+ 7 x 132.28
From the given data we get p(t < -2.2281) =.025

EM-2A-34
530 ENGINEERING MATHEMATICS-IIA

Since HI is left sided the CR is 1<-22281 at .025xlOO=2.5% level


Here the computed value of 1 = -1.70 > -2.2281 .
So the value of t does not lie in the CR. We accept Ho. HI is rejected.
We conclude 'the weight of the patients treated with medicine B is not
incresed than that with A' .
15.4. Test for a specified Correlation Coefficient
Here the population is set of bivariate data, i.e. every element of the
population is of the type (xi,Y). For example (height, weight) of the
students of a college. Here we shall test the correlation coefficient between
Xi and Yj. This testing is conducted for different cases:
Theorem.1 (when the population is normal and the null hypothesis
is Ho(p = 0»
If a random sample ofn number of pairs (XI'YI)' (x2'Y2) (x;I'YlI) is

drawn from a bivariate normal population to test the hypothesis H 0 (p = 0)

.. r~ n - 2. S d I • h •
at a Ieve I then t he test statistic, I= IS a tu ent s z-vanate WIt
1- r2
n - 2 degrees of freedom; r is the correlation -coefficients of the pairs in
the sample. Accordingly the CR is
(i) I < I_a if HI (p < 0) ,left sided
(ii) I > la if HI (p > 0) , right sided

(iii) It I > I~ if HI (p "* 0), both sided

where a = area under z-curve with n - 2 degrees of freedom enclosed


between the ordinates I = la and I = 00 •

Example From a bivariate normal population a sample of size 18 is drawn


and seen that the correlation coefficient in the sample is 0.3. Will it be
wise to think that the variables are uncorrelated in the population?
[Given, at 5% leyel the value oft for 16 degrees of freedom is 2.12]
Solution. The correlated coefficient in the sample, r = 0.3 . Let correlation
coefficient in the population = p .Sample size, n = 18 (small). The null

hypothesis Ho(p = 0)
SMALL SAMPLE TEST OF SIGNIFICANCE 531

.Alternative hypothesis HI (p ::1=


0). Then the test statistic, t = r~
l-r2
with d.o.f 18 - 2 = 16·
0.3x.,f18 - 2
:. the computed value of t = = 1.258
~1- (0.3)2

By the given information we mean pOtl > 2.12) = _5_ = .05


100
:. the CR isltl > 2.12
Our computed value of It I = 11.2581= 1.258 j 2.12
i.e. this does not fall in CR.
:. Ho is accepted at 5% level.
We conclude "the two variables are uncorrelated".
Theorem.2. (When the population is Bivariate normal and the null
hypothesis Ho(p = Po) and Po::l= 0)
If a random sample of n (small) number of pairs
(XI 'YI)' (x2'Y2) (xn'Yn) is drawn from a bivariate normal population
to test the hypothesis Ho(p = Po), Po::l=0 at a level of significance, the

..
statistic 1
Z = -loge 1 + r h as approximate
-- . norma I diistnibuti . h
ution Wit
2 l-r

mean=..!..loge 1+ Po and s.d = ~.


2 1- Po ",n-3
.. Z - mean . . da d I
Th en t he test statistic z = IS approximate stan r norma
s.d
variate for which CR is obtained accordingly.
Example. A random sample of24 pairs of observation shows a correlation
coefficient 0.69. Is it possible that the correlation coefficient between the
variables in the population 0.59? Test at 1% level of significance. Use
proper statistical table.
Solution. The sample size n = 24 (small)
The sample correlation coefficient, r = 0.69.

Null hypothesis Ho (p = 0.59) . Alternative hypothesis HI (p ::1=


0.59) ;
both sided.
532 ENGINEERlNG MATHEMATlCS-IlA

Now, 2 = ~ loge 1+ r has approximate normal distribution with


2 l-r
1 1+ P
mean = -log --
2 e 1- P

= ~ log 1+ 0.59 = ~ log3.878 = ~ x 1.355 = 0.6775 and


2 1-0.59 2 2
1 I
s.d = ,---;:; = r;:;;--::; = 0.218
" n - 3 " 24 - 3
2 -0.6775
:. z = 0.218 is standard normal variate.
Here the observed correlation, r = 0.69 .
. 1 1 + 0.69
.. the computed value of 2 = -log = 0.848
2 1-0.69
0.848 - 0.6775
.. the computed value of z = = 0.78
0.218
This is a both sided test. From the table we get
00 -1.96

J ~ = 0.025 .. f ~ = .025 also


• 1.96 -<Xl

.. for this test CR is Izl > 1.96


Our computed value of Izl = 10.781:f1.96
:. this does not fall in CR.
:. Ho is accepted. We conclude 'it would be right to think that the
correlation coefficient between the variavles in the population is 0.59'
Note This method can be used to test Ho(p = 0) also but this method
gives an approximate result.
15.5. Test For Difference of Correlation Coefficient (or, Equality
oftwo Correlation)
If two independent random of pairs (xt,Yt), (xz,Yz) (xnl'Ynl) and
(XI' ,y/), (X2' ,Y2') (xn ' ,Yn')
2 2
are drawn from two bivariate normal
population to test the hypothesis Ho(PI = P2) at a level of significance
1 1 + 1j 1 1 + r2
then the two statistic 21 = -log-- and 22 =-log-- (1J,r2 are
2 1- r1 2 l-r2
SMALL SAMPLE TEST OF SIGNIFICANCE 533

observed correlation coefficient of the pairs in the two samples


respectively) are such that ZI -Z2 is normal variate with mean

PI - P2 =0 (": PI = P2) and s.d = 1_1_


Vn
+ _1_
-3 n -3
l 2
.

Z -Z
i.e. the test statistic z = 1__ 1_1 + _2_1_ is standard normal variate
V rtl- 3 n2 - 3
for which CR is obtained accordingly.
Example Two independent samples of sizes 16 and 18 are drawn from
r two normal bivariate population. It is observed that the correlation
coefficient between the variables in the two samples are 0.40 and 0.68
respectively. Test at 5% level of significance whether the correlation
coefficient of the variates in the two population are same.
[Given area under normal curve between the ordinates
z=-1.96 and z=1.96isO.95]
Solution. The sample sizes are nl = 16, n2 = 18 .The sample correlation
'1 = 0.40, r2 = 0.68 .Let PI' P2 be the correlation coefficients in the two
population respectively. The null hypothesis H 0 (p~ = P2) .
The alternative hypothesis HI (PI =#; P2) .Both sided.
I 1+r. 1 l+r.
ZI = -loge __ I and Z2 = -loge __ 2
2 l-rl 2 l-r2

ZI - Z2 has normal distribution with mean 0 and

s.d = ~ nl ~ 3 + n2 ~ 3 = ~ 16~ 3 + 18~ 3 = 0.3 8


..
. . th e test statistic, z = ZI - Z2' IS stan dar d norma I'van .ate
0.38
Now the computed value of ZI and Z2
I 1+0.40 1 1+ 0 68
ZI = -loge = 0.422 and Z = -10 . = 0.329
2 1- 0.40 2 2 s, 1- 0.68
:. the computed value of z 0.422 - 0.329 = 0.245
0.38
534 ENGINEERING MATHEMATICS -IIA

1.96

Now we are given, J ¢J = 0.95 .


• -1.96
:. the CR is Izl > 1.96 at 5% level.

Our computed value of Izl = 10.2451= 0.245 'j 1.96


:. this does not fall in CR.
:. Ho is accepted. We conclude "the two population have same
correlation coefficient"
15.6. Test for Ratio of Variances (or, equality oftwo s.d)
Theorem: Let two random samples of sizes n) and ~ be drawn from
two normal population with sample standard deviations Sjand S2 to test
the hypothesis Ho(Ci) =Ci2) at a level of significance. CiJ,Ci2 are s.d of
the two population respectively.
Then the test statistic is
S2 2 n 2 2 n2 2
F =-t where SI =_I_SJ and s2 =--S2'
s2 nJ -1 n2 -1
which is Snedecor's F variate with degrees of freedom (n) -1, n2 -1).
The Best Critical Region is F > Fa where HJ (CiJ > a2) is right sided,
00

where P(Fa < F < (0) = a i.e. f F = a.


"Fa
Since F curve is skewed to the right only, it has no existence on left
of origin so the on alternative .hypothesis should be take HJ (CiJ > a2) .
It is convenient to sleet Sj and S2 in such a way that SJ > S2
Example.I. Two rando samples of sizes 9 and 13 are drawn from two
normal population. It .s observed that the standard deviation of the two
samples are 2.1 and 1.8 respectively. May we suppose that the standard
deviation of the second population is greater thatof the first population.Test
II at 5% level. [Given P(F > 3.28) = .05 for d.o.f 12, 8]
Solution. Let nJ = sample size drawn from second = 13, SI = 1.8
and n2 = 9, S2 = 2.1
Null hypothesis Ho(CiJ = Ci2).
Alternative hypothesis HJ (CiJ > a2) ; Right sided.
SMALL SAMPLE TEST OF SIGNIFICANCE 535

Now s. =--S.
2n.
n. -1
2 13
=-x(1.8)
12
2
=3.51
.
2n2 2 9 2
=--S2
S2 =-x(2.1) =4.961
n2 -1 8
Computed value of test statistic,

F=s~ =~=0.708
s2 4.961
with degrees of freedom 13 -1, 9 -1 i.e. 12, 8.

Now a =5%=~=.05 :. Fa =3.28


100
:. the CR is F > 3.28.
Our computed value of F = 0.7081> 3.28. :. F does not fall in CR.
:. Ho is accepted. H. rejected.
We conclude s.d of second population is not greater than that of first
rather they are equal.
1 Example.2. In a sample of 8 observations, the sum of the squared
deviations of items from the mean was 94.5. In another sample of 10
observations, the value was found to be 101.7. Test whether the difference
is significant at 5% level. [Given area under F-curve enclosed between
the ordinates at 3.29 and 00 is .05 for d.o.f(7,9) and 3.07 for d.o.f(8,10)]
2 1
Solution. Here nl = 8, SI= - x 94.5
8
2 l n 2 8 1
:. sl =--SI =-x-x94.5=13.5
nl -1 7 8
1 2
n2 = 10, S2 = -x
101.7
10
. 2 n2 2 10 1
•• s2 =--S2 =-x-xl01.7=11.3
n2-1 910
Computed value of test statistic

F = s~ = 13.5 = 1.195 with degree of freedom (n. -1,n2 -1) = (7,9).


s2 11.3
Now, a =5%=~=.05 :. Fa =3.29.
100
:. the critical region is F; 3.29 .
536 ENGINEERING MATHEMATICS - I1A

Our computed value of F = 1.195 ~ 3.29


:. it does not fall in CR.
:. Ho is accepted.
We conclude "the difference of two s.d IS not significant"
Example.3. The following results were obtained from two independent
random samples.
Sample size Mean s. d
Sample I : 6 29 4.0
Sample II : 5 25 2.1
Test at 5% level whether the two samples may be regarded as drawn
from the same normal population
[Given t.025 = 1.82 for 9 d.o.f and F.05 = 6.26 for (5,4) d.o.f]
Solution. A normal population has two parameters; its mean and standard
deviation. If we find that these two are same for the two population then
we may conclude that "the two samples are drawn from same normal
population" .
So we shall test Ho(J.l1 = J.l2) as well as HO(O"I = 0"2)' where J.lPJ.l2
are two population means and 0"1,0"2are population s.d
We should first test HO(O"I = 0"2) because if 0"1= 0"2 then
H 0 (J.lI = J.l2) can be tested.

So first we shall that HO(O"I = 0"2)'

2 n 2 6 2
Now SI =--SIl =-x(4.0) =19.20
nl -1 6-1
2 n2 2 5 2
S2 =--S2 =-x(2.1) =5.51
n2-1 5-1
19.20
= -T = --
s2
Computed value of F = 3.48
S2 5.51 •
with degrees of freedom (nl-l,n2 -1)=(5,4)
The CR is F > 6.26
Our computed value of F = 3.48 ~ 6.26
:. it does not lies in CR. Ho is accepted.
We conclude 0"1and 0"2 are equal.
v

SMALL SAMPLE TEST OF SIGNIFICANCE 537

Next test the hypothesis Ho(111 = 112) .


The common s.d of the two population are not known.
So we find
2
i
s2 = nlSl +n2S = 6 X (4.0)2 +5x(2.1)2 =13.12
nl + n2 - 2 6+5- 2

.. computed value of t =
S
if
XI -X2

1
-+-
1
nl n2

t has nJ + n2 - i= 6 + 5 - 2 = 9 degrees of freedom


By given data critical region is It I > 2.26 .
:. Ho(111 = 112) is accepted. Thus two samples are drawn from two
same population.
15.7. chi-sq~are test for Goodness of Fit.
In a random sample drawn from a population there are different classes
of observations having different frequencies. The test by which we decide
whether the observations arein good agreement with a hypothetical
distribution of the population is known as Test of Goodness of Fit.
Illustration.
Let a random sample of 500 students be drawn from the population of all
college students of West Bengal. Let it be observed that there are four classes
of students among these, which are Hindu, Muslims, Christans and Buddhist.
The frequency of these classes i.e. the number of students of these classes
are noted as 200, 150, 100 and 50 respectively in the sample. We consider
the hypothesis that "The ratio of students of these classes in West Bengal
colleges is 3 : 2 : 2: 1". If we want to test whether the frequencies of the
classes occured in our sample agree good with this .hypothesis then this is a
'test of Goodness of Fit'.
Observed Frequency and Expected Frequency
The frequency of a class in a sample is called observed frequency of
that class in the sample. It is denoted by fo. In the previous Illustration
the observed frequency of the class 'Muslim', "fo of Muslim" = 150 .
Under the hypothesis (on the population) which is to be tested by 'test
of goodness of fit' the 'should be frequency' of a class is called the
expected frequency of that class in the sample. It is denoted by fe.
538 ENGINEERING MATHEMATICS-IIA

In the previous Illustration the expected frequency of the class


'Muslim' , " Ie
of Muslim" =~ x500= 125.
8
Theorem. Let Ho be a null hypothesis regarding the occurence of different
classes in the population.
Suppose there are k number of classes. If 10 and I« be the observed
2 '" (fa - le)2
and expected frequency (under Ho) of a class then X = L.. Ie ' the

summation is taken on all classes, is the test statistic which has /


distribution with k - 1 degrees of freedom.
The Best Critical region is
X2 <curve
X2 > X& at a level of significance
where a is the area under X 2 curve
(of k - 1 d.o.f) enclosed between the
ordinates at i = X~ and i= 00 as
2
shown in the adjacent figure. Xa
Proof Beyond the scope of the book.
For illustration and application of above theorem go through the
following illustrative examples.
15.8. Illustrative Examples.
Example. 1. In a sample of 500 students drawn from the college students
in West Bengal it is observed that the number of students reading in HiS;
1st Year, 2nd Year and 3rd Year are 200, 150, 100 and 50 respectively.
Will it support the hypothesis that in West Bengal the number of students
reading in H.S., 1st year, 2nd year and 3rd year are in proportion 3 : 2: 2
: 1 ? Test at 5% level of significance. Given that 5% value of for 3 i
and 4 d.o f is 7.815 and 7.416 respectively.
The 'students reading in H.S.', 'students reading in 1st year' etc. are
'I, I the classes. So here we have k = 4 classes. The null hypothesis is Ho
(In W.B. No. of students reading in H.S : 1st year: 2nd year: 3rd year
= 3 : 2 : 2 : 1). The alternate Hypothesis HI (In W.B. No. of students
*
reading in H.S. : 1st year: 2nd year: 3rd year 3 : 2 : 2 : 1)
Let 10 = observed frequency of a class in the sample.
Ie = Expected frequency of a class in the sample, under Ho .
SMALL SAMPLE TEST OF SIGNIFICANCE 539

Computation of 10 and Ie
Class Observed Probability Expected freqoercy
Frequency (under Ho) (Ie)
(10)
3
Students in H.S 200 3/8 - x 500 = 187.5
8
2
Students in 1st Year 150 2/8 -x500= 125
8
2
Students in 2nd Year 100 2/8 -x 500= 125
8
1
Students in 3rd Year 50 1/8 -x 500= 625
8
Total 500 1 500*
* The total of the expected frequencies should be equal to the total of
the observed frequencies.
2 ..., (10 _ j~)2 .
The test statistic is X = J'... Ie ' with 4 - 1= 3 degree of freedom.
The summation extends over all classes. Computation of X2
corresponding to the sample is shown below :

10 Ie (10 - le/ (10 - le)2lie


200 187.5 156.25 0.83
150 125 625 5
100 125 625 5 .
50 62.5 156.25 2.5
Total 500 - 13.33

:. the computed value of / = l3.33 with 3 d.o.f.


From the supplied data we have p(x 2
> 7.815) = 0.05
:. the best critical region is / > 7.815.
Since our computed value lies in this region (as 13.33> 7.815) we reject
Ho at 5% level of significance. We conclude "In W.B the number of
students reading in H.S, 1st year, 2nd year and 3rd year is not in the
ratio 3 : 2 : 2 : 1 ". (Note that here the second supplied data is useless).

----
540 ENGINEERING MATHEMATlCS -IIA

Ex. 2. A die is thrown 150 times with the following results


No. turned up: 1 2 3 4 5 6
Frequency 19 23 28 17 32 31
Test the hypothesis that the die is unbiased. Given X6.~5;5 = 11.07 and
2
XO.05;6 = 12.59 .

Here the number of classes, k =6.


The die will be unbiased if the frequency of all faces in many throws
are same, i.e. the ratio of occurence of the classes = 1 : 1 : 1 : 1 : 1 : L
Let Ho (In several throws the occurence of Face 1 : Face 2 : Face 3
: Face 4 : Face 5 : Face 6 = 1 : 1 : 1 :.1 : 1 : 1.)
Let 10 = Observed frequency of a class in the sample
Ie = Expected frequency of class in the sample under Ho.
. ..
Th e test statistic IS X
"'\' (fo -IeIe )2 ,t (h e summation
2 = L.. ..
IS
d d
exten e over

all possible class) which is a l variate with k - 1= 6 - 1= 5 degrees of


freedom.
Corresponding to the sample the value of l is computed in the
following table:

Class 10 Probability r: 10- Ie (fo - le)2 (fo - le)2 lie


=(3)x150
(1) (2) (3) (4) (5) (6) (7) =(6)-;-(4)

Face 1 19 1/6 25 -() 36 1.44

Face 2 23 1/6 25 -2 4 0.16


Face 3 28 1/6 25 3 9 0.36
Face 4 17 1/6 25 -8 64 2.56
Face 5 32 1/6 25 7 49 1.96
Face 6 31 1/6 25 6 36 1.44
Total 150 1 150* - - 7.92

* this should be equal to Llo .


2 ~ (fo - le)2 92 .
:. the computed value of X = L.. [« = 7. with d.o.f 5.

J
SMALL SAMPLE TEST OF SIGNIFICANCE 541

From the supplied data we have p(l > 11.07) = 0.05.

So the critical region is l > 11.07 at 5% level of significance.

Since our computed value does not lie in this region (as 7.92 '} 11.07)
we accept Ho at 5% level and conclude

" The die may be biased" .


Ex. 3. The number of road accidents per week in a certain area were as
follow:
12,8,20,2, 14, 10, 15,6,9,4
Are these frequencies in agreement with the belief that accident conditions
were same during 10-week period.
Given the following data
2
Statistic d.o.f value of X at 5% level
2
X 9 16.919

This is a sample of 12+8+20+2+14+10+15+6+9+4=100 observations.


These observations are arranged as
Week 1st 2nd 3rd 4th 5th 6th 7th 8th 9th 10th

No. of
12 8 20 2 14 10 15 6 9 4
accidents
Here k = No. of classes =; No. of weeks = 10 .
The accident conditions are same during these lO-week period means
there are equal number of accidents in each week.

We consider Ho (there are equal number of accidents in each week)


and HI (number of accidents in each week is not same).

Let 10 = observed frequency of each class in the sample


Ie = Expected frequency of each classs in the sample
under ( Ho = 10 )for each class.
. .. 2 ~ (Jo - le/
Th e test statistic IS X = c: Ie

~--------~~--~~-----~
542 ENGINEERING MATHEMATlCS-IiA

(the summation being extended over all classes) which is a / variate


with 10-1 = 9 d.o.f corresponding to the sample the value of / is
computed in the following table:

class 10 Ie 10- Ie u: - let (fo - let lie


1st week 12 10 2 4 0.4
2nd week 8 10 -2 4 0.4
3rd week 20 10 10 100 10
4th week 2 10 -8 64 6.4
5th week 14 10 ·4 16 1.6
6th week 10 10 0 0 0
7th week 15 10 5 25 2.5
8th week 6 10 -4 16 1.6
9th week 9 10 -1 1 0.1
10th week 4 10 -6 36 3.6
Total 100 100 - - 26.6

:. the computed value of l = I (fo - le)2 = 26.6 with d.o.f 9.


Ie
From the given data we see p(l > 16.919)=0.05

:. the CR is l > 16.919. Since the computed value lies in this region
we reject Ho at 5% level and conclude "The accident conditions are not
same during the 10-week period".

Ex. 4. Of 160 off springs of a certain cross between guinea pigs 102
were red, 24 were black and 34 were white. According to genetic model
the probabilities of red, black and white are respectively 9/16, 3/16 and
114. Test at 5% level of significance, if the data are consistent with the
model. For 2 degrees offreedom p( l > 5.99) = 0.05.
Here k = No. of type of guine-pig = 3. Take Ho (The probabilities of
red, black, white are 9116, 3/16 and 1/4 respectively); HI (the probabilities
are not so as prescribed in genetic model).
10 = observed frequency of the class in sample
Ie = expected frequency of the class in sample

J
S tALL SAMPLE TEST OF SIG IFICANCE 543

v
A.
2= .v:J,
L..
----'J._e ),-2 2
is X variate with 3-1 =2
2
d.o.f.
Computation of X

Class 10 Probability r: 10- t: (fo-le/ (10- le/ lie


(I) (2) (3) 4)=(3)x160
Red 102 9/16 90 12 144 1.6
Black 24 3/16 30 -6 36 1.2
White 34 1/4 40 -6 36 0.9
Total 160 1 160 - - 3.7

:. the compute value of l = L (fo - le)2 = 3.7 with 2 d.o.f


Ie
From the given data the CR is l > 5.99. Our computed value does

not lie in this region (as 3.7 j 5.99). Hence Ho is accepted at 5% level of
significance and we conclude " the data are consistent with the model".
Ex. 5. Survey of 320 families with 5 children each revealed the following
distribution:
No. of boys 5 4 3 2 1 0
No. of girls 0 1 2 3 4 5
No. of family 14 56 110 88 40 12
Is the result consistent with the hypothesis that male and female births
are equally probable? The 5% value of with 5 d.ofis 11.07. l
[WB.U.Tech, 2004, 2015
This is a sample of 320 families which are arranged as :
No. of boys 5 4 3 2 0
No.offamilies : 14 56 110 88 40 12
Here we think 'a family with 5 boys', 'a family with 4 boys' as the
classes. So here the number of classes, k =6.
We take Ho (the male and female birth are equiprobable) i.e. Ho (the
probability of male birth =..!.)
2
Then, among 5 children, the probability of r No. of boys
=5c·(..!.)r(1_..!.)5-r =_1 x5C
r 2 2 32 r •
544 ENGINEERING MATHEMATICS- IIA

Therefore, among 320 families, the expected number of families having


I 5 5
r number of boys = 320 x -x Cr = lOx C, .
32
Let 10 = Observed frequency in the sample
Ie = Expected frequency in the sample under Ho .
"\' (fa - Ie )2. h 6 1 5 d f
N ow, X2 = L... J. IS
2
X variate
. .
Wit - = .0.
e 2
Corresponding to the sample we compute the value of X :

Class Probability la-Ie (la-Ii (fa _1e)2


10 Ie
'No of boys
=(3)x32C
Ie
(1) (2) (3) (4) (5) (6) (7)
I
5 14 -x5C5 10 4 16 l.6
32
15
4 56 -x C4 50 6 36 0.72
32
I
3 110 -x5C3 100 10 100 1
32
I5
2 88 -x C2 100 -12 144 l.44
32
I 5
1 40 -xCI 50 -10 100 2
32
I
0 12 -x5Co 10 2 4 0.4
32
Total 320 1 320 - - 7.16

The computed value of l = I (fa ~:e)2 = 7.16 with 5 d.o.f.. From the

supplied data we have the CR as l > 11.07 . Our computed value does
not lie in this region. So Ho is accepted at 5% level an~ we conclude "
1
the male and female birth are equi-probable".
Ex. 6. Four dice were thrown 112 times and the number of times 1, 3 or
5 was thrown were as under:
No. of dice throwing 0 2 3 4
a
1, 3, or 5
Frequency 10 25 40 30 7
Find the value of chi-square presuming that all dice were fair.

/
SMALL SAMPLE T~ST
,;.
OF SIGNlFlCANCE 545

We assume all dice were fair.


Let the event" 1, 3 or 5" be called 'success'
3 1
:. the probability of 'success' in a throw of a die is P ="6 ="2 .
Then "No.of dice throwing 1,3, or 5' means 'No. of success" in
four trials. Therefore
Po = Probability of 0 number of success in 4 trials

_4 (1 )o( -1 )4-0_-- 1
- Co -
2 2 16
PI = Probability of 1 number of success in 4 trials

=4c{i)l(iY- = ~
1

. .
P2 = Probability of 2 succes in 4 trials' =4qGJ xGf2 = Ys

P3=4c{~J(~)= ~
4 (1)4(1)0 X6'
P4= C4"2 "2 =

Let 10 = observed frequency of the success


I« = Expected frequency of the success.
l = L (fo - le)2 is a l variate with 5-1 = 4 d.o.f
Ie
Computation of l-variate

Class 10 Probability Ie 10- Ie u.- le)2 (fo - le)2


I,
(No. of =(3)x112
success)
(1) (2) (3) (4) (5) (6) (7)
0 10 1/16 7 3 9 1.286
1 25 ·1/4 28 -3 9 0.321
2 40 3/8 42 -2 4 0.095
3 30 114 28 2 4 0.143
4 7 1/16 7 0 0 0
Total 112 1 112 - - 1.843

:.l = 1.843 with 4 degrees of freedom.


EM-2A-35

r
546 ENGINEERING MATHEMATICS -IIA

15.9. Test for Independence of Attributes


The character of statistical information which can not be expressed
in quantity but can be expressed in quality is called attribute
Let A and B be two attributes; A is shown in the categories
Aj,A2, .••.• ,A" and B in the categories Bj,B2, .....• ,BI/ .For example let
A == 'Family condition of students in an Engineering college'
B == 'Results in the Final Examination'
Then A may be shown in the categories
Aj == 'poor class',
A2 == 'Middle class',
A) == 'Rich class'
whereas B may be shown in the categories B, == Fail, B2 == 1st class,
B) == 2nd class, B4 == 3rd class.
The frequency of each category obtained from a random sample are
pl~ced in a table containing 111 x n cells.
This table is known as Contingency Table.
tl A
R~"dts Family condition Total
Poor Middle class Rich
Al A2 A)
Fail s, 5 4 3 12
1st class B2 6 4 3 13
2nd class B) 10 8 2 20
3rd class B4 12 11 4 27
Total 33 27 12 72

As for example, a contingency table of A and B (of the above example)


isthat is the frequency under the category AIBI is 5, i.e. number of failed
students coming from poor class is 5. etc.
Now from these observed frequencies we are going to test whether
the attributes A and B are independent or not, i.e. whether one of the two
attributes has influence on another.

J
SMALL SAMPLE TEST OF SIGNIFICANCE 547

This test is performed by using X2 (chi-square) distribution. If the null


hypothesis is taken as Ho (Attributes Aand B are independent) then it can
be shown that
X2 = L (/0 - /,Y
Ie
is a X2 variate with (r-1)(c-1) degrees of freedom where
r = number of row's in the contingency table
c= columns .
fa = observed frequency in contingency table
Ie = Expected frequency
Row total of a Category x Column total of that Category

Total frequency
In the above table, the expected frequency Ie of the category A2Bl
. 13x 27
IS~.

If the computed value of X2 lies in the Critical Region then Ho is


rejected, otherwise accepted. See the subsequent Illustrative Example.
. ?

For a 2 x 2 contingency table the above form of X- variate becomes


more simple. If the table be

B A Total

- a fJ Rl

- Y 0 R2

Total Cl C2 N

(~=a+fJ, R2='y+o;Cj =a+y ,C; =fJ+o, N=~ +R2 =Cj +C;)


N(ao - fJy)2
then X2 will become X2 = ----'--'-~-
. RIR2 CIC2
with (2 -1) x (2 -1) =1 degree of freedom. But here a correction IS

necessary since the degree of freedom becomes 1. This correction is


known as Yate's Correction.

J
548 ENGINEERING MATHEMATICS-IIA

Yate says

(i) if ao > f3r, replace a,o by (a -.!..),(o -.!..) and f3,r by


2 2
1 1
f3+-2' r+- 2

(ii) if ao < f3r replace a,o by


1
a + 2"' ° + 2"1 and f3,r .by
1 1
f3-2",r-2"
Thus after Yate's correction, for the case (i)

II 1
N { aO-2"a-2"0+~-f3r-2"f3-2"r-~
1 1 1 }2
=
R1R2 C1C2

=
k
N {( ao - f3r) - (a + f3+ r + o)} 2
R1R2 C1C2

N{(ao - f3r)-k~f
R1R2 C1C2
and for the case (ii)

2 N{(f3r-ao)-kNf
Corrected X = -~----~-
R1R2 C1Cz
That, considering both the case,

N{lao - f3r'-kNf
Corrected XZ --'----------"--
, R1R2'C1CZ
See the following Illustrative Examples.

--
/
SMALL SAMPLE TEST OF SIGNIFICANCE 549

15.10. Illustrative Examples.


Example.I. (For Contingency table of size =I:- 2 x 2)
A random sample of 500 students appeared in an entrance examination
is classified as follow:

Result Economic Condition Total

Poor Middle Well-ta-do


class

Successful 160 30 10 200

Unsuccessful 140 120 40 300

Total 300 150 50 500

Test at 5% level whether 'Result' and 'Economic condition' are


independent or not. [Given X.~5 = 5.99 for 2 d.o.f i
Solution. We take the Null Hypothesis Ho (the two attibutes are
independent) The observed frequency (fo) are given in the following
2 x 3 contingency table

Result Economic Condition Total

Poor Middle Well-to-do


class

Successful 160 30 10 200

Unsuccessful 140 120 40 300

Total 300 150 50 500

Row total xColumn total


The expected frequency of a category = ----------
Total frequency
SSO ENGINEERING MATHEMATICS-11A

The expected frequency (fe) of each category

Result Economic Condition Total

Poor Middle Well-ta-do


class
Successful 200x300 200x 150 200x50
--
500 500 500 200
=120 =60 =20
300x 150
Unsuccessful 300x300 300x50
-- 300
500
500 500
=180 =90 =30

Total 300 150 50 500

(160-120)2 (30-60)2 (10-20)2


= + +
120 60 20
(140-180)2 (120- 90)2 (40 - 30)2
+ + + -'--------'-
180 90 30
= 55.56
with degree of freedom (r-l)(c-l) =(2-1)(3-1) =2
00

Since X.~5 = 5.99 :. f X2 = .05


5.99
:. The CR is X2 > 5.99 at .05 x l 00 = 5% level
Our computed value of X2 is 55.56> 5.99
:. this lies in critical region. So Ho is rejected at 5% level of
significance. We conclude" 'Result' and 'Economic condition' of the
students are not independent, they are associated"
Example.2. (For 2 x 2 contingency table)
In an experiment of immunization of cattle from tuberculosis the
following results were obtained
Affected Not-affected
Innoculated: 12 25
Net innoculated: 16 6
Calculate X2 and discuss the effect of vaccine in controlling
susceptibility to tuberculosis.
[5% value of 'X2 for one degree of freedom is 3.84]

-
SMALL SAMPLE TEST OF SIGNIFICANCE 55l

Solution. We take the Null Hypothesis Ho (The vaccine has no effect)


i.e. Ho (,immunization' and 'affection' are independent).
The observed freq uency (/0) are given in the following 2x2
contingency table:
Affected Not affected Total
" Innoculated: 12 =a 25 = fJ 37=R,
Not-innoculated: 16= r 6=8 22 =R2
28=C, 31 =C2 59=N
After Yate's correction,
N{la8 - fJrl-.!. N}2
the corrected X 2 = 2
R,R2C,C2

{
59 112x6-25x161-T
59}2
= = 7.44
37 x 22 x 28 x 31
with degrees of freedom (r -l)(c -1) = (2 -1)(2 -1) = 1
Given 5% value of X2 for 1 d.o.f is 3.84.
So the area under chi-curve (with 1 d.o.f) enclosed between the ordinates
X2 = 3.84 and X2 = 00 is .05
:. the CR is X2 > 3.84 at 5% level
The computed value of
X2 = 7.44 > 3.84
That is this falls in CR. So Ho is rejected at 5% level of significance.
We conclude' The vaccine has affect in controlling susceplibility of
tuberculosis
Exercise 15
1. Prices of shares of a company on the different days in a month
were found to be 66, 65, 69, 70, 69, 71, 70, 63, 64 and 68. Discuss
whether the average price of the shares in the month in 65

[Hint: First find x = 67.5, S2 = 70.5 ]


9
552 ENGINEERING MATHEMATJCS-IIA

2. A machinist is making engine parts with axel diameters of 0.700


inch. A random sample of 10 parts shows a mean diameter of 0.742 inch
with a standard deviation of 0.040 inch. Compute the statistic you would
use to test whether the work is meeting the specification. Also state how
you would proceed further.
3. Ten cartons are taken at random from an automatic filling machine.
The mean net weight of the 10 cartons is 11.8kg and standard deviation
is 0.15kg. Does it support the plan that average net weight of all cartons
should be 12kg? You are given that to.05 = 2.26 for 9 d.o.f.
4. An automobile tyre manufacturer claims that the average life of a
particular grade of tyre is more than 20,000kms, when used under normal
driving conditions.A random sample of 16 tyres was tested and a mean
and s.d of 22000 and 5000kms respectively, were computed. Assuming
the lives of tyres in kms to be approximately normally distributed, decide
whether the manufacture's product is as good as claimed. Use 5% level.
[Give 10.05;15 =1.75] [Hint: Ho(J.1=20,000), H1eJ.1 > 20,000)]

5.Two independent random samples of observations are drawn from


two normal populations and the following information is obtained:
First Population Second Populaton
Sample size 10 12
Sample mean 20 27
Population s.d 8 6

Can we conclude that the first population's mean is smaller than that of
second population. Use 1% level of significance. Given area under the
standard normal curve enclosed between the ordinates z = 0 and z = 2.33
is 0.49.

[Hint: 0,,02 are known; take Ho(1l1 = 112), H1(111 < 112) ]

6.Two samples of 6 and 5 items respectively gave the following data:


Mean of the first sample = 40 , s.d of the first sample = 8
Mean of the second sample = 50 s.d of the second sample = 10
Is the difference of the mean of the two normal populations
significant? (The value of t for 9 d.o.f at 5% level is 2.26)
. [Hint: The given data means pO t I> 2.26) =.05 ]
SMALL SAMPLE TEST OF SIGNIFICANCE 553

7. Two independent random samples of size 8 and 6 having mean


38.4 and 33.7 are drawn from two normal populations. The sum of the
squares of deviation from the respective sample mean are 20.8 and 15.7.
Do you think the population mean of the first is the larger?
Given 1.01 = 2.68, 1,005 = 3.06 for 12 d.o.f.
8. Ten soldiers visit a rifle range for two consecutive weeks. Their
scores are:
Soldier : A B C D E F G H J J
1st Week : 67 24 57 55 63 54 56 68 33 43
2nd Week: 70 38 58 58 56 67 68 72 42 38
Examine if there is any significant improvement in their performance.
Assume the rifle-range is normally distributed with same s.d. Given
1.01 = 2.82 for 9 d.o.f. f
"
9. Two independent samples are drawn from two normal populations with
common s.d. The followings are observed : g
size mean vanance
First Sample 9 600 121
Second Sample 8 640 144
Test at 5% level of significance whether the two populations have
ths
same mean. Given 1.025;5 = 2.13 .
10.What is the test statistic which is used t,o test the equality of two )15
means of two normal populations when their common s.d. is unknown.

11.A sample of size lO is drawn from each of two normal populations with '
the same unknown,variance and the followingresult are obtained:
Mean Variance ; the

Sample I 7 26
(the
Sample II 4 10
Test at 5% significance if the two populations have the same mean.
Given the following data. boys
Statistic d.o.f. Value at 5% level
t 18 2.101

z 1.96
554 ENGINEERiNG MATHEMATICS- lIA

12.A die is rolled 60 times and the following results are observed:
Face-point : 2 3 4 5 6 Total
No. of times: 6 10 8 13 11 12 60
occured
Are the data consistent with the hypothesis that the die is unbiased?
(Given "1.1, = 15.09 for 5 degrees of freedom.

13. In an experiment of pea-breeding the following results are obtained.


Round and Yellow - 315 ; wrinkled and yellow -101, Round and green
- 108 ; wrikled and green -32, Total -556. According to theory we
should get the frequency ratio as 9 : 3: 3: 1 of the above mentioned
varieties. Examine at 5% level of significance whether the result obtained
in the experiment agree with the theory. 5% value of "1.2 for 3 d.o.f is
7.815.
14.200 digits from 0 to 9 are taken at random from a page of a certain
random number table. The frequency distribution of the digit is given:
Digit : 0 1 2 3 4 5 6 7 8 9
Frequency:1819 13 21 16 25 22 20 21 25

Can this be regarded as random? Given X1S;9 = 16.92.

[Hint: Random No. means drawing of every digit would be


1 1 I
equiprobable, i.e. probability of the digit 0, I, 2, ... are 10'10'10'"
respectively]
15.In 60 throws of a die, face one turned up 6 times, face two or
three 18 times, face four or five 24 times, and face six 12 times. Test at
10% significance level ,if the die is honest, it being given that
p(x2 > 6.25)= 0.1 for 3 degrees of freedom.

16.1n 360 tosses of a pair of dice, 74 sevens and 24 elevens are


observed. Using 0.05 significance level, test the hypothesis that the dice
are fair. Given ds = 3.84 for 1 d.o.f
17.The wages of a factory workers assumed to be normally distributed
with mean m and variance 25. A random sample of 25 workers gives the
total wages equal to 1250 units. Test the hypothesis m = 52 against the
alternative m = 49 at 1% level of significance
SMALL SAMPLE TEST OF SIGNIFICANCE 555

1 -2.32_.e:..
[Given r;:::-
"l/2n
J
~
e 2 dJ-l = 0.01]

18. The sample mean of a random sample of size 10 is 12.1. Given


that the s.d of the population is 3.2. Can you conclude that the sample
comes from a normal population with mean 14.5 ? Test at 5% level of
significance. Given Z4750 = 1.96 ( Hint: Take Ho(J-l = 14.5) )
19. A soap manufacturing company was distributing a particular brand
of soap through a large number of retail shop. Before a heavy advertisement
campaign, the mean sates per week per shop was 140 dozens. After the
campaign, a sample of 26 shops was taken and the mean sales was found
to be 147 dozens with s.d. 16. Can you consider the advertisement effective?
t.05 = 1· 71 for 25 d.o.f.]
[Hint Ho(1l = 140),HI(Il> 140). lfHo is rejected then HI will f
be accepted i.e., " the advertisement is effective".
Here n = 26, small (as it < 30)] g
20. A certain diet newly introduced to each of the pigs resulted in the
following increse in body weight:
6,3, 8 -2,3, 0, -1, 1,6,0, 5 and 4. Can we conclude that the diet is
effective in increasing the weight of the pigs?
[Given (.05,11 = 2·20] [Hint: Take Ho(1l = O),HI (Il > 0)]
ths
21. A drug is given to 10 patient and the increase in their blood presure
were recorded to be 3, 6, -2, 4, -3, 4, 6, 0, 0, 2. Is it resonable to believe ~15
that the drug has no effect on change of blood pre sure ?
[5% value of t for 9 d.o.f is 2.26]
22. There are two populations, population I and population II. 8 and
6 are standard deviation of the two population respectively. Two s the
independent samples of size 10 and 12 are drawn from the two
populations respectively with sample means 20 and 27. Test at 1% level (the
of significance whether the two population means are equal. Given
P(lzl>2.58)=.01.
23.10 and 25 observations are drawn at random from two populations boys
respectively whose variances are 9.61 and 7.29. It is found that the means
of the two set of observations are 23.0 and 20.3 respectively. Test at 1%
significance level the hypothesis that the mean of the 1st population is
larger.
556 . ENGINEERING MATHEMATICS-UA

24. Two sets of ten students are selected at random from a college.
One set was given memory test without any training and the other set
was given the same test after two weeks of training. The scores obtained
are given in below:
Set A 10 8 7 9 8 10 9 6 7 8
Set B 12 8 8 10 8 11 9 8 9 9
Do you think: there is any significant effect of the training ?
(Given 1.05 ='2.10 at 8 d.o.f)

25.Two working designs are under consideration for adoption in a plant.


A time and motion study shows that 12 workers using design A have a
mean assembly time of 300 seconds with a standard deviation of 12
seconds and that 15 workers using design B have a mean assembly time
of 335 seconds with a s.d of 15 ~~conds. Is the difference in the mean
assembly time between the two working designs significant at 1% level
of significance?

26.The incomes of a random sample of engineers in industry A are


Rs. 630, 650, 680, 690, 690, 710 and 720 per month. The incomes of a
similar sample from industry Bare Rs 610, 620, 650, 660, 690, 700, 710,
720 and 730 per month. Discuss the validity of the suggestion that industry
A pays its engineers much better than industry B. Test at 5% level. (Given
1.05 = 2.145 at appropriate d.o.f mentioned by you.

27.Sample of sales in similar shops in towns A and B regarding a new


product yielded the following information:

For town A XI = 3.45, LXI = 38 ,Lx? = 228, nl = 11

For town B X2= 4.44 , LX2 = 40 , Lxi = 222 , n: = 9


..,
Is there any evidence of difference in the sales in the two towns ?
Test at 5% level. The value of t at 5% level for 18 d.O'~ip 2.10 1
. .
[Hmt: i.e. pit( I > 2.101) =.05; S? = -- LX? - [LXI
-- = 8.79 etc.]
11 11
28.5 identical coins are tossed 320 times, and the number of heads
appearing each time is recorded as follow:

No. of heads
Frequency
°
14
1
45
2
80
3
112
4
61
5
8
Total
320

/
SMALL SAMPLE TEST OF SIGNIFICANCE 557

(i)Are the coins unbiased? Given X205 = 11.07 and X~1 = 15.09 for 5
degree of freedom.
(ii)Is the sample from a binomial population (given p(x2 > 11.07) = 0.05
for 5 d.o.f)
29.A sample analysis of certain examination results of 200 students in
a school was made. It was found that 46 students failed, 68 secured third
division, 62 secured second division and rest were placed in first division.
Are these results commensurate with the annual examination results which
are in the ratio of 2 : 3 : 3 : 2 for above various categories respectively?
(The tabulated value of chi-square for 3 d.o.f at 5% level of
significance is 7.81)
30. Test whether the following distribution fit the poisson distribution
with mean 0.6
x 0 1 2 3 4 5 or more
frequency 140 75 25 6 3 1
Given e-·6 = 05488, X205;3 = 7.81, X~5;4 = 9.49, X~5;5 = 11.07
31.A bird watcher sitting in a park has spotted a number of birds
belonging to 6 categories. The exact classification is given below.
Category: 1 2 3 4 5 6·
Frequency: 6 7 l3 17 6 5
Test at 5% level of significance whether or not the data is compatible
with the assumption that this particular park is visited by birds belonging
to these six categories in the proportion 1 : 1 : 2 : 3 : 1 : l. [Given
p(X2 > 11.07) = 0.05 for 5 d.o.f]
3:Z.The following table gives the number of aircraft accidents that
occured during the various days of the week. Test whether the accidents
are uniformly distributed over the week using 5% level of significance.
Day Sun Man Tue Wed Thu Fri Sat

No. of accidents l3 14 18 12 11 15 14

Given P(X2>1258)=0.05 for 6 d.o.f [WB.U Tech 2005,2008]


[Hint : If the accidents are uniformly distributed of over the week
97 97 97
then No. of accidents will be equal on every day, i.e., 7'7'7 ...
]
558 ENGINEERING MATHEMATICS -IIA

33. The random discrete variable has the following distribution


X 0 1 2 3 4
f : 30 62 46 10 2
Use the chi-square test to determine where X follows binomial
distribution with p = 0.32 . Test at 5% level.
34. The following figures show the distribution of digits in the number
chosen at random from a telephone directory :
Digits 0 1 2 3 4 5 6
Frequency: 1026 1107 997 966 1075 933 1107
Digits 7 8 9 Total
Frequency 972 964 853 10,000
Test whether the digits may be taken to occur equally frequently in
the directory. Test at 5% of significance. Give p(x2 > 16.919)= 0.05 for 9
d.o.f. [Hint: Ie : 1000 1000 . 1000 ... etc.]
35. Out of 8,000 graduates in a town 800 are female, out of 1600
graduate employees 120 are females. Use X2 to determine if any distinction
is made in appointment on the basis of sex. Value of X2 at 5% level for
one degree of freedom is 3.84
36. Two researchers adopted different sampling techniques while
investing the same group of students to find the number of students falling
in different intelligence levels. The results are as follows:
Reasearcher No. of student in each level
Belowaverage Average Above average Genius total
X 86 60 44 10 200
Y 40 33 25 2 100
Total 126 93 69 12 300
would you say that the sampling techniques adopted by the two researchers
are really different?
[Given 5% value of X2 for 3 d.o.f and 4 d.o.f are 7.82 and 9.49
respectively]
[Hint: Ho (the data obtained are independent of the sampling technique
adopted by the researchersj]

/
SMALL SAMPLE TEST OF SIGNIFICANCE 559

37. The average number of articles produced by two machines per


day are 200 and 250 with standard deviations 20 and 25 respectively
on the basis of records of 25 days production. Can you regard both
the machines equally efficient at 1% level 0'[ significance?
Given t01;43 = 2.58 .
38. A certain drug was administered to 456 males out of a total of 720
in a certain locality to test its efficacy against typhoid. The incidence of
typhoid is shown below. Find out at 10% level the effectiveness of the
drug against the disease.
Infection Noinfection
Administering the drug: 144 312 456
Without administering: 192 72 264
the drug 336 384 720
Use statistical table for your data.
[Hint: Ho (incidence of typhoid and administration of drug are
independent)]
39. In the marketing of cosmetics, packaging is an important
consideration. One particular company has to decide which of the 4
suggested packages has to be used for a new product. 4 random samples
of 200 customers each are offered the new product, a different package
being used for each sample.
Do the following results indicate any difference in sales appeal between
the 4 pacakages?
pack 1 pack 2 pack 3 pack 4
Number 38 55 45 42
said
[Given X.~5;3 = 7.815]
[Hint: Ho(sales appeal and packaging independent)]
40. Test at 1% level to test whether innoculation is effective from the
following datas:
attacked not attacked Total
Innoculated: 20 300 320
Not innoculated: 80 600 680
Total: 100 900 lOUO
ENGINEERING MATHEMATlCS-IIA
560

41. In a survey of 200 students of which 75 were intelligent, 40 hfl,c¥-


educated fathers.While 85 of the intelligent boys had uneducated fathers.
Do these figures support the hypothesis that educated fathers have
intelligent boys? Given value of X2 for 1 degree of freedom is 3.841.
42. A manufacturing company of tubless tyre wants to make a choice
amongst certain makes of tyres. It gathered information on the average
running life and bursting strength tyres on the basis of samples drawn at
random from large lost of these makes.The information in regard to the
two makes A and B are given below. The firm wants to know if the
variance of the two makes are really different by applying F-test at 5%
level of significance.
BrandA BrandB
Sample size: 21 16
2.5 1.5
S.d
Mean running: 100 95
life
(Given F2.5,(20,15) = 2.76) [Hint: both sided test should be considerd]
43. Two sources of raw materials are under consideration by a
company. Both sources seem to have similar characteristics, but the
company is not sure about their respective consistency. A sample of
10 items from source A gives a variance of 225 and a sample of 11
items from source B gives variance of 200. Test at 1% level of
significance whether the variance of source A is greater than the
variance of source B. [Given F:Ol;(9,lO) = 4.93, F01;(lO,ll) = 5]
44. In a sample of 8 observation, the sum of squared deviation of item
from the mean is 94.5. In another sample of 10 observation, the value
was found to be 101.7
Test at 5% level of significance whether the difference is significant.
[Given, for degree of freedom (7,9), the area under F-curve enclosed
between the ordinates F = 3.29 and F = 00 is .05]
45. Two random samples are drawn from two normal population and
the following results were obtained:
Sample I: 16 17 18 19 20 21 22 24 26 27
Sample II: 19 22 23 25 26 28 29 30 31 32 35 36
Test at 5% level whether the two population have the same s.d
[Given F:05;(9,lO) = 3.10]
SMALL SAMPLE TEST OF SIGNIFICANCE 561

46. It is known that the average diameters of rivets produced by two .


firms A and B are practically the same but the variance differ. For 22
rivets produced by the firm A, the variance is 8.41, while for 16 rivets
manufactured by firm B, the variance is 14.44. Find at 5% level whether
the product of firm A has the same variability as those of firm B.
[Given F05;(15,21) = 2.20]

[Hint: Take SI = .J14.44 and S2 = .J8.41 as .J14.44 >.J8.41]


47. A sample consisting of 19 pairs of observation gives a correlation
coefficient of 0.5 and another independent sample of 23 pairs shows a
correlation coefficients of 0.6. Are the two correlation coefficients are
really different. Test at 1% level of significance.
[Given z.005 = 1.96 ]
48. The correlation coefficients obtained from a random sample of 19
pairs of observations from a bivariate normal population is 0.8. Does it
support that the population correlation coefficients is 0.55? Given
Z.005 = 1;96
49. A random sample of 27 pair of observations drawn from a bivariate
normal population gives a correlation coefficients of 0.42. Test at 5% level
of significance whether the variables are uncorrelated in the population.
Given t.025 = 2.06 for 25 degrees of freedom.
50. The correlation coefficient .70 and 0.45 were obtained from two
independent .random samples of 28 and 19 pairs of observations
respectively drawn from bivariate normal populations. Do these results
support the hypothesis that the correlation coefficients in the two
population are equal? Test ot 5% level of significance. Use statistical table.
51. A random sample of 28 pairs of observations shows a correlation
coefficient 0.74. Is it reasonable to believe that the sample comes from a
bivariate normal population with correlation coefficient 0.6?
Test at .05 level of significance,
1.96
[Given f = .95 ]
-1.96
52. A correlation coefficient of 0.2 is discovered in a sample of 28
pairs. Find at 5% level significance whether the population has non zero
correlation. Test at 5% level. Use statistical table.

EM-2A-36
ENGINEERING MATHEMATlCS-IIA
562

Answer
1. t == 2.825 with 9 degrees of freedom "average price is not 65"
3. t == -4; Does not support
2. t == 3.l5
4. t == 1.555 with degrees of freedom 15; Ho is accepted
5. z == -2.28 . First population mean may not be smaller than second
population mean.
6. No. i.e. HI (PI ;I: P2) is rejected; t == 1.67

7. Yes, t == 4.99
8. No ; t == 2.04 . 9. t == -6.74, not same

11. t == 1.5 , two means are equal

12. X2 == 3.4 ; The die is unbiased


13. X2 == 0.51 ; experimental results agree with the theory.

14. Yes; X2 == 6.3

15. Yes; X2 == 3.0

16. X2 == 4.07; not fair


17. z==-2; m==52 is true at 1% level

18. No; \z\ = 2.37


19. t == 2 ·19· Ho is rejected
20. t == 3.01> 2.20 .The diet is effective in it •.:reasing body weight

21. t == 2 < 2.26 .The drug has no effect ill cLange of blood presure

22. z == -2.28 , two means are equal

23. yes, z==2.41


24. no effect 25. t == 6.37

26. t == 0.099 ; suggestion not valid

27. t == -0.78 ; no difference in sales


SMALL SAMPLE TEST OF SIGNIFICANCE 563

28. (i) y} = 10.36; coins are unbiased

(ii) X2 = 10.36; the sample is from binomial distribution.


29. X2 = 8.43, No 31. X2 = 0.472 ; assumption is correct
32. X2 = 2.33 , accident are uniformly distributed over the week.

33. The fit is good

34. X2 = 5854; Note equally frequently occurence.


35. X2 = 13.89 with d.o.f 1. "distinction is made in appointment"

36. X2 = 2.097 with d.o.f 3; Ho is accepted

37. t = -7.65; Ho is rejected

38. X2 = 113.74; Ho is rejected

39. X2 = 4.530 with d.o.f 3; Ho is accepted

41.' X2 = 8.88 ; Yes educated fathers have intelligent boy.

42. F = 2.73; the two population variance are same.

43. F = 1.13 ; The two population variance are same

.44. F =' 1.195 . The difference of variance is insignificant

45. F = 1.195 ; the variability in two population is same

46. F = 1.74; same variability

47. Izl = 0.43 ; not different


48. Izl = 1.92; yes
49. It I = 2.31 ; no ; correlated

50. Z = -1.19 ; 'equal correlation' is valid

51. Z = 1.285 ; "correlation is 0.6" this is true

52. Z = 1.015; H0 is rejected


STATISTICAL TABLES

Shaded area is shown in the body of the table :


r/J curve X2 curve

o x
Figure. (i)
Figure. (ii)

t- curve

o
Figure. (iii)

F- curve F- cw·ve·
(5 percent F-points) (I percent F-points)

o F
Figure. (iv) Figure. (v)
STATISTICAL TABLES 565

Table (i). Standard Normal Distribution

x .00 .01 .02 .03 .04 .05 .06 .07 .08 .09

0.0 .5000 .4960 .4920 .4880 .4841 .4801 .4761 . .4721 .4681 .4641
0.1 .4602 .4562 .4522 .4483 .4443 .4404 .4364 .4325 .4286 .4247
0.2 . .4207 .4168 .4129 .4091 .4052 . .4013 .3974 .3936 .3897 .3859
0.3 .3821 .3783 .3745 .3707 .3669 .3632 .3594 .3557 .3520 .3483
0.4 .3446 .3409 .3372 .3336 .3300 .3264 .3228 3192 .3156 .3121

0.5 .3085 .3050 .3015 .2981 .2946 .2912 .2877 .2843 .2810 .2776
0.6 .2743 .2709 .2676 .2644 .2611 .2579 .2546 .2514 .2483 .2451
0.7 .2420 .2389 .2358 .2327 .2297 .2266 .2236 .2207 .2177 .2148
0.8 .2119 .2090 .2061 .2033 .2005 .1977 .1949 .1922 .1894 .1867
0.9 .1841 .1814 .1788 .1762 .1736 .1711 .1685 .1660 .1635 .1611

1.0 .1587 .1563 .1539 .1515 .1492 .1469 .1446 .1423 .1401 .1379
1.1 .1357 .1335 .1314 .1292 .1271 .1251 .1230 .1210 .1190 .1170
1.2 .1151 .1131 .1112 .1094 .1075 .1057 .1038 .1020 .1003 .0985
1.3 .0968 .0951 .0934 .0918 .0901 .0885 .0869 .0853 .0838 .0823
1.4 .0808 .0793 .0778 .0764 .0749 .0735 .0721 .0708 .0694 .0681

1.5 .0668 .0655 .0643 .0630 .0618 .0606 .0594 .0582 .0571 .0559
1.6 .0558 .0537 .0526 .0516 .0505 .0495 .0485 .0475 .0465 .0455
1.7 .0446 .0436 .0427 .0418 .0409 .0401 .0392 .0384 .0375 .0367
1.8 .0359 .0351 .0:\44 .0336 .0329 .0322 .0314 .0307 .0301 .0294
1.9 .0287 .0281 .0274 .0268 .0262 .0256 .0250 .0244 .0239 .02~~

2.0 .0228 .0222 .0217 .0212 .0207 .0202 .0197 .0192 .0188 .0183
2.1 .0179 .0174 .0170 .0166 .0162 .0158 .0154 .0150 .0146 .0143
2.2 .0139 .0136 .0132 .0129 .0125 .0122 .0119 .0116 .0113 .0110
2.3 .0107 .0104 .0102 .0099 .0096 .0094 .0091 .0089 .0087 .0084
2.4 .0082 .0080 .0078 .0075 .0073 .0071 .0069 .0068 .0066 .0064

2.5 .0062 .0060 .0059 .0057 .0055 .0054 .0052 .0051 .0049 .0048
2.6 .0047 .0045 .0044 .0043 .0041 .0040 .0039 .0038 .0037 .OOJ6
2.7 .0035 .0034 .0033 .0032 .0031 .0030 .0029 .0028 .0027 .0026
2.8 .0026 .0025 .0024 .oon. .0023 .0022 .0021 .0021 .0020 .0019
2.9 .0019 .0018 .0018 .0017 .0016 .0016 .0015 .0015 .0014 .0014

3.0. .0013 .0013 .0013 .0012 .0012 .0011 .0011 .0011 .0010 .0010
3.1 .0010 .0009 .0009 .0009 .0008 .0008 .0008 .0008 .0007 .0007
3.2 .0007 .0007 .0006 .0006 .0006 .0006 .0006 .0005 .0005 .0005
3.3 .0005 .0005 .0005 .0004 .0004 .0004 .0004 .0004 .0004 .0003
3.4 .0003 .0003 .0003 .0003 .0003 .0003 .0003 .0003 .0003 .0002
Table (ii).!l-Distribution
r~
n(/ .99 .98 .95 .90 .80 .70 .50 .30 .20 .10 .05 .02 .01 .001
.000 .001 .004 .016 '.064 .148

.020 .040 .455 1.074 1.642
.103 .211 .446 2.706

il .713 1.386 3.84' 5.412 6.635


.115 .185 .352 2.408 3.219 4.605 10.827
.584 1.005 1.424 2.36~ 5.991 7.824
.297 .429 3.665 4.642 9.210 13.815
.711 1.064 1.649 6.251 7.815 9.837
5 .554 2.195 3.357 4.878 11.345 16.266
.752 1.145 1.610 2.343 5.989 7.779 9.488
::WOO 4.351 6.064 11.668 13.277
7.289 9.236 18.467
11.070 I:U88 15.086 20.515
1.134 1.635 2.204
7 1.239 3.070 3.828 5.348
1.564 7.231 8.558
61'"
S
9
1.646
2.088
2.032
2.532
2.167
2.733
3.325
2.833
3.490
4.168
3.822
4.594
5.380
4.671
5.527
6.393
6.346
7.344
' 8.383
9.524
9.303
11.030
10.645
12.017
13362
12.592
14.067
15.507
15.0:n
16.622
18.168
16.812
18.475
22.457
24.322
10 2.558 3.059 8.343 10.656 12.242 20.090 26.125

"I
3.940 4.865 6.179 14.684 16.919 19.679
7.267 9.342 11.781 21.666 27.877
13.442 15.987 18.307 21.161 23.209 29.588
3.609 4.575 5.578 6.989 8.148 10.341
12 3.571
'053 4.178 5.226 6.304 12.899 14.631 17.275
7.807 9.034 11.340 19.675 22.618 24.725
13 4.107 4.765 5.892 14.011 15.812 18.549 31.264
7.042 8.634 9.926 12.340 21.026 24.054
14 4.660 5.368 15.119 16.985 26.217 32.909
6.571 19.812

/I
7.790 9.467 10.821 22.362 25.472
15 5.229 5.985 13.339 16.222 18.151 21.064 27.688· 34,528
7.261 8.547 10.307 11.721 23.685 26.873
14.339 17.322 19.311 29.141 36.123
22.307 24.996 28.259 30.578 37.697
t'"

~
=
t'"
trl
[/)
rJ)
s;!
-l
Cii
-l
16 5.812 6.614 7.962 9.312 11.152 12.624 15.338 18.418 20.465 23.542 26.296 29.633 32.000 ~9.252 ?5
17 6.408 7.225 8.672 10.085 12.002 13.531 16.338 19.511 21.615 24.769 27.587 30.995 33.409 40.790 >
~
18 7.015 7.906 . 9.390 10.865 12.857 14.440 17.338 20.601 22.760 2.'i.989 28.869 32.346 34.805 42.312
8.567 10.117 11.651 13.716 15.352 18.338 21.689 23.900 27.204 30.144 :n.687 36.191 43.820 ~
19 7.633 co
20 8.260 9.237 10.851 12.443 14.578 16.266 19.337 22.775 25.038 28.412 31.410 35.020 37.566 -45.315 ~
l"l
rJ)

21 8.897 9.915 11.591 13.240 15.445 17.182 20.337 23.858 26.171 29.615 32.671 36.343 38.932 46.797
22 9.542 10.600 12.338 14.041 16.314 18.101 21.337 24.939 27.301 . 30.813 33.924 37.659 40.289 48.268
23 10.196 11.293 13.091 14.848 17.187 19.021 22.337 26.018 28.429 32.007 35.172 38.968 41.638 . 49.728
24 10.856 11.992 13.848 15.659 18.062 19.943 23.337 27.096 29.553 33.196 36.415 40.270 42.980 51.179
25 11.524 12.697 14·611 16.473 18.940 20.867 24.337 28.172 30.675 34.382 37.652 41.566 44.314 52.620

26 12.198 13.409 15.379 17.292 19.820 21.792 25.336 29.246 31.795 35.563 38.885 42.856 45.642 54.052
27 12.879 14.125 16.151 18.114 20.703 22.719 26.336 30.319 32.912 36.741 40.113 44.140 46.963 55.476
28 13.565 14.874 16.928 18.939 21.588 23.647 27.336 31.391 34.027 37.916 41.337 45.419 48.278 56.893
29 14.256 15.574 17.708 19.768 22.475 24.577 28.336 32.461 35.139 39.087 42.557 46.693 49.588 58.302
30 14.953 16.306 18.493 20.599 2\.364 2.'i.508 29.336 33.530 36.250 40.256 43.773 47.962 50.892 59.703
VI
I~
n (/ .45
Table (iii). rJ-Distribution
.40 .35 030 .25 .20 .15 .10 .05 .025 .01 .005 .0005
I /158./42
2
.325
.289
.510
.445
.727
.617
1.000 1.376 1.963 1078 6.314 /2.706
3 .137 .816 1.061 1386 31.82/ 63.657
.277 .424 .584 1.886 2.920 636.6/9
.765 .978 1.250 ' 4.303 6.965
4 .134 .271 .414 1.638 9.925 31.598
.569 .741 2353 3./82 4.541
5 .132 .267 .408 .941 1.190 1.533 5.841 12.924
.559 .727 2.132 2.776 ' 3.747
.920 1.156 1.476 4.604. 8.610
2.015 2.571

7
111 .130 .265 .404 .553 .718 .906 1.134
. 3.365 4.032 6.869

8 6/ .130
Y.l
.262
.402
.399
.549
.546
.7/1
.706
.896 1.1 19
1.440
1.415
1.943
/.895
2.447
2.365
3. 143 3.707 5.959
9 .129 .889 1.108 1.397 . 2.998 3.499 5.408
.26/ .398 .543 1.860 2.306
/0 ./29 .703 .883 1./00 2.896 30355
.260 .397 .542 1.383 /.833 5.041
.700 .879 2.262 2.821 3.250
/.093 1.372 1.812 4.781
2.228 2.764 1/69 4.587
" /129./28
/2
.260
.259
.396
.395
.540
.539
.697
.695
.876
.873
1.088 /.363 1.796 2.20/ 2.7/8
13 ./28 .259 /.083 1.356 1.782 3./06 4.437
.394
/4
15
.128
.128
.258
.258
.393
.393
.538
.537
.536
.694
.692
.691
.870
.868
.866
/.079
/.076
1.074
1.350
1.345
1.34/
1.771
1.761
1.753
2./79
2./60
2.145
2.13/
2.681
2.650
2.624
3.055
lOl2 .
2.977
4.3/8
4.22/
4.140
I ~.
~
I:n
-l
2.602 2.947 4.073 ?)
>
t""

~
Ctl
r-
t"l
f/j

--. _"1' _
r'-l
;?
16 .128 .258 .392 .535 .690 .865 1.071 1.337 1.746 2.120 2.583 2.921 4.014
-
-3
r'-l
-3
?}
17 .128 .257 .392 .534 .689 .863 1.069 1.333 1.740 2.110 2.567 2.898 3.965 >
.392 1.067 1.330 1.734 2.101 2.552 2.878 t""
18 .127 .257 .534 .688 .862 3.922
19 .127 .257 :391 .533 .688 .861 1.066 1.328 1.729 2.093 2.539 2.861 3.883 ~
20 .127 .257 .391 .533 .687 .860 1.064 1.325 1.725 2.086 2.528 2.845 3.850 =
t""
trI
r'-l
21 .127 .257 .391 .532 .686 .859 1.063 1.323 1.721 2.080 2.518 2.831 3.819
22 .127 .256 .390 .532 .686 .858 1.061 1.321 1.717 2.074 2.508 2.819 3.792
23 .127 .256 .390 .532 .685 .858 1.060 1.319 1.714 2.069 2.500 2.807 3.767
24 .127 .256 .390 .531 .685 .857 1.059 1.318 1.711 2.064 2.492 2.797 3.745
25 .127 .256 .390 .531 .684 .856 1.058 1.316 1.708 2.060 2.485 2.287 3.725

26 .127 .256 .390 .531 .684 .856 1.058 1.315 1.706 2.056 2.479 2.779 3.707
27 .127 .256 .389 .531 .684 .855 1.057 l.314 1.703 2.052 2.473 2.771 3.690
28 .127 .256 .389 .530 .683 .855 1.056 1.313 1.701 2.048 2.467 2.763 3.674
29 .127 .256 .389 .530 .683 .854 1.055 1.311 1.699 2.045 2.462 2.756 3.659
30 .127 .256 .389 .530 .683 .854 1.055 1.310 1.697 2.042 2.457 2.750 3.646

40 .126 .255 .388 .529 .681 .851 1.050 1.303 1.684 2.021 2.423 2.704 3.551
60 .126 .254 .387 .527 .679 .848 1.046 1.296 1.671 2.000 2.390 2.660 H60
120 .126 .254 .386 .526 .677 .845 1.041 1.289 1.658 1.980 2.358 2.617 3.373
00 .126 .253 .385 .524 .674 .842 1.036 1.282 1.645 1.960 2.326 2.576 . 3.291

IJI
e-
10
570
STATJSTJCAL TABLES

Table (iv). F -Distribution: 5 Percent Points


m
10... J J

r'
2 3 4 5 6 8 /2 24 00
/99.5 2/5.7 224.6
2 /8.5/ 230.2 234.0 238.9 243.9
J9.00 /9./6 249.0
31 /0./3 9.55 9.28
/9.25 J9.30 /9.33 J9.37 '/9.4/ /9.45
254.3
J9.50

,r~
9./2 9.0/ 8.94
4 7.7/ 6.94' 6.59 8.84 8.74 8.64
6.39 6.26 6./6 8.53
5 6.6/ 5.79 5.4J 6.04 5.9/ 5.77
5./9 5.05 4.95 5.63
4..82 4.68 4.53 4.36
5./4 4.76 4.53
7 5..59 4.39 4.28
4.74 4.35 4./5 4.00 3.84
4./2 3.97 3.87 3.67
8 5.32 4.46 3.73 3.57 3.4/
4.07 3.84 3.69 3.23
9 5./2 3.58 3.44
4.26 3.86 3.63 3.48 3.28 3./2 2.93
10 4.96 3.37 3.23
4./0 3.7J 3.48 3.07 2.90 2.71
3.33 3.22 3.07 2.91 2.74 2.54
11
12 /4 ...
4.75
3.98
3.88
3.59
3.49
3.36
3.26
3.20 3.09 2.95 2.79 2.6/ 2.40
/3 4.67 3.80 3." 3.00 2.85
3.41 3./8 2.69 2.50 2.30
14 4.60 3.02 2.92 2.77
3.74 3.34 3.11 2.60 2.42
15 2.96 2.85 2.21
4.54 .l68 3.29 2.70 2.53 2.35
3.06 2.90 2.79 2.13
2.64 2.48 2.29 2.07
"/4.49
/7 4.45 3."
3.59
3.24
3.20
3.0/
2.96
2.85
2.8/
2.74 2.59 2.42 2.24 2.0/
/8 4.4/' 3.55 2.70 2.55 2.38
3.16 2.93 2.77 2.19 1.96
/9 4.38 3.52 2.66 2.5/ 2.34
3.n 2.90 2./5 1.92
20 4.35 3.49 2.74 . 2.63 2.48
3./0 2.87 2.3/ 2.1/ 1.88
2.7/ 2.60 2.45 2.28 2.08 1.84
3.47
22
23
211"2 4.30
4.28
3.44
3.42
3.07
3.05
2.84
2.82
2.68
2.66
2.57
2.55
2.42
2.40
2.25 2.05 1.8/
3.03 2.80 2.23 2.03 1.78
24 4.26 3.40 2.64 2.53 2.38
3.0/ 2.78 2.20 2.00 1.76
25 4.24 3.38 2.62 2.5/ 2.36
2.99 2.76 2.18 1.98 ·1.73
2.60 2.49 2.34 2.16 1.96 1.7/
2'/4.22
27 4.21
3.37
3.35
2.98
2.96
2.74
2.73
2.59
2.57
2.47
2.46
2.32 2./5 1.95 1.69
28 4.20 3.34 2.30 2.13 1.93
2.95 2.71 2.56 1.67
29 4.18 2.44 2.29
3.33 2.93 2.70 2./2 1.9/ 1.65
30 4.17 3.32 2.54 2.43 2.28
2.92 2.69 2.53 2./0 1.90 1.64
'2.42 2.27 2.09 1.89 1.62
40 /4.0'
60 4.00
3.23
3./5
2.84
. 2.76
2.6/
2.52
2.45
2.37
2.34 2./8 2.00 1.79 1.51
120 3.92 3.07 2.25 '2.10 1.92
2.68 2.45 1.70 1.39
3.84 2.29 2.17 2.02
00
2.99 2.60 1.83 1.61
2.37 2.2/ 2.10 1.25
1.94 1.75 1.52 1.00

'--
STATISTICAL TABLES 571

Table (v).F -Distribution: 1 Percent Points

In
n 2 3 4 5 6 8 12 24 00

\ 4052 4999 5403 5625 5764 5859 5982 6\06 6234 6366
2 98.50 99.00 99.17 99.25 99.30 99.33 99.37 99.42 99.46 99.50
3 34.12 30.82 29.46 28.71 28.24 27.91 27.49 27.05 26.60 26.12
4 21.20 18.00 16.69 \5.98 15.52 15.2\ 14.80 14.37 13.93 \3.46
5 16.26 13.27 \2.06 \ \.39 10.97 10.67 10.29 9.89 9.47 9.02

6 \3.74 10.92 9.78 • 9.\5 8.75 8.47 8.\0 7.72 7.3\ 6.88
7 \2.25 9.55 8.45 7.85 7.46 7.19 6.84 6.47 6.07 5.65
8 \ 1.26 8.65 7.59 7.01 6.63 6.:n 6.03 5.67 5.28 4.86
9 10.56 8.02 6.99 6.42 6.06 5.08 5.47 5.11 4.73 4.31
\0 10.04 7.56 6.55 5.99 5.64 5.39 5.06 4.71 ·4.33 3.91

1\ 9.65 7.20 6.22 5.67 5.32 5.07 4.74 4.40 4.02 3.60
12 9.33 6.93 5.95 5.41 5.06 4.82 4.50 4.16 3.78 3.36
/3 9.07 6.70 5.74 5.20 4.86 4.62 4.30 3.96 3.59 3.16
\4 8.86 6.51 5.56 5.03 4.69 4.46 4.14 3.80 3.43 3.00
15 8.68 6.36 5.42 4.89 4.56 4.32 4.00 3.67 3.29 2.87

16 8.53 6.23 5.29 4.77 4.44 4.20 3.89 3.55 3.18 2.75
\7 8.40 6.11 5.18 4.67 4.34 4.10 3.79 3.45 3.08 2.65
\8 8.28 6.01 5.09 4.58 4.25 4.01 3.71 3.37 :tOO 2.57
\9 8J8 5.93 5.0\ 4.50 4.17 3.94 3.63 3.30 2.92 2.49
20 8.10 5.85 4.94 4.43 4.10 3.87 3.56 3.23 2.86 2.42

21 8.02 5.78 4.87 4.37 4.04 3.81 3.51 3.\7 2.80 2.36
22 7.94 5.72 4.82 4.31 3.99 3.76 3.45 3.12 2.75 2.31
23 7.88 5.66 4.76 4.26 3:94 3.71 3.41 3.07 2.70 2.26
24 7.82 5.61 4.72 4.22 3.90 3.67 3.36 3.03 2.66 2.2\
25 7.77 5.57 4.68 4:18 3.86 3.63 3.32 2.99 2.62 2.17

26 7.72 5.53 4.64 4.14 3.82 3.59 3.29 2.96 2.58 2.13
27 7.68 5.49 4.60 4.11 3.78 3.56 3.26 2.93 ~.55 2.10
28 7.64 5.45 4.57 4.m 3.75 353 3.23 2.90 2.52 2.06
29 7.60 5.42 4.54 4.04 3.73 350 3.20 2.87 2.49 2.03
30 7.56 5.39 4.51 4.02· 3.70 3.47 3.17 2.84 2.47 2.01

40 7.31 5.18 4.31 3.83 3.51 3.29 2.99 2.66 2.29 1.80
60 7.08 4.98 4.13 3.65 3.34 3.12 2.8.2 2.50 2.12 1.60
120 6.85 4.79 3.95 3.48 3.17 2.96 '2.66 2.34 1.95 1.38
00 6.64 4.60 3.78 3.32 3.02 2.80 2.51 2.18 1.79 1.00
INDEX 573

Boole's inequality, 8 Leptokurtic distribution, 318


Bernoulli Trial, 46
Best fitted curve, 412 Multinomial distribution, 134
Best Critical Region, 479 Marginal Distribution
Bivarite Data, 129 Function, 266
Bivariate, 352 Mean, 298
Bivariate Probability Density Median, 301
function, 266 Mode, 305

Certain Event, 2 Normal Probability Density, 227


Complementary Event, 2 Normal Density Curve, 228
Composite Event, 3 Null Hypothesis, 474
Composite Hypothesis, 473 Normal Equations, 414
Correlation Coefficient, 356
Conditional Density, 268 Poisson Distribution, 112
Conditional Distribution, 268 Platykurtic distribution, 318
chi-square, 537
Random Experiment, 1
Data / Observations, 295 Random Variable, 62
Disjoint, 2 Rank Correlation, 383
density function, 195 Random Sampling, 441
Density Curve, 196 Regression, 370
Discrete, 295 Regression Line, 371

Event Points, 1 Sample Points, 1


Event, 1 Simple Event, 3
Exhaustive Events, 3 Statistical Regularity, 440
Exponential Distribution, 243 Sampling Error, 440
Exponential distribution, 248 Simple Hypothesis, 473
S.D, 198
Gamma Distribution, 247 . stochastic variable, 62
Gamma denisty Curve, 250 Spectrum, 62
Skewness, 3 13
.Impossible Event, 2 Significance of Kurtosis, 317

Joint Distribution Function, 265 Variance, 198

Kurtosis, 3 13 Zero correlated, 356

You might also like