Professional Documents
Culture Documents
Stats Book
Stats Book
Run
3.20 Poisson Distribution ret Sr Eeverervend or ab0 Co ULIE OH Ps CENCE COCONNeF
3.28 Null and Alternative Hypothesis [3.2 Characteristics for an Ideal Measures of Dispersion
3.29 One-Sided or Two-Sided Hypothesis An ideal measure of dispersion i is to satisfy the following characteristics: :
3.30 Errors (Type-l and Type-ll) i) It should be based on all observations in the data set. ,
3.37 Some Definitions ii) It should be easy to understand.
e These measures have the same units as that of the observations e.g. cm, hours, etc. and
the measures are called as absolute measures of dispersion.
* It can be seen that absolute measures of dispersion possess units and hence create
difficulty in comparison of dispersion for two or more frequency distributions. Absolute
ing
measures of dispersion use original units of data and are most useful for understand
the dispersion within the context of experiment and measurements. -
+t Largest / Highest value * When this difference isdivided by two then it is called as quartile deviatio
n or semi
interquartile range.
i
. + .
'S Smallest/ Lowest vaiue Quartile deviation or semi interquartile range = 5
Range =. L-S
a The range does not tell us the number of data points. - 10+
Q, = The value of C4) observation
b. The Tange cannot be used to find mean, median or mode. : 1 1
The range cannot be used for open-ended distribution. So, 2. 759 observation lies between i. and 3" observation in the given data set
Re
set.
So, the 8.25" observation lies between the 8" and 9" observation in the given data
= Q, = 8 observation + 0.25 (9" observation ~ 8" observation)
N - Total frequency
C.F. - Cumulative frequency —~
Solution : i) To find range and coefficient of range
- Frequency of the particular class
<4
|
- Lower bound of the class which respective quartile lies
ya
©
29
—
|
w - Class width of respective quartile lies, Range = L-S=108-14=94
|
= The value of By observation. L-S$ 94 94
Class of Q ; Coefficient of range = TTS=Tog + 14 122 © 0.7704
This value provides the class where Q, hes,
it} To find quartile deviation and coefficient of quartile deviation
We arrange data in ascending order.
Q,= f 14, 19, 21, [22], 26, 29, 30, 62, 69, 74, 92, [100], 100, 104, 108
| ql
Class of 'Q, = The value of 2) observation. Here i = Number of observations = 15
n+ 1 th
5+1
The value of 35+1) observation
i
= 12® observation = 100
2
Quartile deviation & 75 Q 400 - 2 == — 39
Q,-Q, 100-22 78
Coefficient of Q-D = QFQ =700 +22 122 = 0.6393
= Q, =
10+ 1\" how.
The value of | 4 observati on
= 2.75" observation.
2.75" observation lies between 2” and 3 observation in given data
“ Q, = gn observation + 0.75(3" observation ~ 24 observation) Here
—_ 180
= 12+0.75x(17-12) = 1243.75 “ Q, = The value of 47 4 745) by observation
.
Q, = 15.75 -. Therefore (20 — 30) is Q, class
+ 11
Q, = The value 7
ED) observation = — C.F.
33 th - ” Q, =
= The value of 4 observation
— 30
= The value of 8.25" observation
e Q, = 8" observation + 0.25 (9" observation — 8" observation) ~ 30
= 23 + 0.25(27 - 23)= 23 +1 'Q, =
Q, = 24
; 3N_ 3x1 a
- Q | Q, = The value of
Quartile deviation = Q
== = 80 = 135] observation.
. Q,—
Coefficient of QD = Q =
D Q
tt
~ ven TEE = 0.2075
Qs.
I
Quartile deviation
.10.625
Il
TCNUANNA!
TECHNICAL PUBLICATIONS® _ an cin apo gh
ANCHPATINUGE®D
tenn
an In thenot fac bnaulanne
3-11 Descriptive Statistics : Measures of Dispersion
Statistics . 3-10 Descriptive Statistics : Measures ofDispersion Statistics
nix. ~ Model
Q,-Q , (for frequency distribution)
quartile deviation=
Coefficient of
q | Q+Q,
46.25 —25 | MD about mode
46.25 + 25 Note: Coefficient of M.D. about mode =
Mode
0.2982
1
Ea Mean Deviation
e We have seen that in range, we calculate it by taking extreme observations. In quartile
deviation 25 % extreme observations are left out while calculating Q, and Q,.
* In both these cases only few observations are. considered while calculation arid
remaining observations are ignored. These both terms cannot give the clear picture of all
observations. .
* Here we discuss the measure of dispersion which takes into account all the observations
of the data set and can be calculated from an average. Such measures which are
calculated from average are known as “average deviation”
Definition : The arithmetic mean of absolute deviations from any average (mean or median
or mode) is called Mean Deviation (M.D.) about the respective average.
x Ix, — X|
MD. (for row data of individual observation)
i=l] on
ne |x,— X|
YL
MD.=D.= LF 1
| (for frequency distribution)
Here Zh
M.D. about mean.
Coefficient of M.D. about mean = Mean uy M.D. about mean
I
ii} Mean Deviation (M.D.) about median : M.D. 2
Coefficient of M.D. about mean = Mean” aT =0. 0281
ix, 1 - Median|
SA
. Median 7 observation
Il
iii) Mean Deviation (M.D.) about mode :
a. jx, — Mode} 4" observation
M.D. (about mode) = , x A (for row data or individual observation)
i= 70
. Median =
_= 13=
= 1.8571
M.P. 1.8571
Coefficient of M.D. about median =
Median© 70 © 0-0265
Solution : = 0.3894 —
£fix, — Median| ;
© . MD, about median = —- N *
18 co
= 10 © 18 _.
al which means deviation. Using w we can develop. a measure of dispersion. Which is (for frequency distribution)
better than mean deviation.
The arithmetic mean of u* is known as mean Square deviation in short M.S.D. and is given
Note : o(S.D) afvar(a) is the relation between S.D. and variance.
by,
i) For row data or individual observation Simplified formula of above formulae :
4) For row data or individual observations
2
noua
M.S.D. 2
8
Let,
MS.D. _—
|
“jay N
A.
il
For different values of ‘a’ we get different M.S.D. This creates a problem in measuring
dispersion.
b) Standard deviation : The positive square root of squares of the deviations taken from
=1 N-
arithmetic mean is called Standard Deviation (S.D.). = 2xX + (7)
= a3
In other words the positive square root of variance is called standard deviation or root
k
mean square deviation.
It is denoted by o(sigma)
= E 6x" .
5 fx; D £6)
1= i=
She =
= WN i _ oR) +
o(S.D.) = {for row data-or individual observation)
CV. _= Tay
SD. 100 _o
= X© 100% .
' Remarks: | -
1) Coefficient
of variation always expressed in percentage.
2) In many cases, the value of the coefficient of variation is too smali, for convenienc
e it is
multiplied by 100.
(2)
We have to find standard deviation of first n natural numbers, :
- 1e.1,2,3,:..,n
- ws ~ +1 core
x
Addition of first n natural numbers = 2x,=14+2+...¢n= mt) Here n = Number of observations =
x=
2) For batsman B
batsman B (C.V.) = IM
* 24.30 ;
554 * 100
lt
==
x= 50455
= 50+5.8 = 55 8 43.86 %
‘Solution : Data A :
aideesountonesgssnrrasye
&
TECHNICAL PUBLICATIONS® - an up-thrust for knowledge TECHNICAL PUBLICATIONS® - an up-thrust for knowiedae
Statistics , 3-22 Descriptive Statistics : Measures of Dispersion
- — — _ jatistios 3-23 Descriptive Statistics : Measures of Dispersion
S 7 D
The r" moment or moment of order r is denoted 1, is given by,
. th = =
i) About origin : —_—_
SAY
se]
/
(for individual observations) u, = Lf(x; — x)
WE
als,
Se 7
z
<A
-
= &
&]
OO
Ip
I
I
(for frequency distribution)
ll
i
il
I
|
1-1.
ain
_ u, = Es 5 = Variance= of
i) About any observation A : _
(x,-- Ay _
(for individual observation)
| . | Lf(x, =)
W, 5
i=l | | w = FJ
& 06,AY (for frequency distribution)
a “seem!
LE(x, — X)
= I, —— N-
‘ WE
i=l a
:
Substituting + r= 0,1, 2, 3, 4 in moments about any point A
y= 0 - = - | EEFIRelation between Raw and Central Moments
we get, - .
—_—— —_— We know that,
— A)
Zf(x, Lfx, (Ef; |
Bee NTA oh w=
f(x,
- AY
L © N
w= X¥-A
_—_— wy
+
—
ll
vs ? = 00407
N ;
.
=
.
= OW =o
- Lf; .
3.9.1)
= & = Mean square deviation . .
LAY — Also, u. Ef(x,
A 7 = £)
) _ 2f(x,-A+A-—X)
x; = ) ©
uw, = Xf(x, - A)
3 N . a .
dea wy Cow = ER-A)
and i A)’)
i, = ZE(x,~ NEW) CW
Using binomial expansion
az £=1 r=2
b) Central Moments : | = FE (uw) -C, (pg) +©C, (; +.,+(-1
- ; Ms { Mf (Hi tH ) Hw J ) (9 3
Moments about the mean are known ass central moments. .
from equation (3.9.1) 1 EAU)
Central moment of order r (or r” order moment) is denoted by yy, and i is given by, IDB = o-_
1
E(x, — x) B= WCW AD HC Ha WY er wy
. u, - (for individual observation)
. _— Substitutin = 2,3,4 we get,
L(x, — X). a ap ce mse ne " =_ , ie , 32
= . (for frequency distribution) Wy = Wan COMME LY)
| OF wy WY FY CC, =2),
Substituting r=0, 1, 2,3, 4 in moment about mean of frequency distribution formula. rear
Hy = Wy - (HD
/ 3 4 / 4 ; ,
Eee Skewness
4 , / 3
Þ, = W,-3wW,Þu", +20)
, 4 rn 14 , ra 4 _— +3 4 74
By = Wan Cut CW WW) 5 Cu uyy + Cu) '@ A frequency distribution is symmetric about value ‘a’ if the corresponding frequency
= Wd + Ow, (wy = Aur + wy curve is symmetric about * a”, i.e. the ordinate x = a divides frequency curve into two
Wy = Wd + 6h, (1) = 30)
/ 4 é é ? , 4
equal parts (Fig. 3.9.1(a)).
the right side then distribution i is said to be positively skewed Fig. 3.9.1(b).
© In this case we observe that mode < median < mean
Note :
ii) Negative skewness:
i) The expression ut, contains r terms.
¢ Ifthe mean lies on the left:side of mode or the curve elongated (or stretched) towards the
ii) First term in the expression of jt, is positive and alternative terms are negative. left side then distributioniis Said to be negatively skewed Fig. 3.9.1 (c).
lit) Sum of the coefficient on R.H.S. of each of the relations is zero. In this case we observe then
iv) The last term in the éxpression of u, is (’)’.
We
Mean < Median < Mode
f
SEY Sheppard’s Correction for Centra! Moments
¢ While computing arithmetic mean and S.D. of frequency distribution of continuous
variables, we assume that all the items in the class interval are concentrated at the
midpoint of the class. The same assumption is. made for computing moments. This Mode = Median
= Mean Mode 4 Mean Mean 4 Mode
enables us to compute moments but it gives some amount of error.
; þ © Median | Median
* In case of odd order moments the sum of errors being small and it is negligible. (a) - : (b) {c)
Therefore, odd order moments need not be corrected,
| Fig. 3.9.1
* On the other hand, while computing even ordered central moments, errors are raised to
Different measures of skewness are,
even power, hence all errors are effectively positive. The sum of these errors will be
3(Mean ~ Median
considerably large. In this case W.F. Sheppard suggested the following corrections, i) Skewness =
2 Standard deviation
22
‘LL, (corrected) = Wy FB ht
+30 | 3.10 Kurtosis
¢ We have studied various: aspects of comparison of frequency distributions which are
average,dispersion and symmetry. However these aspects are not enough for
comparison. | :
of py = Wm 3 +200)?
+ Two frequency distributions may have the same average, dispersion and same amount
‘ 40 = 3%20X2+2X(2)) == 64
skewness but they may differ in relative height of the curve. ,
Ul
ty +60, 2 3(u,)
C,C, which are symmetrical about W,-4W,
® Observe the Fig. 3.10.1 containes three curves C,C,
—3 x (2) =212
100 -4x2x40+6x 20x (2)
mean and have the same range.
= 16; Þ, =— 64, == 212 |
* Kurtosis is the property of a distribution which express preakedness or flatness of curve. First four central moments are Mk, = 6, p,
It
Fig. 3.10.1 —_ Mean =
Kurtosis is measured by the coefficien t Þ, and is given by, .
S.D.
bn Fare webs
iM.
is called
i
To find coefficient of skewness and kurtosis.
iii)
We know that, -
a) Mesokurtic curve (Normal curve) ; The curve which is neither flat or peaked Coefficient of skewness ( 9 and coefficient of kurtosis (B,) is given by,
as mesokurtic curve or normal curve.
In this case Þ, = 3 or y, = 0. The curve C, in Fig. 3.10.1 is mesokurtic.
b) Platykurtic curve : The curve which is flatter or.it has less peak than that of normal
- curve than the curve is called as platykurtic curve.
- In this case Þ, < 3 or y, < 0, The curve C,in Fig. 3.10.1 is platyurtic,
c) Leptokurtic curve : The curve which is more peaked or curve having more peak than
that of normal curve then the it is called as leptokurtic curve.
In this case B;> 3 or y, > 0. The curve C, in Fig. 3.10.1 is leptokurtic curve.
Example 3.10.1
Solution : Also /
i) Find four central moments. it) Find coefficients of skewness and kurtosis.
Solution: Given, A= 4,
First four raw moments p’, = 2, p', = 20, p”, = 40, p,= 100
i) To find the first four central moments.
it, = 0
r ap yl
W K 27 (ul J
20-2*=16
il
MW:
MH :
B, > 3
.. Distribution is leptokurtic.
i) To find four moments about any arbitrary point Rees Bivariate Distribution |
We know that, The distribution for one variate x is knows as univariate distribution. The distribution
Raw moments is given by. which involves more than one variable i is known as bivariate distribution.
yd I Suppose X and Y are the vo; variables. Whenever variable X and Y are the measure on the
WEN Ef (x; = AY = yy Een
same event, then they are likely correlated, For example the rainfall (X) and agricultural
, , 428 _* production (Y). We record the values of X and Y for each village of district. Suppose it gives a
WIN ©2440 1-75 set of pairs (x,, ¥,), (Ky Vos cos py}, --» (X,, y,) Where x, is rainfall and y; is the agricultural
,, _ Efuj 1426 sth .
production ofi village. TL:
Thiss set fn pairs is a bivariate data.
Ho= N © 244©
3554©
My=_ Efuſ = ag ESP] Correlation
= 14.56
4 In real life we come across many situations where two variables are correlated. For
Liu; _ 15766
We Se = 64.61 example work and production, income and expense, price and demand, etc.
ii) To find four central moments In all these cases we get idea about two variables, Such interrelated variables are called as
correlated variables.
H, = 0
And linear relation between such two variables is called
My = pw’, - (Ww) = 5.84 - (1.75) = 2.78 as
It
correlation. ~ .
uw, = W;3W, +201)?
14,56 ~ 3 x 5.84 x 1.75 + (1.75) == 9,98 Types of correlation : Production
ll
bb
‘ where
ll
ii) Negative correlation : If variable X increases, variable Y decreases or variable X
decreases then variable Y increases then that type of correlation is called as negative
correlation. ,
e.g, If supply of any product is more, price decreases and if there is sliortage of produc:
then price increases.
lo oe
‘Hence there is a negative correlation between supply of product and price of product. ' Now — DV; > ¥)
cov(x, y) = EG
There are three types of measures of correlation. i) Scatter diagram, ii) Karl Pearson’s
coefficient of correlation, iii) Rank correlation.
= 236%, y,- Fx, Fy FBV)
= = X y,—PEx,—XLy,+LX5)
HK Scatter Diagram
When we plot correlated variables on XY plane using graph paper taking on variable on X = EVE [ 5(=)IG) 1
(73) +yxyn
axis and other on y axis that representation is called as scatter diagram.
’ Also
= 23 -2kx,+@)
5:
{c) Negative perfect _ (d) Negative correlation = Lys * 5 ( =) @*Ln
correlation
S
Fig. 3.13.1
—|=s
= 2K(x) + (8?
Hf
aL Karl Pearson's Coefficient of Correlation
mls
To measure the intensity or degree of linear relationship between two variables, Karl
ret|
Pearson developed a formula called correlation coefficient.
=
I><
—_
wl
a
©
I!
l
a) Correlation coefficient between two variables x and y is denoted by r(x, y) and is
Nl
™
Y_
on
Ll
A
Lad
defined as
Lo)
|
cov (x,y)
(x, y) 5, 5,
Similarly
where cov(x, y) = Co-variance of (x, y)
- One of the best measure of correlation is Karl pearson's coefficient of correlation but there
(ny) Ef, x; y; = ) = ) : _
i.e. cov(x, y=
y) = NJ [I mi |=
a is a difficulty in measuring the correlation between qualitative characteristics we can arrange
7 — : , the items in ascending or in descending order according to their merit this arrange is called as
Also, of = = - IN ranking, | _
— _— The number indicating the position in ranking is called as rank.
and o = = + yy : The Karl pearson’s correlation coefficient between these ranks is called as Spearman's
; .
Rank correlation coefficient and is denoted by R and is given by
d) Method of step deviation :
: x,-A y,-B
Ov)
R = 1 ——
If WET. v= k : n(n = ) .
—_ mo = o r(x, y) = r(u, v)
Solution : |
_ | cov(u, y) _ 1.75
. . Go, oO.
. .
22 . =_ i _ 1
LetA
_— = 1.75
; y,-B x,-24 . 2.9166
and B = 24 2V,= = :
|
Table = =
cov(u, v) =
Wn
7
WJ
i>
Is
j
|
Aa
©
>
Nl
i
nN
I
Tt
wv
li
is
—
i
j©
b
W
lj
= 3.1666 — 0.25
= 2.9166
TECHNICAL PUBLICATIONS® - an up-thrust for knowledge TECHNICAL PUBLICATIONS® - an up-thrust for knowledge
Statistics . 3-36
Statistics . : 3-37 Descriptive Statistics : Measures of Dispersion
Xv —49
SET —
Eu, v,
n —UV= 140 — 12.554
127.446
2. EF
uy
(uy = 128.512
6, = 11.336
. —_F= 149.7, 0, v = 12.2353
Example 3.15.3
RETA Regression
Solution : Let A=68, B 70 As we discussed there is a correlation between rainfall and agricultural production. If we
u;=X;= 68, v,= y; 70 - want to calculate the agriculture production of this year when we know the rainfall of this
Consider the following table year, regression helps us to calculate such value,
“A method of estimating the value of one variable when that of the value of other is known
aud when that two variables are correlated as called regression”,
Fig. 3.17.1
.. The equation of the regression line of y on x is
ii) Regression line of Y on X:
v
T (y-J) = by, (x =)
Note :
1) Regression coefficients can also be expressed as
Exy
n
xdy
nn
Ex By
Exy_n inn
by = Sy (z y ,b,vx "ix
= (= y
n ~ n n \ny
*
2) Coefficient of correlation is the geometric mean of regression coefficients bxy and byx
ie. = afb,, “By = :
Fig. 3.17.2 3) If one regression coefficient is greater than one then other must be less than one.
The distance of the points measured along y axis and if we minimize the distance ‘d’ of
these point from the line we get line of regression of Y on X.
It is given by Y=aX+bX .
- Use of this line is to find the value of y when value of X is given.
Solution: i) To find mean X and¥ :
RAE Regression Coefficients |
Mean X and¥ satisfies the equation of regression lines.
i) Regression coefficient of XonyY: \
-, When we solve given lines of regression we get values ofX and F.
The coefficient of Y in the equation of regression \ line of XonYiex=e+byis called ” 3% + 27 26 l _— (1)
regression coefficient of X on Y. 6x+y.= 31 . (2)
It is denoted by b,, and is given by
we get,
Solve equation (1) and (2) simultaneously,
olo
b xy =r x= 4,y=7
>
“. XE - 20_2y
3
In this equation coefficient of y gives regression coefficient b,
© by =
—2 >
,
wg Gea
i) Regression line of y on x:
(y-¥) = b, (x=3)
(y- 8.4) = —0.679(x = 5.6)
y = 12.202 - 0.679x ...(1)
Uh
(x=5.6) (— 0.236) (y - 8.4) holds
We use multinomial distribution on experiments in which the following conditions
7.582— 0.236y
&
_
X
good.
iii) Value of y when x ll 12 « The experiment in which repeated trials performed, e.g. Rolling a die fire times instead
Put x = 12 in (1) of just once.
We get, Ly 4.054
Each trial must be irdependent of others.
ti
+
iv) Value of x when y 8 « The probability of each outcomes be same across each trial of the experiment.
Put y =8 in (2)
We get, . x 5.964 , =_ . , £20) Poisson Distribution
I
number of
Poisson distribution is a limiting case of the binomial distribution when ‘n’ the
En Binomial and Multinomial Distributions trials is infinitely large and ‘p’ the probability of success is very small and np is finite.
Let us consider an experiment involving n trials performed under identical conditions each ‘The probability of r successes in a series of n independent trials is
is q.
of which can result either success whose probabi ility is p or failure waose probability
Bp, 9 = "Cpq"",r=0,1,2,
Such distribution is called by Binomial. As there are only two outcomes ptgq=l.
Let us consider r success out of n trials hence (nn—r) are failure.
*. Probability of r success in ntrial is given by,
— 1)(n~ 2)...(n= (e~1Y(n~ 1)! Ky
n(n
pt success) = p{(ssss...ss)(times)}X picfff...ff)(n ~ 1) times} @=rprt
(ppp...p)(r times) X (qq...q) (nr) times
I
ron-sf
= Pq
A r successes and (n~ r) failures can occu rin "C mutually exclusive cases. Z
ce n-r Let Z=0p «PE,
Ps p[r success] "cpq":
*.. Put “fr 0. 1,2,...,n
on
oo
—_
B(n, p, r)
Ou
l
1
AQ
Ix
Consider the binomial expansion
@ +p)= htc’ pt "Cp pt. tp
n-2
np =z
co and —
Asn-
Terms of RHS given the orobabtty of r = 0, I, 2, ..., n success. Hence above probability
t-z , n
=
P(X =x) = = 1,2,...,n In Gaussian curve mean, amedian end mode
coinsides, therefore the gaussian curve has one
Where ‘n’ is parameter of uniform distribution.
maximum point, Two tails of the curve are extend
When P(X =x) = 1, Fig. 3.21.1 to — «> and + co towards the positive and negative Mean
this distribution is called as standard uniform directions of X axis 1.e. curve is asymptotic to X
distribution shown in Fig, 3.21.1. axis. The line x = a(mean) divides the distribution Fig. 3.23.1
curve in two equal parts above xX axis.
EZ] Exponential Distribution
Total area under this curve is i.
A continuous random variable X taking non-negative values is said to have an exponential
oo
p(x, €xSx,)
_ "ox
$x $x,) = JJ fx)
:
KA
' 40
Table of area:
Z 0.00 } 0.01 * 0.02 | 0.03 | 0.04 } 0.05 : 0.06 } 0.07 | 0.08 } 0.09.
0.0 7 0.5000 : 0.5040 : 0.5080 } 0.5120 : 0.5160 7 0.5199 : 0.5239 + 0.5279 : 0.5319 2 0.5259
0.1 # 0.5398 | 0.5438 | 0.5478 : 0.5517 } 0.5557 | 0.5596 } 0.5636 ? 0.5675 : 0.5714 7 0.5753
0.2 | 0.5793 : 0.5832 | 0.5871 | 0.5910 : 0.5948 : 0.5987 ! 0.6026 } 0.6064 : 0.6103 | 0.6141
a=0 x4 Xo
0.3 © 0.6179 | 0.6217 } 0.6255* 0.6293 + 0.6331 | 0.6368 | 0.6406 : 0.6443 | 0.6480 } 0.6517
0.4 0.6554 : 0.6591 | 0.6628 : 0.6664 : 0.6700 0.6736 0,6772 + 0.6808 } 0.6844 } 0.6879 Fig. 3.23.4
0.5 # 0.6915 | 0.6950 | 0.6985 | 0.7019 : 0.7054 | 0.7088 : 0.7123 + 0.7157 ; 0.7190 } 0.7224
0.6 2. 0.7257 | 6.7291 } 0.7324 | 0.7357 | 0.7389 | 0.7422 | 0.7454 } 0.7486 } 0.7517 3 0.7549 2) POO<x<x,)=P(0<z<z,) = Area under the normal curve 0 to z, = A(z,)
0.7 0.7580 | 0.7611 } 0.7642 7 0.7673 } 0.7703 } 0.7734 ; 0.7764 } 0.7793 : 0.7823 } 0.7852
0.8 # 0.7881 { 0.7910 * 0.7939 } 0.7967 : 0.7995 } 0.8023 : 0.8051 } 0.8078 } 0.8106 } 0.8133
0.9 0.8159 } 0.8186 } 0.8212 } 0.8238 + 0.8264 * 0.8289 } 0.8315 5 0.8340+ 0.8365 * 0.8389
“£0 £7 0.8413 | 0.8438 } 0.8461 * 0.8485 | 0.8508 } 0.8531 } 0.8554 * 0.8577 : 0.8599 } 0.8621
1.1 0.8643 : 0.8665 : 0.8686 : 0.8708 | 0.8729 ? 0.8749 {| 0.8770 : 0.8790 : 0.8810 * 0.8830
_—===
1.2 # 0.8849 * 0.8869 * 0.8888 | 0.8906 } 0.8925 } 0.8943 } 0.8962 } 0.8980 } 0.8997 * 0.9015
1.3. 2 0.9032 : 0.9049 : 0.9066 } 0.9082 } 0.9099 : 0.9115 } 0.9331 ? 0.9147 } 0.9162 3.0,9177
Tx
0.9292 } 0.9306 } 0.9319 0
1.4 2 0.9192 : 0.9207 7 0.9222 { 0.9236 } 0.9251 } 0.9265 : 0.9279 7
1.5 # 0.9332 + 0.9345 + 0.9357 } 0.9370 } 0.9382 } 0.9394 : 0.9406 } 0.9418 : 0,9429'5 0.9441 Fig. 3.23.5
1.6 } 0.9452 * 0.9463 | 0.9474| 0.9494 ; 0.9495 | 0.9505 : 0.9515 } 0.9525 } 0.9535 } 0.9545
1.7 | 0.9554 : 0.9564 | 0.9573 | 0.9582 ; 0.9591 | 0.9599 | 0.9608 | 0.9616 | 0.9625 | 0.9633
TECHNICAL PUBLICA TIONS® . an up-thrust for knowledge =
EF Chi-quare Distribution
Chi-square distribution mainly ued it
i) Testing hypothesis. =_
ii) Tests for the independence if two criteria of classification of qualitative data.
iii) Tests for goodness of an observed distribution to a theoretical one.
=x, X 0 The chi-square distribution is
i one of the important distribution in statistical theory.
Fig. 3.23.8
6) P(x, <x <x,)=P(-z,<z< Z,) = A(z,) + A(z,) It is denoted by x, where ‘n’ is parameter of distribution called as “degrees.of freedom”.
The variate x. is the sum of squares of n independent standard normal variables [N(0, 1)].
Let X,,X,, X,,..., X, ben independent N(O, 1} variables then the Chi-square distributio
n is
given by, .
According to R.A. Fisher null hypothesis defined as a hypothesis of no difference and it is Type-ll error. ;
denoted by H,.
The following table will clear the idea.
e.g. H,: Þ,= Þ, Or MW, = Wy= ©, This Hypothesis states that there is no difference between
two population means.
Alternative Hypothesis (H,):
Alternative hypothesi s is a hypothesis when null hypothesis rejected, alternative hypothesis
accepted.
It other words Alternative hypothesis is a complementary hypothesis to null hypothes is
EE Some Definitions
and it is denoted by H,.
j) Test of Hypothesis: A rule which jeads to the decision of acceptance of H, or rejection
e.g. If H,: Þ,= UM, then alternative hypothesis may be
of H, on the basis of random samples is called test of hypothesis.
H, : jy, = 1, or (H,) are
ii) One sided and two sided tests : The tests used for testing null hypothesis
H,:1<p,or one
called as one sided or two sided tests according as the alternative hypothesis are
H:Þ>M
sided or two sided hypothesis.
© OH...
it) Test statistic : A function of random sample observation ‘which isi used to test null k
lk Mx
hypothesis H, is called as a test statistic. L = 2N#N CY k= YL @=
It
iv) Critical region : Let R,, X,, «-., X, be the set.of Critical
0.
values for which the null hypothesis Hp is | region(w) Acceptance 1
Mx
LOR
ll
rejected is called as critical region or rejection region (w°)
region. JW has X distribution with (k = p - I)
Critical region is denoted by w and acceptance degrees of freedom and it is parameter of
region is denoted by (w) as Shown in
Fig. 3.31.1
the chi-square distribution.
Fig. 3.31.1. The critical region at level of
v) Level of significance : Probability of rejecting the null hypothesis H, when it is true is significance © is Acceptance
l
2 region l
called as level of significance. x. pz Xe p-he
That is it is the probability of committing type-I error. [t is denoted by «. where. 2
X,_,.,z - Table :
value — . 1
7 ss
if we try to minimize level of significance the probability of error of type-II increases, corresponding (critical value) to degree ;
So level of significance cannot be made zero. However we can fix it in advance as 0.01 of freedom k Land level of _ Xk-p-1;6
or 0.05 i.e. (1 % or 5 %). In most of the cases it is 5 %. Significance &, it is $hown is Fig. 3.31.1 (a)
vi) Test for Goodness of fit of X’ distribution : For a given frequency distribution are try Fig. 3.31.1(a).
to fit probability distribution.
Note :
We know that there are several ity distributions, among these distribution which 4) If the total of cell frequencies is sufficiently large (greater than 50) and expected
will fit to given data use have to identify for that we have to test the appropriatness of the fit. frequencies are greater than or equal to S(i.e. e, 2 5) the we can apply this test.
Let H, : Fitting of the probability distribution to given data is proper (good). 2) While fitting a probability distribution or in finding expected frequencies if any
parameter are not estimated then value
of P is zero.
The test based on X’ distribution used to test this H, is called X’ test of goodness of fit.
3) When expected frequency of a class is less than 5, the class is merged into
Suppose %, 0,0; ... 0; ... O, be the set of observed frequencies and ¢€,, €,,... €,... &, be neighbouring class along with its observed and expected frequencies until total of
corresponding expected frequencies or theoretical frequencies. expected fréquencies becomes greater than or equal to five. This procedure is called
There is no significance
“Pooling the classes”.
between observed and theoretical frequenceness.
In this case k is number of class frequencies after pooling.
Suppose P = Number ofparameter estimated for fitting the probability distribution.
k
oO > 0; = £ e; =
i=l i=l.
If H, is true, then the statistic
Solution :
kys fo-e .
x’ jo Here n= 10
lt
6 1
p| = Probably of getting a doublet =3676
i 5
q= Probability of not getting a doublet ==] = 676
T
=
pr)
Lv
x
ll
[ALO
ii) Probabi-ity of getting at least 4 success:
‘p(6)
wo
i
ll
&
p(r2 6) = p(6) + p(7) + p(8) + p(9) + p(10)
© = arals) @ 16 TN IN 106 1 1Y
6\2) \2 BQ)
it TY TY 4 1 i0 1
+ 0 3) \5 + Cals 5
We know that
p(r) = Cp 4
p(4) = C,
a
ll
i
Hence probability of r no.s called out of 6 called are
6. r_6=r
p@) = Cpa
Solution : A dice isi thrown 10 times. odd numbers are 1,3, 5, 6,9 i) For not more than 3 calls are busy:
1 p(r<3) = p(9)+ p(t) +p) +2) o
: bility of an getting an odd number =-16”= "2
”. Proba
‘c.ls) (is) wos)(is
ex) (9-15) GS)
We know that, p(x) = "C,p'q
a9, (14Y + 6(14f +28 (19) +228
0.9997
“ll
) Probability of 8 SUCCESSES :
w. fi)ne flfry ii) For at least 3 calls tobe busy :
p(8) Cy 2) 2
_ 102 ty _ 10x9/1\® p(t 2 3) 1 - p(<3)
2181\2 2 2 p(r 2 3) 1 = [p(t = 0) + p(r= 1) + pr=2)]
TECHNICAL PUBLICATIONS® - an up-thrust for knowledge
TECHNICAL PUBLICA TIONS® . an un-thrust far knowladne
Statistics . 3-56 Descriptive Statistics : Measures of Dispersion
statistics 3-67 Descriptive Statistics : Measures of Dispersion
Solano = 750: |
ll
0.0051 iti) Probability of expected no. of families having 1 or 2 girls. Which is equivalent to finding
. probability of having 3 boys or 2 boys.
p(l or 2 girls} = p(3 or 2 boys)
<0
It
= 1 -4 4
= EET7 Ov .
Hence expected no. of families faving at least a boy, .
Hence expected number of families having no0 girls
2000 p(r>1) = 2000xD
<7;
2000 [p(r = 4)] = 2000: xx16
= 1875
it) Probability of two boys : - = 125
p(having xample 3.34.6
2 boys) = p(r=2) *
LA I 9
a ( 2) © .
4 (1)
~ 212!() .
Solution : Here 1 in 500 tyres to be defective,
=73
Hence p=
Hence expected number of families having 2 boys
|
“—Z
—E:r=0,1 ,2,3,4 | |
ha r
‘ P= |
DH rh
= 0.6065
e oS. =
i) Probability of no defective tyre : A p(0)
~ 0.02
o 0.02)?
p(r=0) = FL e = 0.5
= 0,6065 x 0.5 = 0.3033 |
p(1) = ii 58 ' |
: , 0.980
= os = 0.6065 x 0.125 = 0.07
It
19 lots
p(3) x 200
It
p(4) x 200
Il
Solution : Solution :
Here 0.001p =
n= 2000
It
_ (p(r = 0) +p(r= ) *p(r=2)]
tt
It I-e*fl+2+2]
= 0.32
0 061 2D
Fig. 3.31.2
A, = Area corresponding to z, = 0.61
= value of 0.61 in Normal table (0:6 row and 0.01 col.)
Solution : Given that 5 phone calls are coming into a call centre.
5
= 0.7291 -0.5
= z= ==0.02
60 = 0.2291
. The probability of getting at most two phone calls.is A, = Area corresponding to z, = 1.32
bt
Let z= =
oc
as
©. Required probability A, + A, 274 candidates score exceeds 60 - .
0.2291 + 0.3023 = 0.5314 1051 candidates score is between 30 and 60.
Example :
sae
If
i) Number of candidates whose score exceeds 60
~ 415
60 — 45
When - x = 60, z= 0.6 e p(x < 700) p(z<-I1.5)
25
p(x > 60) p(z > 0.6) p(z> 1.5)
fo
0.5 —p(z= 0.6) 0.5 = 0.4332
fl
0.5 ~ 0.2257 = 0.2743 0.0668
i
ll
=
2 Go 235 706
Fig. 3.31.6
p(30 <x < 60) p(- 0.6 <z<0.6)
= p(0<z<0.6)+p(0<z<0.6). be expected to fail in 700 Hrs. = 0.0668 x 2000 = 1336 bulbs *
Number of LED bulbs might
a
0.5257 + 0.5257 Example vo : UG -
1.0514
DD 0 06
Fig. 3:31.5
TECHNICAL PUBLICA tions® 7 an up-thrust for knowledge
= 0.2257
A, = Area corresponds to (z, = 1)
E . | = 0.3413
_ p(13.5<x<16.5) = A,—A,
a 0 04 =_ - = = 0.3413 = 0.2257
Fig. 3.3.7
= 0.1156
Required number of students = 1000 x 0.4435
. Number of students score 16 = 1000 x 0.1156
443 | —_— | = 115.6= 116
il) Students score below 8 ,
Here x =8
X, = 34 (observed value) _ —_
* TECHNICAL PUBLICATIONS® - an up-thrust for knowledge TECHNICAL PUBLICATH ons® - an up-thrust for knowledge
Statistics - 3-68 Descriptive Statistics : Measures of Dispersion
| 4 For the lines of regression 8x - 10y = = 66 and 4x — 18y = 214. Find i) Mean of
x and y, ii) Coefficient of correlation r. [Ans. : X=13, Y=17, r=+ 0.6}
5. The following equations represents regression lines equation 76. In a certain examination test 200 students appeared in subject of statistics. Average
marks obtained were 50 % with Standard deviation 5 %. How many Students do You
y = 0.516x + 33.73 expect to obtain more than 60 % of marks, supposing that marks are distributed
x = 0.512y + 32.52 ' normally. : {Ans. : 46 students approximately]
Find i) Mean of x and ¥, tf) Coefficient of correlation (r) 17. In a distribution exactly normal 7 % of the items are under 35 and 89 % are under 63.
= fAns. :X= 67, y = 68.6y, r = 0.514] Find the mean and standard deviation of the distribution _
” —_ [Ans. : mean a = 50.3, o = 10.33]
Problems on Binomial eae 18. A normal distribution has a mean 15.73 and a standard deviation of 2.08. Find % of
—_— cases that fall between 17.81 and 13.65. | [Ans. : 68.26]
6. An unbaised coin is thrown 10 times. Find the probability of getting i) exactly 6 heads 19. A fair coin is tossed 600 times. Using normal distribution. Find the probability of
i) at least 6 heads. - fAns. : i) 575 ii) a getting i) Number of heads less than 270, ii) Number of heads between 280 to 360.
7. Probability of man aged 60 years will live for 70 years is 1/10. Find the probability
:
of - .
. (Ans, .: i)i 0.0074 , 11)ii) 0.4845
0.4845]
5 men selected at random 2 will live for 70 years. {Ans. : 0.05467] 20. The average test marks in: particular class is 59 and S.D. is 9. If the marks are
8. | Out of 800
; families
j wpe iStri
with 4 children each how many families ' 7 "70 received
would be expected to have i) More than 70? how us students in a class of 70 received i)i) vane
‘ks belaw 50 ?
a Oa by 3]
i) 2 boys and 2 girls ii) At least one boy iii) No girl iv) At most two girl
Assume equal probability for boys and girls. fAns. : i} 300 ii) 750 iii) 50 iv) 550] Problems on Ch-square Distribution
9. A group of 20 aeroplane are sent on an operational flight. The chances that
” the : - Tt : h
aeroplane fails to return from the fight is 5 %. Determine the probability then i) No 21. The table below gives. . T7 in cer ‘ai‘ain country7 m_ on
the number 07 accidents that occurred
plane returns ii) At most3 planes do not return. [Ans,. : i) 0.3585 ii) 0.9844} various days of week :
> 7; 2 ae : * Days . . | Mon | Tue | Wed | Thurs | Fri | Sat
10. Team A has-a probability of 3 of winning whenever the team plays a particular game. :
if team A plays 4 games, find the probability that the team coins : I) Exactly two Accidents — 126 | 130 | 110 | 115 | 135 |j 170
— > 7 saps
games, ti) At least two games. , fAns. : i) 0.2962 ii) 0.8889] Test 5 % of level of significance whether the accidents are uniformly distributed ove
Problems on Poisson the days. {Ans. : X, = 4.58, X,,,, = 14.07 H, accepted]
ET ng
_ 22. The demand for aparticular spare part in a factory was found to vary from day to day.
11. Find the probability that atmost 5 defective fuses will be found in a box of 200 fuses if Ina sample study. The following information was obtained. ,
2 % of such fuses are defective: - ‘ | {Ans, : 0.735] | : Days Mon } Tue | Wed | Thurs | Fri | Sat
12. 4 car hire firm has 2 cars which it hires out day by day. The member of demands for
No. of parts | 1124 | 1125 | 1170 | 1120 | 1126 | 1135
the car on each day is distributed as poisson's distributikn with parameter 15.0.
demanded .
Calculate a) the probability of days on which neither car is used and b) for days on
which demand is refused. 2 2 .
{Ans. : (a) 0,22 (b) 0,2025] [Ans. : X = 3.9222, X,,,, = 9.488 H, is accepted]
13. A manufacturer knows that the razor blades he makes contain on the average 0.5 % of
defectives, He picks them in packets of 5. What is the probability that a packet picked
Multiple ‘Choice Questions ;
at random will contain 3 or more faulty blades ? - {Ans, : 0.00000255]
14, There are 300 misprints are distributed randomly throughout a book of 500 pages. Q.1 There are two types of meagures of dispersion:
Find the probability that the gives page contains i) Exactly two mispoints, li) Two or’
[al nomina! measure of dj spersion and real measure of dispersion
more mispoints. {Ans, :{) 0.1 il) 0.122] A 7 7
15. Using Poisson's distribution, find the probability that the ace of spades will be drawn
[b] nominal measure of dispersion and relative measure of dispersion
from a pack of well shaffled carts at least once in 104 consecutive trials. [Ans. ; 0,864}
TECHNICAL PUBLICATIONS®
- an up-thrust for knowledge rs . TECHNICAL PUBLIGATIONS® - an up-thrust for knowledge
Statistics , 3-70 Descriptive Statistics : Measures of Dispersion Statistics , . 3-71 Descriptive Statistics :|Measures of Dispersion
—
Coefficient of kurtosis [ad] Coefficient of variation
- .
[cl
FT ——_] . « .
[a] range | [vb] mean absolute deviation ſe] moments about origin [a] all of the above
[al Q, + Q, , © Q, _ Q, [cl mean is at right to the median [a] mean is less than the mode
2 . 2
Q,11 Ina frequency distribution of a variable x, mean.= 34, median = 28. The distribution is
Q,+Q,
5 ” = Q,-Q
[a] 5
[6] mean deviation [cl mean > median > mode {al mode < median > mean.
[al mean square deviation
Q.13 For a distribution, mean is 42, median is 42.8 and mode is 44. The distribution is .
fe] square deviation fal root mean square deviation
[a] normal - [b]. positively skewed
Q.6 Square of standard deviation is called .
[c] negatively skewed [a] mesokurtic
[a] Mean Square Deviation [b] Mean Deviation
Q.14 The second moment (uz) and fourth moment (p,) about mean are 4 and 18 respectively,
[cl Variance [a] Root Mean Square Deviation
What is value of skewness Þ, 7?
Calculate mean deviation from median and its coefficient from the following data : 110,
Q.7
[a] 0.875 | fo} 1.125
450, 100, 90, 160, 200, 140 : |
[a] Coefficient of standard deviation B Coefficient of gkewness [ce] 40.75 [a] 42.75
[al maximizes
ſa] 0.2 [b] 0.4
[b] minimizes -
fd 1 [a]0
ſe] reduces to zero [a] - approaches to infinity
Q.25 The value of coefficient of correlation
Q.19 The slope of the regression line Y on X is also called
[al between | and 0 [b] between — | and 0
the correlation coefficient of X on Y
ſe] between — | and + 1 fa] Equals to 1
the correlation coefficient of Y on X
Q.26 Line of regression y on x is
the regression coefficient of X on Y
Q.28 If the two regression coefficients are © E and -2 chen correlation coefficient is
Q.21 if the scatter diagram is drawn the scatter points lie on a straight line then it indicates
TECHNICAL PUBLICATIONS® - an up-thrust for knowledge TECHNICAL PUBLICA TIONS® . an up-thrust for knowledge
Statistics 1 Beh. Descriptive Statistics : Measures of Dispersion
of X and y are
— 18y = 214. Then values
40x .
Unit IV
[a] Z=12, F=15 ſb] #=10,F=11
[dl7 Syllabus
Random Variables and Distribution Functions:
Discrete Random
Random Variable, Distribution Function, Properties of Distribution Function,
Random Variable,
Answer Keys for Multiple Choice Questions : Variable, Probability Mass Function, Discrete Distribution Fi unction, Continuous
- Probability Density Function
Qi c Q@<2 > a @3 b> Q4 cc Qs 4 Theoretical Discrete Distributions : Bernoulli Distribution, Binomial Distributio
n, Mean Deviation
of Binomial
about Mean of Binomial Distribution, Mode of Binomial Distribution, Additive Property
Binomial Distriviution,.
Q7: 6b: 08: d: Q9: b 1 QO: d Distribution, Characteristic Function of Binomial Distribution, Cumulants of
S
oO
4.1 Introduction
0.23: e ;Q24i b iQ25: c Random Variable
oO
&
oO
OY >
el
4.2
=
a
AQ
&
el
r-
a
QO
oe
(4-1)
TECHNICAL PUBLICA TIONS® - an up-thrust for knowledge
Statistics 4-2 Random Variables and Probability Distributions
Statistics _ 4-9 _ Random Variables and Probability Distributions
LEI introduction + In simple words a variable used to denote the real value of the outcome of a random
¢ Already in theory of probability we have studied about random experiments and its experiment is a random variable.
corresponding sample space. Now our interest is not to construct the sample space of e A random variable X may be finite or countably finite or countably infinite or
. random experiment but in some number obtained from the outcomes. uncountably infinite. :
© For example, in case of the experiment of tossing a coin twice, our interest may by only . EPRI Types of Random Variables
in the number of heads when a coin are tossed two times. So in probability problems it is
There are two types ofrandom vatiables,
convenient to think of a variable and consider outcomes as the values , the —— takes
such a variable is called a random variable. 1) Discrete random variable
If the random variable X takes only a finite or countably infinite values, then it is called
EFJ Random Variable - discrete random variable. Examples of discrete random variables are the number of
» A random variable is a real valued function defined on the sample space of a random children in a family, the number of cars sold by a dealer, number of stars in the sky and
experiment. In other words, the random variable X can be considered as a function that so on. ,
maps all the elements in the sample space S into points on the real number. 2) Continuous random variables :
Thus X : S > R is a random variable. If the random variable X takes uncountably infinite values, then it is- called continuous
Example: random variable. =
Consider the experiment of tossing two fair coins. Examples of continuous random variables are height of person selected at random, the
The sample Space S = {HH, HT, TH, TT}. There are 4 possible outcomes. age of person selected at random, because x can take any value between a specified
But if we are interested in the number of heads when two coins are tossed, then thete are range. :
only 3 different possible values from 0 to 2. LE] Probability Mass Function (P.M.F)
' 1) Getting no head, 2) Getting one head 3) Getting two heads. * If a discrete random variable X takes values x,, x,, x, .... x, with corresponding
n
probabilities p,, p,, Pys-.» P : respectively where 5 p, = 1 then the distribution .
S ={HH, HT, TH, TT} $f
, ~2-1 0 t 2
Real line
‘OR
-. The discrete probability distribution 1s Example : For a continuous random variable x, the probability density function defined
fe
by
[x 0< x83
fx) = 49
0 otherwise
Properties : In p.n.f. .
Properties : In p.d.f.
i) OSP(x)<1 foralli
1, i) f(x)2 0 for allx
1=1I
i) | fo) dx=1
i1)P (Sx Sx, )= ,;)
PH) + PR) +... + P(x;
db
iv) Mean or expected value : -_
iii)P (a <x $b) = | f(x)dx
The mean or expected value of X is denoted by 1 or E(X) and is defined as. a
Dp, X; iv) Mean ox Expected value :
Þ = E(Q= = 2 Pi;
Dp, _
= , - 2 ; 2 2 2 E(x) =" xt ax
v) The variance of X is denoted by V(X) and V(X) =o = Dlx, —wy = EX’) - [ECX)] — co
vi) Cumulative distribution function (c.d.f.) of a discrete random variable X is denoted by ¥) Variance = V(x) =o" = | 6@- wy f(x) dx = E[x’] - [EGOy
i
F(x) and is defined as, . — oo
P(X=x) . .
F(x) = PX <x]=" >,
, | XiSX
TECHNICAL PUBLICATIONS® - an up-thrust for knowledge TECHNICAL PUBLICATIONS® - an up-thrust for knowledge
fe teint
ii) P [X<2
=P (X =0]+PIX=1]=
] 0.1+k=0.14+0.15=0.25
ll
iii)P [X23
=P [X=3]]+ P[X=4]=2k +k=3k = 3(0.15)=0.45
ll
Il
iv)P [1 SX <3] =P [X= 1]+P[X=2] + P[X=3]=k+2k+2k
= 5k
5 (0.15) Solution : Given random variable X = —2, -1, 0, 1, 2.
0.75 _ By definition of probability mass function
P (X=-2) + P(X =-1) + P(X= 1) + P(X =2) + P(K=0)= 1 .
Given, P (X =—2) = P(X =-1) =P (X= 1) = P(X = 2)=k (say)
Also given P(X < 0) = P(X = 0) |
op P(X=0) = P(X=—1)+P(X=-2)
P(X=0) = 2K
From (I) and (II)
I
k+k+2k+k+k =1
Solution : 1) From probability distribution
” 6k = 1
.
P(X=1)+P(X=2)+P(X=3)+P(X =4) +P(X=5) +P(X=6)+P(X=7)=1
oo kt¢2k+3KH4K +kt 244K =] = ket
c Tk+8Kk = 3 Hence the probability distribution (p.m.f.) 1s
re BY
*
NIOReY
8k +7k-i = 0
8k +8k-k-1 = 0
(k+1)(8k-1)= 0
k= -1 ork=1/8 te seessiarisnon:
Example F yy
Since P(x=1)=k - k>0
The value ofk= 1/8
1} P(X>5) =P(X=6)+P(X
=)
| =2k +4k =6K Solution : Since f(x) > 0 for all x
_
= “6d
aoe
= 32
" POSXS5)= P(X=1)+P(X=2) +P(X.=3)+P(X = 4) + P(X = 5) 0 1 0
=
~
_
oy
|
il
4 |
—
|
i!
mt
©
™
j
Vf
mY
oft
, =
Solution : Given p.6d.f. ‘
_ ſkx(2-x), O<x<2 |
otherwise. : “ E(x) =
fx) = { 0
m) - Variance = V(x)= E(x’) - [EG]
i) Since, f (x) 20: for all x and .
\ 2
.
_
—
X
I
th
>
ll
©
ca
=
|
>
it
—_
l
RD
E(x) =
ntst
lst.
x (2-x) ,O<x<2
A,
—
&=
ll
It
2) Poisson’s distribution
0
3) Geometric distribution.
2 Þ Y P(X=r)-r=E(X=1)
5 r=0 |
n
fx e” dx
2 r P(r)
i
M
-x 35
xe “x
1” © |
= [(-Se°-e* )-(2e —2 —2
=e )}
= -6e*+3e?
Variance of the Binomial distribution is denoted by V and is defined as,
* P(2<x<5) = 3e*—6e*
. n n :
il
Qa
I
© Frequency distributions can be classified as, © —x=0 r=0 .
. 2
1) Observed frequency distributions an Variance = & =npq
I
TECHNICAL PUBLICA TIONS® + an up-thrust for knowledge TECHNICAL PUBLICA TIONS? ~ an up-thrust for knowledge
Statistics l 4-12 Random Variables and Provability Distributions 48 4-13 Random Variables and Probability Distributions
=
—
M
Ry
li
i
* Depending on the values of the two parameters n and p, Binomial distribution may be ©
‘capt =1(3)
9,
5)
AVERY 0.9
©
—
=
Au
1
I
unimodal or bi-modal.
+ To find the mode of Binomial distribution, first we have to find the value of (n+1)p.
1) If (n+1) p is a non-integer then mode is uni-modal and mode = the large st integer P(r<1) = P(r=0) + P(r= 1)1
‘contained in (n+1) p Cpat Cpa
9. 0 9. 9,
Il
2) If (n+1) p is an integer then mode is bi-modal and mode = (n+1)p, (n+1) p~ 1.
G96
It
FEI Additive Property of Binomial Distribution
(Daw
. let X and Y be two independent Binomial variables such that X follows Binomial
distribution with (n, p) and Y follows Binomial distribution with (n,, p), then X+Y.
193) _—
follows Binomial distribution with (n, + n,, p).
P(r <1) 0.00097
Ii
PEPA Characteristic Function of Binomial Distribution
The characteristic function of Binomial distribution is denoted byy ©.(t) and isi defined as
(0) = Efe”)
(qt pe’)” =(1-p + pe’) . : -
ll
Solution : Given : n = 10 ,
'
ESE) Cumutants of Binomial Distribution Let p be the probability of defective transistor
The moment generating function of Binomial distribution is denoted by M,(t and is- - P = 100-0
_ |
defined as —— =
c q= 1-p=08
M,(t) = Ele")
ll
lO-r
Since p(r defective) = "C,p'q’ "= "°C. (02) (0.8)
(q+pe)* =(1-p+pe" | ,
(i) p (All defective transistor)= P@= 10)== C02)" (0.8)° = 0.000001
Il
- The cumulant of Binomial variable X is ky(t) = log. (q+ pe)” log.(qt. pe)
(ii)p (All are non-defective) = P(r=0)= C,(0. 2) (0. 8)" ==0. 1074
TECHNICAL PUBLICATIONS® - an up-thrust for knowledge TECHNICAL PUBLICA Tions® - an up-thrust for knowledge
Statistics 4-14 Random Variables and Probability Distributions Statistics
10!
#77 (0.5)* (0.5)!A
10.9.8.7 = 233
2321 05".
= 210(0.5)"
= 0,205
(ii).p (getting at least 6 heads)
p(getting 6 or 7 or 8 or 9 or 10 heads)
P(r = 6) + P(r = 7) + P(r = 8) + P(r = 9) + P(r = 10) Since the theoretical frequencies are given by
Il
= "c,(0.5)* (0.5)*
+ *c,(0.5)' (0.5)
+ *c,(0.5)* (0.57+ °C,0.5) | No +9) = 128 (44 1 7
0 10 0 : . . >
(0.5) + "C,q(0.5) (0.5) oft (AY a OY LY a TY +a TY. 7,f1V 7, fay" 7. fly
= (0.5) (°C, + Wey %. + %. +. = 128 Co 3} + C5) + ¢, 3) * C, 3] Cy 3) + C3} +0 x) + OF
10f Cr 10,9.8.7
10.9.8 10.9 ‘‘ _ ao 2h221 7 1 35 35
= 051455 +F3g1 2 HL] | 128] 798 * Tag
* 108 * 198* 128
* 1287 18” Dy,
7
= (0.5) 10 (386) = :
= 0.3769
jet p =
1,12
Probability of getting 5 or 6 = 6167673
1
‘Since (n + 1)p = 12.5 is a non-integer, hence the given binomial distribution is uni-modal
and its mode = largest integer contained in 12.5
© Mode = 12
>
on
—_
Au
>
ll
of success is very small and n, the number of trials is very large. i.e. p + 0, 2m > & such
ll
| than np =m remaining finite.
In some cases where an experiment results in two outcomes success or failure, the
number of successes only can be observed and not the number of failures. e.g. we can
observe how many persons die of corona (covid-19) but we can not observe how many 9! 1
=e *= 0.04978
do not die of corona. In such cases Binomial distribution can not be used. In such cases e™ m? —3
|
* A random variable X is said to follows Poisson distribution if the probability of X is
given by,
* The poisson process is one of the most widely used counting process in probability. It is for a Poisson distribution -
. =m Xx
the limiting case of Binomial distribution such that n > -, p > 0 and np = m = finite.
€ Mm
Px).
4
I
x!
Poisson distribution is uniparametric distribution as it is characterized by only one
e 2"
parameter m. ” P(x) x!
1) Mean of Poisson distribution is uy = E(X) = m
i) P(exactly 3 will suffer a bad reaction)
2) Variance of Poisson distribution is V = o* = m
P(x= 3)
3) Poisson distribution may be uni-modal or bi-modal depending upon the value of the _ PA ON
parameter m, ;
3!
(i) If m is a non-integer, then distribution is uni-modal and 0.1804
Mode = The largest contained in m
ii) P(more than 1 will suffer a bad reaction)
(ii) Ef m is an integer, then the distribution is bi-modal and mode = m, m- 1 1 P= 0) +P(x = D]
4) Additive property of Poisson distribution: Let X and Y be the two independent Poisson
variables such that X follows Poisson distribution having parameter m, and Y follows
EL,
Poisson distribution having parameter m,, then X + Yis also a Poisson distribution with 1 ~ {0.1353 +
parameter m, + m, | 1 — 0.4060
0.5940
P(x = 3)
P(x = 4)
PR 10
Now Theoretical frequency ==N- P(x)
m= n=50x0.2=10 Putting x =0, 1,2,3, 4we get the theoretical frequencies as 109 66, 20,4, 1.
For a Poisson distribution,
P(x) =
oO:
io
IU
I
S
For a Poisson distribution’
= tml gt ta
1 — [0.000045 + 0.0004539 + 0.002269} P(x) = =0,1,2,...
1 ~ 0.002754
7 P(x) =
0.997246
_ 1090 +65x1+22X2+3X3+1x4_122 9
me 200 =200*0
Solution : Given that m = 2
For Poisson distribution
For a Poisson distribution
P(x) = = = where x = 0, 1,2, 3,4 P(x)
. where
x = 0, [, 2, ...
© P(a page from the book has at least one printing mistake) = P(x > 1)
P(x
= 0) = oe e ** = 0.5434
= 1-P(x<1)
= 1=P(x=0)
- 0.61
0.61
P(x = 1) = © (061) _ 0.3315
(= J+
e 72°
—0.61
0!
P(x = 2) ey$1) 01013 { — 0.1353
0.8647
Jt
TECHNICAL PUBLICATIONS® . an up-thrust for knowledge
TERKNICAL PHRIHCATIONS® . an to-thrust for knowledae
~
Geometric Distribution
1} 4 random variable X has the following probability distribution.
© Ina series-of Bemoulli's trial (independent trials with constant probability P of success), ON nrepoeoeoteepeoronane
let the random variable, X denote the number of trials until the first process, then X is a
geometric random variable with parameter p such that 0 < p < 1 and the probability mass
function (p.m.f.) ofX is = Find: i) the value of k ii) Mean iif) Variance iy) P(x <-2) v) PAX <- 2)
P(X=x) = op, where x = 1,2, 3, :..\ 2) The probability mass function of a random variable is defined as P(X = 0) = 3C,
P(X = 1) = 4C— 10C* and P(X = 2) = 5C-1, where C > 0 then find the value of C.
PEER Properties of Geometric Distribution 3) If the random variable X takes the values 1, 2, 3, 4 Such that (2P (X = 1) = 3P (X = 2)
= P (X = 3) = SP(X = 4)) then find the probability distribution and P (2 <X <4).
—| a
1) The mean of the geometric distribution is p = E(X) =
4) Find the value of k if the probability density function f(x) = kx eo’, 0<x< 0
2) The variance of the geometric distribution is
5) Find the value k if the probability density function f{x) = — ,- oe < x < co
VOX) = EX) ~ [EG]? = SP = 2 2 1=- x
Q.1 If the discrete random variable X has the following probability distribution
Solution : Given : For Geometric distribution
x=1 Xx 1 2 3 4
P(X) q -p —_ ; _ whereq=Il-p
1
04 “q=1-04=0.6
P(X=x) | 3k k k 3k
Here p=
” - P(X=3) =(0.699"! (0.4) then the value of k is .
= (0.6) (0.4)
0.144
q
[=]
f. Q.6 i the discrete random variable X has the following probability distribution
x o | 1 2 3 4 5
[c] P(X=x) | 0.15 | 0.1-} 0.15 | 030 | 015 | 615
Q.2 If the discrete random variable X has the following probability distribution = then the value of P(X < 2) is; .
x 9 1: 2 3 4 [al 0.15 [b] 0.25
P(X =x) 6.1 k | 2k 2k k
[c] 0.4 fa} 0.45
then the value of k is : -
[a] 0.15 [v] 0.25 Q7 if discrete random variable X has the following probability distribution
x {| 0 | 1 2 3 4 |° 8
[c] 0.10 ſa] 0.35 - P(x=x) | 2k | 2k | k | © | 2@ | T+2k
Q.3 Three coins are tossed simultaneously, X is the number of heads, then P(X = 2) is
ape
[=]
Col
q @
tol
Apo
x + 0 I 2 3
Q4 ifthe discrete random variable X has the following probability distribution
PX=)} a2 | 3 | 3 ] 2
x 1 2 3 4 | 8 8 8 8
P(X=x) | 1 1 3 2 13 . 3
10 5 10 5 - [a] 5 [bl 5
then the value of P(2<X<4)= 9 — 2
3 3
[cl § [ 5
5 [>] 10 Q.9 If probability mass function. f a discrete random variable X is
c -
[a 2 P(X =x)= x? forx= 12.3 then value of
10 c is
0” :
Q.5 The discrete random variable X has the following probability distribution 216 = 251
x 0 1 2 3
PE [eI 216
| 294
ll 776
294
[ 557
P(X
= x)
to ſms
ie
|
00 |
Then the value of P(X 2 2) is Q.10 Ifprobabil y mass function of a discrete random variable X is
c _ : :
| [a] 0.5 [bv] 0.25 P(X =») ={x* * forx= 1.2.5 then E(X) is __.
o' _. :
[Jos + [d] 08
fa] 565
297
343 ~
[bb] 557
251
294 ”
| |
7.
Ai 3
lel
297
294 [9]
294
297
da; .)ti‘<‘iCTSGG
values X,, X,, x, Q.16 If the probability density function of a continuous random variable X is
Q.11 ff the probability function of a discrete random variable X which assumes
= ; BY | ~1Sxs4
Such that P(x,) = 2 P(x,) = 3P(x,) then P(x,) = f(x) = then P(0 <x < 1)
0 ° otherwise
= O@
ez al
1
B
ain
3.
a 2 , ae
IN
& 5 {|
—
—
fal 12
feds hdd
len
ſe] 6 | : 6
Q.13 if the p.d.f. of 2a continuous random variable X is defined by
_[k(t-2x) , O<x<1 then the value of k is .
0 4 | otherwise
.
fo) =| :
f(x) = { x
' DEXE ®
<X<oo
then the value of P(0 <x <2)
[b]
—_M”
‘ 2
[al l-e
_
[4
ef
©
e -1 l . {a -e@ —e
= . . >
variable X is defined by _____.
Q.14 If the probability density function of a continuous random
fog) = KX, ~3<x<3 i,
Wyo, otherwise Q.19 The probability density function of x is f(x) “Tee? <x <oo then the value of k =
1 '1
[al 27 [b] 28
= ©
RIN
] 1
Bir 5
Ae
Q.15 The probability density function of a continuous random
, OSxst :
variable X is
L a
_ f6x(1-x)
fog ={ 0 , otherwise Q.20 Mean and variance are equal in
Af P(X < k) = P(X > k) then the value of kis . Poisson’s distribution
[al Binomial distribution [6]
Alo
1 c
[a +
Qs b Q.7 d Qs Qo Q.10
[a]
st]oO,
3
[5 2 Quit. Q.12 a :£ Q.13 ‘Q.14 Q.15
fottowing is correct ?
Q.35 In a Poisson’s probability distribution, which of the Q.19 Q.20
Q.16 Q.17 a | Q.18
[a] “Mean = Variance [ol Mean > Variance |
Q.21 Q.22 b i Q.23 Q.24 Q.25
[c] Mean < Variance [a] None of these
Q.26 Q.27 a +: Q.28 Q.29 9.30
Q.36 Poisson's probability distribution is c | Q.33 Q34 Q.35
Q.31 Q.32
[al uni-modal [bo] bi-modal
Q,36 Q.37 c +: Q.38 Q.39 Q:40 |
[cl uni-modal or bi-modal[d] None of the above
then the mode of distribution is
gaa
Q.37 Jn Polsson's probability distribution, m = np-is an integer
[a] 1 ſb] 3
[e] 2 - fa} 4
are .
Q.39 The mean and variance of geometric distribution
P:.,Þ gq d g
[al g and
and
[b] p ands 3°
[d q=45 B d P.
[oJ 4Z7; an d 2
F(X = x) = .
Q.40 The probability mass function of a geometric distribution is
[a] pq [bl dp
[e] p'q El 4
TECHNICAL PUBLICAT! ons®.- an up-thrust for knowledge TECHNICAL PUBLICATIONS® - an up-thrust for knowledge
Statistics 4-30 Random Variables and Probability Distributions
Notes
‘Unit V-
Inferential Statistics:
Hypothesis
Syllabus
Statistical Inference - Testing of Hypothesis, Non-parametric Methods and Sequential Analysis:
Introduction, Statistical Hypothesis (Simple and-Composite), Test of a Statistical Hypothesis, Null
Hypothesis, Alternative Hypothesis, Critical Region, Two Types of Errors, level of Significance,
Power of the Test : :
Contents
5,7 Introduction
5.2 Statistical Hypothesis :
5.3 Test of Statistical Hypothesis (Test of Significance)
5.4 Null and Alternate Hypothesis
5 7 Lovel of Significance
2) Composite hypothesis (Definition): It is composed by many simple hypothesis and 2) Alternate hypothesis:
hence don’t specify the population completely.
. The bypothesis that is different from the null hypothesis is called the alternate
For example, If x,, Xp ++aX, is a random sample of size n from a population having
hypothesis.
mean j and variance & then the hypotheses.
For Example : :
Hy: Þ= by of = o is a simple hypothesis.
If we want to test the null hypothesis that “ the average percentile of students in a
Where as college have specified mean 70.” then we have
a) Hy: k=þy (where of unknown) Ho: w= 70
b) H: of = o; (where Þ unknown) then alternate hypothesis, denoted by H, or H, is
e) Hh: ay of <0
Mo™
* The null hypothesis is very useful tool in testing the significance of difference. In other type terror
words, null hypothesis states that there is no real difference in the sample and the Type I error happens when null hypothesis (H,) is rejected even if it is true
population in a particular characteristics under consideration. Therefore it is also defined
as “hypothesis of no difference”. Null hypothesis is denoted by Hy. Type Il error
Type Il error happens when null hypothesis (H,) is accepted when it is false.
‘© The aim of hypothesis testing is to minimize both types of errors. But in practice it is
not possible to reduce both errors simultaneously. The reduction of one type of error Acceptance . Acceptance
region region
usually increases the othertype of error. Therefore, a compromise must be found to limit
Critical
more serious errors. Critical
region
TN Critical Region
* Hypothesis testing involves determining the region such: that the hypothesis will be
| rejected if the sample points falls in the region and it will be accepted if the sample
{i) Left tailed test {ii) Right tailed test
points are outside the region. This region of rejection is called “critical region”. It is
denoted by W and acceptance region 1s denoted W,
* The value of a test statistic which separates the critical region from the acceptance
"region is called “critical value” or significance value. 1
li
Oh
os
Z,, be the critical value at level of significance a 2P(Z > Z2)
S
P(Z > Z)- 8
= P(Z ~Z,)
i.e. area of critical region on both tails is same and is equal to. , Qa [BEI Power of Test
Thus total area of critical region for two tailed test is &. : + Probability of type II error is denoted by B ie. B is probability of accepting wrong null
Note : hypothesis. Therefore 1 —. B gives the probability of rejecting null hypothesi
s when it is
1) Critical value Z,, ; For one tailed test (left or right) at significance level false.
& is equal to the
critical value Z,, for two tailed test at significance level 2 a
*. This probability (i.e. 1— B) is called power of the test.
¢ Power of the test helps to measure the goodness of test (te. how well test is working)
B = [Accept H, | H, is false]
[Accept H, | H, is true]
[Accept H, | H,]
P [Type I error]
3
Plan3) | - «= TG
It
ae
Solution : Given
=
A
ll
756
Then power of test = 1-B=1-75g
s
H, : P
I
e
TECHNICAL PUBLICA Trons® -an up-thrust for knowledg
TECHNICAL PUBLICA Tions® - an up-thrust for knowledge.
Statistics §-12 inferential Statistics: Hypothesis Statistics 5-73 _inferential Statistics: Hypothesis
.. 1 Al General Procedure Followed
H, : Pez iin Testing ofStatistical Hypothesis
.. Critical region is, ' Step V : Compare the test statistic value (Z) with Z (the critical value) at level of
W = {x|x<2} significance a.
Draw a conclusion about the rejection or acceptance of H,.
W = {x[x=2}
Note : 1. If fZ]>Z, we reject H,-
a = P- [Reject
H, / H,]
2. If|Z|<Z,, we accept H,
= pix<2/P=4|
8 5.10 Test of Significance
1
= 1-p[x =2/p=S| Depending on the sample size. We categories testing of significance into two types.
= 1a) G)
2; {1 2 1 2-2
1. Test of significance for large samples- (sample size n> 30)
_
—
. Test of. significance forsmall samples- (sample size n < 30)
' We use this test to check the significant difference between proportion of sample and the
population.
Notations used :
X - no. of success in n independent trials
P - probability of success for each trail
TECHNICAL PUBLICATIONS® - an up-thrust for knowledge TECHNICAL PUBLICATIONS® - an up-thrust for knowledge
Statistics 5-15 inferential Statistics: Hypothesis
Statistics inferential Statistics: Hypothesis
I hy _ Q 1-P=1-0.5=0.5
E(X) Ul nP
V(X) Q Test statistic
E(p) P a4
v9) -Sv00=47629= PQ©
V(p)
i
|
_— i.e. - |Z| < Z where Z,,, is 1.0.8. at 5%
where q=1-p
3) If & is not given, we can take safely 30 limits.
salen
. Let = H Die is unbiased Le. P =
2]
I Solved Senn ce
. Tests statistic .
Þ=P_
a PQ/n
0.36 — 0.33
5
Solution : “Probability of getting head /T="5 =0.5 = P (1/3 x 2/3) /9000
0.03496 < 1.96
The coin is unbiased (i.e. b= 0.5)
i.e, [Zi < Z (1.0.8 at 5%)
The coin is not unbiased (P # 0.5) ces
«
Solution : Here, n= 600, X = 36 . = = respectively “Then to test the significance of difference between PY and P, we use test statistic.
36 i
p = Probability of getting defective bolt =%00 © 0.06 . Z =
it
2 QA
-
a
=
H, : 4% products are defective (iy p.= 0: 04)
H, : p#0.04 . | | Rar
Test statistic ,
. 7 = {_Þ=P|
VPQ'n Solution: Given : n, = 500, n, = 100
2 = | EE
0.04 x 0.96] | 25>196=
oO 2, . (l.o.s. at 5 %)
—— = 9 P, = 16 _
"599 = 0.032
600 | 3
Since I> Z| | | P, = 100 =0, 03 |
.H, is rejected Le. defective products are more than 4 %. © ~, Pim + P, n, _ (0.032) (500) + (0.03 x 100) _ 19. = 0.032
hen © 500 + 100 © 600
e. Q = 1-P=1-0.032 =0.968
Since we have to test whether machine is improved or not.
Solution : Given, =X=40, n = 400 | = * We state null hypothesis.as, =
| : X 40 H, : Machine has not improved after overhauling i.e. Pi =P,
p = Proportion of bad apples =, =0g = 0.1 | | Thus HM, :P, > (gh tailed)
Gf q= 1-p=0.9 under H Z = 7 0.032= 0.03
Since confidence limits is not given, we assume it is 95 % = po (+,ray A 0.032 x 0.968 x == Ta
-. Level of s1gnificance & = 5 %
- | .
Z, = 1.96 = 0.104<1.645=Z, (a =5%)
Also population pr oportion P is not provided | | Conclusion As
: [Z}<Z, (at5%Lo.s.), H, is accepted i.e. machine is not improved.
*. The limits are,
7 , :
TECHNICAL PUBLICATIONS® - an up-thrust for knowledge | : TECHNICAL iPUBLICA TICNS® ® - | an
an up-thrust
up- for knowledge
5-19 inferential Statistics: Hypothesis
Statistics _ 5-18 inferential Statistics: Hypothesis Statistics
1 13
Solution : Given : = 1000, n, = 1200 P, =P, . BOM = -1.29
_£0_._4,, .00_2 Under H, Z = = 1
tot
Þ = 1000 93 5 *P2= 1200© 3 PQ (+ + *) (0.525/0.475) (hs an)
Pmt Pm, "soo + 60 1600 _ 8 [Z} 129 <1.96=Z,
n, +n, 1000 + 1200 2200 11
8 3 Conclusion :.
2 = |=
N [$8/xfn
‘| Where Standard deviation of the sample.
Solution : Given
DTT;
Q = 1-P=0.475 Define
i.e. p,= Ps
'H : Proportion of men and women in favour of proposal is same mean) and [L (population mean)
LS ' H, : There is no significant difference between x (sample
H,: p, *p, (Two Tailed test}
- | X= 6.75 -6.8) _
Z| = = =|-0.
2) o/\ſn] | 1.5 /.f900 | 0.67]
Solution-: Given data :
Z = 0.67
. ASH, Is accepted }Z| = 0.67 <Z,=1.96 at5% level of significance, n= 100,X =51,S=6, n= 50
There is no significant difference between the sample mean and the population mean. o - Unknown
H, : The sample is drawn from the population with mean þp= 50.
H, 7H #50
Under H, : Oz Z| zs
$7 vn 6/100 _10_
_ |-51=50| 6 1.666
Solution : Given data, n= 400, þ = 162 5 cms., X = 160 cms, 6=4.5 cms, Since [Z] = 1.666 <Z, =1.96
. Hh is accepted
Note :
© : $:D, of population, Therefore the sample is drawn, from the population with mean p = 50
xX : Sample mean, | EXIT Test of Significance for Two Means
L. : Population mean, 0
_ : . . . 2 =
ns Sample Size}
n: Let x, be the mean of sample size n, from the population with Þ, and variance & , . Let x,
: Sample drawn from the population with mean Þ = 162.5, be the mean? of an independent sample of size n, from another popuiation with mean p, and
2 : '
by us 162. 5 variance & , . The test statistics is given by
Here
Xp 160 — 162. = | -
ere [Z] = o/\ſn = . I. > 29 11.11
Casel:0,=0,=0
Z=
We know
2 2
ns +1,8
Co =
* n +0
There is no ) =X.
Is
Solution : H, :
Test statistics, z=
H;X, = x,
We need to calculate sample variance -
. ce ;
25
= aw KX) _ 784000 _
Examp s= 2° =, = 1000 ~ 784
2400000
— 2
a (8,
.
; .
S, = ,F©2 =T500 = 1600
Now we calculate the sample mean
_7, = Ex, ==49000 =49
i ©" © 1000
2x, _ 70500 _
girls ie. X, =X,
nm, 1500 ~47 7
. HH 49-47 T6900 _ | | 4
The test statistic Z = 7 7 784 1.470 .
N
Il
=
a
=D
vu
< 1.96
As {Zj = 1.47 - _—_—
At 5 % level of significance H, is rejected i.e. there is no Significance difference between th
AS 3.144 <Z,= 1.96 at 5 level of significance.
tt
sample mean.
H, is accepted.
. There is no significant difference between, the me ans of the intelligence of boys and girls
n, = 60 xX, = 79 S,=7 _ , t=
Kap
Syn ©
1.67
— 1.60
Define H, and H,and solve ! 0.16 16 © 1
Ans. : H, : rejected Here, degrees of freedom =n — } =16—1=15 from the
t table | t |= 1.75 <toos js = 2.13
-. Hy, accepted at 5 % /.0.s.
EXIT Test of Significance of Small Samples Therefore we conclude that the mean height of the newcomers is not significantly different
When the sample size is less than 30 then the sample is called small sample. from the mean height of the student population of the previous year
epugabremetonevaveunsmcouurusyaton'or
Following two types are enlisted for small samples to test the significance.
1) Student t-test (t - test) 2) F-Test (Snedecor’s variance test)
Where X
{l
Solution : HA,: There is no Significant difference between the mean height of the newcomers
and the mean height of the student population of the previous year.
i.e. H): w= 1.60, H,:
yu # 1.60
0.011967 _
0.001088 = 0.032983
wn
lt
n-1 11
l
xX —p 0.9883— 1
= TITRE = 1.228815
S|\n 0.032983 / 12
-. [t| 1.228815 < tos =2.201 d.f. 11
.H, accepted at 5 % level of significance. _
©. The average weight of contents of package taken as 1 kg.
Where, X
£076 X,-X, 7
4{0.09125 SA fotn
ats n
_ .
and $
2
Degrees of freedom =n, +n,-2=10+8-2=16
as 2.79 is less than
For 16 degrees of freedorr, the probability of getting a value of t as high
with degrees of freedom n, +, — 2 hypothesis. ' There is
0.05 Therefore the value © at 5 % level of Significance rejects null
Note: significant difference i in the mean yields.
then we have
1) If the two sample standard deviations S,, S, are given
n,S; +n, s*
S* = n 7H 2
2) Let X,, X,, ....X, n and X; 1? ,X;, ... X! n be the values of two random
samples from two Solution: H,: me Þ,
and 2 but the same
different normal population N(Þ1, o) and N(Þ,, o) having pt, H,: kW *Þ
standard deviation o then the t-test statistic is
Here x, 20, n, =10, 8, =3.5,
lt
be 18.6, 14, S$, =5.2,
it n, $ '+n,S)_ 10 (3.5)°+ 145.2)"
= 22.775
A
10+ 14-2
Ln
1
Iz
With n, + n, ~2 degrees of freedom.
. S = 4.772
Ext FTest — | : ,
Let X,, X,, ... X, and x > 3» wee xy are two independent random samples from two
normal population N (,,0,) and (1,,0;) respectively then the statistics of F-distribution is
& 144
Solution
= 10-1 16
ry
YH
F =—S, where $.'>S;
1 2
| -
_ 264 _.,
= 2 1301-2
2 DX, —X 2
Where S, = n,-1 F test statistics
©
2 2X, — Ky
©]
S, = n, — 1 .
. 2 Degrees of freedom = n,-1,n-1=13-1,10-1=
12, 9
(a, and n, be the sizes of two samples with the variances S, and S,2 )
Note : . .
1. In F-test. It is very important to note that the population variances are equal
: . 2 2 .
ie. O, =O 2
%
2) The degrees of freedom is (n,~ 1, n, ~ 2)
PAIRUA WHOA KOye
Test result : If the calculated value of F exceeds Fy; forn,-1,n,-1 degree of freedom from Solution :
the given F-value table. We conclude that the ratio is significant at 5 % level
Remark : When readymade data is given instead of tabular long data. We use
Mo
2 Nn, S, 2
to
S = 2 x, — A
Ln S, = -
.
1
2 .
2 1,8 2
— 2 _ 61.52-8(1.2)
ID 2 2 dx, —n,x,) gop 714
_— Sy 7 1
TECHNICAL PUBLICATIONS® - an up-thrust for knowledge TECHNICAL PUBLICATIONS® - ar up-thrust for knowledge
__tnferential Statistics: Hypothesis Statistics (5-31 inferential Statistics: Hypothesis
Statistics 5-30
2 - —~ 2 | oh For Sample I .
—1(1X))
2x, n,— —11.(1
_ 73.26 11- .5) == 4.85 \ | .
2 4,
N
1:
A
=
>
ON
©
i,R
=oO
pe
~
©
8,
i
i
l
mo
-. F test statistics is. = | . I —— = 12.057
I
wn
i
o " __F= = (-6,=6, ) (Sample drawn from the same population or
ons Here the tabulated value of F at 5.1 level of significance with degrees of freedom.
;
(n, —1, 2, —1) (7, 9) is 3.29
. .
Solution :
[F| = 1.057 <Fo95 = 3.29
= 4
. ,
here
1.
gut S
2
209 _,
22-1 $1 : , = => H, is accepted.
1 n,-1 -
eo
2 _ 198, _16(3.8)
S HET 16-1 © 15
Test statistics
2
S\ 15.4
F= +=
$, 8.81
=1.75 . -
ay -djr016h
Mm
Example Ty Solution :
H, : The stability of both workers is same 0; = 9,
H, : Worker A is more stable than worker B.
n, = 6 n, =8
Solution : . .
1 2 .2
: .
6, = ©,
. . . IX 222
_ Hy: There is no significant difference between population variance Le.
H, : There is significant difference between population variance.
_ _ 2X; i _ 315 315 39.38
I ;.
For sample
,
n, = 8 = 84.4
Dx,-%,)
.. H, is accepted,
i.e. The stability of both workers is same.
Case Study:
L. Dieters lose more fat than exercise ?
- Diet only
Sample mean : 5.5 kg.
Sample standard deviation : 4.2 kg
- Sample size =n = 46
Testing a hypothesis
Iona em
mem mat, dn nen,
IT TECHNICAL PUBLICATIONS® - an up-thrust for knowiedge
Statistics = 5-35 inferential Statistics: Hypothesis
Statistics 5-34 inferential Statistics: Hypothesis
. . 2
Which is another form to calculate x value.
Step Ill: Decision
a= 0.05 ie 5% Remarks
AS © Z = 1.56< 1.96 © Z,) i) x’ value ranges from 0 to °°
it
izi= Z, . ii) If x - = 0 then it indicates that observed and theoretical frequencies matches exactly.
We accept H, - null hypothesis and we conclude that there is no significant difference iii) If x >0, shows that there is difference between observed and theoretical frequencies.
between the two methods. The greater the discrepancy, greater the x value.
When any experiment is ‘performed, it is observed that theoretical consideration and 1) If expected frequencies are not ‘provided then iit can be calculated by using the formula.
Row total x column total
observed data are not same. There is always some difference between observed data and Expected frequency = (formula i is used for contigency table]
Total frequency
theoretical data. va (Chi-square) test helps to measure the magnitude of this discrepancy
between observed and expected data. . 2) x test can be used as a
Chi-square test is most commonly used test to check the goodness of fit of theoretical a) Test of goodness of fit
distribution to actual distribution. b) Test of homogeneity of independent estimates ofpopulation variance
Test statistic for x’ - test is given by c) Test of hypothetical value of population variance
n 2 d) List to homogeneity of independent estimates of population correlation. .
-(O.-E, .(5-E1.1)
Y= yo . - 3) Once expected frequency for one cell of 2 x2 contigency table is calculated, then
. i .
*
. =l remaining frequencies can be also calculated by using the formula either (Row total —
where O, (i= 1, 2, ...n) are observed frequencies. calculated frequency) or (Column total — Calculated frequency).
E,(i= 1, 2, ...m) are expected frequencies.
Conditions for applying 7 test :
Such that
XO,= LE, =N (Total frequencies) 1) Total frequency (N) should be large (N > 50).
me
Simplifying RHS of equation (5.11.1), we eet . 2) Expected frequencies should be greater than or equal to five i.e. E, >5Vi=1,2,
*;
4 n
0; ~20,E,-E, Since x value is over estimated for small values of expected frequencies.
>
x = i=1 Ej 3) When Ex 5 for some class “i?” then that class is merged with neighbouring class until
the total value of expected frequency becomes greater than or equal to. 5. [While
22 2X0 En
=
E, -2N-N 4) Note that the number of degrees of freedom is to be determined with the number of
classes after pooling them.
“O
ew!
_N _ ... (5.11.2)
71 iy
—_
=Jl
¢ rf ~E)
Solution : =
1
SPPU : Deci-D1
Solution : From given data
Solution : Given proportion 8 :
No. of coin tossed n = 5 13 : >
8+2+2+1
No. of times coin tossed N = 210°
Total frequency = 524
“. By binomial distribution, expected frequency for getting r heads is
n r_n-r Then expected frequencies E. are
Expected frequency (E,) = NP (X=r)- 210 C.p q .
5 1Y .
= 210°C, 5 “tex 524 = 40
Conclusion :. :
Solution : From the given data, we observe that expected frequencies in first and last classes
are less than 5. Hence we need to apply pooling of classes.
We will merge these classes with their neighbouring classes.
Thus we get
naa nti oe
= .
Thus we have
= 38.392
i
2. A psychologist feels that playing soft music during a test will change the result of test.
The psychologist is not sure whether the grades will be higher or lower. In the past
mean of the scores was 74,
State H, and H,.
3. Let P is the probability of getting head when a coin is tossed once. Suppose that the
hypothesis H,: P = 0.5 is rejected in favour of H 1 P=0:6, if 7 or more heads are
obtained in 10 trials. Calculate the probability of type I error and power of the test.
7
, : 7
4. In a Bernowli's distribution with parameter P, H, : P = 773 against H, © P -4 is
rejected if more than.3 heads are obtained out of 6 throws of a coin. Find the
probabilities of type I and type II error. [Ans. : a = 0.3437, B = 0.3196]
TECHNICAL PUBLICATIONS® - an up-thrust for knowledge . on = TECHNICAL PUBLICATIONS® - an up-thrust for knowledge
Statistics , 5-43. inferential Statistics: Hypothesis
Statistics 86-42 Inferential Statistics: Hypothesis
15. The mean weight obtained from a random sample size 100 is 64 gms the S.D. of the
5. An urn contains 6 marbles of which @ are white and the others black. In order to test
weight distribution of the population is 3 gms. Test the statement that the mean weight
the null hypothesis H,: 0 = 3, Ws H, : 0 = 4, two marbles are drawn at random {Ans. : Ho rejected]
of the population is 67 gms at 5 % level of significance.
(without replacement) and H, is rejected if both marbles are white otherwise H, is.
16. Five measurement of the tar content of a certain kind of cigarette yielded 14.5, 14.2,
accepted. Find the probabilities of committing type 1 and type II error.
[Ans. : « = 0.20, Þ = 0.60] 14.4, 14.3 and 14.6 mg/cigarette. Assuming that the data arc a random sample from a
6. Let XK, X,, ve Xo bea random sample from I (@ 25). If for testing H, : 8 = 20 normal population show that at the 0.05 level of significance the null hypothesis
Le = 14-must be rejected in favour of the alternative hypothesis 4 # 14. (Hint : yo
against H,: 0 = 26, the critical region W is defined by W ={x |¥ > 23.266}. Then calculate S.D. of sample.)
find the size of critical region and the power. [Ans. : 0.2389]
7.. 325 women out of 600 women chosen from a city were found to be working in different
sectors. Test the hypothesis that majority of women in the city are working.
[Ans. : Ho is rejected at 5 % 1.0.s.]
[Ans. : Ho rejected] |
8. A random sample of 500 bolts was taken from a large consignment and 65 were found
to be defective. Find the percentage of defective bolts in the consignment ? 17. 4 sample of 1000 students from a university was taken and their average weight was
. . [Ans. : Between 17.51 and 8.49] found to be 112 pounds with S.D. of 26 pounds. Could the mean weight of Students in
fAns, : Ho rejected]
9. A bag contains defective articles, the exact number of which is not known. A sample of the population be 120 pounds.
was Rs. 210 with a S.D. of Rs. 10 in sample of
100 from the bag gives 10 defective articles. Find the limits for the proportion of 18. The average income of person
defective articles in the bag. [Ans. : limits are (0.1588, 0.0412)] 100 people of a city. For another sample of 150 persons the average income was Rs.
10. In a city a sample of 1000 people were taken and out of them 540 are vegetarian and 220 with S.D. of Rs. 12. The S.D. of incomes of the people of the city was Rs. 11. Test
income of the
the rest are non- vegetarian. Can we say that both habits of eating are equally whether there.is any significant difference between the average
: [Ans. : Hy rejected]
popular in the city.at i) 1 % level of significance ii) 5 % level of significance. localities.
results. Examine
[Ans. : Ho rejected at 5 % of 1.0.5, Ho accepted at 1 % of /.0.s} 19. Intelligence tests on two groups of boys and girls gave the following
[Ans. : H, accepted]
the difference is significant ?
Test of significance for difference of proportion ' paint. Two
26. A product developer is interested in reading the drying time of a primer
11. 4 manufacturing firm claims that its brand A product out sells its brand B product by formulation of the paint tested formulation 1 is the standard chemistry
and formulation
8 %. It is found that 42 out of a sample of 200 person prefer brandA and out of 18 out 2 has new dying of ingredient that should reduce the drying time is8 minutes and this
of another sample of 100 person prefer brand B. Test whether the 8 % difference is inherent variability should be unaffected by the addition of the new ingradient
10
valid claim. ; [Ans, ! H, accepted] are painted with
specimens are painted with formulation I and another 10 specimens
12.. In a pole submitted to the students body at university, 850 men and 560 women voted. min andX, = 112 min
500:-men and 320 women voted “yes”. Does this indicate a significant difference of " formulations 2. Two samples average drying lime are x, = 121
effectiveness
opinion between men and women on this matter at 1 % of level of significance ? respectively. What conclusion can the product developer draw about the ]
, [Ans. : H, accepted] [Ans. : H, accepted
of the new ingredients ?
bank studies claim as
13. In a town 4, there were 956 births of which 52.5 % was males while in towns 4 and B 2i. A study of the number lunches that executes in the insurance and
the following
combined, this proportion in total of 1406 birth was 0.496 Is there any Significant deductible expenses per month was based random samples and yields
difference in the proportion of male births in two towns, [Ans, : Hy rejected]
results. n, = 40, X, = 9,8, = 19, “ny = 50, ¥,= 80, S, = 21, Test the null
[Ans. : H, rejected]
Test of significance for single mean - hypothesis against the alternative hypothesis ?
14. A random sample of 900 members has a mean 3.4 cms. Can it be reasonably regarded
as a sample from a large population of mean 3.2 ems and standard deviation
2.3 ems ? {Ans. : Ho accepted]
knowledge
TECHNICAL PUBLICATIONS® - an up-thrust for
- TECHNICAL PUBLIOATIONS®- an up-thrus! for knowledge
Statistics . 5-44 inferential Statistics: Hypothesis statistics 5-45 inferential Statistics: Hypothesis
f
22. To find out whether the in habitants of two /slands may be regarded as having racial _29. Two random samples are drawn from 2 normal populations are as follows
ancestry an anthropologist determines
the cephalic incides of six adult males from
each island getting x, = 77.4 andx,= 72.2 the corresponding standard deviations
Jo, = 3.3 and o, = 2.1 Test whether the difference between the two samples means
can reasonably be attributed to chance. [Ans. : H, rejected} [Ans.:H, accepted]
30. The two random samples reveals thefollowing data
23. A group of 10 boys fed on diet A another group of 8 boys fed on different diet B
recorded the following increase in weight (kgs).
DietA | 5 6 8 i 12 4 3 | 9 6 10
Diet B i0. 1 2 71 8
wo
oo
a
Ww
24. A random sample of 10 boys had the 1.Q.’s 70, 120, 110, 101, 88, 83, 95, 98, 107 and 31. Two independent samples of siz
size 8 and 9 had ihe following values of the variables.
100. Do these data support the assumption of population mean 1.0. of 160 ?
: [Ans. : H, accepted]
25. A sample of 18 items has a mean 24 units and standard deviation 3 units Test the
hypothesis that it is a random sample from a normal population with mean 27 units.-
fAns.: H, rejected] Do the estimates of the population 7 variance differ Significantly ?
26. The average number of articles produced by two machines per day are 200 and 250 32. Ina sample study, following information regarding demand for a particular spare part
with standard deviation 20 to 25 respectively on the basis of records of 25 days on various days was obtained from given data , at 5% 1.0.s. fest that demand of no. of
production. Can you regard both the machines equally efficient at 5 % level of spare part does not depend on day. :
significance ? [Ans. : H, rejected]
27. For filling each bottle with 170 tablets of a perticular machine, an automatic machine
was installed from the production a sample of 9 bottles was taken the number of tablet
found in these 9 bottles are as follows. Test whether the machine has been installed
properly or not. X : 168, 164, 166, 167, 168, 169, 170, 170, 171. — [Ans.: H, rejected]
[Ans, ys = 0.18029, Accept H,]
F-Test 33. The table below gives the number of accidents that occurred in the certain factory on
28. The following data shows the runs scored by 2 batsman. Can it be said the
performance of batsman A is more consistent than the performance of batsman B?
Use 1% level of significance.
Test whether the accidents are uniformly distributed over the different days.
[Ans.: == 5.25, Accept Hy]
[Ans. : H, accepted]
TECHNICAL PUBLICATIONS® - an up-thrust for knowledge TECHNICAL PUBLICA TIONS® - an up-thrust for knowledge
’ Statistics _—_— 5-46 : Inferential Statistics: Hypothesis = 5-47 inferential Statistics: Hypothesis
in son
34. Using the data given in following contigency table, test whether the eye colour
Dayeenvarcvend
Multiple Choice Questions
- is associated with the eye colour in father.
2
Q.1 An assertion made abouta population for testing, is called ,
37, The no. of defects per unit in a sample of 330 units of a manufactured product
was [al level of significarc2 [6] confidence level
found asfollows.
CIO power of test [4] critical value
Fit a Poisson distribution to the data and test for goodness of. ‘fit. [al Sample size [6] Power of test
[Ans.: y== 0.0292, Accept H,]
[Hint - Apply pooling of classes] [cl Critical value [a] None of these
.
each time is
38. A set of 5 coins is tossed 3200 times and number of heads appearing Q.7 A region where the null hypothesis gets rejected is called .
noted. The results are given below.
[a] critical region [v] acceptance region
TECHNICAL PUBLICATIONS® - an up-thrust for knowledge FESHNICAL PUBLICATIONS® - an up-thrust for knowledge
Statistics 5-48 inferential Statistics: Hypothesis Statistics : ‘5-49 inferential Statistics: Hypothesis
'Q8 If the critical region lies on the one side of the distribution then the test is referred as, [ 9.16 The probability of Type | error is referred as
[al Zero tailed [b] One tailed [a] research hypothesis n composite hypothesis
ſc] Two tailed [d] Three tailed 5 null hypothesis [d] simple hypothesis |
Q.10 The type of test (One tailed/ Two taited) depends on, Q.18 If a null hypothesis (H,) is accepted then the value of test statistic (z) lies in the
TrRe RURAL ALOT IAATIANIC® brit for bnnwiarina TECHNICAL PUBLICA TIONS® . an up-thrust for knowledge
5-67 inferential Statistics: Hypothesis
inferentiat Statistics: Hypothesis Statistics
Statistics . 5-50 =
black
Q. 28 Among | 64 offspring's of a certain cross between guinea pig 34 were red, 10 were
from a set of tables. Th e frequencies of the
digits are as.
Q.23 200 digits are chosen at random and 20 were white, Acceding to genetic model, these numbers
should in the ratio 9:3:4.
follows : o
2 | 3 | 4 5 6 7 8 9 Expected frequencies in the order .
a] 20 and 10 [b] 21 and9 Q.29 A sample analysis of examination results of 500 students
was made. The observed
are in the ratio 4:3:2:1 for the
frequencies are 220, 170, 90 and 20 and the numbers
ſe] 20 and9 [a] 15and8 Then the expected frequencies are
various categories.
Inferential Statistics :
Tests for Hypothesis
Syllabus
Steps in Solving Testing of Hypothesis Problem, Optimum Tests Under Different Situations, Most
" , : : Powerful Test (MP Tesi}, Uniformly Most Powerful Test, likelihood Ratio Test, Properties of
. . Likelihood Ratio Test, Test for the Mean of a Normal Population, Test for the Equality of Means of
; \ . . Two Normal Popidations, Test for the Equality of -Means of Several Normal Populations, Test for
the Variance of a Normal Population, Test for Equality of Variances of two Normal Populations,
Non-parametric Methods, Advantages and Disadvantages of Non-parametric Methods.
Contents
8.7 Pre-requisites :
6.2 Optimum Test Under Different Situations
6.3 Best Critical Region (BCR) /Most Powerful Critical region / Most Powerful Test (MP)
. * 6.4 Uniformly Most Powerful Critical Region or UMP Test
6.5 Likelihood Ratio Test (LRT)
.
. - : {6 _ 1
TECHNICAL PUBLICATIONS® - an up-thrust for knowledge : . _ )
Statistics . \ 6-3 inferential Statistics : Tests for Hypothesis
Statistics * 6-2 inferential Statistics : Tests for Hypothesis |
2. Uniformly most powerful test*(UMP)
Parameter space Ba Best Critical Region (BCR) / Most Powerful Critical region / Most
Parameter space is'a set of all values that a parameter of population can take. It is denoted Powerful Test (MP)
by 6. In testi ing simple null hypothesis (H)) vs simple alternative hypothesis (H,) the critical
eg: | — “region which has maximum power (1- 8) among the set of all critical Tegion belongs to
value *
1) Consider normal distribution N(l, o). Here the parameter Þ (mean) can take any - sample space is called most powerful test.
from — oe to 0 ; therefore parameter space for | is . The region is called most powerful region.
= {p/—o<p<o} While testing simple null hypothesis -
And the e param (variance) can take only positive values, therefore patameter space H,: 8=0,
foro’ is , vs
= {o/o' > 0}
2) Consider a null hypoothesis The critical region W is said to be most powerful critical region of size 0.
: Þ<4 if P(x € W/H,) = JLo dx =
ws.
Here pcan take any vaalue from co to 4
P(x = W/H,) = P(x € W,/H,)
= {u/oo<ps 4}
Where, W, is any other critical region.
©, is parameter Space wrt. Hy. -
Clearly ©, < © [X41 Uniformly Most Powerful Critical Region or UMP Test
vs composite alternative hypothesis, the critical
_In testing simple/composite nnull hypothesis
Likelihood function
region which has maximum power among the set of all critical regions belongs to sample is
It describes how likely a specific population can fit the observed data.
called UMPCR or UMP.
Definition : Consideer a population with p.d.f. UMP region may or may not exist. However, for H, : 8 = 6, vs H, : 0, > ®, or H, : 8, < 8,
function
f(X,9,, 8,,... 0) ; where X : X), Xz Xz, ++ Ny isa random sample. ‘Then likelihood
a UMP test exist.
of given sample observation is
Note : BCR or UMPCR is find out with the help of Neymann-Pearsson Lemma.
= Tl f(x,8,, 0, -.., 0)
1=1 Neyman - Pearson Fundamental Lemma (N-P-Lemma) Statement :
Let k> 0 and W bea critical region of size & such that
(GE) Optimum Test Under Different Situations ov
> kj | .. (64.1)
| +
(8) or to maximize
In hypothesis testinng, our main focus is to minimize the type I error w = {xe s
on
test statistics and critical
power (1 — B) under fixed a. This can be done by proper choice of
. ,
| md
region . W=jxeEs ;=<k ... (6.4.2)
0
Henceforth, following methods help us to choose best critical region.
1. Most powerful test (MP)
<k © (nw
Fl [1
Px € W/H,) = 1-B «JL dx ... (6.4.4) 0 .
= L,< kL,
Let, W, be any other CR of size &, < &.
ſues, Shore [Lies
P(x © W,/H,) = «= ) To dx ... (6.4.5) B B. A |
1
ſt, a < {Li a
P(x . W/H,) = 1-6, . = Ww,J Ly a . .
... (6.4.6) B . * A |
Tues | ia
Consider ,
;
|
W
.
W,
S
Buc ~ AUC
aS a
from (6.4.5)
and (6.4.6) Ww ſua.
1 «J ſie
! | - 16 aud.6)
From (6.4 (6.4.4)1
> L, >kL,
_ ſua, [Load
W Ww
fx, dx >k- ſL&,. J to ax
A A B
TECHNICAL PUBLICATIONS® - an up-thrust for knowledge TECHNICAL. PUBLICATIONS® - an up-thrust for knowledge
inferential Statistics : Tests for Hypothesis
Siatistics
Statistics 6-6
Solution : Given :
7
IS
_
Lp .
MX
ll
1-B* 0:3.
& = 0.37
3] There are two critical regions for which & < 0.2
B = P [Accept Hy Hi] .
= {X=
OW,
wn
I
on
LY
= R(X<1.7/H,) * Power of test P [Reject H/ H,)
I}
P[X=2/A]
0.3
ii) W,= {X= 1}
, Power of test P [Reject H/ H,]
P [X= 1/H,}
I
2,
©
|
i
0.4
4]W,is more powerful than W,
5] Yes the C.R. {X = 1} is more powerful than {X =2}
CR.
P {x>0.5/6=1}
WU
P {0.5<x<8/Q=1}
P {0.5<x<1/6=1}
1
Jo ax W= {x:x 2 1} and W={x:x<1}
0.5 89-1 H,:0= 2 andH,:0 =1
1 & = Size of the type I error
\1ax=0.5 = PREW/H|]
0.5 = P[x>1/0=2}
Similarly, B PIxeW/H] = fe™ dx
1 :
P [x <0.5/0=2] ET
0.5
J (Gs, 99},-, x —__-
22 1
0 =e =z
€
0.5 1
J 5 dx B = Size of type 1] error
0
= P{xe W/H] =P{x<1/0 =1]
B = 0.25 1 Te 1
Therefore size of type I and type II errors are & = 0.5 and Þ = 0.25 respectively.
0
ii) W=CR={x:1<x<1.5}
a P {xe W/6=1}
1.5 Example 6.4.5
J f(x, @)],.., dx
1
“= 0
Solution : Pdf for norma! distribution is
Since under H,:0=1 2
f 1,2 1 (4g)
f{x, 8) =0, for l <x<1.5 P.d.f= (2n nl exp SE
%
2
B= P {xe W/0=2} Mean = 1, variance = & 0
TECHNICAL PUBLICATIONS® - an up-thrust for knowledge TECHNICAL PUBLICATIONS® - an up-thrust for knowledge
” Statistics . 6-10
inferential Statistics : Tests for Hypothesis
The likehood function is
(x,~ 10)"
f(a; 6,y) = Bexp {-B(x—)},x>y
- (32n) ** exp = = 0, - . :
otherwise
the likelihood function is given , ; .
| x |
-S(Lx —20 Ex,+ 16% (107 —Lx; + 2p Ex;—16 uw
i= J |
BM
exp
L(y)
vi
”
. fod 2 2 L(Y) . tT
= . exp 35 (2 Ex, (u-10) + 16 (4 —I10);<K
x
Taking natural log on both the sides
-2(y =10) £x, +16 (y*—10\) £32 in (k)
=
_— i 1} | 6) ep]-B, Lo 040 + Bon (Xn) 2
> 16% = Tee (p10) (32 Ink-16 (u’— 10°) = =_ PR 9
=k say : Taking log,
We know that . —
nin Ea Boy = ink
B, -B,) +9 8,¥,-n
= LO cit ze -
L(y)
Therefore BCR of size o for H, : # = 10 vs . > x (B,-B,) $ 1B =tBeSlnk+ln 5]
H, : p>10 | > *<3 6,5 ¥;B,-%By~a ln k + In = BL Bo
is given by
Example 6.4.7 7
| CR {Op Xp en&) 1% BK}
_y QC (PX>k) when u=10 _.
HN
For f(x, 8) =
LO) _9€
. E Oo
i= 1 i=1
(x)= A max L (x,0)
By N-P lemma L®) gee
Ratio of likelihood function is given as where L (x, 9) is likelihood function then LRT says
Reject H, if,A <C
4
(1 +8) i +0) where C is constant choosen such that
— n
>k P[A<C/¥Hy] | =O |
n Il 1
i.e. C is choosen s.t. probability of type le error is of size a.
1+8
( o (= (x, + 6g)?
Remark : :
n 1 . 1
IA
Thus 0 < ASL (Using remark i)
I In xt 8p In other words, smaller the value of A, more the chances of rejecting H,.
i=1 AX tS, |
i) If H, and H, both are simple hypothesis, then L.R.T. is equivalent to N-P lemma.
Which can't be put in the form of a function of the sample observations, not depending on
iv) if either one or both hypotheses (H,, and H,) are composite then LRT statistic A is a
hypothesis.
function of every sufficient statistic for 9.
Hence no BCR. exist in this case.
(HJ Properties of Likelihood Ratio Test
6.51 Likelihood Ratio Test (L.R.T.)
1) Ifnull and alternative hypothesis
N-P lemma is applicable only if both H, and H, are simple hypotheses. But if either H, or are both simple then LRT leads to the N-P lemma.
H, or both are composite then no such theorem is available. In such cases we can employ the 2) if UMP test at all exists then LRT is generally UMP. In this case following
asymptotic properties of LRT are satisfied.
likelihood ratio test (LRT), to find BCR and power of the test.
a) Under particular conditions, —2 log, A has asymptotic chi-square distribution.
Statement - .
b) Under certain assumptions LRT is consistent.
Let © be a parametric space, ©, and ©, are parametric Spaces W.r.t. null ©, and alternative
hypothesis ©, respectively. Such that ©= ©, U ©).
Considera random variable X with p.d.f. f(X, 9) where 8 & ©
= {(p, o) /u =p o > 0} yw wt a) v1
i=l
ae)
=]
Als we have, for N & &P) distribution curve is given by the formula.
= L
%, is given by °
Then likelihood function of the sample observations Xp X,,.--,
(6.74)
.
[= 5
=
exp| SO _— i. (6.7.1
Substituting equation (6.7.4) in equation (6.7.1) we get,
I
2710
. .
1=
L{®)
2: w2
Substituting ofeequatio n (6.7. 2)i in equation (6.7.1), we get
3 +... (fromequations (6.7.2) and (6.7.4)
<
n
il
ub) = [oo] wee E 5x -¥) = ‘|
A )
w2
= ... (6.7.7)
[ea] . i. (6.7.3)
2s
and we accept H, if
[t] < t,_, (0/2)
X— My Remark : | .
~t 1) Consider, testing of H,: = uy, 0 < of <oo against H, :u >, 0< of <a
fl
Sin -1
The parameter space is
t RK Uy
=> © = {@, 0); => <p<o, o >0}
n—1 S
II
OE
—1
-
=>
o (
+
Iya
t
=)
2
2c
2 Cc
=2in
- Enos
1 + 2 _ a2 >
IV
Y
EM:
Ke (say)
IV
3 1 i 2 2
Thus the critical region may be will defined by
& = i. >) =
VeGat)
il
Then likelihood functions under © and ©, are
it} =
A i W | a 2
where the constant K is derived such that size of critical region am Le. L(®,) = (Fr 5) exp 57, i (x,
— 8)
P (i= K|H,) = a i=l
Since, under H,, the statistic “t”. follows students t-distribution with (n —1) degrees of
freedom, :
1 A —12
\ſ2zs,) ©
Thus LRT statistic is given by
A=
L Gp
A
yy
9
“Ym
aN
77
W
fff ifx2 |
Ii
= 1] if x <p
[HI Test for Equality of Means of Two Normal Populations
Which is same as the LRT statistics derived in equation 6.7.5. +
Let X, and X, be two independent random variables from normal populations, having
Therefore following the same procedure (as in the case of testing of H,: w= Þy vis
« means 1, and Jl, respectively and corresponding variances bo and o..
: 1 Pg), we get ihe crilical region viz 0 < A <A equivalent to '0?
‘
Let Þ,, >» °, and 9, all are unknown. Consider the problem of testing hypothesis
2 2 . .
t =
Jn &-pg)
sek . "x2 H, : 1, =, =; ©,, 0, > 0 against an alternative hypothesis.
2 2
H,: W#W; 5,,0,>0
Here t is.right tailed t-test and k is calculated as 2 2 .
P(t>k) = a CaseI: 6, # ©, (Population variances are not equal)
Parameter spaces w.r.t. entire region and H, are defined respectively as
k= t,_,@
n-
Similarly to test populations N(u,,o ') and N (1, © "), with sizes m and n respectively,
H, : = Ly O >0
Then likelihood function is given by
against 1 m/2 = n/2 1 I/n
Hy: <p, o >0 L = [== exp|” LY (1 xg) exper 20, j=1
gn)
The critical region is given by --.(6.8.1)
t <-t,_,(@)
TECHNICAL PUBLICA TIONS® - an up-thrust for knowledge
TECHNICAL PUBLICATIONS® - an up-thrust for knowledge
Statistics 6-20 Inferentia { Statistics: Tests for Hypothesis Statistics : 6-21 Inferential Statistics : Tests for Hypothesis
A
We get, L(O) = |—z —; e , .-. (6.8.4 211/22
“BF © = as) (ams) * | —_ mtn 32] L &~ wht Yo
Now, in the parameter space defined under H, (i.e. in g, we have Þ,= Þ, = Þ and the
1) 2. tl
L= (=) e -.. (6.8.8)
likelihood function is given by,
n | And MLE ofu and of are
YL (9 -| 1/26} 2 (%; =) |
(1 ©) j=] jel Osh 20 sal -2 By 1)-2Y(x, 1) |=
oa Gee=o weeaye ©
In (©), MLE of 1 is obtained as the root of cubic equation SS x, aye ~mu —nu =0
+l * Fl
om (%,- 1) n (X,— 1)
+ ... (6.8.5) >mx, + nx, =(m+n)u
= A 6 A
x (x, - ay > (x; py
i=] j=l SHE USER ... (6.8.9)
Which is a complicated function of sample observations. Apparently the LRT statistic and
Similarly, Glog
log LL = 0
also become complex function of sample observations and its distribution become tedious.
Therefore critical region for given 1.0.8. & can ot be obtained.
Lo 2 2 2 2 2 ma So
Case H : Population variances 6, and ©, are equal 1.e. 0, = 0, =0& {say) 20} i= a 2j
mtn
Consider 4 1 2
= ” ; .- (6.8.14):
EA
a
2, (x;
A.2
=Þ) = (=3 +3, —ÞY ial
a
mn (X,— X,)
1
_il
wt
(m+n)(ms, + ns})
EA
(> FO + m(X, = uy For students t-distribution with (m + n — 2) degrees of freedom, the test statistic is given by
1
T (See explanation in test for mean of normal population)
mx, + nk,
AN
1
ms, +m|xX,7 Xn
ll
J 2.
A .mx.+ nx. Where g = RIT_
[pe and ms; = Souk x))]
2, =
2 mn (X,~X,)
= ms 7”
by (m+n)
t
2 . mn (X, - X,)
2 , — 42 5
—_ . Aa 2 mn (X,-X;) = — Ta
Similarly,
4 Ay:
2 . (x4—Þ)
_ == 8; + a+ nj? m+n-2 (m+n) (ms, +ns,)
.. equation 6.8.14 becomes
Substituting these values in & in equation (6.8.10) we get mtn
N 1
os 6-5 | (6.8.11), . : x = [1+ t pee) | 68.15)
T = m+n [ ms? rn + mtn mt+n-2
Thus from equations (6. 8.8), (6.8.9) and (6.8.16) we get The critical region of LRT is given by
_min [= + a] ;
A mtn. 2 0<A<C
L@) =e 2.+4— mn > ..(6.8.12)
2n (157+ ms? REIT RS "ec
m+n-2
From equation (6.8.6) and (6.8.7) we get,
m+n m+n
LG) =- © 2 2 . [ann ... (6.8.13)
(6.8.1
t
2 1 1
. I |
From equations (6.8.12) and (6.8.13)
, I
LRT statistic is given by 2 (m+n-2) = 1
A
A =
L(O,) . tek (say) .
A
L(®,) =. It|2k
2 2 mtn where k is calculated as l .
(ms, + ns,) 2 P(\t}>k|H,) = & , ... (6.8.16)
It
2: 2 mn = 2
(ms,.+ ns.)+ mn (X,-X,)
TECHNICAL PUBLICATIONS® - an up-thrust for knovdedge TECHNICAL PUBLICATIONS®- an up-thrust for knowledge
6-24 3
inferential Statistics : Tests for Hypothesis .
Statistics ‘
Statistics : 6-25 inferential Statistics : Tests tor Hypothesis
of freedom
Since, under H,, the statistic t follows the student’s t - distribution with degrees
GE] Test of Equality of Means of Several Normal Populations
(m +n — 2), therefore from equation (6.8.1) we get,
k = (0/2) ... (6.8.17) Consider there are k normal populations with means þ; , @ = 1, 2, ... k) and common
| .
ten2
2_ 2 2 . 2 22 2. variance o*. Which is-unknown to test the null hypothesis Hy: p,=—, =H, =... =U, =U
Thus LRT for testing Hy : þ, =, ,6,=6, = 0 vis H, :W#Þ,,0, = 9, is
against an alternate hypothesis Huw Wy=W,6.. ch
equivalent to two tailed t-test where we reject H, if
Letx;
þ (i=1,2, .... kandj=1,2, ...Okn) be thej® observation
.th
from i normal population
te
Otherwise H, may be accepted. In this case parameter spaces are defined a
~ Remarks : 2 2 2 . 2_2_ 2. : ©
. : 0 1S same as <a, o> 0} i =1,2, wk
1) LRT for testing Hy : yy, = Þ, 3 0 = 0, =O against Hy: HW, > Hy, 6,=0,—
{(u, of) ;—<p,
L(©,)
i
3) Po The MLE of p,(i= 1, 2,
1,2,...,k
pets
ll
ll
=
and _—_ o =
where So =
TECHNICAL PUBLICA TIONS® - an up-thrust for Knowledge TECHNICAL PUBLICATIONS® - an up-thrust for knowledge
Statistics 6-26 ‘Inferential Statistics : Tests for Hypothesis Statistics 6-27 inferential Statistics : Tests for Hypothesis’
—_—_ _1 2 _2r
and & =o x 2, (xg —X) =
where
NN Asc
©
|
Then
I
, I!
A =
LO)AE 3($392 (6.9.1) F > A (Say)
p Where A is obtained as P (F > AJH =O
From E-table at 100 & % of 1.0.s. and for k—1,n-k) degrees of freedom, we get
Consider si = $ E Sy
a =] A= Fer, cx (0)
. Thus, H, is -cjected if
=> (xj-% +x) = Fer,n-k(O)
me =]:
Otherwise it is accepted.
= $ 5 (x; —X) "3 ni,- X) Note : To find MLE and ML function, follow the procedure as discussed in previous cases.
(dell
= Sot S; . (6.9.2) Remark :
ni
In analysis of variance (ANOVA) the terminology.
where S,= 1, (%,-%)
Fl i) Sy is called Within Sample Sum of Squares (W.S.S.)
From equation (6.9.2) and equation (6.9.1) becomes ii) S$? is called Total Sum of Squares (T.S.S.)
iii) S, is called Between Samples Sum.of Squares (B.S.S.)
L ToT (6.9.3) | iv) Sz/k — Lis called between samples Mean Sum of Squares (M.S.S.)
1+|<
Sp
‘TECHNICAL PUBLICA Tions® « an yp-thrust for knowledge TECHNIQAL PUBLICATIONS® . an up-thrust for knowledge:
Statistics . : 6-29 inferential Statistics ; Tests for Hypothesis
Statistics 6-28 inferential Statistics : Tests for Hypothesis
(ep? 2%
L,2_ n)
7 1 Ye? 196he & =)? n =
Cc
A
HO) i Ge = © . , W2 (-(/2)+ n/2 |
(x) e _<Cn?
CON 222 > (& —u)* 2 RY) OO
and L@,) =- (a) * on (x) e <Cn" ef
phe
7 Og — 1,2
< c (ne )
MLE of © are
wu and &, in
< B (say).
WET ‘As the statistic x follows the chi-square distribution with @ - I) degrees of freedom, The
N =
critical region is obtained by pair of intervals 0 < F< % and XL < xs <a, Where XL; and % are
aA 1 2
and | of == 2 (,-%= $* [Refer 6.7 for MLE] determined such that - A
Then LRT criterion become Where x. (0) is upper & -point of the chi-square distribution with n degrees of freedom.
Populations
wae Test for Equality of Variances of Two Normal Under H,, the statistic
Lt, Hy» 5, and 5, are all
Let N (11,,07) and N (145,05) be two normal populations where Mm
, Y (4-7 /m-1
VT
unknown.
samples from normal
all
x = 1, 2, 3 ..., m) and x, (i = 1, 2, ..., n) be independent random
rot
Se
so
Le:
——
roo
_
I
i}
il
afl I
populations of sizes m and.n respectively. (x) —1
1
—
Suppose we want totest
0, = 0; =o FollowsF distribution with (m ~— 1, n - 1) df. substituting equation (6.11.5) in equation
Hy:
|
Ws OH: 0 #6, (6.11.4) we get,
m+nf. (m=1
the pa arameter spaces defined as . n-1
2 2 2 2 (m+n) 2 . .. (6.11.6)
A = mi? WE om + m2 .
o m n. m-1
2 2
>6}
i+ F
Þ,, &):—=< Ws M, <5 5 n-1
©, i;
fi
and the likelihood function is given by But 3, defined in (6.11.6) is a monotonic function of F. Therefore test can be carried
out
1 m following F-distribution with test statistic as given in equation (6.11.5).
(xz; 7 wu)?
121 W
—Yo
The critical region is given by a pair of intervals as
2 Fg Zou wy 2;
7 i=l ve 2J=t
\2no,} | 27s, F< F,andF2F, *
LE
. (6.19.1) where F, and F, are obtained such that
topics) in ©.and ©, and
Calculating MLE of p, and o, (k = 1, 2) (as discussed in previous P(F2F,) = 1-52
Il
substituting them in equation (6.11.1) we get,
1 m/2, , \w2 —(mt ny/2 P(F>F) = F
= Ins) nS e |
L@)
TECHNICAL PUBLICA TIONS® - an up-thrust for knowledge TECHNICAL PUBLICATIONS® - an up-thrust for knowledge
Statistics . 6-32 inferential Statistics : Tests for Hypothesis Statisti ics —_ . |
: 6-33 inferential Statistics ; Tests for Hypothesis
ane
Since under H,: LRT is equivalent to F-test, thus from F-table,
= ad 2D & +A - 10)"
of
F, = Fon l.n- AC -2)
i
then LRT statistic is given by
a
F, = Fe-uei(3)
16). = F610}
where F__ m.n (&) is upper o - point of F-distribution with (m, n) degrees of freedom.
Thus we get two tailed F-test with critical region.
16).
+. The critical region A < c becomes
Q
F > Fon ten—2) (2)
76 10}
: a
and F< Foe tae) (1 -£)
mY 19 sinc
—4
(X10) 2Shc
Soiution : Given :
> [*— 10
9,:Þ 10
O,:4 # 10
As variance o* is known, therefore pis the only parameter of distribution.
| —hc
— 10 n ,
. We define parameter spaces as
iE=101,
ohſn © on = k (say)
© = {pl—oo<p 9} But from SND; M0, 1),
Oo = {p[p =10}
We know that p.d.f. forN (0°) is 2 = Ew): 1646
_ 2 Ne ohſn
f(X) = = 705) . We get Z> k
Where k is derived s.t. size of critical region is
& =5 % j.e.
Likelihood function becomes P(Z2k/0,) = a :
> — k= Z,= 1 96
Thus we reject ©, if Z> 1.96,
Le oy FL ww
an) © i=! | CY
0 =2)
Replacing 1 by its MLE, likelihood function under © and Q, is
Solution : Given
=| a?
Statistics .
1 if k2
The parameter spaces are
=a
© = {8[9>0} 1
(O,x)" e 04% 1
(I-1 Te
if XXK <
©, = {819 <9
1 if 0x21
can be written as
Using equation (1), likelihood function
L= [I ger = oo
(eax OY if XOg<!
isl Let 8, x = t
equation
MLE of @ can be ‘obtained by solving the
(8, x)" en MED = en?
2 Gang
1 OE,
ie Exe
BE;
i=0
then
: _ _ - . | _ 1
90 Note that t'e a=) 1-5 maximum value ©1” att=1 i.e. when X = 3.0
2K,
> P 8 <k
3
i
Then LRT statistic becomes
A
= P(O, =x, < nk)
3
ll
L(Og)
A = A
_ L(O)
= [=] a
I
0 [2
e
TECHNICAL PUBLICATI ons® - an up-thrust for knowledg
Eo wanat AUANAATIANCO _ on un-thrict for knowledae
Statistics 6 - 36 inferential Statistics : Tests for Hypothesis - Statistics 6-37 inferential Statistics: Tests for Hypothesis
3. Letx, x, ..., x, be a random sample of size n, from a normal distribution Nw, &F),
tests which deals with drawing the conclusion about parameters of population are called
where fl and of both are unknown. Show that LRT, used to test H, - Þ = ky ws
“Parametric tests”.
f,.: H FM, O< o< co, is two tailed t-test.
But in practice, many situations arises where the population under consideration does not
follow any specific distribution. To deal with such problems, statisticians have derived severai 4. Prove that the LRT for resting the equality of variance of two normal populations is
methods and tests, in which no assumptions about distributions as well as about parameters of same as two tailed F-test. 7
As these methods are independent of population distribution, they are also known as size n, from a Poisson distribution. =_
“Distribution free” methods. 6. What is composite and simple hypothesis. Give examples.
7, Write Short notes on -
PRE Advantages of Non-parametric Tests (N.P.)
1) Most powerful test 2 Uniformly most powerful test
N.P. methods have many advantages over the parametric methods.
3) The best critical region 4) Level of signifi icance.
1) N.P. methods are easy and simple to understand, easy and simple to implement when
8. What is the importance of pe 1 and type [1 errors in Neyman and Pearson, theory of
sample size is small. .
testing.
2) N.P. methods do not require complicated computation, therefore these methods are less
9. State and prove Neyman pearson lemma for testing a simple hypothesis against a
time consuming. . . . 0
simple alternative hypothesis.
3) In N.P. methods no assumption, regarding distribution of parent population, from which 10. Let &, Xy X, bea random sample from a p.d.f.
sample is drawn is made. O<x<1 .| 8>0
fi.) ={0x 7 0<x<l
4) They need only nominal or ordinal data. otherwise.
5) N.P. methods not only make less rigorous assumptions but also work well even certain ‘Find on U.M.P. test of size a for testing H,: @= 1against H, o>.
assumption are violated.
Multiple Choice Question
yw Disadvantages of Non-parametric Tests
1) Due to simplicity of N.P. tests, they. are often used even if there exist an appropriate Qu The set of all possible values of a parameter of some p.d.f. is calied
parametric method. ,
[al null space [b] parameter space
In such cases (where parametric test is available and can be more powerful), applying
N.P. is just waste of time and data. [c] critical region [a] none of these
2) ALN. P..tests are not as simple as they are claimed to be. Q.2 The ratio of maximum likelihood function defined over ©, (null parameter space) to that
3) It is not possible to determine the actual of a N.P. test. defined over @ {complete parameter space) is called
Exercise
[al likelihood ratio statistic ſo] p-value
1. Define i) Parameter space ii) Likelihood function iii) Likelihood Ratio test statistic ſc] “maximum likelihood estimate [d] t-test statistic
2. Consider a random sample x,, x, ... x, of size n, from a normal distribution N(u, F).
Develop LRT to test H,: = M, (specified) against.
DH,.u>k WA UE, WR: HF
{Given & is known]
cs for Hypothesis
: Tests Statistics. 6-39 inferential Statistics : Tests for Hypothesis
Statistics 6-38 Inferential Statisti
Q9 Suppose ttest is applied to test the equality of means of two random variables X and Y,
‘| Q/3 Likelihood ratio test statistic A= . |
bo. sup where X and Y are independent. Then : .
. sup -
8&9 L(x,0) (6 080, L(x,8) [a] Distribution of both X and Y must be normal with equal variances.
SUP Sup
L(x,8) . Fj [b] Distribution of both X and Y must be normal.
©, L(x,9) a gs 9.
[e] sup - Sup [e] Distribution of X and Y may not be normal.
8&0 ix,0) ; 9, 0)
[4 Both X and Y must have the same variance.
Q.4 According to L.R.T,, H, is rejected if 0 <A <A, where, A, is derived from .
[a] P(A<A,JH,) = a [b] PA <AJH,) =o Answer Keys for Multiple Choice Questions :
Ef 9009nn
=a Q1/ b {02} a iQ3} ec 1Q4} b
ſe] PO>MgH)=o& |- [d] P(A>AgHg)
, 4Q6 d iQ7:.b §Q8i d : QS. b
| Q.5, If x,, X,, ..., X, be a random sample from a population with p.d.f. f(x, 8,, 8, ..., 9,) then
likelihood function is given by
“tt n
=—_ oo —_ ooo
L=2 f(x, 8,, 8, «.., Og) [6] i= 11 0x, 0, 0,....,09
Q.6 The LRT for testing H, > W = Hy against H, : =, based on random sample of size n from
tk os
- a normal population with unknown p and of is equivalent to
The Neyman-Pearson lemma provides the best critical region for testing
null
Q.8
hypothesis against aliernative hypothesis.
b} Define sampling: Explain random sampling. (Refer sections 1.7 and 1.7.2) [4]
Q.2 a) What are the methods of estimation ? Give brief on testing of hypothesis.
(Refer sections 1.10.1 and 1.10.2) [5]
b) Give advantages and disadvantages ofstatistical analysis.
‘(Refer sections 1.3 and 1.4) [4]
c) Explain the scope of statistics in engineering and technology.
(Refer section 1.2.2) _ 14
Q.3 a) What is histogram ? Draw the histogram for the following data - [5]
(M- 1)
rm em amt narnt® ann thenet far “nawladae
Statistics , M-2 Solved Model Question Papers Statistics M-3 Solved Model Question Papers
t
ns. :
Scale =- 1-+|
Median 2 xh
X- axis: - age (in Years) 1 om = 1 Year
Y - axis : - No. of boys 1 om = 1
338.0952
=
©
d) Define Geometric mean and harmonic mean. Compare them on the basis of merits.
Q
OR
[©
123456 7 8 9 10 11 12 15 14 15 16
Age (in years) Q.4 a) Daily expenditure of 100 families on transport is given below : [5]
' Fig. 1
Expenditure 20 - 28 30 - 39 40 - 49 80. - 59 60 - 69
'b) of arithmetic mean (two each). (Refer section 2.10)
State merits and demerits [4]
No. of families 14 © 9? '27 ? 15
20 - 29.5 14
Ans. : 29.5 - 39.5 fk
39.5 - 49.5 27
Cl. f Less than c.f.
49.5 - 59.5 fy
0-100 9 * 9 __ 59.6 - 69.5 15
100-200 ns 24 N = 100
Mode = 43.5
200-300 18 42 5 = 50
Here maximum frequency is 27
Ml
500-600 14. 95
Mode
a
bet;
a
S
+
oO
Il
600-700 . 5 100
©
a
mn
+
Il
TECHNICAL PUBLICA Tons? - an up-thrust for knowledge . TEC! NICAL PUBLICA TIONs® - an up-thrust for knowledge ©
EIFS
\
Statistics M-4 Solved Mode! Question Papers
Statistics M-5 Solved Mode! Question Papers’
2133 = 39.5f, — 39.5f, +270 —10f,
43.5
54-f -f,
2349 - 43.5f, — 43.56, = 2403 ~ 49.5, - 39.5f,
49.5f + 39.5f, — 43.5f, —43.5f, = 54
6f, —4f, = 54 d) State the. advantages atand limitation ofgraphical representation of data.
... (1)
(Refer Section 2.5)
As N = Sif= 100 ‘(4
14 +6 +27 +6 +15 = 100 SOLVED MODEL QUESTION PAPER (End Sem)
f +f, = 100-56 Statistics
f +f, = 44 S.E, (Al and DS) Semester- IV (As Per 2020. Pattern)
-.. (2)
‘Solving equation (1) and (2),
; Time; 2 4+ Hours]
f, = 23 f, =21- [Maximum Marks : 70
N.B.: . :
b) Give.merits and demerits of median. (Refer section 2.13) [4] i) Attempt Q.1or Q.2, Q.3 or Q4, Q.5 or Q.6, Q.7 or Q.8.
ti) Neat diagrams must be drawn wherever necessary.
c) Calculate harmonic mean of the following Series
(4] 4) Figures to the right side indicate full marks.
Values 2 6 10 44 78 iv) Assume suitable data, if necessary.
: Frequency 4 12 20 9 5
Q.1 a) For the regression lines 3x + 2y~26 = 0 and 6x + y - 31 =0
find DX andy it) Correlation coefficient r between x andy.
Ans. :
(Refer example 3.18. 1) =_ {6]
Ww; x
b) Compute M.D, about |i Mean ii) Median also find coefficient of M.D, for the
2 4 following frequency distribution, (Refer example 3.6.2) [5]
6 12
10. - 20 Class internal 1-3 =: . 3-5 5-7 7-9
14 - 9- Frequency 3: 4 2 1
18 5
©) MNC company conducted 1000 candidates aptitude test. The average score is 45
dw; = 50 and the standard deviation of ‘score is 25. Assuming normal distribution for the
Wie 2,4 6,10, 14,18 result. Find i) The number of candidates whose scores exceed 60.
X; 12 20 9 5 ii) The number of candidates whose scores lies between 30 and 60.
(Refer example 3.31.11) [6]
=» i ll +1.5556 +3.6
2 2 2
= 15 + 5.1556 = 6.6556
TECHNICAL PUBLICATIONS ~
®
- an up-thrust for knowledge
TECHNICAL PUBLICATIONS® - an up-thrust for knowledge
Statistics M-6 Solved Mode! Question Papers
Statistics M-7 Solved.Mode! Question Papers
OR
b) Let a random variable x takes values -2, -1, 0, 1, 2 such that
G2 a) The following marks have been obtained by a class of students in 2 papers .of P(X=-2)= P(X=~1) = P(X= 1) = P(X= 2) and
mathematics _— 5} P(X < G) = P(X = 0) = P(X > 0). Determine the probability mass function of X.
(Refer example 4.5.3) [6]
Fares vor
is probability mass function. Q.6 a) In an experiment on pea breeding, the following frequencies of seeds were obtained.
Find i) k ti) P{x<2] i) P[X>3] iv) P[1<x<3} {6]
(Refer example 4.5.1)
Round and Wrinkled and Round and Wrinkied and Total
greer. . green yellow — yellow
In a continuous distribution density function f(x) = k x (2 = x), 0 < x < 2. Find, 222 120 32 150 524
- OR Theory predicts that the frequencies should be in proportion 8:2:2:1. Examine the
correspendence between theory and experiment. (Refer example 5.11.3)
@.4 a) Fit a Poisson distribution to the following data and calculate theoretical frequencies.
(Refer example 4.17.4) [6] From the data given below. Intelligence tests of two groups of boys and girls gave
b)
x: i 0 1 2 3 4 Total the follewing results. Examine the difference is significant.
(Refer example 5.10.12} . [5]
f; © 109 65. 22 83 1 200
c) In two independent samples of size 8 and 10 the sum of squares deviations of the
sample values from the respective sample means were 84.4 and 102.6. Test whether
the difference of variances of the population is significant or not.
(Refer example 5.10.22) Oo . [6]
Q.7 a) State and Prove N-P-Lemma. (Refer section 6.4) [8]
b) Check whether a BCR exists for testing the null hypothesis Hy : 8 = @ vs
Hy: 8 = 0, > 09 for the parameter 0 of the distribution. f(x, 8) coe
= ;
x+
1sx<c, (Refer example 6.4.7) [5]
c) Let X ve a random sample having normal distribution N(u,o7) where v-is unkown
and 0* = 2. Derive likelihood ratio test for Ho: u = 10 against H; : p = 10 at 5%
level of Significance.
(Given that, maximum: Fikelihood function under © and ©,.
n2
1O)=[
© = i et Þ
F(x; —X)
, —Xx 2
KO), on =[—]
( gg) "2 e 14 TAD I(x; Mey —X)°
RFT2 +X14 -10)"]3012 5
Ans.: ©, - rejected if Z > 1.96
OR