Module III 1

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 28

Module III

Correlation:-
To examine whether the two RV’s are inter-related, we collect of values
of and corresponding to repetitions of the random variable. Let them
be . Then we plot the points with co-
ordinates on a graph paper. The simple
figure consisting of the plotted points is called a scatter diagram. From the
scatter diagram, we can form a fairly good, though vague, idea of the
relationship between and . If the points are dense or closely packed, we
may conclude that and are correlated. On the other hand if the points
are widely scattered throughout the graph paper, we may conclude that
and are either not correlated or poorly correlated.
Further if the points in the scatter diagram appear to lie near a straight line,
we assume that the RV’s have linear correlation. If they cluster round a well
defined curve other than a straight line, the RV’s are assumed to be non-
linear
Karl Pearson’s Product Moment Correlation Coefficient
(Correlation Coefficient between and )

∑ ∑ ∑
∑ ∑ ∑ ∑

( , ) ( )
( )

Note:-
1) When and are independent . Hence
and thus

3) Correlation coefficient is independent of change of origin and scale.


i.e., if and , where then
1. Compute the coefficient of correlation between and , using the
following data:

X 1 3 5 7 8 10
Y 8 12 15 17 18 20

Sol:-

1 8 1 64 8
3 12 9 144 36
5 15 25 225 75
7 17 49 289 119
8 18 64 324 144
10 20 100 400 200
=248 =1446
Here

2. Compute the coefficients of correlation between and using the


following data:-

X 65 67 66 71 67 70 68 69
Y 67 68 68 70 64 67 72 70
Note:-
Note:-
For the data of problem, compute the coefficients of linear partial
correlation and multiple correlation .
Sol:-
X1(Weight) X2(Height) X3(Age) u=X1-77 v=X2-55 w=X3-10 uv vw uw u^2 v^2 w^2

64 57 8 -13 2 -2 -26 -4 26 169 4 4

71 59 10 -6 4 0 -24 0 0 36 16 0

53 49 6 -24 -6 -4 144 24 96 576 36 16

67 62 11 -10 7 1 -70 7 -10 100 49 1

55 51 8 -22 -4 -2 88 8 44 484 16 4

58 50 7 -19 -5 -3 95 15 57 361 25 9

77 55 10 0 0 0 0 0 0 0 0 0

57 48 9 -20 -7 -1 140 7 20 400 49 1

56 52 10 -21 -3 0 63 0 0 441 9 0

51 42 6 -26 -13 -4 338 52 104 676 169 16

76 61 12 -1 6 2 -6 12 -2 1 36 4

68 57 9 -9 2 -1 -18 -2 9 81 4 1

-171 -17 -14 724 119 344 3325 413 56


𝑛 ∑ 𝑢𝑣 − ∑ 𝑢 ∑ 𝑣
𝑟 =
(𝑛 ∑ 𝑢 − ∑ 𝑢 )(𝑛 ∑ 𝑣 − ∑ 𝑣 )

12 724 − −171 −17


= = 0.8197
( 12 3325 − −171 ) ( 12 413 − −17 )

𝑛 ∑ 𝑣𝑤 − ∑ 𝑣 ∑ 𝑤
𝑟 =
(𝑛 ∑ 𝑣 − ∑ 𝑣 )(𝑛 ∑ 𝑤 − ∑ 𝑤 )

12 119 − −17 −14


= = 0. . 7984
( 12 (413) − −17 ) ( 12 56 − −14 )
.

It is concluded that the correlation coefficient between weight and height is


for boys of same age.

.
=0.8418
.
Regression Equations
Regression Equation on

∑ ∑ ∑
Where ∑ ∑
(OR)
(OR)
( )
is called the regression coefficient on
( )

Regression Equation on

∑ ∑ ∑
Where ∑ ∑
(OR)
(OR)

is called the regression coefficient on


Note:-
1) Correlation Coefficient is the geometric mean of
regression coefficient.
i.e.,
2) and have the same sign as
3) When there is a perfect linear correlation between and
, namely, when , the two regression lines
coincide.
4) The points of intersection of the two regression lines is
clearly the point whose co-ordinates are .
5) When there is no linear correlation between and ,
namely, when , the equations of the regression
lines become and , which are at right angles.
1) Obtain the equations of the lines from the following data.
X 22 26 29 30 31 31 34 35
Y 20 20 21 29 27 24 27 31

Hence find the coefficient of correlation between and . Also


estimate the value of (a) , when and (b) when
X Y u=X-29 v=Y-27 u^2 v^2 uv

22 20 -7 -7 49 49 49

26 20 -3 -7 9 49 21

29 21 0 -6 0 36 0

30 29 1 2 1 4 2

31 27 2 0 4 0 0

31 24 2 -3 4 9 -6

34 27 5 0 25 0 0

35 31 6 4 36 16 24

238 199 6 -17 128 163 90


Regression coefficients

Regression equation on

When we have
Regression equation on

When we have
In a partially destroyed laboratory record of an analysis of correlation
data, the following results only are legible: Variable of . The
regression equations are and What
were (a) the mean values of and (b) the standard deviation of
? And (c) the correlation coefficient between and
Sol:-
A study of prices of rice at Chennai and Mumbai gave the following
data:
Chennai Mumbai
Mean 19.5 17.75
S.D 1.75 2.5

Also the coefficient of correlation between the two is 0.8. Estimate


the most likely price of rice (a) at Chennai corresponding to the price
of Rs. 18 at Mumbai, and (b) at Mumbai corresponding to the price
of Rs. 17 at Chennai.
Sol:-
Ten competitors in a beauty contest were ranked by three judges as follows:

Competitors
Judges 1 2 3 4 5 6 7 8 9 10
A 6 5 3 10 2 4 9 7 8 1
B 5 8 4 7 10 2 1 6 9 3
C 4 9 8 1 2 3 10 5 7 6
Discuss which pair of judges have the nearest approach to common taste of beauty.

You might also like