U-3 Correlation and Regression

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 49

1

Definition:
2

(2) By Schwartz inequality, we have

Properties: (1)

i.e.,

(2)

i.e.,

The converse of the result is not true, i.e., two uncorrelated variables may not
be independent.
3

Problems: (10.1)
4

(10.2)
5
6
7

Regression:

Lines of regression:
8
9
10

Remark:
11

Regression Coefficients:

Properties:

Angle between two lines of regression:


12
13
14
15
16
17
18
19
20

Properties of residuals:
Property 1. The sum of the products of any variable with every residual
is zero, provided the subscript of the variable occurs among the
secondary subscripts of the residual. That is,
21

Property 2. The sum of the products of any two residuals is unaltered, if


we omit from one of the residuals, any or all the secondary subscripts
which are common to those of the other. That is,
22
23
24

Property 4. The variance of the residual of a variable given by the plane


of regression can be expressed in terms of the variable itself.
25
26
27
28

Properties of multiple correlation coefficient:


(1) Multiple correlation coefficient is a non-negative quantity. That is, it lies
between 0 and 1.
That is, 0 ≤ 𝑅1.23 ≤ 1, 0 ≤ 𝑅2.31 ≤ 1, 0 ≤ 𝑅3.12 ≤ 1
(2) The position of the subscript on the right side of dots does not make any
difference in the meaning.
That is, 𝑅1.23 = 𝑅1.32, 𝑅2.31 = 𝑅2.13, 𝑅3.12 = 𝑅3.21 .
(3) If 𝑅1.23 = 0, then 𝑟12 = 0 𝑎𝑛𝑑 𝑟23 = 0.

The following are some of the properties of multiple correlation coefficients:

1. Multiple correlation coefficient is the degree of association between observed


value of the dependent variable and its estimate obtained by multiple
regression,

2. If multiple correlation coefficient is 1, then association is perfect and


multiple regression equation may said to be perfect prediction formula,

3. If multiple correlation coefficient is 0, dependent variable is uncorrelated


with other independent variables. From this, it can be concluded that
multiple regression equation fails to predict the value of dependent variable
when values of independent variables are known,
29

5. Multiple correlation coefficient is always greater or equal to any total


correlation coefficient or The value of multiple correlation coefficients is not
less than the simple correlation coefficient. If 𝑅1.23 is the multiple
correlation coefficient, then 𝑅1.23 ≥ 𝑟12 or 𝑅1.23 ≥ 𝑟13 or 𝑅1.23 ≥ 𝑟23 ,
Similarly for 𝑅2.31 and 𝑅3.12 , and

6. Multiple correlation coefficient obtained by method of least squares would


always be greater than the multiple correlation coefficient obtained by any
other method.

Note:
2 2
2 𝑟𝑖𝑗 +𝑟𝑖𝑘 −2𝑟𝑖𝑗 𝑟𝑗𝑘 𝑟𝑖𝑘
(1) 𝑅𝑖.𝑗𝑘 =
1−𝑟 2 𝑗𝑘
2
(2) 𝜎1.23 = 𝜎12 (1 2
− 𝑅1.23 )
2
𝜎1.23…𝑛
2
(3) For n − variate distribution, 𝑅1.23..𝑛 =1− , and
𝜎12

2
𝜎1.23…𝑛 = 𝜎12 (1 − 𝑅1.23..𝑛
2
).

(4) Since 𝑉𝑎𝑟(𝑒1.23 ) = 𝐶𝑜𝑣�𝑋1, 𝑒1.23 � = σ21 − σ21.23 , 𝐶𝑜𝑣�𝑋1, 𝑒1.23 � ≥ 0 and
hence 𝑅1.23 ≥ 0, that is, 0 ≤ 𝑅1.23 ≤ 1.
2 1 2
(5) If 𝑅1.23 = 1, then σ1.23 = ∑ X1.23 = 0, that is, all the regression
N
residuals are zero and hence 𝑋1 = 𝑏12.3 𝑋2 + 𝑏13.2 𝑋3 , that is, the equation
of the regression plane may be treated as a perfect prediction formula for
𝑋1 .
30

Method-1:
Cov(X1.3 , X 2.3 ) = Cov{( X1 − b13 X 3 )((X 2 − b 23 X 3 )}

= Cov(X1, X 2 ) − b13 Cov(X 2 , X 3 ) − b 23 Cov(X1, X 3 ) + b13 b 23 Cov(X 3 , X 3 )

σ σ σ σ
= r12σ1σ 2 − r13 1 r23 σ 2σ3 − r23 2 r13 σ1σ3 + r13 1 r23 2 σ3 2 , since
σ3 σ3 σ3 σ3

Cov(X 3 , X 3 ) = Var (X 3 ) = σ32 .

= r12 σ1σ 2 − r13 r23 σ1σ 2 and

Var (X1.3 ) = Var (X1 − b13 X 3 ) = Var (X1) + b13 2 Var (X 3 ) − 2b13 Cov(X1, X 3 )

2 2 σ12 2 σ
= σ1 + r13 σ3 − 2r13 1 r13 σ1σ3 = σ12 − r13 2σ12
σ32 σ3

Similarly, Var(X 2.3 ) = σ 22 − r23 2σ 22


31

Method-2:
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49

************************

You might also like