Professional Documents
Culture Documents
Covariance
Covariance
Covariance
1
Covariance
The covariance of two variables x and y in a data set measures how
the two are linearly related. A positive covariance would indicate a
positive linear relationship between the variables, and a negative
covariance would indicate the opposite.
2
Covariance
Variables may change in relation to each other
3
Variance vs Covariance
First, a note on your sample:
– If you’re wishing to assume that your sample is
representative of the general population (RANDOM
EFFECTS MODEL), use the degrees of freedom (n – 1)
in your calculations of variance or covariance.
4
Variance vs Covariance
Do two variables change together?
Variance: n
• Gives information on variability of a
single variable.
(x i x) 2
Var( X ) S x2 i 1
Covariance: n 1
• Gives information on the degree to
which two variables vary together. n
• Note how similar the covariance is to
variance: the equation simply (x i x)( yi y )
cov( x, y ) i 1
multiplies x’s error scores by y’s error
scores as opposed to squaring x’s error
scores. n 1
5
Covariance
(x i x)( yi y )
cov( x, y ) i 1
n 1
When X and Y : cov (x,y) = pos.
When X and Y : cov (x,y) = neg.
When no constant relationship: cov (x,y) = 0
6
Covariance
When two random variables X and Y are not independent,
it is frequently of interest to assess how strongly they are
related to one another.
The covariance between two rv’s X and Y is
7
Covariance
The covariance between two rv’s X and Y is
Cov(aX,cY), where a and c are constants.
8
Covariance
The covariance between two rv’s X and Y is
Cov(aX+b,cY+d), where a,b,c and d are constants.
9
Problem with Covariance:
The value obtained by covariance is dependent on the size of the
data’s standard deviations: if large, the value will be greater
than if small… even if the relationship between x and y is
exactly the same in the large versus small standard deviation
datasets.
10
Covariance
Ex: Calculate the covariance of the following pairs of
observations of two variates X and Y.
(1,6), (2,9), (3,6), (4,7), (5,8), (6,5), (7,12), (8,3), (9,17),
(10,1)
Sol:
11
Example Covariance
5
x y xi x yi y ( xi x )( yi y )
4 0 3 -3 0 0
3 2 2 -1 -1 1
2
3 4 0 1 0
1
4 0 1 -3 -3
0
0 1 2 3 4 5 6 7 6 6 3 3 9
x3 y3 7
n
( x x)( y y))
i i
7
cov( x, y ) i 1
1.75
n 1 4
12
Covariance
13
Smoking v Lung Capacity Data
1 0 45
2 5 42
3 10 33
4 15 31
5 20 29
14
Smoking and Lung Capacity
Lung Capacity (Y )
50
45
Lung Capacity
40
35
30
25
20
-5 0 5 10 15 20 25
Smoking (yrs)
15
Smoking v Lung Capacity
Observe that as smoking exposure goes up, corresponding
lung capacity goes down
16
Covariance
17
The Sample Covariance
Similar to variance, for theoretical reasons, average is
typiclly computed using (N -1), not N . Thus,
1 N
S xy
N 1 i1
Xi X Y Y
i
18
Calculating Covariance
19
Calculating Covariance
Cigs (X ) ( X X ) ( X X ) (Y Y ) (Y Y ) Cap (Y )
0 -10 -90 9 45
5 -5 -30 6 42
10 0 0 -3 33
15 5 -25 -5 31
20 10 -70 -7 29
∑= -215
20
Covariance Calculation
Evaluation yields,
1
S xy ( 215) 53.75
4
21
Computational Formula
22
22
Covariance
That is, since X – X and Y – Y are the deviations of the
two variables from their respective mean values, the
covariance is the expected product of deviations. Note
that Cov(X, X) = E[(X – X)2] = V(X).
23
Covariance
Then most of the probability mass or density will be
associated with (x – X) and (y – Y), either both positive
(both X and Y above their respective means) or both
negative, so the product (x – X)(y – Y) will tend to be
positive.
25
Covariance
Figure 5.4 illustrates the different possibilities. The
covariance depends on both the set of possible pairs
and the probabilities. In Figure 5.4, the probabilities
could be changed without altering the set of possible
pairs, and this could drastically change the value of
Cov(X, Y).
(a) positive covariance; (b) negative covariance; (c) covariance near zero
Figure 5.4 26
Covariance
The following shortcut formula for Cov(X, Y) simplifies the
computations.
Proposition
Cov(X, Y) = E(XY) – X Y
27