One Wayancova1kevinedit12 140916165947 Phpapp01

You might also like

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 211

One-way Analysis of

Covariance
(ANCOVA)
Conceptual Tutorial

First
How did we get here?

Consider a similar problem

A pizza caf owner wants to know which type


of high school athlete she should market to.

A pizza caf owner wants to know which type


of high school athlete she should market to.
Should she market to football, basketball or
soccer players?

A pizza caf owner wants to know which type


of high school athlete she should market to.
Should she market to football, basketball or
soccer players?
So she measures the ounces of pizza eaten by
12 football, 12 basketball, and 12 soccer
players in one sitting.

Here are the raw data:


Football Players Basketball Players
29 oz. of pizza
15 oz. of pizza
eaten
eaten
24 oz. of pizza
28 oz. of pizza
eaten
eaten
14 oz. of pizza
13 oz. of pizza
eaten
eaten
27 oz. of pizza
36 oz. of pizza
eaten
eaten
27 oz. of pizza
29 oz. of pizza
eaten
eaten
28 oz. of pizza
27 oz. of pizza
eaten
eaten
27 oz. of pizza
31 oz. of pizza
eaten
eaten

Soccer Players
32 oz. of pizza
eaten
27 oz. of pizza
eaten
15 oz. of pizza
eaten
23 oz. of pizza
eaten
26 oz. of pizza
eaten
17 oz. of pizza
eaten
25 oz. of pizza
eaten

The owner wondered how much these athletes


like pizza to begin with and how that might
affect the results.

The owner wondered how much these athletes


like pizza to begin with and how that might
affect the results. She surveyed them prior to
their eating the pizza.

The Survey
On a scale of 1 to 10
how much do you
like pizza?
1 2 3 4 5 6 7 8 9 10

Here were the results:


Football Basketball
7.0
3.0
5.0
8.0
3.5
4.5
9.0
9.5
7.0
6.5
8.0
7.0
6.5
7.5
7.5
9.0
2.5
8.5
9.0
4.0
8.0
7.5
5.0

8.0

Soccer
7.5
4.5
3.5
6.0
6.0
4.5
6.0
1.5
6.5
5.0
5.5
4.0

Based on this information, lets determine how


we got here.

Here is the problem again:

A pizza caf owner wants to know


which type of high school athlete she
should market to, by comparing how
many ounces of pizza are consumed
across all three athlete groups.
She will control for pizza preference.

The Problem: A pizza caf owner wants to know which type of high
school athlete she should market to, by comparing how many ounces
of pizza are consumed across all three athlete groups. She will control
for pizza preference.

The Problem: A pizza caf owner wants to know which type of high
school athlete she should market to, by comparing how many ounces
of pizza are consumed across all three athlete groups. She will control
for pizza preference.

Is This:

Inferent
Descript
or
ial
ive

The Problem: A pizza caf owner wants to know which type of high
school athlete she should market to, by comparing how many ounces
of pizza are consumed across all three athlete groups. She will control
for pizza preference.

Is This:

Inferent
Descript
or
ial
ive

The Problem: A pizza caf owner wants to know which type of high
school athlete she should market to, by comparing how many ounces
of pizza are consumed across all three athlete groups. She will control
for pizza preference.

Is This:

Inferent
Descript
or
ial
ive

Based on the data set of 36 athletes, this


is a sample from which the owner would
like to make generalizations about
potential athlete customers.

The Problem: A pizza caf owner wants to know which type of high
school athlete she should market to, by comparing how many ounces
of pizza are consumed across all three athlete groups. She will control
for pizza preference.

Is This:

Inferent
Descript
or
ial
ive

Based on the data set of 36 athletes, this


is a sample from which the owner would
like to make generalizations about
potential athlete customers.

The Problem: A pizza caf owner wants to know which type of high
school athlete she should market to, by comparing how many ounces
of pizza are consumed across all three athlete groups. She will control
for pizza preference.

Is This Question of:

Differen
Relations
or
hip
ce

The Problem: A pizza caf owner wants to know which type of high
school athlete she should market to, by comparing how many ounces
of pizza are consumed across all three athlete groups. She will control
for pizza preference.

Is This Question of:

Differen
Relations
or
hip
ce

Because the owner wants to compare groups


differences, we are dealing with
DIFFERENCE.

The Problem: A pizza caf owner wants to know which type of high
school athlete she should market to, by comparing how many ounces
of pizza are consumed across all three athlete groups. She will control
for pizza preference.

Is This Question of:

Differen
Relations
or
hip
ce

Because the owner wants to compare groups


differences, we are dealing with
DIFFERENCE.

The Problem: A pizza caf owner wants to know which type of high
school athlete she should market to, by comparing how many ounces
of pizza are consumed across all three athlete groups. She will control
for pizza preference.

Inferential

Descriptive

Descriptive

Inferential

The Problem: A pizza caf owner wants to know which type of high
school athlete she should market to, by comparing how many ounces
of pizza are consumed across all three athlete groups. She will control
for pizza preference.

Is The Distribution:

Normal

Not
or
Normal

The Problem: A pizza caf owner wants to know which type of high
school athlete she should market to, by comparing how many ounces
of pizza are consumed across all three athlete groups. She will control
for pizza preference.

Is The Distribution:

Normal

Not
or
Normal

After graphing each


column we find that
the distributions are
mostly normal.

Football
Players
29 oz. of pizza

Basketball
Players
15 oz. of pizza

Soccer
Players
32 oz. of pizza

eaten

eaten

eaten

24

oz. of pizza
eaten

28

oz. of pizza
eaten

27

oz. of pizza
eaten

14

oz. of pizza
eaten

13

oz. of pizza
eaten

15

oz. of pizza
eaten

27

oz. of pizza
eaten

36

oz. of pizza
eaten

23

oz. of pizza
eaten

27

oz. of pizza
eaten

29

oz. of pizza
eaten

26

oz. of pizza
eaten

28 oz. of pizza

27 oz. of pizza

17 oz. of pizza

eaten

eaten

eaten

27 oz. of pizza

31 oz. of pizza

25 oz. of pizza

eaten

eaten

eaten

32 oz. of pizza

33 oz. of pizza

14 oz. of pizza

The Problem: A pizza caf owner wants to know which type of high
school athlete she should market to, by comparing how many ounces
of pizza are consumed across all three athlete groups. She will control
for pizza preference.

Is The Distribution:

Normal

Not
or
Normal

After graphing each


column we find that
the distributions are
mostly normal.

Football
Players
29 oz. of pizza

Basketball
Players
15 oz. of pizza

Soccer
Players
32 oz. of pizza

eaten

eaten

eaten

24

oz. of pizza
eaten

28

oz. of pizza
eaten

27

oz. of pizza
eaten

14

oz. of pizza
eaten

13

oz. of pizza
eaten

15

oz. of pizza
eaten

27

oz. of pizza
eaten

36

oz. of pizza
eaten

23

oz. of pizza
eaten

27

oz. of pizza
eaten

29

oz. of pizza
eaten

26

oz. of pizza
eaten

28 oz. of pizza

27 oz. of pizza

17 oz. of pizza

eaten

eaten

eaten

27 oz. of pizza

31 oz. of pizza

25 oz. of pizza

eaten

eaten

eaten

32 oz. of pizza

33 oz. of pizza

14 oz. of pizza

The Problem: A pizza caf owner wants to know which type of high
school athlete she should market to, by comparing how many ounces
of pizza are consumed across all three athlete groups. She will control
for pizza preference.

Inferenti
al
Descripti
ve

Descripti
ve

Inferenti
al

Normal

Not
Normal

The Problem: A pizza caf owner wants to know which type of high
school athlete she should market to, by comparing how many ounces
of pizza are consumed across all three athlete groups. She will control
for pizza preference.

Are the Data:

Scaled?
(ratio/interval/ordinal
)

Categori
or
cal?
(ordinal)

The Problem: A pizza caf owner wants to know which type of high
school athlete she should market to, by comparing how many ounces
of pizza are consumed across all three athlete groups. She will control
for pizza preference.

Are the Data:

Scaled?
(ratio/interval/ordinal
)

The data is interval


(ounces of Pizza)

Categori
or
cal?
(ordinal)

Football
Players
29 oz. of pizza

Basketball
Players
15 oz. of pizza

Soccer
Players
32 oz. of pizza

eaten

eaten

eaten

24

oz. of pizza
eaten

28

oz. of pizza
eaten

27

oz. of pizza
eaten

14

oz. of pizza
eaten

13

oz. of pizza
eaten

15

oz. of pizza
eaten

27

oz. of pizza
eaten

36

oz. of pizza
eaten

23

oz. of pizza
eaten

27

oz. of pizza
eaten

29

oz. of pizza
eaten

26

oz. of pizza
eaten

28 oz. of pizza

27 oz. of pizza

17 oz. of pizza

eaten

eaten

eaten

27 oz. of pizza

31 oz. of pizza

25 oz. of pizza

eaten

eaten

eaten

32 oz. of pizza

33 oz. of pizza

14 oz. of pizza

The Problem: A pizza caf owner wants to know which type of high
school athlete she should market to, by comparing how many ounces
of pizza are consumed across all three athlete groups. She will control
for pizza preference.

Are the Data:

Scaled?
(ratio/interval/ordinal
)

The data is interval


(ounces of Pizza)

Categori
or
cal?
(ordinal)

Football
Players
29 oz. of pizza

Basketball
Players
15 oz. of pizza

Soccer
Players
32 oz. of pizza

eaten

eaten

eaten

24

oz. of pizza
eaten

28

oz. of pizza
eaten

27

oz. of pizza
eaten

14

oz. of pizza
eaten

13

oz. of pizza
eaten

15

oz. of pizza
eaten

27

oz. of pizza
eaten

36

oz. of pizza
eaten

23

oz. of pizza
eaten

27

oz. of pizza
eaten

29

oz. of pizza
eaten

26

oz. of pizza
eaten

28 oz. of pizza

27 oz. of pizza

17 oz. of pizza

eaten

eaten

eaten

27 oz. of pizza

31 oz. of pizza

25 oz. of pizza

eaten

eaten

eaten

32 oz. of pizza

33 oz. of pizza

14 oz. of pizza

The Problem: A pizza caf owner wants to know which type of high
school athlete she should market to, by comparing how many ounces
of pizza are consumed across all three athlete groups. She will control
for pizza preference.

Inferenti
al
Descripti
ve

Inferenti
al

Normal
Scaled

Descripti
ve

Not
Normal

Categoric
al

The Problem: A pizza caf owner wants to know which type of high
school athlete she should market to, by comparing how many ounces
of pizza are consumed across all three athlete groups. She will control
for pizza preference.

Is there:

1 DV

2 or more
or
DV

The Problem: A pizza caf owner wants to know which type of high
school athlete she should market to, by comparing how many ounces
of pizza are consumed across all three athlete groups. She will control
for pizza preference.

Is there:

1 DV

2 or more
or
DV

The Problem: A pizza caf owner wants to know which type of high
school athlete she should market to, by comparing how many ounces
of pizza are consumed across all three athlete groups. She will control
for pizza preference.

Is there:

1 DV

2 or more
or
DV

The Problem: A pizza caf owner wants to know which type of high
school athlete she should market to, by comparing how many ounces
of pizza are consumed across all three athlete groups. She will control
for pizza preference.
Inferential

Descriptive

Descriptive

Inferential

Normal

Scaled

1 DV

Not Normal

Categorical

2 or more DV

The Problem: A pizza caf owner wants to know which type of high
school athlete she should market to, by comparing how many ounces
of pizza are consumed across all three athlete groups. She will control
for pizza preference.

Is there:

1 IV

or

2 or more
IVs

[Type of Athlete is the only Independent


Variable (IV)]

The Problem: A pizza caf owner wants to know which type of high
school athlete she should market to, by comparing how many ounces
of pizza are consumed across all three athlete groups. She will control
for pizza preference.

Is there:

1 IV

or

2 or more
IVs

[Type of Athlete is the only Independent


Variable (IV)]

The Problem: A pizza caf owner wants to know which type of high
school athlete she should market to, by comparing how many ounces
of pizza are consumed across all three athlete groups. She will control
for pizza preference.

Is there:

1 IV

or

2 or more
IVs

[Type of Athlete is the only Independent


Variable (IV)]

The Problem: A pizza caf owner wants to know which type of high
school athlete she should market to, by comparing how many ounces
of pizza are consumed across all three athlete groups. She will control
for pizza preference.
Descriptiv
Inferential
e
Descriptiv
e

Inferential

Normal

Scaled
1 DV

1 IV

Categoric
al

2 or more
DV

2 or more
IV

Not
Normal

The Problem: A pizza caf owner wants to know which type of high
school athlete she should market to, by comparing how many ounces
of pizza are consumed across all three athlete groups. She will control
for pizza preference.

Is there:

1 IV Level

or

2 or more IV
Levels

The Problem: A pizza caf owner wants to know which type of high
school athlete she should market to, by comparing how many ounces
of pizza are consumed across all three athlete groups. She will control
for pizza preference.

Is there:

1 IV Level

or

2 or more IV
Levels

The Problem: A pizza caf owner wants to know which type of high
school athlete she should market to, by comparing how many ounces
of pizza are consumed across all three athlete groups. She will control
for pizza preference.

Is there:

1 IV Level

or

2 or more IV
Levels

The Problem: A pizza caf owner wants to know which type of high
school athlete she should market to, by comparing how many ounces
of pizza are consumed across all three athlete groups. She will control
for pizza preference.
Inferenti
al

Descripti
ve

Inferenti
al

Normal

Scaled
1 DV

1 IV
1 IV
Level

2 or
more IV
Levels

Not
Normal

Categori
cal

2 or
more DV

2 or
more IV

Descripti
ve

The Problem: A pizza caf owner wants to know which type of high
school athlete she should market to, by comparing how many ounces
of pizza are consumed across all three athlete groups. She will control
for pizza preference.

Are the Samples:

Independ
Repeated or
ent

The Problem: A pizza caf owner wants to know which type of high
school athlete she should market to, by comparing how many ounces
of pizza are consumed across all three athlete groups. She will control
for pizza preference.

Are the Samples:

Independ
Repeated or
ent

No individual is in more than one group

The Problem: A pizza caf owner wants to know which type of high
school athlete she should market to, by comparing how many ounces
of pizza are consumed across all three athlete groups. She will control
for pizza preference.

Are the Samples:

Independ
Repeated or
ent

No individual is in more than one group

The Problem: A pizza caf owner wants to know which type of high
school athlete she should market to, by comparing how many ounces
of pizza are consumed across all three athlete groups. She will control
for pizza preference.
Inferential
Descriptiv
e

Inferential

Normal
Scaled
1 DV

1 IV Level

2 or more
IV Levels

Repeated

Independe
nt

Not
Normal

Categorica
l

2 or more
DV

2 or more
IV

1 IV

Descriptiv
e

The Problem: A pizza caf owner wants to know which type of high
school athlete she should market to, by comparing how many ounces
of pizza are consumed across all three athlete groups. She will control
for pizza preference.

Is There:

A Covariate

or

Not a
Covariate

The Problem: A pizza caf owner wants to know which type of high
school athlete she should market to, by comparing how many ounces
of pizza are consumed across all three athlete groups. She will control
for pizza preference.

Is There:

A Covariate

or

Not a
Covariate

The Problem: A pizza caf owner wants to know which type of high
school athlete she should market to, by comparing how many ounces
of pizza are consumed across all three athlete groups. She will control
for pizza preference.

Is There:

A Covariate

or

Not a
Covariate

The Problem: A pizza caf owner wants to know which type of high
school athlete she should market to, by comparing how many ounces
of pizza are consumed across all three athlete groups. She will control
for pizza preference.
Inferential

Descriptive

Inferential
Normal

Scaled

1 DV

1 IV

1 IV Level

2 or more DV

2 or more IV
Levels
Independent

A Covariate

Not Normal

Categorical

2 or more IV

Repeated

Descriptive

Not a
Covariate

Now that we know how we got here, lets


consider what Analysis of Covariance is.

Now that we know how we got here, lets


consider what Analysis of Covariance is.
First, . . . what is covariance?

Now that we know how we got here, lets


consider what Analysis of Covariance is.
First, . . . what is covariance?
As you know, variance is a statistic that helps
you determine how much the data in a
distribution varies.

Now that we know how we got here, lets


consider what Analysis of Covariance is.
First, . . . what is covariance?
As you know, variance is a statistic that helps
you determine how much the data in a
distribution varies.
Number of
Pizza Slices
eaten by
Basketball
Players
5 6 7
Not
much
varianc
e

Now that we know how we got here, lets


consider what Analysis of Covariance is.
First, . . . what is covariance?
As you know, variance is a statistic that helps
you determine how much the data in a
distribution varies.
Number of
Pizza Slices
eaten by
Basketball
Players

Number of
Pizza Slices
eaten by
Soccer
Players
5 6 7
Not
much
varianc
e

5 6

A lot of
varianc
e

9 10

Covariance is a statistic that helps us


determine how much two distributions that
have some relationship covary.

Lets imagine that students take a math test


and their ordered scores look like this.

Lets imagine that students take a math test


and their ordered scores look like this.

Stude
nt

Test
Scores

Bambi

98

Belle

92

Billy

84

Boston 77
Bryne

73

Bubba

68

Etc

Lets imagine that students take a math test


and their ordered scores look like this. Then
lets imagine they take a math anxiety survey.

Stude
nt

Test
Scores

Bambi

98

Belle

92

Billy

84

Boston 77
Bryne

73

Bubba

68

Etc

Lets imagine that students take a math test


and their ordered scores look like this. Then
lets imagine they take a math anxiety survey.

Stude
nt

Test
Scores

Math Anxiety
Scores

Bambi

98

Belle

92

Billy

84

Boston 77

Bryne

73

Bubba

68

Etc

Etc.

How much do they covary?

Stude
nt

Test
Scores

Math Anxiety
Scores

Bambi

98

Belle

92

Billy

84

Boston 77

Bryne

73

Bubba

68

Etc

Etc.

Bambi has the highest Math Test Score and the


highest Math Anxiety Score

Stude
nt

Test
Scores

Math Anxiety
Scores

Bambi

98

Belle

92

Billy

84

Boston 77

Bryne

73

Bubba

68

Etc

Etc.

Belle has the 2nd highest Math Test Score and


the 2nd highest Math Anxiety Score

Stude
nt

Test
Scores

Math Anxiety
Scores

Bambi

98

Belle

92

Billy

84

Boston 77

Bryne

73

Bubba

68

Etc

Etc.

Billy has the 3rd highest Math Test Score and


the 3rd highest Math Anxiety Score

Stude
nt

Test
Scores

Math Anxiety
Scores

Bambi

98

Belle

92

Billy

84

Boston 77

Bryne

73

Bubba

68

Etc

Etc.

Boston has the 4th highest Math Test Score and


the 4th highest Math Anxiety Score

Stude
nt

Test
Scores

Math Anxiety
Scores

Bambi

98

Belle

92

Billy

84

Boston 77

Bryne

73

Bubba

68

Etc

Etc.

Bryne has the 5th highest Math Test Score and


the 5th highest Math Anxiety Score

Stude
nt

Test
Scores

Math Anxiety
Scores

Bambi

98

Belle

92

Billy

84

Boston 77

Bryne

73

Bubba

68

Etc

Etc.

Bubba has the 6th highest Math Test Score and


the 6th highest Math Anxiety Score

Stude
nt

Test
Scores

Math Anxiety
Scores

Bambi

98

Belle

92

Billy

84

Boston 77

Bryne

73

Bubba

68

Etc

Etc.

These two data sets perfectly covary.

Stude
nt

Test
Scores

Math Anxiety
Scores

Bambi

98

Belle

92

Billy

84

Boston 77

Bryne

73

Bubba

68

Etc

Etc.

These two data sets perfectly covary.


This means as one changes the other changes.
Stude
nt

Test
Scores

Math Anxiety
Scores

Bambi

98

Belle

92

Billy

84

Boston 77

Bryne

73

Bubba

68

Etc

Etc.

These two data sets perfectly covary.


This means as one changes the other changes.
Stude
nt

Test
Scores

Math Anxiety
Scores

Bambi

98

Belle

92

Billy

84

Boston 77

Bryne

73

Bubba

68

Etc

Etc.

Either in the
same direction

These two data sets perfectly covary.


This means as one changes the other changes.
Stude
nt

Test
Scores

Math Anxiety
Scores

Bambi

98

6
2

Belle

92

5
3

Billy

84

Boston 77

Bryne

73

3
5

Bubba

68

2
6

Etc

Etc.

Or opposite
directions

Covariance is a statistic that describes that


relationship.

Covariance is a statistic that describes that


relationship. The larger the covariance statistic
(either positive or negative), the more the two
samples covary.

Covariance is a statistic that describes that


relationship. The larger the covariance statistic
(either positive or negative), the more the two
samples covary.
Lets demonstrate how to calculate covariance
by hand.

Covariance is a statistic that describes that


relationship. The larger the covariance statistic
(either positive or negative), the more the two
samples covary.
Lets demonstrate how to calculate covariance
by hand.
(Although most statistical software can do it for you
automatically.)

Here is the data set

Here is the sum of each


column:

And the mean:

Remember Covariance can only be computed between two or


more variables (e.g., test questions, test scores, ect.) with
scores across each variable for each person.

Here is the formula for covariance:

XY

Xi

X Y i Y
N

With an explanation for each value:

XY

Xi

X Y i Y
N

XY

Xi

X Y i Y
N

XY

Xi

X Y i Y
N
Test
Scores

Anxiety
Scores

XY

Xi

X Y i Y
N

XY

Xi

X Y i Y
N
Each
Test
Scores

XY

Xi

X Y i Y
N

XY

Xi

X Y i Y
N
Or in this case
is the mean for
Test Scores
(82)

XY

Xi

X Y i Y
N

XY

Xi

X Y i Y
N
Each
Anxiety
Score

XY

Xi

X Y i Y
N

XY

Xi

X Y i Y
N
Or in this case
is the mean for
Anxiety Scores
(24)

Let the Covariance Calculations Begin!

Deviation between
each math score and
the mean.
X
i

Deviation between
each math score and
the mean.
X
i

Deviation between
each math score and
the mean.
X
i

Deviation between
each math score and
the mean.
X
i

98

Deviation between
each math score and
the mean.
X
i

98 - 82

Deviation between
each math score and
the mean.
X
i

16

Deviation between
each math score and
the mean.
X
i

16
92 - 82

Deviation between
each math score and
the mean.
X
i

16
10
84 - 82

Deviation between
each math score and
the mean.
X
i

16
10
2
77 - 82

Deviation between
each math score and
the mean.
X
i

16
10
2
-5
73 - 82

Deviation between
each math score and
the mean.
X
i

16
10
2
-5
-9
68 - 82

Deviation between
each math score and
the mean.
X
i

16
10
2
-5
-9
-14

Deviation between each


Anxiety score and its mean.
Y

Deviation between each


Anxiety score and its mean.
Y

Deviation between each


Anxiety score and its mean.
Y

5-4

Deviation between each


Anxiety score and its mean.
Y
1

Deviation between each


Anxiety score and its mean.
Y

1
6-4

Deviation between each


Anxiety score and its mean.
Y

1
2
4-4

Deviation between each


Anxiety score and its mean.
Y

1
2
0
4-4

Deviation between each


Anxiety score and its mean.
Y

1
2
0
0
3-4

Deviation between each


Anxiety score and its mean.
Y

1
2
0
0
-1
2-4

Multiply the two paired


Deviations to get what is
called the cross product
X

Y
i

Multiply the two paired


Deviations to get what is
called the cross product
X

16

Y
i

Multiply the two paired


Deviations to get what is
called the cross product
X

16 x 1

Y
i

Multiply the two paired


Deviations to get what is
called the cross product
X

16

Y
i

Multiply the two paired


Deviations to get what is
called the cross product
X

16
10 x 2

Y
i

Multiply the two paired


Deviations to get what is
called the cross product
X

16
20
2x0

Y
i

Multiply the two paired


Deviations to get what is
called the cross product
X

16
20
0
-5 x 0

Y
i

Multiply the two paired


Deviations to get what is
called the cross product
X

16
20
0
0
-9 x -1

Y
i

Multiply the two paired


Deviations to get what is
called the cross product
X

16
20
0
0
9
-14 x -2

Y
i

Multiply the two paired


Deviations to get what is
called the cross product
X

16
20
0
0
9
28

Y
i

16
20
0
0
9
28

Heres the covariance equation again:

16
20
0
0
9
28

XY

16
20
0
0
9
28

Xi

X Y i Y
N

So far we have calculated a


portion of the numerator of
this equation.

XY

16
20
0
0
9
28

Xi

X Y i Y
N

Now we will sum or add up


the cross products.

XY

16
20
0
0
9
28

Xi

X Y i Y
N

Now we will sum or add up


the cross products.

XY

16
20
0
0
9
28

Xi

X Y i Y
N

Now we will sum or add up


the cross products.

XY

16
20
0
0
9
28

Xi

X Y i Y
N

Add up

Now we will sum or add up


the cross products.

XY

Xi

16
20
0
0
9
28

X Y i Y
N

Add up

73

Now we will sum or add up


the cross products.

XY

Xi

16
20
0
0
9
28
73

X Y i Y
N

X i X Y i Y
Then we divide the result by
XY
the number of subjects,
N
which in this case is (6)

16
20
0
0
9
28
73

X i X Y i Y
Then we divide the result by
XY
the number of subjects,
N
which in this case is (6)

16
20
0
0
9
28
73 / 6

X i X Y i Y
Then we divide the result by
XY
the number of subjects,
N
which in this case is (6)

16
20
0
0
9
28
73 / 6 = 12.2

Covariance = 12.2

XY

Xi

X Y i Y
12.2
N

16
20
0
0
9
28
73 / 6 = 12.2

Lets see what the covariance looks like when the


direction of the data goes in the opposite direction:

BEFORE

BEFORE

AFTER

BEFORE

AFTER

First we calculate the deviations from the


mean:

First we calculate the deviations from the


mean:

We now compute the Cross Products

x
x
x
x
x
x

=
=
=
=
=
=

We now compute the Cross Products

We sum the cross products and then divide it


by the number of students

We sum the cross products and then divide it


by the number of students

-73

We sum the cross products and then divide it


by the number of students

-73 / 6

We sum the cross products and then divide it


by the number of students

-73 / 6 = -12.2

This is the covariance!

-73 / 6 = -12.2

Notice that when there is a


This is the covariance!

negative relationship between


two variables

-73 / 6 = -12.2

Notice that when there is a


This is the covariance!

negative relationship between


two variables

-73 / 6 = -12.2

Notice that when there is a


This is the covariance!

negative relationship between


two variables

-73 / 6 = -12.2

The covariance is
negative

On the other hand, when the relationship


between two variables is positive . . .

On the other hand, when the relationship


between two variables is positive . . .
The covariance will be positive.

On the other hand, when the relationship


between two variables is positive . . .
The covariance will be positive.

16
20
0
0
9
28
73 / 6 = 12.2

On the other hand, when the relationship


between two variables is positive . . .
The covariance will be positive.

16
20
0
0
9
28
73 / 6 = 12.2

Why is this important to know?

Why is this important to know?


Especially in light of the question we are
trying to answer?

Why is this important to know?


Especially in light of the question we are
trying to answer?
The Problem: A pizza caf owner wants to
know which type of high school athlete she
should market to, by comparing how many
ounces of pizza are consumed across all
three athlete groups. She will control for
pizza preference.

Computing covariance helps usThe Problem: A pizza caf owner wants to


know which type of high school athlete she
should market to, by comparing how many
ounces of pizza are consumed across all
three athlete groups. She will control for
pizza preference.

Computing covariance helps usThe Problem: A pizza caf owner wants to


know which type of high school athlete she
should market to, by comparing how many
ounces of pizza are consumed across all
three athlete groups. She will control for
pizza preference.

Computing covariance helps usThe Problem: A pizza caf owner wants to


know which type of high school athlete she
should market to, by comparing how many
ounces of pizza are consumed across all
three athlete groups. She will control for
pizza preference.
Meaning that we want to know what the
difference between the three groups would be
if we took away all of the covariance
between pizza preference and amount of
ounces of pizza eaten.

Computing covariance helps usThe Problem: A pizza caf owner wants to


know which type of high school athlete she
should market to, by comparing how many
ounces of pizza are consumed across all
three athlete groups. She will control for
pizza preference.
If we did not take out the covariance, then a
bunch of soccer players may like pizza not
because they are soccer players (our
research question) but because they just
LOVE PIZZA (not our research question).

Computing covariance helps usThe Problem: A pizza caf owner wants to


know which type of high school athlete she
should market to, by comparing how many
ounces of pizza are consumed across all
three athlete groups. She will control for
pizza preference.
Because their love of pizza is not what we are
testing, we will control for it by computing
covariance and see how much of the fact
that they are soccer players really affects
the amount of ounces of pizza they eat.

So, lets begin by running a One-way ANOVA


without removing the covariance between
pizza preference and ounces eaten by athlete
type.

Heres the data

Football
Players
29 oz. of pizza
24
14
27
27
28
27
32
13
35
32
17

eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten

Basketball
Players

Soccer
Players
32
oz. of pizza
15 oz. of pizza eaten
28 oz. of pizza eaten 27
13 oz. of pizza eaten 15
36 oz. of pizza eaten 23
29 oz. of pizza eaten 26
27 oz. of pizza eaten 17
31 oz. of pizza eaten 25
33 oz. of pizza eaten 14
32 oz. of pizza eaten 29
15 oz. of pizza eaten 22
30 oz. of pizza eaten 30
26 oz. of pizza eaten 25

eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten

Here are the results of the one-way ANOVA


for this data set:

Football
Players
29 oz. of pizza
24
14
27
27
28
27
32
13
35
32
17

eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten

Basketball
Players

Soccer
Players
32
oz. of pizza
15 oz. of pizza eaten
28 oz. of pizza eaten 27
13 oz. of pizza eaten 15
36 oz. of pizza eaten 23
29 oz. of pizza eaten 26
27 oz. of pizza eaten 17
31 oz. of pizza eaten 25
33 oz. of pizza eaten 14
32 oz. of pizza eaten 29
15 oz. of pizza eaten 22
30 oz. of pizza eaten 30
26 oz. of pizza eaten 25

eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten

Sums of
Squares
Between Groups
Within Groups
(error)

38.9
1587.4

Total

1626.3

df

Mean Square F-Ratio


0.
2
19.4
4

33

48.1

Here are the results of the one-way ANOVA


for this data set:

Football
Players
29 oz. of pizza
24
14
27
27
28
27
32
13
35
32
17

eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten

As you will recall, an


F-ratio 1 or lower with
any ANOVA method is
not significant.

Basketball
Players

Soccer
Players
32
oz. of pizza
15 oz. of pizza eaten
28 oz. of pizza eaten 27
13 oz. of pizza eaten 15
36 oz. of pizza eaten 23
29 oz. of pizza eaten 26
27 oz. of pizza eaten 17
31 oz. of pizza eaten 25
33 oz. of pizza eaten 14
32 oz. of pizza eaten 29
15 oz. of pizza eaten 22
30 oz. of pizza eaten 30
26 oz. of pizza eaten 25

eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten

Sums of
Squares
Between Groups
Within Groups
(error)

38.9
1587.4

Total

1626.3

df

Mean Square F-Ratio


0.
2
19.4
4

33

48.1

Here are the results of the one-way ANOVA


for this data set:

Football
Players
29 oz. of pizza
24
14
27
27
28
27
32
13
35
32
17

eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten

As you will recall, an


F-ratio 1 or lower with
any ANOVA method is
not significant.

Basketball
Players

Soccer
Players
32
oz. of pizza
15 oz. of pizza eaten
28 oz. of pizza eaten 27
13 oz. of pizza eaten 15
36 oz. of pizza eaten 23
29 oz. of pizza eaten 26
27 oz. of pizza eaten 17
31 oz. of pizza eaten 25
33 oz. of pizza eaten 14
32 oz. of pizza eaten 29
15 oz. of pizza eaten 22
30 oz. of pizza eaten 30
26 oz. of pizza eaten 25

eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten

Sums of
Squares
Between Groups
Within Groups
(error)

38.9
1587.4

Total

1626.3

df

Mean Square F-Ratio


0.
2
19.4
4

33

48.1

Then
After calculating the covariance
between pizza preference and ounces
of pizza eaten in one sitting, we find
that there is a positive relationship.

After calculating the covariance between pizza


preference and ounces of pizza eaten in one
sitting, we find that there is a positive
relationship.
Football
Players
29 oz. of pizza
24
14
27
27
28
27
32
13
35
32
17

eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten

Basketball
Players

Soccer
Players
32
oz. of pizza
15 oz. of pizza eaten
28 oz. of pizza eaten 27
13 oz. of pizza eaten 15
36 oz. of pizza eaten 23
29 oz. of pizza eaten 26
27 oz. of pizza eaten 17
31 oz. of pizza eaten 25
33 oz. of pizza eaten 14
32 oz. of pizza eaten 29
15 oz. of pizza eaten 22
30 oz. of pizza eaten 30
26 oz. of pizza eaten 25

eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten

Pizza Preference (scale


1-10)
Footbal Basketb
l
all
Soccer
7.0

3.0

7.5

5.0

8.0

4.5

3.5

4.5

3.5

9.0

9.5

6.0

7.0

6.5

6.0

8.0

7.0

4.5

6.5

7.5

6.0

7.5

9.0

1.5

2.5

8.5

6.5

9.0

4.0

5.0

After calculating the covariance between pizza


preference and ounces of pizza eaten in one
sitting, we find that there is a positive
relationship.
Football
Players
29 oz. of pizza
24
14
27
27
28
27
32
13
35
32
17

eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten

Basketball
Players

Soccer
Players
32
oz. of pizza
15 oz. of pizza eaten
28 oz. of pizza eaten 27
13 oz. of pizza eaten 15
36 oz. of pizza eaten 23
29 oz. of pizza eaten 26
27 oz. of pizza eaten 17
31 oz. of pizza eaten 25
33 oz. of pizza eaten 14
32 oz. of pizza eaten 29
15 oz. of pizza eaten 22
30 oz. of pizza eaten 30
26 oz. of pizza eaten 25

eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten
oz. of pizza
eaten

Covarianc
e = 12.1

Pizza Preference (scale


1-10)
Footbal Basketb
l
all
Soccer
7.0

3.0

7.5

5.0

8.0

4.5

3.5

4.5

3.5

9.0

9.5

6.0

7.0

6.5

6.0

8.0

7.0

4.5

6.5

7.5

6.0

7.5

9.0

1.5

2.5

8.5

6.5

9.0

4.0

5.0

After running the Analysis of Covariance on the


data and partialling out pizza reference, here is
the resulting ANOVA table:

After running the Analysis of Covariance on the


data and partialling out pizza reference, here is
the resulting ANOVA table:

SS
Adjusted means
(BG) 74.5
Adjusted error
(WG) 314.1
Adjusted total 388.6

df

MS

37.2

3.8

32
34

9.8

After running the Analysis of Covariance on the


data and partialling out pizza reference, here is
the resulting ANOVA table:
Adjusted means after we
took out the covariance
between the two variables:
Type of Athlete and Pizza
Preference (the covariate)

SS
Adjusted means
(BG) 74.5
Adjusted error
(WG) 314.1
Adjusted total 388.6

df

MS

37.2

3.8

32
34

9.8

After running the Analysis of Covariance on the


data and partialling out pizza reference, here is
the resulting ANOVA table:

SS
Adjusted means
(BG) 74.5
Adjusted error
(WG) 314.1
Adjusted total 388.6

df

MS

37.2

3.8

32
34

9.8the F-ratio is
Notice
larger making it more
likely to be significant.

Lets compare the F-ratio for just the ANOVA

Lets compare the F-ratio for just the ANOVA


Before

Between Groups
Within Groups
(error)
Total

SS
38.9
1587.4
1626.3

df
2

MS
19.4

33

48.1

F-Ratio
0.4

Lets compare the F-ratio for just the ANOVA


Before

Between Groups
Within Groups
(error)
Total

With the ANCOVA

SS
38.9
1587.4
1626.3

df
2

MS
19.4

33

48.1

F-Ratio
0.4

Lets compare the F-ratio for just the ANOVA


Before

Between Groups
Within Groups
(error)
Total

SS
38.9
1587.4
1626.3

With the ANCOVA

MS
19.4

33

48.1

F-Ratio
0.4

After

SS
Adjusted means
(BG)
Adjusted error
(WG)
Adjusted total

df

df

MS

74.5

37.2

314.1
388.6

32
34

9.8

F-Ratio
3.8

Lets compare the F-ratio for just the ANOVA


Before

Between Groups
Within Groups
(error)
Total

SS
38.9
1587.4
1626.3

With the ANCOVA

df
2

MS
19.4

33

48.1

F-Ratio
0.4

After

SS

df

MS

F-Ratio

Adjusted means
(BG)
74.5
2
37.2
3.8
Adjusted error
So we would conclude
there is a significant
difference
(WG) that314.1
32
9.8
between football, basketball & soccer players in terms of the
Adjusted total
388.6
34
ounces of pizza they eat, that is, when we control for pizza
preference.

Lets compare the F-ratio for just the ANOVA


Before

Between Groups
Within Groups
(error)
Total

SS
38.9
1587.4
1626.3

With the ANCOVA

df
2

MS
19.4

33

48.1

F-Ratio
0.4

After

SS

df

MS

F-Ratio

Adjusted means
(BG)
74.5
2
37.2
3.8
Adjusted error
So we would conclude
there is a significant
difference
(WG) that314.1
32
9.8
between football, basketball & soccer players in terms of the
Adjusted total
388.6
34
ounces of pizza they eat, that is, when we control for pizza
preference.

We can even adjust the original means for


amount of ounces of pizza eaten, after
controlling for preference.

We can even adjust the original means for


amount of ounces of pizza eaten, after
controlling for preference.
Original Means
Athlete

Football

Basketbal
l

Soccer

Means

25.4

26.3

23.8

We can even adjust the original means for


amount of ounces of pizza eaten, after
controlling for preference.
Original Means
Athlete

Football

Basketbal
l

Soccer

Means

25.4

26.3

23.8

Adjusted Means

(after controlling for the covariance)

Athlete

Football

Basketbal
l

Soccer

Means

24.3

23.8

27.3

We can even adjust the original means for


amount of ounces of pizza eaten, after
controlling for preference.
Original Means
Athlete

Football

Basketbal
l

Soccer

Means

25.4

26.3

23.8

Adjusted Means

(after controlling for the covariance)

Athlete

Football

Means

24.3

Basketbal
l

Soccer

23.8
27.3
Notice
that
after
controlling for pizza
preference, the mean for
Basketball players
drops

We can even adjust the original means for


amount of ounces of pizza eaten, after
controlling for preference.
Original Means
Athlete

Football

Basketbal
l

Soccer

Means

25.4

26.3

23.8

Adjusted Means

(after controlling for the covariance)

Athlete

Football

Means

24.3

Basketbal
l

Soccer

23.8
27.3
Notice
that
after
controlling for pizza
preference, the mean for
Basketball players
drops

We can even adjust the original means for


amount of ounces of pizza eaten, after
controlling for preference.
Original Means
Athlete

Football

Basketbal
l

Soccer

Means

25.4

26.3

23.8

Adjusted Means

(after controlling for the covariance)

Athlete

Football

Means

24.3

Basketbal
l

Soccer

23.8
27.3
Notice
that
after
controlling for pizza
preference, the mean for
Basketball players
drops

We can even adjust the original means for


amount of ounces of pizza eaten, after
controlling for preference.
Original Means
Athlete

Football

Basketbal
l

Soccer

Means

25.4

26.3

23.8

Adjusted Means

(after controlling for the covariance)

Athlete

Football

Basketbal
l

Soccer

Means

24.3

23.8

27.3

And the mean


for Soccer
players
INCREASES!

We can even adjust the original means for


amount of ounces of pizza eaten, after
controlling for preference.
Original Means
Athlete

Football

Basketbal
l

Soccer

Means

25.4

26.3

23.8

Adjusted Means

(after controlling for the covariance)

Athlete

Football

Basketbal
l

Soccer

Means

24.3

23.8

27.3

And the mean


for Soccer
players
INCREASES!

We can even adjust the original means for


amount of ounces of pizza eaten, after
controlling for preference.
Original Means
Athlete

Football

Basketbal
l

Soccer

Means

25.4

26.3

23.8

Adjusted Means

(after controlling for the covariance)

Athlete

Football

Basketbal
l

Soccer

Means

24.3

23.8

27.3

And the mean


for Soccer
players
INCREASES!

We can even adjust the original means for


amount of ounces of pizza eaten, after
controlling for preference.
Original Means
Athlete

Football

Basketbal
l

Soccer

Means

25.4

26.3

23.8

Adjusted Means

(after controlling for the covariance)

Athlete

Football

Basketbal
l

Soccer

Means

24.3

23.8

27.3

Thats the
Power of
ANCOVA!

Important note,

The more the covariate (pizza preference)


covaries with the independent variable (type
of athlete) . .

Important note,

The more the covariate (pizza preference)


covaries with the independent variable (type
of athlete) . . . the bigger the adjustment will
be between original and adjusted means.

Important note,

The more the covariate (pizza preference)


covaries with the independent variable (type
of athlete) . . . the bigger the adjustment will
be between original and adjusted means.

Meaning they share a larger


covariance value (either positive
or negative).

Important note,

The more the covariate (pizza preference)


covaries with the independent variable (type
of athlete) . . . the bigger the adjustment will
be between original and adjusted means.
Original Means
Athlete

Football

Basketbal
l

Soccer

Means

25.4

26.3

23.8

Adjusted Means
(after controlling for the covariance)

Athlete

Football

Basketbal
l

Soccer

Means

24.3

23.8

27.3

Important note,

The more the covariate (pizza preference)


covaries with the independent variable (type
of athlete) . . . the bigger the adjustment will
be between original and adjusted means.
Original Means
Athlete

Football

Basketbal
l

Soccer

Means

25.4

26.3

23.8

Big
Adjusted Adjustments
Means

(after controlling for the covariance)

Athlete

Football

Basketbal
l

Soccer

Means

24.3

23.8

27.3

One final note

In this case the covariate (the thing we were


controlling for) was a continuous variable like
Ounces of pizza eaten
Time it takes to eat pizza
The weight of each athlete.

In this case the covariate (the thing we were


controlling for) was a continuous variable like
Ounces of pizza eaten
Time it takes to eat pizza
The weight of each athlete.

In this case the covariate (the thing we were


controlling for) was a continuous variable like
Ounces of pizza eaten
Time it takes to eat pizza
The weight of each athlete.
But it also can be categorical (one or the
other)

In this case the covariate (the thing we were


controlling for) was a continuous variable like
Ounces of pizza eaten
Time it takes to eat pizza
The weight of each athlete.
But it also can be categorical (one or the
other)
Year in School (Sophomores, Juniors, or
Seniors)
Gender (Male or Female)
Religious Affiliation (Muslim, Catholic,
etc.)

In summary

Analysis of Covariance is a powerful


tool that makes it possible to
control for any variable that is not
of interest (eg. pizza preference)

Analysis of Covariance is a powerful


tool that makes it possible to
control for any variable that is not
of interest (eg. pizza preference) in
order to see the true effect of the
variable of interest (type of athlete)
on a dependent variable of interest
(ounces of pizza eaten)

There are more complex methods such


as Factorial ANCOVA, Repeated
measures ANCOVA and Multivariate
ANCOVA.

There are more complex methods such


as Factorial ANCOVA, Repeated
measures ANCOVA and Multivariate
ANCOVA.
This presentation gives you the
conceptual foundation necessary to
understand the Analysis of Covariance
elements of these methods.

You might also like