Download as xlsx, pdf, or txt
Download as xlsx, pdf, or txt
You are on page 1of 40

ANOVA

Analysis of variance (ANOVA) is an analysis tool used in statistics that splits an obs
What is Analysis of found inside a data set into two parts: systematic factors and random factors. T
Variance statistical influence on the given data set, while the random factors do not. Analysts
determine the influence that independent variables have on the dependent variable

What Does the The ANOVA test is the initial step in analyzing factors that affect a given data set. Once the test i
Analysis of Variance additional testing on the methodical factors that measurably contribute to the data set's inconsi
ANOVA test results in an f-test to generate additional data that aligns with the proposed regress
Reveal?

The Analysis of Variance is used to test whether two or three or more VARIANCES are SIGNIFICA

1. ONE WAY ANOVA or SINGLE FACTOR - samples of size n are selected from each of k populations
It is assumed that the k populations are independent and normally distributed

The hypothesis will be: Hₒ: µ₁ =µ₂ = µ₃ = µn


H₁: at least two the means are not equal
samples objects SQUARES OF VALUES
STUDENT MATH ENGLISH SCIENCE M²
1 85 87 86 7225
2 81 82 77 6561
3 78 80 81 6084
4 92 94 90 8464
5 88 86 85 7744
Sums 424 429 419 1272 36078
Means 84.8 85.8 83.8 T=
n=5
k=3 SST = 108214 - (1272)² 108214
(5)(3)
SSC = (424)² + (429)² + (419)² - (1272)²
5 (5)(3)
SSE = SST - SSC = 338.4

S₁² = SSC / k-1 = 10 / (3 - 1) = 5


S₂² = SSE / k(n-1) = 338.4 / 3(5-1) = 28.2

f ₁= S₁² /S₂² = 0.177 f computed

From Table A7 (found at the 2nd column and 12th row)


because: k-1=3-1=2 2nd column
k(n-1) = 3 (5 - 1) = 12 12th row

The critical value of f = 3.89


6 Analysis f critical > f comp = 3.89 > 0.177

Conclusion: There is NO SIGNIFCANT DIFFERENCE between the means


The mean grades of the students in MATH, ENGLISH and SCIENCE are equal

Anova: Single Factor

SUMMARY
Groups Count Sum Average Variance
MATH 5 424 84.8 30.7
ENGLISH 5 429 85.8 29.2
SCIENCE 5 419 83.8 24.7

2nd column, 12th row


ANOVA
Source of Variation SS df MS F P-value
Between Groups (SUBJECTS 10 2 5 0.177 0.84
Within Groups 338.4 12 28.2

Total 348.4 14

The value of f comp IS WITHIN THE ACCEPTANCE REGION


Using the p value 0.84
compare p value and level of significance

IF p is LESS THAN α REJECT the null hypothesis ( the value is within the REJECTION REGIO
IF p is GREATER THAN α, ACCEPT the mull hupothesis ( the value is insisde the ACCEPTANCE R

IF p is GREATER THAN α = (084 > 0.05) ACCEPT the null hypothesis

THERE IS NO SIGNIFICANT DIFFERENCE BETWEEN THE MEANS

Problem #2/ P400

The following data represent the number of packages of five popular brands of cigarettes
sold by a supermarket on 8 randomly selected days:

BRANDS
A B C D E
1 21 35 25 32 45
2 35 12 60 53 29
3 32 27 33 29 31
4 28 41 36 42 22
5 14 19 31 40 36
6 47 23 40 23 29
7 25 31 43 35 42
8 38 20 48 42 30
means 30 26 39.5 37 33

Perform an analysis of variance, at the 0.05 level of significance, and determine whether or not
the 5 brands sell, on the average, the same numeber of cigarettes at this supermarket?

Solution: Using the Excel statistical data analysis toolpack


STEPS
1 Hₒ: µa = µb = µc = µd = µe
2 H₁: at least two of the cigarette brands have different sales means

Anova: Single Factor

SUMMARY
BRANDS Count Sum Average Variance
A 8 240 30 106.86
B 8 208 26 88.86
C 8 316 39.5 120.29
D 8 296 37 86.29
E 8 264 33 57.14

ANOVA
Source of Variation SS df MS F P-value
Between Groups (BRANDS) 929.6 4 232.4 2.53 0.0579
Within Groups 3216 35 91.88571428571

Total 4145.6 39

At an alpha = 0.05:
Analysis F computed < F crit (2.53 < 2.64)
ACCEPT the null hypothesis

Conclusion: The is NO SIGNIFICANT DIFFERENCE between the mean sales of the cigarrette brands
The is NOT enough evidence to REJECT the null hypothesis at 0.05 alpha

#4 / 400 Three sections of the same elementary mathematics course are taught by 3 teachers
The final grades are recorded as follows
TEACHER SQUARES
A B C A²
73 88 68 5329
89 78 79 7921
82 48 56 6724
43 91 91 1849
80 51 71 6400
73 85 71 5329
66 74 87 4356
60 77 41 3600
45 31 59 2025
93 78 68 8649
36 62 53 1296
77 76 79 5929
96 15
80
36
Sums (column) 817 1051 838 2706 59407
Square of Sums 667489 1104601 702244
Means
n₁ = 12
n₂ = 15 SST = 197622 - 2706² / 40 14561.1
n₃ = 13
N = 40 SSC = (817²/12 + 1051² / 15 + 838² /13) - (2706² /40) 222.02

SSE = SST-SSC= 14339.08

S₁² = SSC / k-1 = 222.02 / (3-1) 111.01


S₂² = SSE /k(N-1) 14339.08/(40-3)) 387.542703

f= 0.286 computed value

Is there a significant difference in the average grades given by the 3 teachers?


Use a 0.05 level of significance
STEPS
1 Hₒ µa = µb = µc
2 H₁ at least two of the means are not equal

3 level of signifcance: α = 0.05

4 F critical from table A7:

F crit = 3.26 (note that interpolation is needed)


At k = 3-1 = 2
n=N-k= 40 - 3 =
5 Find the F comp By EXCEL

Anova: Single Factor

SUMMARY
TEACHERS Count Sum Average Variance
A 12 817 68.08333333 343.901515
B 15 1051 70.06666667 398.638095
C 13 838 64.46153846 414.602564

ANOVA
Source of Variation SS df MS F
Between Groups 222.02 2 111.01 0.286
Within Groups 14339.08 37 387.54

Total 14561.1 39

6 DECISION: Fcrit > F comACCEPT the null hypothesis


Conclusion There is NO SIGNIFICANT DIFFERENCE in the mean grades of the groups of math
THE LEVEL OF COMPETENCE of students in the three classes Are EQUAL

#10 / P402
BLENDS OF COFFEE
1 2 3 4
25.6 25.2 20.8 31.6
24.3 28.6 26.7 29.8
27.9 24.7 22.2 34.3
25.933 26.167 23.233 31.900
a Is there a significant difference in the average percentage reduction in yield for different blends?
Use a 0.05 level of significance

STEPS
1 Hₒ µ₁=µ₂=µ₃=µ₄
2 H₁ at least two of the means are not equal

3 level of signifcance: α = 0.05

4 F critical from table A7:

5 Anova: Single Factor

SUMMARY
Groups Count Sum Average Variance
1 3 77.8 25.93333333 3.32333333
2 3 78.5 26.16666667 4.50333333
3 3 69.7 23.23333333 9.50333333
4 3 95.7 31.9 5.13

ANOVA
Source of Variation SS df MS F
BLENDS OF COFFEE Between Groups 119.65 3 39.88 7.10
Within Groups 44.92 8 5.62

Total 164.569167 11

6 Analysis F crit < F comp = (4.07 < 7.10) REJECT the null hypothesis
CONCLUSION: There are at least two brands whose means SIGNIFICANTLY DIFFER

Using the p value 0.01 < 0.05 the comuted value is WITHIN THE REJECTION REG

TWO WAY CLASSIFICATION a set of observations characterized by two or more criteria


A. WITH SINGLE OBSERVATION OR WITHOUT REPLICATION

EXAMPLE:
PROBLEM #3 / page 418

The following data represent the final grades obtained by 5 students in Math, English, French and
Use a 0.05 level of significance to test the hypothesis that:

a. the courses are of equal difficulty Ho: µ math = µ English = µ French = µ b


b. the students have equal ability µa=µb =µc=µd=µe

SUBJECT
STUDENT MATH ENGLISH FRENCH BIOLOGY
A 68 57 73 61
B 83 94 91 86
C 72 81 63 59
D 55 73 77 66
E 92 68 75 87
sum (columns) 370 373 379 359
Sqr of Sums of
Columns 136900 139129 143641 128881

SST = 112441 - 1481² / (5*4) 2772.95


SSC= (370² + 373² + 379² + 359²) - 1481² / (5*4)
5
SSR= (259² + 354² + 275² + 271² + 322² ) /4 -
SSE= SST - SSR - SSC 1112.1

S₁² = SSR / (r-1) = 1618.7 / (5 - 1) 404.675


S₂² = SSC /(c- 1) = 42.5 / (4 - 1) 14.05
S₃² = SSE /(r - 1) (c- 1) = 92.675

f₁ = S₁² / S₃² 4.37


f₂ = S₂² / S₃² 0.15
STEP
1 Ho: µ math = µ English = µ French = µ biology
µa=µb =µc=µd=µe

2 H₁: At least two of the subjects have significantly different level of difficulty
At least two of the students have significantly different level of ability

3 level of signifcance: α = 0.05

4 F critical from table A7:

5 Find the F comp By EXCEL

BY EXCEL 1. Go to data analysis toolpack


2. Use ANOVA without replication function
3. Provide data to the function

Anova: Two-Factor Without Replication

SUMMARY Count Sum Average Variance


A 4 259 64.75 50.9166667
B 4 354 88.5 24.3333333
C 4 275 68.75 96.25
D 4 271 67.75 92.9166667
E 4 322 80.5 120.333333

MATH 5 370 74 201.5


ENGLISH 5 373 74.6 193.3
FRENCH 5 379 75.8 101.2
BIOLOGY 5 359 71.8 186.7

ANOVA
Source of Variation SS df MS F
STUDENTS Rows 1618.7 4 404.68 4.37
SUBJECTS Columns 42.15 3 14.05 0.15
Error 1112.1 12 92.68

Total 2772.95 19

6 THERE EXISTS A SIGNIFICANT DIFFERENCE IN THE LEVEL OF ABILITY OF AT LEAST


THE IS NO SIGNIFICANT DIFFERENCE IN THE LEVEL OF DIFFICULTY OF THE SUBJEC
Problem #6/page 419
A study is made to determine the force required to pull apart pieces of glued plastic.
Three types of plasticwere tested using four level of humidity, the results in kilograms
are given as follows:
HUMIDITY
Plastic type 30% 40% 50% 60%
A 39 33.1 33.8 33
B 36.9 27.2 29.7 28.5
C 27.4 29.2 26.7 30.9

Use 0.05 level of significance to test the hypothesis that


There is NO SIGNIFICANT DIFFERENCE IN THE MEAN FORCE required to pull the glued plastic apart

a When different types of plastics are used


b For different humidity conditions

STEPS

1 Ho: µA=µB =µC


µ 30% = µ 40% = µ 50% = µ 60%

2 H₁: At least two tests have significant difference in the mean force when different ty
At least two tests have significant difference in the mean force for different hum

3 level of signifcance: α = 0.05

4 F critical from table A7:

5 Find the F comp By EXCEL

BY EXCEL 1. Go to data analysis toolpack


2. Use ANOVA withOUT replication function
3. Provide data to the function

A
B

Anova: Two-Factor Without Replication

SUMMARY Count Sum Average Variance


A 4 138.9 34.725 8.24916667
PLASTICS B 4 122.3 30.575 18.8225
C 4 114.2 28.55 3.56333333

0.3 3 103.3 34.43333333 38.2033333


HUMIDITY 0.4 3 89.5 29.83333333 9.00333333
0.5 3 90.2 30.06666667 12.7033333
0.6 3 92.4 30.8 5.07

ANOVA
Source of Variation SS df MS F
Types of Plastics Rows 79.27 2 39.64 4.69
Humidity Columns 41.22 3 13.74 1.63
Error 50.69 6 8.45

Total 171.18 11

6 There is NO SIGNIFICANT DIFFERENCE IN THE MEAN FORCE required to pull the glued plastic apart
There is NO SIGNIFICANT DIFFERENCE IN THE MEAN FORCE required to pull the glued plastic apart

B. WITH INTERACTION OR REPLICATION


#8/page 420 The following data represent the results of four quizzes obtained by five students in Math,
English, French and Biology

Student Math English French Biology SUMS


88 51 73 87 299
79 72 77 92 320
1
63 58 81 81 283
80 65 77 76 298
79 85 82 80 326
56 67 80 62 265
2
96 95 36 93 320
68 88 68 67 291
67 74 91 77 309
51 59 59 84 253
3
66 47 95 70 278
89 82 92 73 336
35 76 43 55 209
64 26 42 53 185
4
60 49 52 49 210
70 76 32 56 234
99 84 95 83 361
87 83 98 87 355
5
77 94 81 76 328
95 76 96 80 347
SUMS 1469 1407 1450 1481 5807
Squares of Sums of Columns 2157961 1979649 2102500 2193361 8433471

STEPS

1 Ho: µ math = µ English = µ French = µ biology


µa=µb =µc=µd=µe
the students and the subjects do not interact

2 H₁: At least two of the subjects have significantly different level of difficulty
At least two of the students have significantly different level of ability
There is interaction between the students abilities and the subjects difficulty

3 level of signifcance: α = 0.05

4 F critical from table A7:


5 Find the F comp

By EXCEL Use a 0.05 level of significance to test the hypothesis that

a the courses have equal difficulty


b the students have equal ability
c the students and the subjects do not interact

1. Go to data analysis toolpack


2. Use ANOVA with replication function
3. Provide data to the function

Anova: Two-Factor With Replication

SUMMARY Math English French Biology


1
Count 4 4 4 4
Sum 310 246 308 336
Average 77.5 61.5 77 84
Variance 109.666667 81.66666666667 10.66666667 48.6666667

2
Count 4 4 4 4
Sum 299 335 266 302
Average 74.75 83.75 66.5 75.5
Variance 288.916667 142.25 451.6666667 193.666667

3
Count 4 4 4 4
Sum 273 262 337 304
Average 68.25 65.5 84.25 76
Variance 244.916667 243 286.25 36.6666667

4
Count 4 4 4 4
Sum 229 227 169 213
Average 57.25 56.75 42.25 53.25
Variance 236.916667 582.25 66.91666667 9.58333333
5
Count 4 4 4 4
Sum 358 337 370 326
Average 89.5 84.25 92.5 81.5
Variance 94.3333333 54.91666666667 60.33333333 21.6666667

Total
Count 20 20 20 20
Sum 1469 1407 1450 1481
Average 73.45 70.35 72.5 74.05
Variance 272.892105 313.1868421053 456.0526316 173.839474

ANOVA
Source of Variation SS df MS F
STUDENTS Rows 10040.95 4 2510.24 15.38
SUBJECTS Columns 157.94 3 52.65 0.32
STUDENTS & SUBJECTS Interaction 3267.75 12 272.31 1.67
Within 9794.75 60 163.25

Total 23261.3875 79

Analyses / Conclusion
For Students Ability: THERE EXISTS A SIGNFICANT DIFERRENCE IN STUDENTS' ABILITY
For course difficulty: THERE IS NO SIGNIFICANT DIFFERENCE IN THE COURSE DIFFICULTY
students and the subjects do not interact THERE IS NO SIGNIFICANT INTERACTION BETWEEN STUDENTS ABILITY AND SUBJ

#10/ page 421 In an experiment conducted to determine which of the 3 missile systems is preferable
the propellant burning rate for 24 static firings were measured. Four propellant types
were used. The experiment yielded duplicate observations of burning rates at each of
the treatments. The data after coding were recorded as follows:

Propellant Type
Missile System B1 B2 B3 B4
34 30.1 29.8 29
A1
32.7 32.8 26.7 28.9
32 30.2 28.7 27.6
A2
33.2 29.8 28.1 27.8
28.4 27.3 29.7 28.8
A3
29.3 28.9 27.3 29.1

Use a 0.05 level of significance to test the hypothesis that

A There is no difference in the mean propellant burning rates when four different m
B There is no difference in the mean propellant burning rates of the four propellan
C The is no interaction between the missile systems and the different propellant ty

STEPS

1 Ho: µ₁ = µ₂ = µ₃ = µ₄
µa₁ = µb₂ = µc₃
The is no interaction between the missile systems and the different propellant ty

2 H₁: At least two of the missile systems have dfference in the mean propellant burnin
At least two of the missile systems have difference in the mean propellant burnin
The is significant interaction between the missile systems and the different prope

3 level of signifcance: α = 0.05

4 F critical from table A7:

5 Find the F comp

1. Go to data analysis toolpack


2. Use ANOVA with replication function
3. Provide data to the function

Anova: Two-Factor With Replication

SUMMARY B1 B2 B3 Total
A1
Count 2 2 2 6
Sum 66.7 62.9 56.5 186.1
Average 33.35 31.45 28.25 31.0166667
Variance 0.845 3.645 4.805 7.17366667

A2
Count 2 2 2 6
Sum 65.2 60 56.8 182
Average 32.6 30 28.4 30.3333333
Variance 0.72 0.08 0.18 3.79066667

A3
Count 2 2 2 6
Sum 57.7 56.2 57 170.9
Average 28.85 28.1 28.5 28.4833333
Variance 0.405 1.28 2.88 1.02566667

Total
Count 6 6 6
Sum 189.6 179.1 170.3
Average 31.6 29.85 28.38333333
Variance 5.044 3.259 1.585666667

ANOVA
Source of Variation SS df MS F
Missile Systems ROWS 20.6144444 2 10.30722222 6.251
Propellant Types Columns 31.1211111 2 15.56055556 9.437
Missile Systems &
Propellant Interaction 13.9888889 4 3.497222222 2.121
Within 14.84 9 1.648888889

Total 80.5644444 17

A At least two of the missile systems have SIGNIFICANT DIFFERENCE in the mean p
B At least two of the missile systems have SIGNIFICANT DIFFERENCE in the mean p
C The is NO significant interaction between the missile systems and the different p

PROBLEM SET # 10 SOLVE # 3, 5 & 7 PP 400-401 / #7,9 & 11 PP 420-421


DEADLINE: SEPT 25, 2021
d in statistics that splits an observed aggregate variability
ctors and random factors. The systematic factors have a
ndom factors do not. Analysts use the ANOVA test to
ave on the dependent variable in a regression study.

ct a given data set. Once the test is finished, an analyst performs


contribute to the data set's inconsistency. The analyst utilizes the
at aligns with the proposed regression models.

or more VARIANCES are SIGNIFICANTLY DIFFERENT

k-1=2 v1

k(n - 1) = 12 v2

populations
SQUARES OF VALUES
E² S²
7569 7396
6724 5929
6400 6561
8836 8100
7396 7225
36925 35211 108214
ƩX² =

107865.6 348.4

= 107875.6 107866 10
n the means
NGLISH and SCIENCE are equal

F crit
3.89
value is within the REJECTION REGION)
e value is insisde the ACCEPTANCE REGION)

ll hypothesis

A² B² C² D² E²
rent sales means

F crit
2.64

or Using the P values:

pvalue > alpha


the cigarrette brands 0.0579 > 0.050

e taught by 3 teachers

SQUARES
B² C²
7744 4624
6084 6241
2304 3136
8281 8281
2601 5041
7225 5041
5476 7569
5929 1681
961 3481
6084 4624
3844 2809
5776 6241
9216 225
6400
1296
79221 58994 ƩX² = 197622

he 3 teachers?

interpolation
rpolation is needed)
2 2nd column v₂ f(v₁,v₂)
37 37th row 37 - 30 30 3.32 x - 3.32
40 - 30 37 x 3.23 - 3.32
40 3.23

7 x - 3.32
=
10 -0.09

7 * (-0.09)
= x - 3.32
10
3.26 = x

P-value F crit
0.7525786 3.252
mean grades of the groups of math students taught by the three teachers
e three classes Are EQUAL

ction in yield for different blends?

P-value F crit
0.01 4.07
the null hypothesis
GNIFICANTLY DIFFER

alue is WITHIN THE REJECTION REGION

ed by two or more criteria


dents in Math, English, French and Biology

= µ English = µ French = µ biology


math
a=µb =µc=µd=µe

sums(rows)
sqr sums of rows M² E² F² B²
259 67081 4624 3249 5329 3721
354 125316 6889 8836 8281 7396
275 75625 5184 6561 3969 3481
271 73441 3025 5329 5929 4356
322 103684 8464 4624 5625 7569
1481 445147 28186 28599 29133 26523 112441

42.15

1481² / (5*4) 1618.7


different level of difficulty
different level of ability

P-value F crit ANALYSES


0.02 3.26 REJECT the null hypothesis
0.93 3.49 ACCEPT the null hypothesis

The pvalue is lower than 0.05, REJECT the null hypothesis


The pvalue is higher than 0.05, ACCEPT the null hypothesis

THE LEVEL OF ABILITY OF AT LEAST TWO STUDENTS


EVEL OF DIFFICULTY OF THE SUBJECTS (TEACHERS HAVE SAME COMPETENCE)
ieces of glued plastic.
the results in kilograms

n the mean force when different types of plastics are used


n the mean force for different humidity conditions
P-value F crit Table A7
0.06 5.14 ACCEPT Ho 2nd column, 6th row
0.28 4.76 ACCEPT Ho 3rd column, 6th row

uired to pull the glued plastic apart When different types of plastics are used
uired to pull the glued plastic apart for the different humidity conditions
note:

r = number of rows
c= number of columns
n = number of replicants

ed by five students in Math,

sqr of sums Math English French Biology


89401 7744 2601 5329 7569
102400 6241 5184 5929 8464
80089 3969 3364 6561 6561
88804 6400 4225 5929 5776
106276 6241 7225 6724 6400
70225 3136 4489 6400 3844
102400 9216 9025 1296 8649
84681 4624 7744 4624 4489
95481 4489 5476 8281 5929
64009 2601 3481 3481 7056
77284 4356 2209 9025 4900
112896 7921 6724 8464 5329
43681 1225 5776 1849 3025
34225 4096 676 1764 2809
44100 3600 2401 2704 2401
54756 4900 5776 1024 3136
130321 9801 7056 9025 6889
126025 7569 6889 9604 7569
107584 5929 8836 6561 5776
120409 9025 5776 9216 6400
1735047 113083 104933 113790 112971 444777
SSR =

SSC =

different level of difficulty


different level of ability
lities and the subjects difficulty
COLUMNS
ROWS
OVERALL

Total

16
1200
75
123.06667

16
1202
75.125
255.05

16
1176
73.5
219.06667

16
838
52.375
218.11667
16
1391
86.9375
66.0625

P-value F crit Table A7


0.00000 2.53 REJECT 4th column, 60th row
0.81 2.76 ACCEPT 3rd column, 60th row
0.10 1.92 ACCEPT 12th column, 60th row

TUDENTS' ABILITY
E COURSE DIFFICULTY
WEEN STUDENTS ABILITY AND SUBJECT DIFFICULTY

(5*4*n) - 1 = 79

20n = 80
n=4 replicants

e systems is preferable
Four propellant types
urning rates at each of
burning rates when four different missile systems are used
burning rates of the four propellant types
ems and the different propellant types

ems and the different propellant types

ence in the mean propellant burning rates when four different missile systems are used
ence in the mean propellant burning rates of the four propellant types
sile systems and the different propellant types
P-value F crit
0.020 4.256 REJECT Ho column 2, 12th row
0.006 4.256 REJECT Ho column 3, 12th row

0.160 3.633 ACCEPT Ho column 6, 12th row

FICANT DIFFERENCE in the mean propellant burning rates when four different missile systems are used
FICANT DIFFERENCE in the mean propellant burning rates of the four propellant types
missile systems and the different propellant types
CONCLUSION TABLE A7
4th row 12th column
3rd row 12th column

You might also like