Professional Documents
Culture Documents
STAT-702 Unit # 1
STAT-702 Unit # 1
Introduction to Statistics
1
Objectives
Understand the complexity of managerial decisions
Know the need of using quantitative approach to
managerial decisions
Appreciate the role of Statistical methods in data
analysis.
2
INTRODUCTION TO STATISTICS
Statistics:-
May be defined as a science of collection, representation,
analysis and interpretation of numerical data under
uncertainty conditions.
Descriptive Statistics:
Gives numerical and graphic procedures to summarize a
collection of data in a clear and understandable way
Inferential Statistics:
Provides procedures to draw inferences about a
population from a sample.
3
Statistics presents a rigorous scientific method for
gaining insight into data. For example, suppose we
measure the weight of 100 patients in a study. With
so many measurements, simply looking at the data
fails to provide an informative account. However
statistics can give an instant overall picture of data
based on graphical presentation or numerical
summarization irrespective to the number of data
points. Besides data summarization, another
important task of statistics is to make inference and
predict relations of variables.
4
INTRODUCTION TO STATISTICS
Population
A population is the totality of the observations made on all the objects
(under investigation) possessing some common specific characteristics,
which are of particular interest to researchers. It is the entire group
whose characteristics are to be estimated.
For example, the heights of all the students enrolled at UAF in M.Com
degree in Spring 2014, the wages of all employees of a mill in a given
year, etc. A population may be finite or infinite. The number of
observations in a finite population is called the size of the population
and is denoted by the letter N.
Sample
A sample is a representative part of the population which is selected to
obtain information concerning the characteristics of the population.
The number of observations in a sample is called the size of the sample
which is denoted by n.
5
INTRODUCTION TO STATISTICS
Sampling
The process of drawing a sample from population is called
sampling.
Why take a sample
Why take a sample instead of studying every member of the
population?
A sample of registered voters is necessary because of the
prohibitive cost of contacting millions of voters before an election.
Testing wheat for moisture content destroys the wheat, thus
making a sample imperative.
If the soft drink tasters tested all the soft drink, none would be
available for sale.
6
INTRODUCTION TO STATISTICS
Parameter
A parameter is a numerical characteristic of a population, such
as its mean or standard deviation, etc. Parameters are fixed
constants that characterize a population. They are denoted by
Greek letters. Parameter is a fixed quantity.
Statistic
A statistic is a numerical characteristic of a sample such as its
mean or standard deviation, etc. The statistics are used to draw
valid inferences about the population. They are denoted by Latin
letters. Statistic is a variable quantity.
7
Variables
Any characteristic, which may varies with respect to individual,
time, and place. For example
No of products produced by a machine during a specified
period of time.
Number of workers
Weight of a any individual
Price, Sale, Adv. expenditures
Quality, Design, Performance
8
Types of variables
Fixed variables:- Random Variables:-
1.Design 1.Sales
2.Quality 2.Growth
3.Adv. Expenditures 3.Recovery Time
4.Diet
5.Dose of a medicine
9
Types of variables
Qualitative variable
Characteristic which varies in quality (not numerically) from one
individual to another, also called attribute, e.g. eye color, education
level, Behavior, quality, Design, Performance.
Quantitative variable
Variable is called a quantitative variable when it varies in quantity (or
numerically) from one individual to another, e.g. age, income,
temperature, Price , Sale, Advertising Expenditures
Discrete variable
Variable take only specified values or take values by jumps or breaks,
e.g. number of rooms in a house, number of deaths in an accident etc.
Continuous variable
If it can assume any vale (fractional or integral) within two specified
values ‘a’ and ‘b’, e.g. height of a plant, speed of a car, Sale, Price
10
Measurement
The process of assigning numbers or labels to objects, persons,
states or, events in accordance with specific logically accepted
rules for representing quantities or qualities of attributes or
characteristics. Data can be classified according to levels of
measurement. The level of measurement of the data often
dictates the calculations that can be done to summarize and
present the data. It will also determine the statistical tests that
should be performed. There are actually four levels of
measurement: nominal, ordinal, interval, and ratio [Stevens
1951].
The lowest, or the most primitive, measurement is the nominal
level. The highest, or the level that gives the most information
about the observation, is the ratio level of measurement.
11
Levels of Measurement
Nominal-Level
The data is only descriptive (e.g. religion, country name, region).
For example, eye color, religion, Gender, Product Name, Reg. #
etc.
Ordinal-Level
The data has rank order, though intervals between data points
cannot be considered equal (e.g. high/medium/low income). For
example, cricket teams standings in ICC ranking, students’
grades, etc.
Interval-Level
The data has equal intervals between data points. For example,
temperature, shoe size and IQ scores, etc.
Ratio-Level
The data has equal intervals between data points and a true zero.
For example, bank balance, weight, height, etc.
12
INTRODUCTION TO STATISTICS
Observation
The numerically recording of information is called observation
or datum.
Observations can be simply divided into three types:
categorical where the observations can be in a limited
number of categories which have no obvious scale (e.g.
‘Pass’, ‘Fail’, ‘Yes or No’);
Discrete where there is a real scale but not all values are
possible (e.g. ‘number of Products’ or ‘number of students )
Continuous where any value is theoretically possible, only
restricted by the measuring device (e.g. lengths,
concentrations, Weight ).
13
INTRODUCTION TO STATISTICS
Data
The Collection of some related observations is called data.
Classification of data
Data that may have been originally collected and have not
undergone any sort of statistical treatment are called
Primary data, while the data that have undergone any sort
of statistical treatment at least once are called Secondary
data.
Data may be available from existing sources e.g. records
and publications or the same may have to be collected
afresh.
14
INTRODUCTION TO STATISTICS
Collection of primary data
(1) Direct personal investigation
(2) Personal interview
(3) Collection through questionnaires.
(4) Collection through enumerators.
(5) Collection through local sources
Collection of Secondary data:
1. Official Publications
Federal Bureau of Statistics
Population Census of Organization
Ministries of Health, Food, Agriculture, Finance etc.
Provincial Bureaus of Statistics
(2) Semi-official
15
INTRODUCTION TO STATISTICS
Collection of Secondary data:
2. Semi-official Sources
Publication of State Bank of Pakistan
NBP
District Councils
WAPDA
3. Private Sources
Chamber of Commerce & Industry
Co-Operative Societies
4. Research Organizations
PARC, NARC, Universities
16
A Taxonomy of Statistics
17
Arithmetic Mean
The arithmetic mean is defined as a value obtained by
dividing the sum of all the observations by their
number, that is
Sum of all the observatio ns
Arithmetic Mean
Number of the observatio ns
If X1, X2, …, Xn are n observations of a variable X then
their AM is defined as: n
X1 X 2 X n X i
X i 1
n n
18
Arithmetic Mean
The marks obtained by 8 students are given below
Find Arithmetic Mean: Let X=Marks then
X
67
X 548 , n 8
72 X
X 548
68.5 Marks
68 n 8
70
65 NOTE:
68
75 At least one observation will be below
and at least one will be above the
63 mean
548
19 ΣX
Example: The height of 15 plants are given below. Find the
Arithmetic Mean . Let X= Plant Height
listing X
1 14
2 17
3 31 x-bar
4 28 737/15 = 49.13333
5 42
6 43
7 51
8 51
9
10
11
66
70
67
X 737 , n 15
12
13
70
78
A.M X
X 737
49.13
14 62
n = 15 47
n 15
total 737
20
Example Days Off per Year
The data represent the number of days off per year for a
sample of individuals selected from nine different countries.
Find the mean.
20, 26, 40, 36, 23, 42, 35, 24, 30
X
X1 X 2 X 3 Xn
X
n n
20 26 40 36 23 42 35 24 30 276
X 30.7
9 9
X
i 1
i
£ 259.85
X i
259
X i 1
£ 37.12
22 n 7 (to the nearest penny).
Interpretation:
24
Types of Measures of Dispersion
There are two main types of measures of dispersion:
1. Absolute Measure of Dispersion
2. Relative Measure of Dispersion
1. Absolute Measure of Dispersion
The absolute measure of dispersion measures the variation present
among the observations in the unit of the variable or square of
the unit of the variable.
2. Relative Measure of Dispersion
The relative measure of dispersion measures the variation present
among the observations relative to their average. It is expressed
in the form of ratio, or percentage. It is independent of the unit
of measurement.
25
Measures of Dispersion
The commonly used measures of absolute dispersion
are:
1. Range
2. Quartile Deviation
3. Mean (Average) Deviation
4. Variance and Standard Deviation
Their corresponding measures of relative dispersion are:
1. Coefficient of Range/Coefficient of dispersion
2. Coefficient of Quartile Deviation
3. Coefficient of Mean (Average) Deviation
4. Coefficient of Variation (CV)
26
Measures of Spread
– Distance Based Measures of Spread
• The range
• The Semi interquartile range
– Centre Based Measures of Spread
• The mean deviation
• The variance
• The standard deviation
27
Variance: is defined as the arithmetic mean
of the Squared deviation of observations
2
from mean it is denoted
S by
2
( X X )2
1 X
S
2
or S X
2 2
n n n
(Unbiased Estimate Variance)
2
( X X )2
1 X
S2 or S 2 X 2
n 1 n 1 n
28
Q. Find Variance of the X ( X X )2
given data, s2 4 36
6 16
X 60 9 1
X 10 12 4
n 6
13 9
S
2 ( X X ) 102
2
17
16 36
n 6 60 102
X ( X X ) 2
S
2 ( X X ) 2
102
20.4
n 1 5
29
Q. Find Variance of the
X cm X2
given data, s2 4 16
X 60 6 36
X 10
n 6 9 81
X
2
1
17 12 144
S
2
X
2
n n 13 169
1 (102) 2 16 256
S 702
2
17 cm 2
6 6 60 702
1
X 2 X
2
X X 2
S2
n 1 n
1 (102) 2
n6
S 702
2
20.4
5 6
30
Standard Deviation: is defined as the Positive
square root of the arithmetic mean of the Squared
deviation of observations from mean it is denoted
by S
( Biased Estimate S .D )
2
( X X ) 2
1 X
S
n
or S
n
X
2
n
(Unbiased Estimate S .D )
2
( X X ) 2
1 X
S
n 1
or S
n 1
X
2
n
31
Q. Find S.D of the given X ( X X )2
data, s 4 36
6 16
X 60
X 10
9 1
n 6 12 4
(X X )
13 9
2
102
S 4.123 16 36
n 6 60 102
X ( X X ) 2
S
(X X ) 2
102
4.517
n 1 5
32
Q. Find S.D and Coefficient
Variation of the given data, X cm X2
s2 4 16
X 60
X 10 6 36
n 6
9 81
X
2
1
S
n
X
2
n
12 144
13 169
1 (102) 2 16 256
S 702 17 4.123cm
6 6 60 702
1
X
2
X X 2
S
n 1
X
2
n
1 (102) 2
n6
S 702 20.4 4.517
5 6
33
Coefficient of Variation (CV)
• Always in percentage (%)
• Shows relative variability, that is, variability relative
to the magnitude of the data i.e variation relative to
mean
• Can be used to compare two or more sets of data
measured in different units or same units but
different average size
S
C.V .100%
X
34
Coefficient of Variation (CV)
X 10
S 4.123
4.123
CV 100%=41.23%
10
35
Comparing Coefficient of Variation
Mr. Ali
AverageMarks X 1 =80
Standard deviation=S1 5
S1 5 Mr. Ali is
CVALI 100% 100% 6.25%
X1 80 more
Consistent
Mr.Zain in
AverageMarks X 1 =80 performanc
e
Standard deviation=S1 15
S1 15
CVALI 100% 100% 18.75%
X1 80
36
Comparing Coefficient
of Variation
• Stock A:
o Average price last year = $50
o Standard deviation = $5
s $5
CVA 100% 100% 10%
x $50 Both stocks
have the same
• Stock B:
standard
o Average price last year = $100 deviation, but
stock B is less
o Standard deviation = $5 variable
relative to its
s $5 price
CVB 100% 100% 5%
x $100
Coefficient of Variation
Summary statistics for WEIGHT and HEIGHT (both ratio variables) of Pakistani adults in different
units:
Weight Height Weight Height
Mean 160 pounds 66 inches SD 30 pounds 4 inches
72.6 kilograms 5.5 feet 13.6 kilograms 0.33 feet
0.08 tons 168 centimeters 0.015 tons 10.2
centimeters
Which variable [WEIGHT or HEIGHT] has greater dispersion? [No meaningful answer can be
given]
Which variable has greater dispersion relative to its average, e.g., greater Coefficient of
Dispersion (SD relative to mean)?
S1 30 13.6 0.015
CVWeight 100% 18.7%
X1 160 72.6 0.08
S1 4 0.33 10.2
CVHeight 100% 6.1%
X1 66 5.5 168
Note that the Coefficient of Variation is a pure number, not expressed in any units and is the
same whatever units the variable is measured in.
STATISTICAL INFERENCE
Statistical inference is the process of reaching conclusions about
characteristics of an entire population using data from a subset, or sample,
of that Population.
(x i X n )2
ˆ 2 s 2 i 1
n 1
Truth (not *hat notation ^ is often used to indicate
“estimate”
observable)
Sample
Population (observation)
parameters
N N
x (x )
i
2
i 1
2 i 1 Make guesses about
N N
the whole
population
Estimation…
There are two types of inference: estimation and hypothesis
testing; estimation is introduced first.
Point Estimator
Interval Estimator
Point Estimator…
46
Hypothesis Testing
H1
47
Left-tailed Test:- Average Marks at
Least 45
H0: µ 45
H1: µ < 45
Points Left
Values that
differ significantly
from 45 45
Right-tailed Test: Average Marks at
most 45
H0: µ 45
H1: µ > 45
Points Right
Values that
differ significantly
45 from 45
Two-tailed Test Average Marks equal
to 45
H : µ = 45 a is divided equally between
0 the two tails of the critical
H1: µ 45 region
45
Correct
We decide to
Decision
reject the Type I error
No Error
a
Decision
null hypothesis
1-
Correct
We don't
Decision Type II error
reject the
null hypothesis
No Error
1-a
Significance Level
Probability of committing a Type-I error is called the
level of significance, denoted by α . By α =5% we mean
that there are 5 chances in 100 of incorrectly rejecting a
true null hypothesis. To put it in another way we say that
we are 95% confident in making the correct decision.
Level of Confidence
The probability of not committing a Type-I error, (1- α ), is
called the level of confidence, or confidence co-efficient.
Power of a Test
The probability of not committing a Type-II error, (1-β), is
called the power of the test.
52
Test Statistic
A statistic on which the decision of rejecting or don’t rejecting the
null hypothesis is based is called a test statistic
In testing of hypothesis the sampling distribution of the test statistic
is based on the assumption that the null hypothesis is true.
53
Decision Rule Critical Value
54
General Procedure for Hypothesis Testing
Step-1:- Formulate the null and alternative hypotheses
Step-2:- Decide upon a significance level,
Step-3:- Choose an appropriate test statistic
Step-4:- Calculation
Step-5:- Determine the Critical Region (CR). The location of the
CR depends upon the form of alternative
hypothesis.
• If >, choose the right tail as the CR
• If <, choose the left tail as the CR
• If ≠ , choose a two-tailed CR
•Step-6:-Conclusion: Reject null hypothesis if the computed
value of test statistic falls in the CR, otherwise don’t reject null
hypothesis and then state the decision in managerial terms
55
EXAMPLE:- It is claimed that an automobile is driven on the
average more than 12,000 miles per year. To test this claim a
random sample of 100 automobiles owners are asked to keep
a record of the miles they travel. Would you agree with the
claim if the random sample showed an average of 12500
miles and a standard deviation of 2400 miles?
Construction of hypotheses
POPULATION Ho : 12000
> 12000 H1: > 12000
12000 Level of significance
a = 5%
t X tCal
12500 12000
2.08
2
SAMPLE s2 2400
n=100 n 100
ത
𝑋=12500
S=2400 56
Step-5 Critical Region:-
t ta n1d . f
t t0.0599d . f
t 1.66
Step-6
Conclusion: Since tcal 2.08 fall in the Rejection Region so we reject
H0
tcal 2.08
Acceptance Region Rejection
ttab 1.66 Region
-5 -4 -3 -2 -1 0 1 2 3 4 5
EXAMPLE:- It is claimed that an automobile is driven on the
average at most 12,000 miles per year. To test this claim a
random sample of 100 automobiles owners are asked to keep
a record of the miles they travel. Would you agree with the
claim if the random sample showed an average of 12500
miles and a standard deviation of 2400 miles?
Construction of hypotheses
POPULATION Ho : 12000
12000 H1: > 12000
> 12000 Level of significance
a = 5%
t X tCal
12500 12000
2.08
2
SAMPLE s2 2400
n=100 n 100
ത
𝑋=12500
S=2400 58
Step-5 Critical Region:-
t ta n1d . f
t t0.0599d . f
t 1.66
Step-6
Conclusion: Since tcal 2.08 fall in the Rejection Region so we reject
H0
tcal 2.08
Acceptance Region Rejection
ttab 1.66 Region
-5 -4 -3 -2 -1 0 1 2 3 4 5
EXAMPLE:- It has been found from experience that the mean
breaking strength of a particular brand of thread is 9.63N.
Recently a sample of 36 pieces of thread showed a mean
breaking strength of 8.93N and standard deviation of 1.40N.
Can we conclude that the thread has become inferior?
Construction of hypotheses
POPULATION Ho : 9.63
H1: < 9.63
< 9.63
9.63 Level of significance
a = 5%
t X tCal
8.93 9.63
3.0
2
s2 1.40
SAMPLE n 36
n=36
ത
𝑋=8.93
S=1.40 60
EXAMPLE:- It has been found from experience that the mean
breaking strength of a particular brand of thread is 9.63N.
Recently a sample of 36 pieces of thread showed a mean
breaking strength of 8.93N and standard deviation of 1.40N.
Can we conclude that the thread has become inferior?
Construction of hypotheses
POPULATION Ho : 9.63
H1: < 9.63
< 9.63
9.63 Level of significance
a = 5%
t X tCal
8.93 9.63
3.0
2
s2 1.40
SAMPLE n 36
n=36
ത
𝑋=8.93
S=1.40 61
Step-5 Critical Region:-
t ta n 1d . f
t t0.0535d . f
t 1.690
Step-6
Conclusion: Since tcal 3.00 fall in the Rejection Region so we reject
H0
tcal 3.00
ttab 1.690 Acceptance Region
Rejection Region
-5 -4 -3 -2 -1 0 1 2 3 4 5
EXAMPLE:- The mean lifetime of bulbs produced by a
company has in past been 1120 hours. A sample of 9
electric light bulbs recently chosen from a supply of
newly produced battery showed a mean lifetime of 1170
hours with a standard deviation of 120 hours. Test that
mean lifetime of the bulbs has not changed
Step-1 Construction of hypotheses
POPULATION Ho: = 1120
H1: 1120
=1120
1120 Step-2. Level of significance
a = 5%
t X tCal
1170 1120
2
1.25
SAMPLE s2 120
n=9 n 9
ത
𝑋=1170
S=120 63
Step-5 Critical Region:-
t ta t ta
2 n1d . f 2
n 1 d . f
t t0.0258d . f t t0.0258d . f
Step-6
t 2.306 t 2.306
Conclusion: Since tcal 1.25 does not fall in the Rejection Region so
we do not reject H 0
tcal 1.25
Rejection ttab 2.064 Acceptance ttab 2.064 Rejection
Region Region Region
-5 -4 -3 -2 -1 0 1 2 3 4 5
Example:- A researcher wishes to estimate the average marks of the
students in Math-101 course of A section. A random sample of 25
students is selected and the sample mean is found to be 50 with standard
deviation 2. Estimate 90 % confidence Interval for the Average Marks.
SAMPLE 2
X t a ( n 1) d . f S
n=25 α=0.10 n
ത
𝑋=50 α/2=0.05
2
S=2 50 t 0.05( 24) 2
25
50(1.711)(0.4)
Width of C.I
50 0.684
50.684- 49.316
1.368 50-0.684, 50+0.684
65
( 49.316 , 50.684)
Example:- A researcher wishes to estimate the average marks of the
students in Math-101 course of A section. A random sample of 25
students is selected and the sample mean is found to be 50 with
standard deviation of 2. Estimate 95 % confidence Interval for the
Average Marks.
2
SAMPLE S
X t a / 2( n 1) d . f
n=25 α=0.05 n
ത
𝑋=50 α/2=0.025 2
50(2.064)(0.4)
Width of C.I
50.826- 49.174 50 0.826
1.652
50-0.826, 50+0.826
66
( 49.174 , 50.826)
Example:- A researcher wishes to estimate the average amount of money
that a student from university spends for food per day. A random sample of
36 students is selected and the sample mean is found to be Rs 45 with
standard deviation of Rs.3. Estimate 90 % confidence limits for the
average amount of money that the students from the university spend on
food per day.
2
SAMPLE X t a / 2( n 1) d . f S
n
n=36 α=0.10
2
ത
𝑋=45 α/2=0.05 45 t 0.05(35) 3
S=3 36
45(1.69)(0.5)
45 0.84
45-0.84, 45+0.84
67
( 44.16 , 45.84)
TEST OF HYPOTHESIS FOR DIFFERENCE BETWEEN POPULATION MEANS
EXAMPLE: The average marks of 20 Students in Math-101 of A Section are 50
with a standard deviation 2 and the average marks of 15 Students in Math-101
of B Section are 40 with a standard deviation 2.5. On the basis of above sample
information can we conclude that students of Section A perform better than
Section B students . (Assume Population variances are equal) Use 5% level of
significance Step-1:Construction of hypotheses
POPULATION H0 : 1 2 1 2 0
1 > 2 H1 : 1 2 1 2 0
1 2
Step-2: Level of significance
a = 5%
t ( X 1 X 2 ) (1 2 )
SAMPLE 1 1
n1=20 n2=15 S
2
p
n1 n2
𝑋ത1 =50 𝑋ത2 =40
S1=2 S2=2.5 68
Step-4:-Calculation
( n 1) S 2 (n 1)S 2
S 2p 1 1 2 2
n1 n2 2
Sp
2 (20 1)*4 (15 1)*9
20 15 2
S 2p 76 87.50 4.95
33
Step-4:-Calculation
(50 40) 0 10
tCal 13.15
1 1 0.845
4.95
20 15
69
Step-5 Critical Region:-
t ta n n 2 d . f
1 2
t t0.0533d . f
t 1.692
Step-6
Conclusion: Since tcal 13.15 fall in the Rejection Region so we
reject H 0
tcal 13.15
Acceptance Region ttab 1.691 Rejection
Region
-5 -4 -3 -2 -1 0 1 2 3 4 5
Comparing more than two population means
We can use two sample t-test to test the equality of
more than two population means, but this
procedure
– Require large number of two sample t-tests
– Performing many two sample t-tests at α tends to
inflate the overall α risk.
For example, To test the equality of 10-
population means, we have to perform 45 t-
test If the tests are independent and each test
use α =0.05, then overall α=45(0.05)=2.25
we require a procedure for carrying out test of
hypothesis about the equality of several population
means simultaneously
–we can use F-distribution in ANOVA that yields a single
test statistic for comparing all means so that the overall risk
71
of Type-I error is controlled
Analysis of Variance (ANOVA)
Analysis of Variance is a procedure that partitions the total
variability present in the data set into meaningful and distinct
components. Each component represents the variation due
to a recognized source of variation, in addition, one
component represents the variation due to uncontrolled
factors and random errors associated with the
73
(One-Way ANOVA)Four groups of students ( All of
approximately same attributes) were subjected to different
teaching techniques and tested at the end of a specified period
of time. Due to drop outs in the experimental groups (sickness,
transfers etc) the number of students varied from group to
group
Method 1 Method 2 Method 3 Method 4
65 75 59 94
87 69 78 89
73 83 67 80
79 81 62 88
81 72 83
69 79 76
90
Result:-As Fcal =3.77 > F.05(3,19) =3.13 So reject Ho and conclude that
there is difference in the mean achievements for the four teaching methods.
(Two-Way ANOVA)Four Breeds of cattle were fed on three
different diats. Gains in weight in pounds over a given period
were recorded.
Rations Breeds Total
B1 B2 B3 B4
R1 46.5 62 41 45
R2 47.5 41.5 22 31.5
R3 50 40 25.5 28.5
Total
Breeds
Rations Total
B1 B2 B3 B4
R1 46.5 62 41 45 194.5
R2 47.5 41.5 22 31.5 142.5
R3 50 40 25.5 28.5 144.0
Total 144 143.5 88.5 105 481
G.T
(G.T ) 2 (481) 2
C .F 19280.08
n 12
TSS (46.5) 2 (47.5) 2 (28.5) 2 C.F
TSS 20729.5 19280.08 1449.42
(144) 2 (143.5) 2 (88.5) 2 (105) 2
SSBreed CF
3 3 3 3
SSBreed 20061.83 19280.08 781.75
(194.5) 2 (142.5) 2 (144) 2
SSRation CF
4 4 4
SSRation 19718.125 19280.08 438.05
SSError TSS SSB SSR
SSE 1449.42 781.75 438.05 229.63
ANOVA TABLE
SOV d.f SS MS F.cal F.tab
Breed b-1=3 781.75 260.75 F1=6.81 F.05(3,6)=4.76
Ration r-1 =2 438.05 219.02 F2=5.72 F.05(2,6)=5.14
Error 6 229.63 38.37
Total n-1=11 1449.42
80
Two-Way ANOVA. The Black Rock candy company was
planning a test of three new candy flavors (F1,F2,F3). In the test
company wished also to measure the effect of three different retail
price levels (P1=79 Cents, P2=89 Cents, P3=99 Cents). Because
each flavor was to be tested at each price a total of nine different
flavor-prices level combinations were to be tested. The following
data represent the number of sold candy in (100).
F1 F2 F3
P1 8 13 5 26
P2 4 18 6 28
P3 4 22 10 36
Total 16 53 21 90
Do the data provide sufficient evidence to indicate
a difference in the mean for flavors and prices?
81
Step-1: Construction of hypotheses
Ho : 1=2=3 i.e All the flavors have equal sales
H1: At least one ’s is different
Ho : 1=2=3 i.e All the prices have equal sales
’
H1 : At least one ’s is different
Step-2. Level of significance Step-3. Test Statistic
a = 5% F test (ANOVA)
Step 4 Calculation
Candy Flavors
Price Total
F1 F2 F3
P1 8 13 5 26
P2 4 18 6 28
P3 4 22 10 36
Total 16 53 21 90
G.T
82
(G.T ) 2 (90) 2
C .F 900.00
n 9
TSS (8) (4)
2 2
(10) C.F
2
84