Professional Documents
Culture Documents
Quantitative Techniques - Ii: Dr. Pritha Guha
Quantitative Techniques - Ii: Dr. Pritha Guha
Quantitative Techniques - Ii: Dr. Pritha Guha
TECHNIQUES - II
Dr. Pritha Guha
MORE THAN TWO POPULATIONS
Comparing the means of more than two populations
• Annual savings using public transportation in 4 large American cities (in $):
• We believe that the annual saving for the 4 cities are the same. We would have
𝐻0 : mean savings for all cities are same
𝐻1 : mean savings vary across the cities
boxplot(savings~city, col = c("red", "blue", "yellow", "plum"), main = "Boxplot of
Annual savings using public transportation in 4 large American cities")
Comparing the means of more than two populations: Set up
• Suppose we have k (≥ 3) samples.
• We would like to know whether they are from the same distribution.
• Assumption: all samples are from normal distributions with same variance (unknown)
• Samples:
• Sample 1 : 𝑋11 , 𝑋12 , ⋯ , 𝑋1𝑛1 𝐼𝐼𝐷 𝑁(𝜇1 , 𝜎 2 )
• Sample 2 : 𝑋11 , 𝑋12 , ⋯ , 𝑋1𝑛1 𝐼𝐼𝐷 𝑁(𝜇2 , 𝜎 2 )
…
• Sample k : 𝑋11 , 𝑋12 , ⋯ , 𝑋1𝑛1 𝐼𝐼𝐷 𝑁(𝜇𝑘 , 𝜎 2 )
Assumptions
• Normality: all samples have to be from normal distributions.
• Independence: samples need to be independent.
• Equal variance: all populations must have equal variance
• Difference, if any, is therefore only through the means.
Alternative Representation
• Let μ: overall mean,
𝛼𝑖 : Differential effect of the i-th factor/ treatment/ group, then,
𝜇𝑖 = 𝜇 + 𝛼𝑖
• The model becomes: 𝑋𝑖𝑗 = 𝜇 + 𝛼𝑖 + 𝜖𝑖𝑗 , 𝑗 = 1, 2, ⋯ , 𝑛𝑖 , 𝑖 = 1, 2, ⋯ , 𝑘
• We now test for
𝐻0 : 𝛼1 = 𝛼2 = ⋯ = 𝛼𝑘 = 0, 𝐻1 : at least one 𝛼𝑖 is not 0
• A Model Restriction: The 𝛼𝑖 ’s are the differential effects from mean level, thus,
𝑘
• σ𝑖=1 𝑛𝑖 𝛼𝑖 = 0 , if model is unbalanced.
𝑘
• σ𝑖=1 𝛼𝑖 = 0, if model is balanced.
Some Estimates
1 𝑛𝑖
• The grand sample mean, 𝑋ത00 = σ𝑘𝑖=1 σ𝑗=1 𝑋𝑖𝑗 is unbiased for overall mean μ.
𝑛
Thus 𝜇ො = 𝑋ത00
1 𝑛𝑖
• Sample mean for each group 𝑋ത𝑖0 = σ 𝑋 . 𝑋ത𝑖0 is unbiased for 𝜇𝑖 = 𝜇 + 𝛼𝑖 .
𝑛𝑖 𝑗=1 𝑖𝑗
𝑘 σ𝑛𝑖 2
σ ത00
• Total Sum of Squares (SST): 𝑖=1 𝑗=1 𝑋𝑖𝑗 − 𝑋
In R
• #Calculation by hand
• unique(city)
• xGT = mean(savings)
• xB = mean(savings[city == "Boston"])
• xNY = mean(savings[city == "NY"])
• xSF = mean(savings[city == "SF"])
• xC = mean(savings[city == "Chicago"])
In R
𝑛𝑖 2
SST= σ𝑘𝑖=1 σ𝑗=1 𝑋𝑖𝑗 − 𝑋ത00 =
Test Statistic
𝑆𝑆𝑇𝑅Τ(𝑘−1)
• Under 𝐻0 , 𝐹 = ~𝐹 𝑘−1 ,(𝑛−𝑘)
𝑆𝑆𝐸 Τ(𝑛−𝑘)
Note: We define
• 𝑀𝑆𝑇𝑅 = 𝑆𝑆𝑇𝑅Τ(𝑘 − 1)
• 𝑀𝑆𝐸 = 𝑆𝑆𝐸 Τ(𝑛 − 𝑘)
Analysis Of Variance (ANOVA) Table
• Under 𝐻0 , 𝐹~𝐹𝑘−1,𝑛−𝑘
• Reject 𝐻0 if observed 𝐹 > 𝐹𝑘−1,𝑛−𝑘;𝛼 at level α.
• Rejection of 𝐻0 means that not all group means are equal.
In R
• PublicT.Anova=aov(savings~city)
• summary(PublicT.Anova)
Analysis Of Variance (ANOVA) Table
• To test 𝐻0 : 𝜎12 = 𝜎22 = ⋯ = 𝜎𝑘2 and 𝐻1 : not all 𝜎𝑖2 ’s are equal.
• Assumption: Data is from a normal distribution.
2
• Under 𝐻0 , the test statistic follows 𝜒𝑘−1 .
• Reject 𝐻0 at significance level α, if
2
• observed test statistic value > 𝜒𝑘−1;𝛼
• or if p-value < α
In R
bartlett.test(savings ~ city)
A Problem
• Does salary depend on gender?
• A data consisting of observations on three variables for 52 tenure-track professors in a
small college was collected to test this opinion (see DisSalary.csv).
• The variables are:
• Gender: Male/Female
• JobRank: Full Professor (full), Associate Professor (associate), Assistant Professor
(assistant)
• Salary: Salary of the faculty('000 Rs.)
• Cut-off value:
• p-value:
• Conclusion: