Module4_Market_Research_1

You might also like

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 34

MKTG 631

Marketing Analytics
Module 4 – Market research: Hypothesis Testing I

Jinhee Huh
Hypothesis Testing 1 – Chi-Square and
Student’s t Testing
Group Comparison Example

• Which gender group is more likely to own an iPhone?

iPhone = 0 iPhone = 1 Total


Male 250 150 400
Female 450 150 600
Total 700 300 1000

• Men: 150/400 = 37.5%


• Women : 150/600 = 25%
Group Comparison Example

• Are these numbers different?


• Mathematic difference?
• If numbers are not exactly the same, they are different
• Managerial difference?
• A difference is important from a managerial perspective only if results or numbers are
sufficiently different
• Statistical or significant difference
• If a difference is large enough to be unlikely to have occurred by chance or due to sampling
error, then the difference is statistically significant

Hypothesis testing
Hypothesis Testing Process

• Step 1: State the hypotheses


• Step 2: Choose the test
• Step 3: Develop a decision rule
• Step 4: Compute the test statistic
• Step 5: State the conclusion
Step 1: State the hypotheses

• A hypothesis is a claim about the population

• Null Hypothesis (H0)


• Default hypothesis
• Typically, “no relationship” or “no difference”

• Alternative Hypothesis (Ha or H1)


• Opposite of the null hypothesis
Step 1: State the hypotheses

• Null Hypothesis (H )0

• Men and women are equally likely to purchase an iPhone

• There is no significant relationship between gender and the iPhone ownership

• Alternative Hypothesis (H1 or Ha)


• Men and women have different likelihood to purchase an iPhone

• There is a significant relationship between gender and the iPhone ownership.


Step 2: Choose the Appropriate Test

• t-test
• One sample t-test (one variable)
• Testing if a variable’s mean is significantly different from a specific value

• Two sample t-test (two variables or two groups)


• Comparing the means of two groups/variables are significantly different from each other
Step 2: Choose the Appropriate Test

• Chi-square test
• One sample chi-square test
• Testing if the observed proportions are equal to a set of known proportions

• Two sample chi-square test


• Testing the relationship between two categorical variables
Step 2: Choose the Appropriate Test

• Student’s t distribution • Chi-square distribution

Source: https://www.globalspec.com/reference/69593/203279/10-7-the-student-s-t-distribution
Step 3: Develop a Decision Rule

• Significance Level () of a test


• The basis for the decision rule
• 0.05, 0.01 or 0.1

• P-value: 0.03
• For 0.05 significance level, we reject the null hypothesis
• For 0.01 significance level, we cannot reject the null hypothesis
Step 3: Develop a Decision Rule

• The Significance Level (𝛼) determines the “critical value”


• Reject the null hypothesis if the test statistic is larger than or equal to the critical value

Area in Upper Tail

Degree of freedom .10 .05 .025

9 1.383 1.833 2.262

10 1.372 1.812 2.228

11 1.363 2.201 2.201

12 1.356 2.179 2.179


Step 4: Calculate the Value of the Test Statistic

• Calculate the Expected Value for each cell in the cross-tab table, assuming H0 is true
• Example
• H0 : Gender and phone use are NOT related

Male Female Row Total

iPhone ? ? 200

Not iPhone ? ? 200

Column Total 300 100 400


Step 4: Calculate the Value of the Test Statistic

• Calculate the Expected Value for each cell in the cross-tab table, assuming H0 is true
• Example
• H0 : Gender and phone use are NOT related

Male Female Row Total

iPhone a b 200 𝑎 +𝑏=200

Not iPhone ? ? 200


𝑎 𝑏
{
𝑎=150
𝑏=50
=
Column Total 300 100 400 300 100
Step 4: Calculate the Value of the Test Statistic

(Oij  Eij ) 2
• For each cell, compute
Eij

Male Female Male Female

iPhone 120 (150) 80 (50) iPhone (120-150)2/150=6 (80-50)2/50=18

Android 180 (150) 20 (50) Android (180-150)2/150=6 (20-50)2/50=18


Step 4: Calculate the Value of the Test Statistic
r k (Oij  Eij ) 2
• Sum over all cells   
2

i 1 j 1 Eij
• : number of categories of variable 1
• : number of categories of variable 2

Male Female Row Total

iPhone (120-150)2/150=6 (80-50)2/50=18 6+18=24

Android (180-150)2/150=6 (20-50)2/50=18 6+18=24

Column Total 300 100 24+24=48


Step 5: State The Conclusion

• Calculate degree of freedom

• The number of values in the final calculation of a statistic that are free to vary

• Look for the critical value


• Chi-square distribution Table
Step 5: State The Conclusion

• Decision Rule

• Reject the null hypothesis if the test statistic is larger than the critical value
Area in Upper Tail
Degree of freedom .10 .05 .01
1 2.70 3.84 6.63
2 4.61 5.99 9.21
3 6.25 7.81 11.34
4 7.78 9.48 13.28

• Reject the null hypothesis


Step 5: State the Conclusion

• With confidence of 95%, we reject the null hypothesis that there is no


relationship between gender and iPhone purchase

• More intuitively, we find that there is a significant relationship between gender and
iPhone purchase
Step 5: State the Conclusion

• What if the chi-square test statistic < 3.84?

• With confidence of 95%, we fail to reject the null hypothesis


• More intuitively, it seems that there is no association (relationship) between gender and iphone
ownership.
In-Class Practice
• Assume that a marketing manager wishes to compare five
different colors of package design. He is interested in knowing
which of the five is the most preferred one so that it can be
introduced in the market. Analyze the frequency table below
using chi-square test of independence and draw your
conclusion.
Package color Preference by consumers
Red 70
Blue 106
Green 80
Pink 70
Orange 74
Total 400
21
In-Class Practice

• A marketing manager of a laundry detergent company wants to


know whether the income level of the consumers influence their
choices of the brand. Analyze the cross-table below using chi-
square test of independence and draw your conclusion.

Brand 1 Brand 2 Total


Income – high 35 25 60
Income - low 45 95 140
Total 80 120 200

22
Student’s t Test

• Hypothesis

• H0: The average sales of stores in city A is the same as that of stores in city B

• H1: The average sales of stores in city A is not same as that of stores in city B

• Data
City Store sales
A 1000 2825 5639 5792 7238 5224 6277
B 3100 2393 4012 1928 8321 7082
Student’s t Test – Unpaired
• Test statistic
• Equal variance:

• Unequal variance:

• ;
Area in Upper Tail
Degree of freedom .10 .05 .025
• 0.29
9 1.383 1.833 2.262
• (for equal variance) 10 1.372 1.812 2.228
11 1.363 2.201 2.201
12 1.356 2.179 2.179
In-Class Practice

• Was the marketing campaign successful? Use t-test to draw your


conclusion.
Pre Marketing During Marketing
Monday 649 1070
Tuesday 654 799
Wednesday 961 575
Thursday 816 940
Friday 663 917

Saturday 623 714

Sunday 599 748

25
R program Practice
Chi-square Test
• Chi-square test function
• chisq.test(frequency table)
• chisq.test can be operated on a frequency table

• Example
• tmp.tab <- table(rep(c(1:4), times=c(25,25,25,20)))
• tmp.tab
• chisq.test(tmp.tab)

• tmp.tab <- table(rep(c(1:4), times=c(25,25,25,10)))


• tmp.tab
• chisq.test(tmp.tab)
Chi-Square Test
• One sample chi-square test

• Are there any significant differences in the segmentation size?


• table(seg.df$Segment)
• chisq.test(table(seg.df$Segment))

You can reject the null hypothesis


There are significant differences in segmentation size
Chi-Square Test
• Two sample chi-square test

• Is the subscription status independent from the home ownership?


• table(sub$subscribe, sub$ownHome)
• chisq.test(table(sub$subscribe, sub$ownHome))

You cannot reject the null hypothesis


Home ownership is not related to the subscription status
Chi-Square Test
• In-class practice

• Is the subscription rate different between male and female?

• Does the subscription rate vary by segments?


Student’s t Test
• t-test function

• t.test(x, mu=m0)
• One sample t-test
• H0: mu = m0 (m0 is a specific numeric value. If not specified, m0 = 0)

• t.test(x1, x2)
• x1 and x2 are numeric variables
• Comparing average values

• t.test(x1~x2)
• x1 is numeric and x2 is binary factor variable
• Comparing the average value of group 1 in x2 and that of group 2 in x2
Student’s t Test
• Two sample t-test

• Is the average age different from average income?

• Check if the variances are the same


• var.test(sub$age, sub$income)
• t.test(sub$age, sub$income)
Student’s t Test
• Two sample t-test

• Does the income vary by the home ownership status?


• var.test(owner_dt$income, no_dt$income)
• t.test(owner_dt$income, no_dt$income, var.equal = T)

• Is the average income different between “Travelers” (segment: “Travelers”) home


owners and “Travelers” non-home owners?
• var.test(income ~ ownHome, data=subset(sub, Segment=="Travelers"))
• t.test(income ~ ownHome, data=subset(sub, Segment=="Travelers"))
Student’s t Test
• In-class practice

• Is the home owner’s mean age different from the non-home owner’s average
age?

• Is the average income different between male “Urban hip” (segment: “Urban
hip”) consumers and female “Urban hip” consumers?

You might also like