Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

STAR DIGITAL CASE

PRAVEEN PAITHANKER-043
GAURAV CHAUHAN-022
ANITA RANJAN-083
PARTH SHAH-112
Q1. Assess the validity of the random assignment of users to test group and control group in
the experiment. To answer this question think about what random assignment is intended
to accomplish, and how you can rigorously examine in the data (after the experiment is
completed) whether that goal was accomplished.
Answer1
In this experiment respondents were divided into two groups-test and control and as per the
case, users were assigned to each group randomly. To support this hypothesis, we ran
ANOVA to check whether the two groups have equal distribution of impression frequencies.
Basically we checked whether means of number of impressions are statistically equal between
the two groups.
Results:

 Means for two groups are statistically same as evident in table 1 under averages
column(7.86 and 7.92) since P value is more than 0.05, NULL hypothesis that means
are equal holds true
 This implies that chances of finding a person with similar impressions is same in two
groups or we can say distribution is same
 We will further run other tests to establish causality between the purchase intention
and number of impressions and if the predictor variable is not useful in predicting the
purchase outcome, we will have to check for other variables such as activity bias
(confounding variable) and device a technique to identify and remove the same from
our analysis

Q2 Examine whether or not Star Digital’s advertising campaign had an impact on purchases.
For this question ignore the fact that different consumers received different number of ad
impressions. In other words, advertising exposure is measured as a binary variable – exposed
or not. Use the following methods:
Answer2: Our assumptions are:

 Total Number of impressions does not cause any difference and leads to same kind of
purchase intention
 We will use Chi Square test and Logistics Regression (between purchase outcome
and group status)
 Here we converted both the categorical variables in binary variables and ran the tests
to establish association
a) Use a 2-sample t-test (Chi Square test)
Case Processing Summary

Cases
Valid Missing Total
N Percent N Percent N Percent
test * purchase 25303 100.0% 0 0.0% 25303 100.0%

 Test was tun for all 25303 observations

test * purchase Crosstabulation


purchase
0 1 Total
test 0 Count 1366 1290 2656
% within test 51.4% 48.6% 100.0%
% within purchase 10.9% 10.1% 10.5%
% of Total 5.4% 5.1% 10.5%
1 Count 11213 11434 22647
% within test 49.5% 50.5% 100.0%
% within purchase 89.1% 89.9% 89.5%
% of Total 44.3% 45.2% 89.5%
Total Count 12579 12724 25303
% within test 49.7% 50.3% 100.0%
% within purchase 100.0% 100.0% 100.0%
% of Total 49.7% 50.3% 100.0%
 89.9% of the users who purchased were shown the company ad whereas 10.1% were
the charity ad
 89.1% of the users who did not purchase were shown the company ad whereas 10.9%
were shown the charity ad
 Further we will check the statistical significance of this association

Chi-Square Tests
Asymptotic
Significanc Exact Sig. Exact Sig.
Value df e (2-sided) (2-sided) (1-sided)
a
Pearson Chi- 3.501 1 .061
Square
Continuity 3.424 1 .064
Correctionb
Likelihood Ratio 3.501 1 .061
We can see here that χ(1) = 3.501, p = .061.This tells us that there is no
statistical significant association between purchase outcome and group
status that is, both groups equally prefer to purchase.

b) Logistics Regression

Test of the null hypothesis H0: Y=0.503 (Variable


purchase):

Statistic DF Chi-square Pr > Chi²


-2 Log(Likelihood) 1 3.501 0.061
Score 1 3.501 0.061
Wald 1 3.499 0.061
 Likelihood Chi Square value of 3.501 and P value of 0.061 shows that there the model
with as it is data is not good predictor of the purchase outcome

Model parameters (Variable purchase):

Source Value Standard error


Wald Chi-Square
Pr Wald
> Chi²Lower
Wald
bound
Upper
(95%)
boundOdds
Odds
(95%)
ratio
ratio Odds
Lowerratio
bound
Upper
(95%)
bound (95%)
Intercept -0.057 0.039 2.174 0.140 -0.133 0.019
test-0 0.000 0.000
test-1 0.077 0.041 3.499 0.061 -0.004 0.157 1.080 0.996 1.170

 Group status is not significant in predicting the purchase outcome as Chi Square
value is 3.499 and P value is more than 0.05 (odds of purchase by test group users
in comparison to control group users can’t be predicted )
 This shows that tough the users are assigned to two groups randomly but there is
some other variable such as activity bias affecting the prediction ability from the
experiment data

Q3 Specify and estimate a logistic regression model to examine the hypothesis that
consumers who received more Star Digital ad impressions are more likely to purchase than
those who received fewer impressions. Consider only the linear effect of impressions.

Answer3 To understand the activity bias, we modified the analysis by further changing the
group wise data. As per the analysis, 1290 users from control group purchased the product
which shows that there is natural tendency to buy the product even when stimuli (company
advertisement) is absent. For such group the average number of impressions is 10.8.
When we checked the data for test group users, there is huge variation in the number of
impressions shown to the group. (1 to 521)
Also 85% of test group users (19283 out of 22648) have total impressions more than 10.8. (11
to 521)
Therefore if we take the number of impression as a proxy for the time spent online, we can
say that people who have impressions more than 10.8 are anyways purchasing because they
spend most of their time online. Here it is safe to assume that instead of advertisement, time
spent online is affecting the purchase outcome. To control for this activity bias we divided the
test group in two separate groups.

Group1: Users who have more than 10.8 total impressions


Group2: Users who have less than 10.8 total impressions

We again ran Logistics regression for this two data sets.

Group1:

Chi- Pr >
Statistic DF square Chi²
-2
Log(Likelihood) 1 8.962 0.003
Score 1 8.055 0.005
Wald 1 7.902 0.005
 Likelihood value and significance value show model fitment

Predictor Equation:
Pred (purchase) = 1 / (1 + exp(-(1.10467415490076+3.14761711405013E-03*total)))
 P value shows significance of the predictor variable (total impressions)
 Coefficient is 0.003 (3.14761711405013E-03)

Now same analysis is done for other set of data for group2:

Chi- Pr >
Statistic DF square Chi²
-2 <
Log(Likelihood) 1 417.832 0.0001
<
Score 1 415.439 0.0001
<
Wald 1 401.191 0.0001
 Log Likelihood value and significance show that model is fit for prediction

Predictor Equation:
Pred(purchase) = 1 / (1 + exp(-(-0.489360889497943+0.122690961752537*total)))
 P value shows significance of the predictor variable (total impressions)
 Coefficient is 0.123 (0.122690961752537)

So coefficient for group 2 (test group users with impressions less than 10.8) is 41 times
the coefficient for group1. ((test group users with impressions more than 10.8). This
shows that odds of purchase increase faster with increment in total impressions for
users who have been shown less number of impressions as the coeff value is smaller
or we can say that users who have actually spent more time online will not be affected
by change in number of impressions.
Thus model prediction improved when we removed the activity bias from our analysis
for the experiment data.

You might also like