Marketing Analytics: No Document Allowed

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 3

Marketing Analytics

Stéphanie Aerts
Master degree in Management Science – 1st year

June 2018 – Serie A - Theoretical part (1/2) : 1h


No document allowed

STUDENT ID: _________________ NAME: _______________________

1. (5 points) Multiple-choice questions : Correct answer: 1 point - Incorrect answer: - 0.25 point

1/ You consider several techniques to make a prediction. Based on which dataset will you decide which method is the
best one?
a. Validation
b. Training
c. Comparison
d. Test
2/ Why do we prune a decision tree?
a. To predict new cases
b. To try to avoid over-fitting
c. To compute the confusion matrix
d. Something else
3/ What is the name of the dataset used to avoid overfitting?
a. Validation
b. Test
c. Training
d. Something else
4/ Which method can be used to predict the risk of a customer (high risk/low risk)?
a. Linear Regression
b. Market Basket Analysis
c. Logistic regression
d. Several of the previous ones.
5/ What is the formula for entropy?
a. −∑ log 2 𝑝(𝑖; 𝑡)
b. 1 − ∑𝑝(𝑖; 𝑡)2
c. −∑𝑝(𝑖; 𝑡) log 2 𝑝(𝑖; 𝑡)
d. Something else

2. (3 points) Define overfitting:

3. (4 points) If we perform a Market Basket Analysis on the following transactional dataset, what
will be the lift, support and confidence of the rule Butter  Milk
Detail your calculation and briefly explain the meaning of each of these three measures.

Support :
Basket ID Product
145 Milk
153 Milk
153 Butter
162 Bread
165 Eggs
Confidence: 235 Milk
235 Butter
250 Butter
250 Cucumber
Lift : 265 Tomatoes
269 Milk
271 Butter
273 Eggs

4. (8 points) A renown retailing corporation would like to predict the weekly sales of their different
stores on the basis of the following dataset:

Store ID Nb_store TypeStore Fuel_Price Unemployment Sales (target)


2356 2 0 1.70 12.0 10 230
2569 1 1 1.50 8.0 15 256

The regression coefficients have been computed with SAS EM and are given in the table below:

Input Meaning Regression coefficient


Nb_store : number of stores in the same zip-code 𝛽1 = −30.0
TypeStore : small (0) or big (1) 𝛽2 = 20.0
Fuel_Price : average fuel price in the region (in euros) 𝛽3 = 10.0
Unemployment : unemployment rate in the region (in %) 𝛽4 = −20.0
𝛽0 = 10 200

i) What is the predicted sales amount for those two stores? (Detail your calculation)

Store 2356 :

Store 2569:

ii) Give the interpretation of the regression coefficients associated to the variables TypeStore
and Fuel_Price. (Just provide a sentence explaining the effect of those variables on the
target)

Fuel_Price :

Type_Store :

iii) If we assess the performance of the model on the basis of this dataset,

a. Which performance measure should be used?


b. Detail the computation of this measure on the above dataset (Just give the formula,
not the result).

iv) Let’s say that we modify the target variable Sales into a nominal variable taking one of the
two possible values: high amount (coded by 1)/ low amount(coded by 0). Using a logistic
regression for predicting this new target variable, the obtained regression coefficient
associated with the variable Fuel_Price is:
Input Meaning Regression coefficient
Fuel_Price : average fuel price in the region (in euros) 𝑒 𝛽3 = 1.10

Provide a sentence explaining the effect of this variable on the new target.

You might also like