Final Exam Data Analysis Process 2020-21

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

Master in International Management and Sustainability

AY 2021-2022

THE DATA ANALYSIS PROCESS:


TOOLS FOR BETTER DECISION MAKING
Prof. Laura Trinchera
6 May 2021

REMEMBER:
1) Give your (developed) answers directly on these sheets
2) Provide methodological justifications for all your answers
3) Once finished save your file as a .pdf and upload it on Moodle
Good luck!!

Data Presentation

The data used here come from a study by Touzani and Azza (2004)1 aimed at
measuring the effect of several cognitive and affective antecedents of brand loyalty
for shampoo.

An adapted version of the questionnaire used by the authors is presented in Appendix


1. The items marked with a * are reversed items that have been transformed (by
reversing the answers) before to be included in the analysis (pay attention to the
interpretation!). The questionnaire is composed of three different parts. In the first
part the respondents answered about their general opinion on shampoos’ brands. In
the second part, each respondent answered on the most used brand. In the third part
socio-demographical variables have been collected.

400 costumers have been selected by a quota sampling controlling for age and socio-
professional categories.

Each variable has been measured on a five-point Likert scale: 1 means Strongly
disagree and 5 Strongly agree.

The authors wished to verify:

H1: How many factors (and which they are) explain cognitive and affective
antecedents of brand loyalty.

1
Touzani Mourad, Temessek Azza (2004) Une approche intégrative pour l'étude des antécédents de la
fidélité à la marque, 1-19. In Colloque de l’Association Tunisienne du Marketing.
Master in International Management and Sustainability

Q1. (1pts):
One of the goals of Principal Component Analysis is to reduce the original dimension
of the data. How would you choose the number of principal components to retain?

The goal is to choose the principal components that retain the most information from
the original data set.

F1 F2 F3 F4 F5 F6 F7 F8 F9 F10 F11 F12 F13 F14 F15 F16 F17


Eigenvalue 4.40 2.08 1.23 1.09 1.01 0.92 0.87 0.74 0.71 0.64 0.57 0.55 0.50 0.47 0.43 0.39 0.39
Variability 25.89 XXX 7.25 6.40 5.97 5.42 5.14 4.33 4.16 3.79 3.35 3.22 2.95 2.79 2.52 2.32 2.28
Cumulative 25.89 38.11 45.37 51.77 57.73 63.15 68.29 72.62 76.78 80.57 83.92 87.14 90.09 92.88 95.40 97.72 100

Table 1: Eigenvalues

Q2. (1pts):
According to the results in Table 1, how many principal components would you think
can be detected in the collected data (apply the Kaiser rule)?

According to the results, 17 principal components can be detected. Since at F17, the
cumulative variability is 100%, hence 17 principal components can be identified.

Q3. (1pts):
What is the percentage of total inertia explained by the first 4 principal components?

The percentage of total inertia explained by the first 4 principal components is


51.77%. It refers to the cumulative variability of the first 4 principal components.
Master in International Management and Sustainability

Q4. (1pts):
Table 4 is missing a value. Fill in the missing value (i.e. the XXX) by answering the
following question: what is the portion of the inertia explained by the second principal
component?

The missing value is 12,22. The portion of the inertia explained by the second
principal component is 12,22%. Since the total variability refers to the number of
variables which is 17, hence we divide the eigenvalue (2.08) by the number of
variables (17) and we can find 12.22. Another way is to do 38,11 – 25,89 to find the
variability of the second principal component and we can find also 12.22.

Q5. (1pts):
What is the amount of TOTAL Inertia? Justify your answer.

It is 17. It refers to the number of variables. It is the sum of the eigenvalues.


Master in International Management and Sustainability

Q6. (3pts):
Briefly interpret the first two principal components in this example. That is, what
aspect of the original variables is captured by the first principal component? By the
second? Pay attention to the values in Tables 2 and 3 on the following page for
interpreting the principal components.

Figure 1: Correlation Circle

Here, axes F1 and F2 carry 38.11% of the information from the original data set.

The first principal component captures mainly the variables c1, c3, e1, e7, sat1. F1
contains 12,79% of the variable c1, 14,24% of the variable c3, 12,58% of the variable
e1, 14,98% of the variable e7 and 14,20% of the variable sat1. The squared cosines of
these variables are very high compared to the rest (0,44; 0,49; 0,43; 0,51; 0,49) and
the higher the better. Hence, F1 represents very well these variables. As such, there is
a positive relationshiop between these variables as it can be observed on the
correlation circle. The red lines of these variables are close to each other.

On the other hand, the second principal component captures mainly the variables s1,
s2, s3, s4, d1, d3. For instance, the variable s1 contributes by 12,33% to the factor F2.
The respective squared cosines are respectively 0,38; 0,45; 0,53; 0,46; 0,41; 0,38
Master in International Management and Sustainability
which is very high compared to the other variables. Hence the second principal
component represents very well these variables.

However, it is worthwhile to point out that these two principal components only
represent 38.11% of the information from the original data set which is significantly
low. Furthermore, the red lines of the correlation circle are not close to the circle
meaning that maybe the variables are not enough represented by F1 and F2. It might
be useful to have a look at other principal components such as F3.

Table 2: Contribution of the variables (%)


F1 F2
s1 2.58 12.33
s2 1.86 14.61
s3 0.04 17.28
s4 0.08 15.05
s5 2.70 3.44
d1 0.13 13.35
d2 0.95 6.28
d3 0.02 12.40
c1 12.79 1.45
c3 14.24 1.51
c4 8.27 0.20
e1 12.58 0.07
e2 9.73 0.27
e7 14.98 0.00
sat1 14.20 0.77
sat2 0.35 0.50
sat3 4.51 0.47
Master in International Management and Sustainability

F1 F2
s1 0.09 0.38
s2 0.06 0.45
s3 0.00 0.53
s4 0.00 0.46
Table 3: s5 0.09 0.11
Squared d1 0.00 0.41 cosines
of the d2 0.03 0.19
variables
d3 0.00 0.38
c1 0.44 0.04
c3 0.49 0.05
c4 0.28 0.01
e1 0.43 0.00
e2 0.33 0.01
e7 0.51 0.00
sat1 0.49 0.02
sat2 0.01 0.02
sat3 0.15 0.01

Q7. (2pts):
We applied a hierarchical cluster analysis (agglomerative) on the survey data. The
obtained dendrogram is reported in Figure 2.
Master in International Management and Sustainability

How many classes you would identify according to the results in Figure 2? Cut the
dendrogram according to your answer.

If we cut the dendogram at the top, we can identify 2 classes.

END OF EXAMINATION PAPER

Thank you for attending this course


Master in International Management and Sustainability

Appendix 1: Questionnaire

First part

Express your level of agreement or disagreement with the following statements on shampoos.
For each question check the box that mostly represent your feelings by crossing it.

Please answer the following questions according to:

1. Strongly disagree
2. Disagree
3. Neither agree nor disagree
4. Agree
5. Strongly agree

Neither
Strongly Strongly
Disagree agree nor Agree
disagree agree
disagree

S1 When I buy a shampoo I care for the brand 1 2 3 4 5

I don’t see any differences between the main


D1* 1 2 3 4 5
brands of shampoo

When buying a shampoo I take the brand into


S2 1 2 3 4 5
account

For me there are big differences between the main


D2 1 2 3 4 5
brands of shampoo

S3* I don’t choose a shampoo according to the brand 1 2 3 4 5

The only difference between the main brands of


D3 1 2 3 4 5
shampoo is the price

S4* For a shampoo the brand is not very important 1 2 3 4 5

S5 When I buy shampoos I prefer well-known brands 1 2 3 4 5


Master in International Management and Sustainability

Second part

What brand of shampoo you buy most often?


Answer:...................................................................

When answering the following questions please refer to the brand indicated above.

Neither
Strongly Strongly
Disagree agree nor Agree
disagree agree
disagree

C1 The products of this brand bring me security 1 2 3 4 5

When I can not find this brand in my local store I


E1 1 2 3 4 5
prefer to wait for buying.

Sat1 I am satisfied with this brand of shampoo 1 2 3 4 5

I have confidence in the quality of products of this


C3 1 2 3 4 5
brand

Sat2 The packaging of this brands of shampoo is nice 1 2 3 4 5

When I can not find this brand in my local store, I


E2* 1 2 3 4 5
take another

Sat3 For this brand, shampoo smells good 1 2 3 4 5

For shampoos, one can say that I am attached to


E’7 1 2 3 4 5
this brand
I think this brand is continually seeking to
C4 1 2 3 4 5
improve its response to consumer needs

Do you intend to buy this brand at your next purchase?

YES NO

If "No", which brand do you intend to buy?


Answer:..................................................................

Some questions on your buying behaviours


A/ Order chronologically the last three shampoo brands you bought (it may be the same brand)

1 ..............................................
2................................................
3................................................
Master in International Management and Sustainability
B/ Are you:

1. Loyal to a unique brand


2. Loyal to several brands
3. Indifferent to brands

Demographic questions

What is your age?

Under 25 years old


25 - 34 years old
35 - 44 years old
45 - 54 years old
55 - 60 years old
60 years or older

What is your total monthly household income?

Less than 100 TND


100 to 400 TND
400 to 800 TND
800 to 1500 TND
More than 1500 TND

What is your marital status?

Single, never married


Married or domestic partnership without children
Married or domestic partnership with children
Divorced
Widowed

You might also like