Final Exam Data Analysis Process 2020-21

Master in International Management and Sustainability
AY 2021-2022
THE DATA ANALYSIS PROCESS:

TOOLS FOR BETTER DECISION MAKING
Prof. Laura Trinchera
6 May 2021
REMEMBER:
1) Give your (developed) answers directly on these sheets
2) Provide methodological justifications for all your answers
3) Once finished save your file as a .pdf and upload it on Moodle
Good luck!!
Data Presentation
The data used here come from a study by Touzani and Azza (2004)1 aimed at
measuring the effect of several cognitive and affective antecedents of brand loyalty
for shampoo.
An adapted version of the questionnaire used by the authors is presented in Appendix

1. The items marked with a * are reversed items that have been transformed (by
reversing the answers) before to be included in the analysis (pay attention to the
interpretation!). The questionnaire is composed of three different parts. In the first
part the respondents answered about their general opinion on shampoos’ brands. In
the second part, each respondent answered on the most used brand. In the third part
socio-demographical variables have been collected.
400 costumers have been selected by a quota sampling controlling for age and socio-
professional categories.
Each variable has been measured on a five-point Likert scale: 1 means Strongly
disagree and 5 Strongly agree.
The authors wished to verify:
H1: How many factors (and which they are) explain cognitive and affective
antecedents of brand loyalty.
1
Touzani Mourad, Temessek Azza (2004) Une approche intégrative pour l'étude des antécédents de la
fidélité à la marque, 1-19. In Colloque de l’Association Tunisienne du Marketing.
Q1. (1pts):
One of the goals of Principal Component Analysis is to reduce the original dimension
of the data. How would you choose the number of principal components to retain?
The goal is to choose the principal components that retain the most information from
the original data set.
F1 F2 F3 F4 F5 F6 F7 F8 F9 F10 F11 F12 F13 F14 F15 F16 F17

Eigenvalue 4.40 2.08 1.23 1.09 1.01 0.92 0.87 0.74 0.71 0.64 0.57 0.55 0.50 0.47 0.43 0.39 0.39
Variability 25.89 XXX 7.25 6.40 5.97 5.42 5.14 4.33 4.16 3.79 3.35 3.22 2.95 2.79 2.52 2.32 2.28
Cumulative 25.89 38.11 45.37 51.77 57.73 63.15 68.29 72.62 76.78 80.57 83.92 87.14 90.09 92.88 95.40 97.72 100
Table 1: Eigenvalues
Q2. (1pts):
According to the results in Table 1, how many principal components would you think
can be detected in the collected data (apply the Kaiser rule)?
According to the results, 17 principal components can be detected. Since at F17, the
cumulative variability is 100%, hence 17 principal components can be identified.
Q3. (1pts):
What is the percentage of total inertia explained by the first 4 principal components?
The percentage of total inertia explained by the first 4 principal components is

51.77%. It refers to the cumulative variability of the first 4 principal components.
Q4. (1pts):
Table 4 is missing a value. Fill in the missing value (i.e. the XXX) by answering the
following question: what is the portion of the inertia explained by the second principal
component?
The missing value is 12,22. The portion of the inertia explained by the second
principal component is 12,22%. Since the total variability refers to the number of
variables which is 17, hence we divide the eigenvalue (2.08) by the number of
variables (17) and we can find 12.22. Another way is to do 38,11 – 25,89 to find the
variability of the second principal component and we can find also 12.22.
Q5. (1pts):
What is the amount of TOTAL Inertia? Justify your answer.
It is 17. It refers to the number of variables. It is the sum of the eigenvalues.

Q6. (3pts):
Briefly interpret the first two principal components in this example. That is, what
aspect of the original variables is captured by the first principal component? By the
second? Pay attention to the values in Tables 2 and 3 on the following page for
interpreting the principal components.
Figure 1: Correlation Circle
Here, axes F1 and F2 carry 38.11% of the information from the original data set.
The first principal component captures mainly the variables c1, c3, e1, e7, sat1. F1
contains 12,79% of the variable c1, 14,24% of the variable c3, 12,58% of the variable
e1, 14,98% of the variable e7 and 14,20% of the variable sat1. The squared cosines of
these variables are very high compared to the rest (0,44; 0,49; 0,43; 0,51; 0,49) and
the higher the better. Hence, F1 represents very well these variables. As such, there is
a positive relationshiop between these variables as it can be observed on the
correlation circle. The red lines of these variables are close to each other.
On the other hand, the second principal component captures mainly the variables s1,
s2, s3, s4, d1, d3. For instance, the variable s1 contributes by 12,33% to the factor F2.
The respective squared cosines are respectively 0,38; 0,45; 0,53; 0,46; 0,41; 0,38
which is very high compared to the other variables. Hence the second principal
component represents very well these variables.
However, it is worthwhile to point out that these two principal components only
represent 38.11% of the information from the original data set which is significantly
low. Furthermore, the red lines of the correlation circle are not close to the circle
meaning that maybe the variables are not enough represented by F1 and F2. It might
be useful to have a look at other principal components such as F3.
Table 2: Contribution of the variables (%)

F1 F2
s1 2.58 12.33
s2 1.86 14.61
s3 0.04 17.28
s4 0.08 15.05
s5 2.70 3.44
d1 0.13 13.35
d2 0.95 6.28
d3 0.02 12.40
c1 12.79 1.45
c3 14.24 1.51
c4 8.27 0.20
e1 12.58 0.07
e2 9.73 0.27
e7 14.98 0.00
sat1 14.20 0.77
sat2 0.35 0.50
sat3 4.51 0.47
F1 F2
s1 0.09 0.38
s2 0.06 0.45
s3 0.00 0.53
s4 0.00 0.46
Table 3: s5 0.09 0.11
Squared d1 0.00 0.41 cosines
of the d2 0.03 0.19
variables
d3 0.00 0.38
c1 0.44 0.04
c3 0.49 0.05
c4 0.28 0.01
e1 0.43 0.00
e2 0.33 0.01
e7 0.51 0.00
sat1 0.49 0.02
sat2 0.01 0.02
sat3 0.15 0.01
Q7. (2pts):
We applied a hierarchical cluster analysis (agglomerative) on the survey data. The
obtained dendrogram is reported in Figure 2.
How many classes you would identify according to the results in Figure 2? Cut the
dendrogram according to your answer.
If we cut the dendogram at the top, we can identify 2 classes.
END OF EXAMINATION PAPER
Thank you for attending this course

Appendix 1: Questionnaire
First part
Express your level of agreement or disagreement with the following statements on shampoos.
For each question check the box that mostly represent your feelings by crossing it.
Please answer the following questions according to:
1. Strongly disagree
2. Disagree
3. Neither agree nor disagree
4. Agree
5. Strongly agree
Neither
Strongly Strongly
Disagree agree nor Agree
disagree agree
disagree
S1 When I buy a shampoo I care for the brand 1 2 3 4 5
I don’t see any differences between the main

D1* 1 2 3 4 5
brands of shampoo
When buying a shampoo I take the brand into

S2 1 2 3 4 5
account
For me there are big differences between the main

D2 1 2 3 4 5
brands of shampoo
S3* I don’t choose a shampoo according to the brand 1 2 3 4 5
The only difference between the main brands of

D3 1 2 3 4 5
shampoo is the price
S4* For a shampoo the brand is not very important 1 2 3 4 5
S5 When I buy shampoos I prefer well-known brands 1 2 3 4 5

Second part
What brand of shampoo you buy most often?

Answer:...................................................................
When answering the following questions please refer to the brand indicated above.
Neither
Strongly Strongly
Disagree agree nor Agree
disagree agree
disagree
C1 The products of this brand bring me security 1 2 3 4 5
When I can not find this brand in my local store I

E1 1 2 3 4 5
prefer to wait for buying.
Sat1 I am satisfied with this brand of shampoo 1 2 3 4 5
I have confidence in the quality of products of this

C3 1 2 3 4 5
brand
Sat2 The packaging of this brands of shampoo is nice 1 2 3 4 5
When I can not find this brand in my local store, I

E2* 1 2 3 4 5
take another
Sat3 For this brand, shampoo smells good 1 2 3 4 5
For shampoos, one can say that I am attached to

E’7 1 2 3 4 5
this brand
I think this brand is continually seeking to
C4 1 2 3 4 5
improve its response to consumer needs
Do you intend to buy this brand at your next purchase?
YES NO
If "No", which brand do you intend to buy?

Answer:..................................................................
Some questions on your buying behaviours

A/ Order chronologically the last three shampoo brands you bought (it may be the same brand)
1 ..............................................
2................................................
3................................................
B/ Are you:
1. Loyal to a unique brand

2. Loyal to several brands
3. Indifferent to brands
Demographic questions
What is your age?
Under 25 years old

25 - 34 years old
35 - 44 years old
45 - 54 years old
55 - 60 years old
60 years or older
What is your total monthly household income?
Less than 100 TND

100 to 400 TND
400 to 800 TND
800 to 1500 TND
More than 1500 TND
What is your marital status?
Single, never married

Married or domestic partnership without children
Married or domestic partnership with children
Divorced
Widowed

Final Exam Data Analysis Process 2020-21

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Final Exam Data Analysis Process 2020-21

Uploaded by

Copyright:

Available Formats

Master in International Management and Sustainability

THE DATA ANALYSIS PROCESS:

An adapted version of the questionnaire used by the authors is presented in Appendix

The authors wished to verify:

F1 F2 F3 F4 F5 F6 F7 F8 F9 F10 F11 F12 F13 F14 F15 F16 F17

The percentage of total inertia explained by the first 4 principal components is

It is 17. It refers to the number of variables. It is the sum of the eigenvalues.

Figure 1: Correlation Circle

Table 2: Contribution of the variables (%)

If we cut the dendogram at the top, we can identify 2 classes.

END OF EXAMINATION PAPER

Thank you for attending this course

Please answer the following questions according to:

S1 When I buy a shampoo I care for the brand 1 2 3 4 5

I don’t see any differences between the main

When buying a shampoo I take the brand into

For me there are big differences between the main

S3* I don’t choose a shampoo according to the brand 1 2 3 4 5

The only difference between the main brands of

S4* For a shampoo the brand is not very important 1 2 3 4 5

S5 When I buy shampoos I prefer well-known brands 1 2 3 4 5

What brand of shampoo you buy most often?

C1 The products of this brand bring me security 1 2 3 4 5

When I can not find this brand in my local store I

Sat1 I am satisfied with this brand of shampoo 1 2 3 4 5

I have confidence in the quality of products of this

Sat2 The packaging of this brands of shampoo is nice 1 2 3 4 5

When I can not find this brand in my local store, I

Sat3 For this brand, shampoo smells good 1 2 3 4 5

For shampoos, one can say that I am attached to

Do you intend to buy this brand at your next purchase?

If "No", which brand do you intend to buy?

Some questions on your buying behaviours

1. Loyal to a unique brand

What is your age?

Under 25 years old

What is your total monthly household income?

Less than 100 TND

What is your marital status?

Single, never married

You might also like