Analysis of Retail Outlet

Ambudheesh Parasar
About the Project:
About the Project:

As a consulting assignment, the main purpose of this report is to analyse the data
collected from 403 customer through convenient sampling in Noida. The main
objective is to present the insights to the higher management so that necessary
action can be taken ok the basis of the report

Critical analysis points:

In order to analyze the data in the best possible manner, in the first chapter the data
is presented with the help of graphs to let the management understand the data and
then factor analysis has been performed to figure out the key factors independently
for two sets:

Set 1 consisted of variables like:

 What was the type of product (s) that you have brought during this visit to
the retail outlet?
 I avoid crowded place whenever possible
 If i see a crowded place I won't even go inside
 I avoid crowded stores
 If I see a store crowded, I won't even go inside
 The store seemed very spacious while shopping in the store
 The store had an airy feeling to it
 I felt cramped while shopping in the store
 the store feels very spacious since the ceilings are high and the light is bright
 I felt confining while shopping in the store since the ceiling is low and the
light is dim
 the ceiling and the light in this store give an open, airy feeling
 I felt annoyed while shopping in the store
 I felt suffocated while shopping in the store
 I felt despairing while shopping in the store
 I felt I could not move freely in the retail outlet
 The retail outlet seemed very crowded to me during this visit
 The retail outlet seemed very crowded to me in this shopping trip
 The retail outlet was a little too busy during this shopping trip
 I would recommend this store to other people
 It is is likely that I will continue purchasing the products from this store in
the future

Set 2 consisted of variables like:

 I intend to continue purchasing products from this store in the future

Set 2 consisted of variables like:

 Age
 Gender
 Income
 Marital Status
 Dependents
 Store Visited
 Distance
 Accompanied
 Time Spend
 Intention
 Frequency
 Amount Spent

Then the factor analysis was executed, and then critical factors were identified.

This was the basis of the first research question:

R1: Identification of critical factors both from set 1 and set 2?

After the identification of critical factors, regression analysis was performed to

figure out the impact on factors from set 1 and set 2 on amount spent and this
formed the basis of second research question

R2: Mention the key factors affecting the amount spent by an user and also
find the relationship strength?

Once the regression analysis was done, clustering was done to segregate the data
so that meaningful clusters can be done which can help in identification of
customers which help us in framing 3rd research question

R3: Segmentation of user on the basis of the data available in set A and set B

Data preparation through visualization

Data preparation through visualization

Frequency Percent Percent Cumulative Percent
Valid 18-25 181 44.9 44.9 44.9
26-35 85 21.1 21.1 66.0
36-45 72 17.9 17.9 83.9
46-55 42 10.4 10.4 94.3
55 above 23 5.7 5.7 100.0
Total 403 100.0 100.0
Key take away: 45 % of the user lies in the age range of 18-25.

Frequency Percent Percent Cumulative Percent
Valid Less
134 33.3 33.3 33.3
10001-20000 91 22.6 22.6 55.8
20001-35000 87 21.6 21.6 77.4
35001-50000 53 13.2 13.2 90.6
50001 and
38 9.4 9.4 100.0
Total 403 100.0 100.0
Key take away Majority, (roughly 35 %) of the user earns less than 10000 as their
Frequenc Valid
y Percent Percent Cumulative Percent
Valid Male 202 50.1 50.1 50.1
Female 198 49.1 49.1 99.3
3 3 .7 .7 100.0
Total 403 100.0 100.0
Key take away: Nearly equal distribution of male and female in the sample

Marital Status

Marital Status
Frequency Percent Percent Cumulative Percent
Valid Single 209 51.9 51.9 51.9
192 47.6 47.6 99.5
3 1 .2 .2 99.8
4 1 .2 .2 100.0
Total 403 100.0 100.0
Key take away: Roughly equal distribution of single and married respondent

Frequenc Valid
y Percent Percent Cumulative Percent
Valid One 122 30.3 30.3 30.3
Two 103 25.6 25.6 55.8
Three 69 17.1 17.1 73.0
More than
54 13.4 13.4 86.4
NA 55 13.6 13.6 100.0
Total 403 100.0 100.0
Key take away: Most of the respondents have one or two respondents
Store Visited
Frequency Percent Percent Cumulative Percent
Valid Big Bazar 231 57.3 57.3 57.3
19 4.7 4.7 62.0
Spencer 77 19.1 19.1 81.1
Easy Day 32 7.9 7.9 89.1
44 10.9 10.9 100.0
Total 403 100.0 100.0

Marketing Assignment Retail Oulet Submission G19005

In order to analyse the data better, in the first step reliability of the data was tested
with the help of Cronbach alpha.

Reliability Statistics: The first table we need to look at in our output is the
Reliability Statistics table. This gives us Cronbach’s alpha coefficient.

The value of Cronbach alpha came out to be 0.631. After analysing the mean of the
individual variables, gender and marital status came out to be outlier and when
dealt further, it was found that gender and marital status is not co-related with any
of the other variables (all the co-relation values are coming to be less than 0.3) and
hence gender was dropped and once again analysis was performed and value of
Cronbach alpha bettered to 0.639 (which shows that the variables are reliable)

Factor analysis was performed to extract the key factors from the list of factors
from set 1.

Those factors whose loading was less than 0.7 were excluded in the process.

Kaiser-Meyer-Olkin Measure of Sampling Adequacy. .743

Bartlett's Test of Sphericity Approx. Chi-Square 2527.771
Df 300
Sig. .000

Prior to the extraction of the factors, several tests are used to assess the suitability
of the respondent data for factor analysis. These tests included Kaiser-Meyer-Olkin
(KMO), Measure of Sampling Adequacy and Bartlett's Test of Sphericity. The
KMO value in this research came out to be 0.743 which is close to 0.8 so the
factors are considered very good for the factor analysis.
Bartlett's Test of Sphericity
The null hypothesis is that the inter-correlation matrix comes from a population in
which the variables are non-collinear (i.e. an identity matrix) and that the non-zero
correlations in the sample matrix are due to sampling error.
Test Results for Bartlett’s test of Sphericity
χ2 = 2527.771
df = 300
p < 0.001
Statistical Decision

Marketing Assignment Retail Oulet Submission G19005

The sample inter correlation matrix do not come from a population in which the
inter correlation
Matrix is an identity matrix.

The factors are rotated, and total cumulative variances defined by the 6
components are close to 61%.

High loading factors in component 1

 I avoid crowded place whenever possible
 If i see a crowded place I won't even go inside
 If I avoid crowded place whenever possible
 If I see a store crowded, I won't even go inside

We can term it Crowd averse people

High Loading factors in component 2

 The store seemed very spacious while shopping in the store

 The store had an airy feeling to it
 The ceiling and the light in this store give an open, airy feeling

We can term it as Store Infrastructure

High Loading factors in component 3

 I feel annoyed while shopping in the store

 I feel suffocated while shopping in the store

We can term it as Discomfort in shopping

High Loading factors in component 4

 I would recommend the stores to other

 It is likely that I will continue purchasing from this s tore
 I intended to continue purchasing the products from this store in the future

We can term it as store loyalty

High Loading factors in component 5

High Loading factors in component 5

 The retail outlet seemed very crowded to me in this shopping trip
 The retail oulet seemed very crowded to me during visit
 The retail outlet was a little too busy during this. Shopping trip

We can term it as crowd in retail outlet

High loading factor in component 6

 I felt confining while shopping in store since the ceiling is low and the light
is dim

We can term it as Inconvenience in Retail outlet

Factor analysis was performed one more time with the set 2 data elements:

Those factors whose loading was less than 0.7 were excluded in the process.
KMO and Bartlett's Test
Kaiser-Meyer-Olkin Measure of Sampling Adequacy. .685
Bartlett's Test of Approx. Chi-Square 693.176
Sphericity df 66
Sig. .000

Prior to the extraction of the factors, several tests are used to assess the suitability
of the respondent data for factor analysis. These tests included Kaiser-Meyer-Olkin
(KMO), Measure of Sampling Adequacy and Bartlett's Test of Sphericity. The
KMO value in this research came out to be 0.685 which is close to 0.7 so the
factors are considered very good for the factor analysis.
Bartlett's Test of Sphericity
The null hypothesis is that the inter-correlation matrix comes from a population in
which the variables are non-collinear (i.e. an identity matrix) and that the non-zero
correlations in the sample matrix are due to sampling error.
Test Results for Bartlett’s test of Sphericity
χ2 = 693.176
df = 66
p < 0.001
Statistical Decision
The sample inter correlation matrix do not come from a population in which the
inter correlation
Matrix is an identity matrix.

Marketing Assignment Retail Oulet Submission G19005

The factors are rotated, and total cumulative variances defined by the 4
components are close to 58%.

High loading factors in component 1

 Age
 Income
 Marital Status

They can be termed as: Profile

High loading factors in component 2

 Time spends

It can be termed as length of stay

High loading factors in component 4

 Dependents

Studying the impact of factors on amount spent

With the data set extracted with the factor analysis, we proceeded with the
regression analysis to find out:

1. Study the impact of x1, x6,……..x52 variables on amount of money spent

2. Study the impact of variables like age, gender ……… on the amount of money

For studying the impact of variables on amount of money spent we run the linear

In case 1: we found that R ^2 is close to 4% which shows the 4% of the variation

in the amount of money spent can be attributed to the factors like crowd, store
infrastructure, internal spacing of the retail stores etc.

Marketing Assignment Retail Oulet Submission G19005

However, the model is significant and even though the variation explained is 4%
only, we can go ahead and find out the critical factors affecting this variation.

On further analysis, we found out that only 4 which is store loyalty has a
significant impact on the purchasing power of the people and it shows that better is
the store loyalty, better is the consumption habit of the customer.

In case 2: we found that R ^2 is close to 25% which shows the 25% of the variation
in the amount of money spent can be attributed to the factors like profile of the
people, length of their stay etc.

On further analysis, we found out that 5 factors namely:

 Income
 Marital status
 Who accompanies
 Time spends
 Preference to a particular retail store

These 5 factors have a strong cause and effect relationship with the amount of
money spent.

Marketing Assignment Retail Oulet Submission G19005

Out of these 5 factors Marital status and time spent are the most influential factor
to figure out the amount spent.
Clustering of the data set to segment the people based on their consumption
pattern, profile, crowd aversion phenomena, store loyalty and store infrastructure
to name a few.

We have divided the dataset into 4 clusters. Before dividing the data into cluster,
we performed z analysis test with the variables in set 2 (variables like Age,
Gender, Marital status, Dependents, Store visited, Distance covered, Who
accompanied, Time spend, Purchase intention, Frequency of visit, Amount spent,
Type of product)

Cluster Number of cases

Cluster1 172
Cluster2 110
Cluster3 78
Cluster4 42

Number of cases
100 172
110 78
50 42
Cluster1 Cluster2 Cluster3 Cluster4

Characteristics of cluster:

Cluster 1: They spent higher amount of money and they are in the mid-level age
group and with higher income and are generally married. They are the target
customers as they bring profit to the firms. More promotional events should be
carried out to cater to these audiences.

Cluster 2: Mostly female with two or more dependents and are usually
accompanied with other members but don’t spent too much of amount. They may
be termed as economical customers.

Marketing Assignment Retail Oulet Submission G19005

Cluster 3: They are the people who don’t mind travelling distances and are
generally accompanied by two or more people and they have higher frequency
(they tend to visit store more often). They are budget customers and they mostly
buy for monthly items at bulk.

Cluster 4: Mostly females with higher purchase intention but they mostly purchase
low cost items.


1. Mostly those who are married and spend time in the retail outlet generally tends
to spend more and hence proper advertisement should be planned to hit their
emotional chord along with having better aisle facilities that can help them to move
freely inside the retail oulet.

2. There are people who are very loyal to the store, they purchase on a regular
basis and hence discount should be given to them so that they can feel more loyal

Marketing Assignment Retail Oulet Submission G19005

towards the Retail oulet. They may be not contributing enough for the profit part,
but they buy in bulk and hence are the source of regular income for the firm.

3. Segment 1 (They spent higher amount of money and they are in the mid-level
age group and with higher income and are generally married. They are the target
customers as they bring profit to the firms) people must be targeted and they must
be given initiative to travel distance as they are the infrequent but source of high
income for the retail outlet.


Marketing Assignment Retail Oulet Submission G19005

