Download as pdf or txt
Download as pdf or txt
You are on page 1of 31

MA (Marketing Analytics)

Hypothesis Non-Hypothesis
 Chi-Square  EFA (fact Analysis)
 T-test  Cluster Analysis
1. One Sample
2. Two Sample
3. Paired Sample
 Predictive/Forecasting (Regression)  Discriminant Analysis
Regression
1. Simple Linear
2. Multiple
 Multi-Dimensional
mapping

Explorative Factor Analysis


It is a Multi-variate Statistical technique in which there is no distinguish
between dependent and independent variable. It is a data reduction technique.
(It helps in identifying the most important attributes or factors out of the all.)

In factor analysis all the variable under the investigation is analysed together to
extract the underlined factors.
It is very useful method to reduce a large number of data (attributes) to the
few manageable factors. These factors explain most part of the variation of the
original set of data.
Example - A market researcher might have collected data on 50 attributes of
the product which may be difficult to analyse. Therefore, he plans to apply FA
(Factor Analysis) which reduces the 50 attributes into 5-6 manageable factors.

CASE -1 (Investment behaviour of the employee in PSU)


A study was carried out in 2007 to understand and analyse the investment
behaviour of the all employees of Public Sectors Unit PSU and Government. A
Sample of 80 respondents were drawn from the PSU and government
employees in Delhi.
SPSS Command for EFA

1. Click Analyse on the SPSS menu bar.


2. Click on data Dimension reduction followed by factor.
3. On the dialogue box which appears select all the variable required for
the factor analysis by clicking on the right arrow to transfer them from
the variable list on the left to the variable box on the right.
4. Click on extraction in the lower part of the dialogue box
I. Select principle component as method.
II. Under display select unrotated factor solution.
III. Under extract Eigene value over 1.
IV. Under Analyse, choose correlation matrix.
V. Click Continue
5. Click on Rotation in the lower of the main dialogue box, Select varimax
from the options under method. Click Continue
6. Click on Descriptive in the lower part of the dialogue box. Click KMO and
bartlett’s test od Sphericity. Click Continue
7. Click on Scores, click on save as variable and select method as regression,
then click on display factor scores coefficient.
8. Click on OKAY to get factor analysis output including the unrotated factor
matrix, the rotated factor matrix using varimax rotations and the
extracted factors along with Eigene value and cumulative variance.

Interpretation

The value of KMO and Barlett’s test should be more than 0.5 in case to accept
factor analysis.
The total sum of the variance percentage should be more than 50%.

Component Matrixa
Component
1 2
X1 -.182 .756
X2 .507 .127
X3 .341 -.692
X4 .312 .126
X5 .765 -.198
X6 .562 .645
X7 .781 .076
Extraction Method: Principal
Component Analysis.
a. 2 components extracted.

Interpretation for CASE-1


Upon looking to the KMO & Bartlett’s Table, we can easily see the SIG value is
equal 0.000 which means that the factor analysis conducted is significant as the
assumed level of significance remains 0.05(alpha, “P is less than alpha
therefore H0 is rejected and H1 is accepted”) indicating the rejection of the
hypothesis that the correlation matrix of the variable is insignificant. It may be
noted that the sample size should be greater than 5 times of the number of
attributes ( 7x5= 35/ sample size 80 is greater than 35).
The bartlett’s of sphericity indicates the significance of the correlation matrix
which means that correlation coefficient matrix is significant as indicated by P
value corresponding to the Chi-Square Statics.
Upon looking to the total variance table, we can clearly see that 7 attributes
have been reduced to 2 factors. Upon looking to the second last column
percentage of variable, the percentage of variance is equal to 28.207, Second
percentage of variance is equal to 22.635. This means that when we add both
the value of variance (28.207 +22.635 = 50.84) the sum must be greater than or
equal to 50, indicating that the reduced factor explains at least 50% of the all
the attributes.
Upon looking to the rotated component matrix table, we can easily make out
the factor loadings of a signal attribute distributed between two sperate
factors, we will select factor loading which is equal to or greater than 0.05. The
factor loading with a negative sign must be interpretate inversely. (Refer to the
table).
[For rotated component matrix interpretation for negative value]
For example, X1(Risk awareness= -0.776) which indicates that the employees
investment will decrease with the increase of risk awareness we can also say
Reliability Testing Interpretation
Reliability test was done on PSU Employees data in order to calculate
Cronbach’s Alpha Value.

Upon looking to the case processing Summary Table, we can see that, valid
cases= 100 which is also signifies the sample size, therefore we can say that N =
100, We can also see Excluded is = 0 which means 100% uploaded data is
calculated.
Upon looking to the reliability static table we can clearly see that the
Cronbach’s alpha is .401 which Means that the Calculated value of Cronbach’s
Alpha is below the Minimum desired value i.e. 0.5. We can also say that the
Data can be Dropped among the various items of the data. Further we can also
see “N of items” = 10 which means that the test has been performed on 10
only.
Investment behaviour of employees= 1/ risk awareness

Note:
Items/Attributes
Items or Attributes are basically the factors which plays an important role in
decision making of a consumer.

CASE – 2
A study was conducted to determine the factor responsible for measuring the
satisfaction level among consumer of aerated drink. A survey conducted with a
sample size of 100 consumers of soft drinks from different age and income
groups, the respondent were a mix of male and female, some of the question
asked in a survey is as followed.
X1 = Aerated soft drinks are refreshing
X2 = are bad for health 1 = Strongly disagree
X3 = are very convenient to serve 2 = Disagree
X4 = should be avoided with age 3 = Neither disagree nor
agree
X5 = are very tasty 4=
Agree
X6 = are not good for children 5 = Strongly agree
X7 = should be consumed occasionally
X8 = should not be taken in large quantity
X9 = are not as good as energy drinks
X10 = are better than fruit juices

KMO and Bartlett's Test


Kaiser-Meyer-Olkin Measure of Sampling Adequacy. .722
Bartlett's Test of Sphericity Approx. Chi-Square 224.769
df 45
Sig. .000
Total Variance Explained
Extraction Sums of Squared Rotation Sums of Squared
Initial Eigenvalues Loadings Loadings
Compone % of Cumulative % of Cumulative % of Cumulative
nt Total Variance % Total Variance % Total Variance %
1 2.935 29.349 29.349 2.935 29.349 29.349 2.857 28.572 28.572
2 1.842 18.424 47.773 1.842 18.424 47.773 1.623 16.231 44.803
3 1.020 10.202 57.975 1.020 10.202 57.975 1.317 13.172 57.975
4 .922 9.223 67.198
5 .833 8.335 75.532
6 .699 6.994 82.526
7 .554 5.540 88.066
8 .487 4.870 92.936
9 .442 4.423 97.359
10 .264 2.641 100.000
Extraction Method: Principal Component Analysis.

Interpretation
Upon looking to the KMO& Bartlett’s Test table, we can see that the sample
adequacy value= 0.722 this indicates the significance of correlation matrix
which means that correlation coefficient is significant as indicated by P-value
(0.000): where alpha = 0.5 corresponding to the chi square statics. This may be
noted that KMO value is equal to or above 0.5 will be only considered.

Upon looking to the total variance table, we can clearly see that 10 items/
Attributes have been reduced into 3 factors. The value of %Variance mentioned
in the second last column= 28.572+16.231+13.172 = 57.975. The sum of all 3
factor’s percentage of Variance (%variance) must be equal or above 50.

Rotated Component Matrix (Refer to notebook)


Upon looking to the rotated matrix table made above we can clearly see the
factor loading from X1 to X10 and their shortlisting to their respective factors.
The 3 factors from the above table are:
1. Health Concerns
2. Benefits
3. Comparative Features

The Items/Attributes with the negative factor loading will be interpretate


inversely. For example, X10= -0.640 which means that aerated drinks = 1/
better than fruit juices. That means aerated drinks are not better than fruit
juices.
STP (Segmentation, Targeting, Positioning)

Type of Segmentation:
1. Demographic segmentation
Age, Income, Education level, Qualification level, Gender
2. Geographic segmentation
It based upon area, topography, region
3. Behavioural segmentation/Benefit Sort
What are the different benefits that a customer is getting, added value,
Consumer actions, Decision making patterns, Market Data
4. Psychographic segmentation (Lifestyle)
When a product is sold on the basis of your lifestyle, passion, personality,
opinions
5. Firmographic
When the organization employee are segmentate on the basis of above
segments
Positioning
It is the ranking that you give to a brand’s product or to a product in your mind.

There are two types of segmentation for segmentation:


1. Discriminant
2. Cluster
Notes:
 Discount is Sale Promotion Technique.
 7P’s of marketing are generally used for services whereas the 4P’s are
used for product.
 Push & Pull Strategy; Sales Promotion is a Push Strategy.
 Promotion and Sales Promotion are two different things.
 Cash and carry Fundamental – Coco-cola and Thumps Up
TYPES OF POSITIONING
 By Attributes
 By Quality
 By Use and Application
 By Product Class
 By Country or Culture
 By Product User

The hashtag of Arial is “SHARE THE LOAD”

What is Differentiation?
1. Unique Attribute
2. Cutting Edge Technology
3. One of a kind Packaging
4. Price
5. Service
T-Test
Types of T-Test
1. One sample
2. Two sample – Two sample T-test are about the difference of opinion
between two different population. Ex – Male and female.

3. Paired sample – It is the comparison of an event before and after. The


population remains the same but scenario changes.

The test value for one sample T-test is fixed at 3. (Parameter to measure
satisfaction level)
Testing i.e. all types of T-test are hypothesis testing, which are used to
measure/manage satisfaction or dissatisfaction level.
One sample T-test
Case: Ban of Plastic Bag

Objectives:
1. To identify the parameters of plastic bags in which consumers have
favorable opinion (Dissatisfaction/ satisfaction level) or to understand
the satisfaction and the dissatisfaction level of the respondent w.r.t the
ban of plastic bags.
2. To examine whether the view of male and female respondent are same
or to examine the difference in the perception of male and female
respondent.
3. To Understand the difference in the perception of the plastics bags
before and after the ban.
4. To analyze the satisfaction level of the respondent w.r.t age.

Note:
Chi-square test is used for demographic aspect of the data i.e. age.

Steps for T-test


1. Click on analyze at the SPSS Menu bar.
2. Click on compare means followed by one sample t-test.
3. Select the Test variable for which is this test is to be done by clicking on
the arrow after highlighting the appropriate variable to transfer it from
left to right.
4. Specify the test value which is the hypothesized value and say okay.
For Objective 1:
H0 : There is no significant difference between the satisfaction and
dissatisfaction level of the respondent w.r.t ban on plastics bags.
H1: There is significant difference between the satisfaction and dissatisfaction
level of the respondent w.r.t ban on plastics bags.

OUTPUT FOR SHEET 2 ( PLASTIC BAGS)

One-Sample Statistics
N Mean Std. Deviation Std. Error Mean
x12_a 44 2.80 1.374 .207
x12_b 44 1.41 .871 .131
x12_c 44 3.50 1.151 .174
x12_d 44 2.77 1.138 .172
x12_e 44 1.80 .930 .140
x12_f 44 3.16 1.328 .200

One-Sample Test
Test Value = 3
95% Confidence Interval of the
Difference
t Df Sig. (2-tailed) Mean Difference Lower Upper
x12_a -.988 43 .329 -.205 -.62 .21
x12_b -12.113 43 .000 -1.591 -1.86 -1.33
x12_c 2.881 43 .006 .500 .15 .85
x12_d -1.324 43 .192 -.227 -.57 .12
x12_e -8.595 43 .000 -1.205 -1.49 -.92
x12_f .794 43 .431 .159 -.24 .56
INTERPRETATION OF ONE SAMPLE TEST
Upon looking to the one sample statistics table, we can clearly see that the
N= 44 which signifies that all the respondents have been under consideration
and none of the respondents have been left out.

Upon looking One Sample Test table, we can clearly see that Test Value= 3. This
is because the respondent respond’s is measured on 5-point scale. Since the
scale is capturing the perception which cannot be measured in decimals,
therefore we cannot assume the perception value as 2.5 (provided test value is
always the mid value). Therefore, we assume 3 as a mid-value which is neutral.

Upon looking at the third column sig (2 tailed), we can clearly see that X12_B,
X12_C, X12_E are the only significant values (p is less than alpha which is
incase of X1B- 0.000 is less than 0.05 respectively. Alpha is the assumed level of
significance which is 0.5). The values of X1A, X12_D, X12_F are insignificant
because p is greater than alpha.

The alternate hypothesis that is H1 is accepted only in the case of X12_B,


X12_C, X12_E. While H1 is rejected in the case X12_A, X12_D, X12_F. This
means that “There is a significant difference between the satisfaction and the
dissatisfaction level of the respondents wrt to ban on the plastic bags” with
reference to B,C,E (Plastic bags are harmful for the environment, I do not wish
to quit using plastic bags, Plastic bags is not enforced properly).

After filtering out the significant value which were B,C,E; we will again look
back to One Sample Statistics table. In the One Sample Statistics table, we can
clearly see the MEAN value of X12_B, X12_C, X12_E; we found that only X12_C
is above the Test Value while X12_B, X12_E is below the Test Value. This means
that only X12_C attribute results into satisfaction level of the respondents. In
other words we can say that, respondents are strongly satisfied with the
statement that they do not wish to quit using plastic bags. Incase of X12-B.
X12_E; we can conclude that respondents are strongly dissatisfied with the
statement that plastic bags are harmful for the environment and Plastic bags
ban are not enforced properly.
2 Sample T-test

2 sample T- Test is also known as independent sample t-test

1. Click analyze at SPSS Menu bar.


2. Click on Compare means followed by independent sample T-Test.
3. Select the test variable for which test is to be done, by clicking on the
arrow after highlighting the appropriate variables to transfer it from left
to right.
4. Select the grouping variable in the same way and transfer it to right side
box. This variable defines the codes of segregating the test variable into
two groups.
5. Then define the codes for the two groups by clicking on define groups
just below the grouping variable and typing the codes.
6. Click okay to get the output for an independent sample T- test.

Output

Group Statistics
Gender N Mean Std. Deviation Std. Error Mean
x12_a 1 31 2.90 1.491 .268
2 13 2.54 1.050 .291
x12_b 1 31 1.42 .958 .172
2 13 1.38 .650 .180
x12_c 1 31 3.35 1.170 .210
2 13 3.85 1.068 .296
x12_d 1 31 2.84 1.241 .223
2 13 2.62 .870 .241
x12_e 1 31 1.87 .957 .172
2 13 1.62 .870 .241
x12_f 1 31 3.29 1.346 .242
2 13 2.85 1.281 .355
Independent Samples Test
Levene's Test for
Equality of Variances t-test for Equality of Means
95% Confidence
Interval of the
Sig. (2- Mean Std. Error Difference
F Sig. t Df tailed) Difference Difference Lower Upper
x12_a Equal variances 5.327 .026 .800 42 .000 .365 .456 -.555 1.285
assumed
Equal variances .922 31.787 .000 .365 .396 -.441 1.171
not assumed
x12_b Equal variances .331 .568 .119 42 .906 .035 .291 -.553 .622
assumed
Equal variances .139 32.888 .890 .035 .249 -.473 .542
not assumed
x12_c Equal variances .192 .663 - 42 .001 -.491 .377 -1.253 .270
assumed 1.302
Equal variances - 24.628 .001 -.491 .363 -1.240 .257
not assumed 1.352
x12_d Equal variances 3.195 .081 .589 42 .559 .223 .379 -.542 .988
assumed
Equal variances .680 31.926 .501 .223 .328 -.446 .892
not assumed
x12_e Equal variances .123 .727 .829 42 .412 .256 .308 -.367 .878
assumed
Equal variances .863 24.733 .397 .256 .296 -.355 .866
not assumed
x12_f Equal variances .841 .364 1.012 42 .003 .444 .439 -.441 1.330
assumed
Equal variances 1.033 23.663 .003 .444 .430 -.444 1.332
not assumed

Interpretation
Independent T Test is majorly used to understand the difference in the
perception of the 2 different population (Mainly male and female in our case),
Upon the application of the independent sample T- test we get two major table.

1. Group Statistics table


2. Independent Sample test table.

Upon Looking to the group statistics table, we can clearly see that end value is =
44 (1. Is 31 + 2. Is 13 = 44).

Moving on to independent sample T-test table, we can clearly see that sig. two
tailed of any of the variable (x1 -x5) is not less than alpha (where Alpha =
assumed level of significance = .05 ; p is calculated level of significance)
therefore we can clearly say that H02 is accepted and H2 is rejected.

Interpretation Part 2 (Changed Value in the table) :

Upon looking at the independent sample Test table, we can clearly see that the
Sig. two tailed value of the attribute x12a, x12c, x12f are significant having
values 0.000, 0.001, 0.003 respectively (P < alpha ; H02 is rejected and H2 is
accepted, Therefore we will move forward by only considering the attributes
x12a, x12c, x12f.

1. In case of x12a : There is a significant relationship between the


perception of people regarding plastic bag is must when buying grocery
and vegetables w.r.t the ban on plastic bag.
2. In case of x12c : There is significant relationship between the respondent
perception regarding wish not to quit using plastic bag w.r.t the ban of
plastic bags.
3. In case of x12f : there is a significant relationship between the
respondent perceptions regarding paper bags as a useful substitute of
plastic bags w.r.t the ban on plastic bags.
Upon looking to the group statics table, we can clearly see that the value of
x12a the mean value of male is higher than the female which means that male
are contributing more as compared to females in forming the perception of
attribute x12a.
In case of x12c females are contributing or agreeing more as compared to male
counterparts while forming the perception of the attribute x12c.
In case of x12f male is contributing or agreeing more as compared to their
female counterparts while forming the perception of the attribute x12f.

Paired Sample T Test


Steps for Paired Sample T-test
1. Repeat Step 1 (of Two sample/Independent t-test) of the above after
your data is typed or
2. Click on compare means followed by Paired sample t-test.
3. Select two variables from the above variable list appearing on the left
side. These should be transferred to the box on the right by clicking on
the arrow.
4. Click on Okay to get the desired output.
Case:
Comparative perception of mess food versus Dhaba’s a case of IIFT.
Objective: the student of IIFT conducted a comparative study of both IIFT Mess
and the Dhaba’s to find out the factors that could Mess for the benefit of the
student community at IIFT. The Questionnaire was e-mailed to 260 students
but only 45 responses were received. The attributes on which the question
were asked are:
1. Taste of food
2. Quality of ingredients
3. Hygiene
4. Cost
5. Ambience
6. Nutrition
7. Menu variety
8. Quality of service
9. Timming at which they are open
10. Total time taken for the meal
The response of the respondent were rated on 5 point likert scale where 1=
extremely unsatisfied and 5= extremely satisfied.
Q1. Find out the response rate of the Survey.
Q2. Frame the Objective of the above case problem.
Q3. Frame the hypothesis for the above case problem.
Q4. Analyze and interpretate the paired sample T-test.
Q5. Summary and conclude the result.

Response Rate: 45/260 x 100 = 17.30


Formula: Total responses received/ Total no. questionnaire circulated X 100
Objective: To understand the difference in the perception of the IIFT students
w.r.t the Mess food and Dhaba’s.
Hypothesis
H03: there is no significant difference in the perception of IIFT students w.r.t
the Mess food and Dhaba’s.
H3: There is a significant difference in the perception of IIFT students W.r.t the
Mess Food and Dhaba’s.

Paired Samples Statistics


Mean N Std. Deviation Std. Error Mean
Pair 1 X1 2.27 45 1.116 .166
Y1 3.93 45 .889 .133
Pair 2 X2 2.31 45 1.164 .174
Y2 3.73 45 .939 .140
Pair 3 X3 3.20 45 1.036 .154
Y3 3.13 45 .968 .144
Pair 4 X4 2.69 45 .821 .122
Y4 2.69 45 .763 .114
Pair 5 X5 2.89 45 .959 .143
Y5 2.09 45 .793 .118
Pair 6 X6 3.11 45 .959 .143
Y6 2.98 45 .839 .125
Pair 7 X7 3.18 45 .936 .140
Y7 2.62 45 .960 .143
Pair 8 X8 2.96 45 .976 .145
Y8 2.49 45 .815 .122
Pair 9 X9 3.38 45 1.193 .178
Y9 3.73 45 1.116 .166
Pair 10 X10 3.80 45 .842 .126
Y10 3.11 45 .982 .146

Paired Samples Test


Paired Differences
95% Confidence Interval of the
Std. Std. Error Difference Sig. (2-
Mean Deviation Mean Lower Upper t df tailed)
Pair 1 X1 - Y1 -1.667 1.552 .231 -2.133 -1.200 - 44 .000
7.203
Pair 2 X2 - Y2 -1.422 1.712 .255 -1.937 -.908 - 44 .000
5.572
Pair 3 X3 - Y3 .067 1.629 .243 -.423 .556 .274 44 .785
Pair 4 X4 - Y4 .000 1.066 .159 -.320 .320 .000 44 1.000
Pair 5 X5 - Y5 .800 1.100 .164 .470 1.130 4.881 44 .000
Pair 6 X6 - Y6 .133 1.307 .195 -.259 .526 .684 44 .497
Pair 7 X7 - Y7 .556 1.099 .164 .225 .886 3.392 44 .001
Pair 8 X8 - Y8 .467 1.307 .195 .074 .859 2.395 44 .021
Pair 9 X9 - Y9 -.356 1.824 .272 -.903 .192 - 44 .198
1.308
Pair X10 - .689 1.411 .210 .265 1.113 3.274 44 .002
10 Y10

Interpretation (Paired Sample Test)


Objective: To understand the difference in the perception of the IIFT students
w.r.t the Mess food and Dhaba’s.
Hypothesis
H03: there is no significant difference in the perception of IIFT students w.r.t
the Mess food and Dhaba’s.
H3: There is a significant difference in the perception of IIFT students W.r.t the
Mess Food and Dhaba’s.
Referring to the objective 3 and H03 & H3 below is the interpretation of Paired
sample T-test. Upon Looking to the paired sample test table, we can clearly find
out the significant pairs over the insignificant pairs from the value of sig. “2
tailed” pair 1,2,5,7,8,10 are significant because their calculated level of
significance (P) < the assumed level of significance (alpha - 0.05). The P value
of 1,2,5,7,8,10( .000, .000, .000, .001, .021, .002).
Therefore, we will only interpret significant pairs and leave insignificant pairs.

Upon Looking to the 1st column of paired sample test table i.e. means values,
we can clearly see that the mean value of pair 1,2 are negative i.e. (-1.667 & -
1.422 respectively). While the mean value of pair 5,7,8,10 is positive i.e. (
0.800, 0.566, 0.467, 0.689 respectively). The negative value of pair 1 & pair 2
means Y(Dhaba Food ) is better than (>) X(Mess food). While in case of
positive value, we can clearly assume that X(Mess food) > than Y(Dhaba food).

We can conclude that in terms of “taste of food” & “Quality of Ingredients”


Dhaba food is better than Mess Food, while in case of attributes “Ambience”,
“Menu of Variety”, “Quality of service”, “total time taken for the Meal” Mess
food is better than Dhaba Food.

T3
CLUSTER ANALYSIS
Cluster Analysis is also referred to as a classification technique wherein grouping
can be done Objects, Individuals and Entities.
In Cluster Analysis, the whole population sample is undifferentiated and trhe
attempt to access similarity in response to variable and the grouping happens post
the clustering. Cluster Analysis is the best classification technique when multiple
factors are involved in data collection.

Types of Cluster Analysis and its usage:


1) Market Segmentation – As we know market segmentation is the process of
splitting customer/potential customer within a market into different group
segments where customer have or similar requirement satisfied by a distinct
marketing mix. For example- Mc Donalds, this is one area that has seen
maximum theorization on the basis of technique. The best example of
Clustered solution is in the area of benefit sort segmentation. Here the
consumers are divided into groups based on the benefits they seek from the
product category. Then these could be across age group, gender and other
variables. Thus, a marketer could design his products on the basis of this
segmentation approach.

2) Segmenting Industries/Sector – The researcher could also go about


grouping products or sector. For example: Hospitality, Health, Education.
Into blocks that have some common trades. This makes it easier for both
organization and policy maker while planning or evaluating the performance
of the groups.

3) Segmenting Markets – Cities or regions with some common traits like


Population mix, Infrastructure Development, Climatic or Socio-economic
condition could be clustered together. Say if one city in Kerala and another
in Andhra Pradesh are in 1 cluster, then the organization/brand is able to plan
and execute a similar business approach in both the areas.

4) Segmenting Financial Sector or Instruments – This is an emerging area


where different factors like raw material, cost and financial allocation,
seasonality and other factors are being used to group sectors together to
understand the growth and performance of group of industries.

5) Career Planning and Training Analysis – In the area of HR (Human


Resource), the techniques can be used to group people into cluster on the
basis of their educational qualification, experience, aptitude and aspiration.
This grouping can assist the HR division to effectively manage training and
manpower development for the members of different clusters effectively.

STEPS FOR CLUSTER ANALYSIS

1. On the top of the screen, go to Analyze – classify, hierarchal cluster.


2. A dialog box will open for the technique. Now select all the variables to
be used for the analysis by dragging them to the right into the variable
box.
3. Then select cases as we are going to cluster the sample.
4. In the display box check Statics and Plots
5. Now go to method for cluster method.
6. Select group linkage. In the measure box check the scale as Interval or
Count or Binary as the case may be for the clustering variable.
7. Once you select the measure, the option for calculating distance for the
measure would get activated.
8. For the binary data, select simple matching coefficient, click continue.
9. For Interval data, select square Euclidean distance. Click continue.
10. For the count data select chi-square. Click continue
11. Now go to statics, in the pop window check agglomeration schedule. Click
continue
12. Click on Plots and click Dendogram. Next for icicle box, check all clusters
and in the orientation box, check vertical. Click continue.

K means for Cluster Analysis


1. On the top of the screen, go to analyze, classify, K means cluster.
2. A dialog box will open for the technique. Now select all the variables to
be used for the analysis by dragging them to the right, into the variable
box.
3. Under this there is an option for number of cluster, enter a number here.
4. Click an option. In the pop-up window, in the statics box, check initial
cluster center, anova and cluster information for each case. Click
continue.
5. Go to save and click on save cluster membership.
6. Go to the main menu box and click on OK.

OUTPUT FOR JEWELRY CASE CLUSTER ANALYSIS

Agglomeration Schedule
Cluster Combined Stage Cluster First Appears
Stage Cluster 1 Cluster 2 Coefficients Cluster 1 Cluster 2 Next Stage
1 5 9 .000 0 0 7
2 3 7 .444 0 0 5
3 4 10 .548 0 0 7
4 6 8 .558 0 0 5
5 3 6 .609 2 4 6
6 2 3 .810 0 5 8
7 4 5 .811 3 1 9
8 1 2 1.037 0 6 9
9 1 4 1.850 8 7 0

JEWELRY CLUSTER ANALYSIS INTERPRETATION


Enchant is a jewelry designer who wishes to know if the population of young
teenage girls aged between 13-19 can be divided into smaller groups who might
be looking at jewelry very differently.
The following 5 statements have been used to conduct the survey on the 5-point
Likert scale.
From the Dendogram of the Jewelry we get majorly two cluster:
Cluster 1 = 5,9,4,10
Cluster 2 = 6,8,3,2,7,1
Based on the data obtained, we plot inter respondent distance against the case
based on proximities and we get a grouping of 10 teenage girls into two distinctive
cluster. This plot is called Dendogram. If we look at the original statement that
they agreed with, we find that the 1st cluster (5,9,4,10) seems to be socially
concerned group as they show higher degree of agreement X3 and X4. The other
girls (6,8,3,2,7,1) are more self-driven as they show higher degree of agreement
with X1, X2, X5.

OUTPUT OF MILK SUPPLEMENTARY CASE

Agglomeration Schedule
Cluster Combined Stage Cluster First Appears
Stage Cluster 1 Cluster 2 Coefficients Cluster 1 Cluster 2 Next Stage
1 5 9 .000 0 0 7
2 3 7 .444 0 0 5
3 4 10 .548 0 0 7
4 6 8 .558 0 0 5
5 3 6 .609 2 4 6
6 2 3 .810 0 5 8
7 4 5 .811 3 1 9
8 1 2 1.037 0 6 9
9 1 4 1.850 8 7 0
INTERPRETATION FOR CLUSTER ANALYSIS (Milk Supplementary case)
Upon looking at the agglomeration schedule, the first cluster consumes more of
Bournvita, Milo and Horlicks. Thus, we name them as the cluster which is MILK
ADDITIVE cluster. The second cluster is Chawanprash consuming cluster and we name
them as MILK ACCOMPANIMENT AYURVEDIC FOCUSED Cluster. The third
cluster only consumes Protinex, Horlicks and Complain. Thus, we name them as MILK
ADDITIVE NUTRITION FOCUSED cluster. The number of cases in each cluster is
can be seen in “Cluster Summary”.
REGRESSION ANALYSIS

Commands for Correlation:


1. Click on Analyze
2. Click on Correlate followed by Bivariate. On the dialog box which appears
select all the variable for which correlation are required by clicking on the
right arrow to transfer them on variable list on the left. Then select the
pearson under the heading Correlation Coefficient and select Two-tailed
under the heading test of significance.
3. Click OK to get the matrix pair wise pearson correlation among all the
variables selected along with the two tailed significance of each pair wise
correlation.

Command for Regression:


1. Click on Analyze at Spss menu bar.
2. Click on Regression followed by linear.
3. In the dialog box which appear select a dependent variable by clicking on
the arrow leading to the dependent box after highlighting the appropriate
variable from the list of variables on the left side. Select the independent
variable to be included in the regression model in the same way transferring
them from left side to the right box by clicking on the arrow leading to the
box called independent variable or independence.
4. In the same dialog box, select the Method. Choose
 Enter as the method if you want all independent variable to be
included in the model. Stepwise if you want to Forward Stepwise
Regression.
 Backward- if you want to use backward
5. Select Options. If you want additional output
6. Select Plot. If you want to see some plots such as residual plots, select those
you want and click continue.
7. Click OK from the main dialog box to get the Regression output
OUTPUT FOR REGRESSION

Model Summary
Mode R R Adjusted R Std. Error of
l Square Square the Estimate
a
1 .928 .860 .849 .699
a. Predictors: (Constant), p_qlty, nutrition, taste

ANOVAa
Model Sum of df Mean F Sig.
Squares Square
1 Regression 108.375 3 36.125 73.891 .000b
Residual 17.600 36 .489
Total 125.975 39
a. Dependent Variable: pref
b. Predictors: (Constant), p_qlty, nutrition, taste

Coefficientsa
Model Unstandardized Coefficients Standardized t Sig.
Coefficients
B Std. Error Beta
1 (Constant) .733 .301 2.436 .020
nutrition .295 .103 .284 2.865 .007
taste .170 .103 .198 1.655 .107
p_qlty .548 .118 .522 4.660 .000
a. Dependent Variable: pref

Minimum to accept the model is 0.5.


Objective 1- To understand the impact of Nutrition of MRP biscuits with
respect to preference of choice of the respondents in Ambala city.
H0(1): There is no significant relationship between Nutrition of MRP biscuits
and preference of choice of the respondents in Ambala city.
H1(1): There is a significant relationship between Nutrition of MRP biscuits
and preference of choice of the respondents in Ambala city.

Objective 2- To understand the impact of Taste of MRP biscuits with respect to


preference of choice of the respondents in Ambala city.
H0(2): There is no significant relationship between the taste of MRP biscuits
and preference of choice of the respondents in Ambala city.
H1(2): There is a significant relationship between the taste of MRP biscuits and
preference of choice of the respondents in Ambala city.

Objective 3- To understand the impact of Quality of MRP biscuits with respect


to preference of choice of the respondents in Ambala city.
H0(3): There is no significant relationship between the quality of MRP biscuits
and preference of choice of the respondents in Ambala city.
H1(3): There is a significant relationship between the lpquality of MRP biscuits
and preference of choice of the respondents in Ambala city.

Impact, Cause and effect, Prediction, Forecast are the keyword to identify
Regression
INTERPRETATION OF REGRESSION
As per the case of regression of MRP biscuit preference is assumed to be
dependent variable (DV) while Nutrition, Taste and Preservation Quality is
assumed to be Independent Variable (IV).
Upon looking to the “Model Summary Table”, we can clearly see that the value
of R2 and Adjusted R2. We will consider Adjusted R2 for further interpretation.
The adjusted R square value comes out to be 0.849 which means that 84.9% of
the predictive model is being defined by this analysis. The minimum value of
Adjusted R2 which has to be considered as 0.5. Since the obtained value which
is 0.849 is greater than 0.5, therefore we will consider the current value of
Adjusted R2.

ANOVA TABLE
From the Anova table, we can clearly see that the sig significant value of the
entire predictive model comes out to be 0.000 which is significant, therefore,
can be considered for the further analysis.
Nutrition
With reference to the Objective 1 and H0 and H1, when we look upon to the
coefficient table, we can easily make out that sig value of nutrition is 0.007
which is significant (P is less than alpha; 0.007<0.05. Therefore, H0 is rejected,
H1 is accepted). The unstandardized coefficient beta value of Nutrition is 0.295
which means that almost 29.5% of the regression predictive model is explained
by Nutrition. From the positive beta value of Nutrition we can also infer that
Nutrition is directly proportional to the preference of choice.
Nutrition = Preference of choice.
Above formula clearly interprets that if everything else remains constant then
preference of choice increases by 0.295 when nutrition increases by 1 unit.

Taste
From the significant value of taste, we can clearly see that 0.107 which is
insignificant (p is greater than alpha therefore, H0 is accepted, H1 is rejected)
Preservation of Quality
In this case we can clearly see, the sig value is 0.00 which is significant (p is
less than alpha therefore H0(3) is rejected, H1(3) is accepted). Upon looking at
the unstandardized coefficient beta value, we can clearly see that preservation
quality explains the entire model by 54.8%. The positive value of Preservation
Quality explains that preservation quality is directly proportional to preference
of choice.

Preservation Quality = Preference of choice


The above formula represents that if everything else remains constant the
preference of choice increases by 54.8% if the preservation quality increases by
1 unit.

PREDICITIVE MODEL

Preservation
Quality

Nutrition

Preference of choice
HYPOTHETICAL QUESTION AND ITS EXAMPLE (Values changed)

Model Summary
Mode R R Adjusted R Std. Error of
l Square Square the Estimate
a
1 .928 .860 .849 .699
a. Predictors: (Constant), p_qlty, nutrition, taste

Coefficientsa
Model Unstandardized Coefficients Standardized t Sig.
Coefficients
B Std. Error Beta
1 (Constant) .733 .301 2.436 .020
nutrition -.295 .103 .284 2.865 .007
taste .170 .103 .198 1.655 .107
p_qlty .548 .118 .522 4.660 .000
a. Dependent Variable: pref

DV = Preference of choice
IV = Nutrition, Taste, Preservation Quality
Interpretation
The value of Nutrition is assumed to be -0.295.
From the above “coefficient table”, we can clearly see that sig value of
Nutrition comes out to be 0.007 which is less than the assumed level of
significance (P is less than alpha; 0.007 is less than 0.05; H0(1) rejected, H1(1)
accepted. Upon looking to the “unstandardized coefficient beta value”,we can
clearly see that the value comes out to be 0.295 which means that nutrition is
contributing 29.5% of the total regression predictive model. The negative sign
of nutrition indicates that nutrition is inversely proportional to preference of
choice.
Nutrition = 1/Preference of Choice.
The above formula means that if everything else remains constant the value of
nutrition decreases by 1 unit when preference of choice increases by 0.295.

Conclusion for the first interpretation


Nutrition: Since because we already confirmed that Nutrition [IV(1)] is
significant and directly proportional to Preference of choice (DV). Therefore,
we can conclude that Nutrition plays a positive role while making the choice of
MRP biscuit. Respondents consider the Nutrition of the biscuit as an important
factor while choosing MRP biscuit.

Taste: From the above analysis, we can confirm that Taste [IV(2)] is
insignificant. Therefore, it doesn’t have any role/relationship/impact while
choosing MRP biscuit.

Preservation Quality: From the above analysis we confirm that Preservation


Quality is significant and directly proportional to [IV(3)] Preference of choice
[DV]. Therefore, we can say that preservation quality positively impacts the
choice of MRP biscuit. We can also say that respondents/customer look for or
consider preservation quality as the most important factor while choosing MRP
biscuits. Out of all the independent variables, Preservation Quality (.548,
54.8%) impacts the respondents decision the MAXIMUM while choosing the
MRP biscuits.

Conclusion for the Hypothetical example


From the above analysis, we can clearly confirm that Nutrition is significant but
inversely proportional to Preference of Choice which means that more the
nutrition less the preference of MRP biscuits. It means that

You might also like