Cluster Analysis

Cluster Analysis
Nadya Fionalita
19016041
1
Problem Identification
Marketing managers want to create consumer groups based on consumer profiles, namely:
1. Usia
2. Jumlah anak
3. Income
4. Kegiatan membaca koran setiap minggu
5. Kegiatan menonton TV setiap minggu
6. Jumlah motor yang dimiliki konsumen
7. Jumlah mobil yang dimiliki konsumen
8. Jumlah kartu kredit
9. Tingkat pembelian barang tiap minggu
10. Tingkat pengeluaran
11. Jumlah jam kerja dalam seminggu
12. Jumlah kegiatan belanja dalam seminggu
Methodology
To find the grouping that consist of similar form, the researcher use K-Means clustering analysis
in this research method. In order to sum the cluster, researcher divided it into four clusters. This
research will consist of 60 responses. Below are variable identification to ease the next steps.
Variable Name Label Variable Name Label

Usia Age Number of Children People
Pendapatan Income Menonton TV Watching Hour
Membaca Koran Reading Hour Motorcycle Owned Motorcycle
Car Owned Car Credit Card Owned ATM
Tingkat Pembelian Purchase Level Tingkat Pengeluaran Spending
Jam Kerja Working Hour Kegiatan Belanja Shopping
Result
Descriptive Statistics
N Minimum Maximum Mean Std. Deviation
Zscore: Usia 60 -1.74613 2.14071 .0000000 1.00000000

Zscore: Jumlah Anak 60 -.63104 2.97489 .0000000 1.00000000
Zscore: Penghasilan rata-
60 -.91197 3.08493 .0000000 1.00000000
rata per bulan
Zscore: Jumlah Jam
membaca Koran setiap 60 -1.60876 2.26950 .0000000 1.00000000
minggu
Zscore: Jumlah Jam
60 -1.88693 2.01707 .0000000 1.00000000
menonton TV setiap minggu
Zscore: Jumlah Motor yang
60 -1.47158 1.68180 .0000000 1.00000000
dipunyai
Zscore: Jumlah Mobil yang
60 -.87521 2.21377 .0000000 1.00000000
dipunyai
Zscore: Jumlah Kartu
60 -1.67616 2.51425 .0000000 1.00000000
Kredit/ATM yang dipunyai
Zscore: Tingkat Pembelian
60 -1.22890 1.89212 .0000000 1.00000000
Barang setiap minggu
Zscore: Tingkat
60 -.88103 3.11566 .0000000 1.00000000
Pengeluaran Bulanan
Zscore: Jumlah Jam Kerja
60 -1.18493 2.69195 .0000000 1.00000000
setiap minggu
Zscore: Jumlah Jam
60 -1.38277 3.00381 .0000000 1.00000000
Berbelanja setiap minggu
Valid N (listwise) 60
Initial Cluster Centers
Cluster
1 2 3 4
Zscore: Usia -.50941 .37396 1.25733 2.14071
Zscore: Jumlah Anak -.63104 -.63104 -.63104 2.97489
3.08493 .72057 -.68679 1.95904
rata per bulan
Zscore: Jumlah Jam
membaca Koran setiap .97675 .97675 -1.17784 -.31601
minggu
Zscore: Jumlah Jam
-1.32922 .90164 .06507 -.49265
.10511 1.68180 -1.47158 1.68180
dipunyai
2.21377 -.87521 -.87521 .66928
dipunyai
2.51425 1.67616 -1.67616 .83808
1.50199 1.89212 -.83877 1.50199
Zscore: Tingkat
3.11566 -.05153 -.67366 1.98452
Pengeluaran Bulanan
2.69195 2.09550 -1.12529 .72368
setiap minggu
Zscore: Jumlah Jam
3.00381 1.90717 -.78460 .81052
Iteration Historya
Change in Cluster Centers
Iteration 1 2 3 4
1 2.587 2.360 2.451 2.648

2 .000 .256 .075 .000
3 .000 .000 .000 .000
a. Convergence achieved due to no or small change in cluster centers. The maximum

absolute coordinate change for any center is .000. The current iteration is 3. The
minimum distance between initial centers is 5.634.
Final Cluster Centers
Cluster
1 2 3 4
Zscore: Usia -.68609 .38755 -.18698 .76265
Zscore: Jumlah Anak -.63104 -.42300 -.02254 1.53252
3.08493 .54735 -.54605 1.71135
rata per bulan
Zscore: Jumlah Jam
membaca Koran setiap 1.40767 .81101 -.44528 .89056
minggu
Zscore: Jumlah Jam
-.21379 .38682 -.06739 -.38110
.10511 .83282 -.36789 .73579
dipunyai
1.44152 .43166 -.37325 1.28707
dipunyai
1.67616 1.22489 -.56571 .67047
.72174 .99182 -.51692 1.26791
Zscore: Tingkat
3.11566 .49664 -.53321 1.72813
Pengeluaran Bulanan
1.64817 1.23296 -.56762 .67597
setiap minggu
Zscore: Jumlah Jam
1.50839 1.23998 -.54583 .53935
ANOVA
Cluster Error
Mean Square df Mean Square df F Sig.
Zscore: Usia 2.400 3 .925 56 2.595 .061

Zscore: Jumlah Anak 4.962 3 .788 56 6.299 .001
16.500 3 .170 56 97.249 .000
rata per bulan
Zscore: Jumlah Jam
membaca Koran setiap 8.137 3 .618 56 13.173 .000
minggu
Zscore: Jumlah Jam
.982 3 1.001 56 .981 .409
5.720 3 .747 56 7.655 .000
dipunyai
6.811 3 .689 56 9.890 .000
dipunyai
13.391 3 .336 56 39.828 .000
10.852 3 .472 56 22.982 .000
Zscore: Tingkat
16.309 3 .180 56 90.653 .000
Pengeluaran Bulanan
13.456 3 .333 56 40.441 .000
setiap minggu
Zscore: Jumlah Jam
12.637 3 .377 56 33.555 .000
The F tests should be used only for descriptive purposes because the clusters have been chosen to maximize the
differences among cases in different clusters. The observed significance levels are not corrected for this and thus
cannot be interpreted as tests of the hypothesis that the cluster means are equal.
Number of Cases in each

Cluster
Cluster 1 2.000
2 13.000
3 40.000
4 5.000
Valid 60.000
Missing .000
Analysis
K-Means is used to create cluster and relocate them through iteration process. Based on the
data, there is a change in Iteration History table and the three iterations performed towards
the data and have minimum distance of 5.634 between initial centers.
The interpretation of four clusters formed begins by analyzing the variables that distinguish
between the three clusters.
In this research, SPSS generally uses 5% error, we can construct hypothesis where no
significant difference between clusters if the significance value > 0.05. Significance value <
0.05 shows the opposite. ANOVA table presents distinguishing variables of the segments.
1. Income, reading hour, Motorcycle, car, ATM, purchasing level, spending, working hour,
and shopping have 0.00 significance value which represents they have significance
difference in clusters. Cluster 1, cluster 2, cluster 3, and cluster 4 have relation with
all these variables.
2. Watching hour has significance value of .409 that has the same meaning with all
variables included in number 1.
3. Children has significance value of .001 that also has the same meaning as all
variables included in number 1 and number 2.
4. Age has .61 significance value which means this variable does not have significance
difference between clusters. Hence, this is not considerable during the analyzing
process.
Based on the table, the highest amount of F value is 97.249. The number shows income in
each cluster are much different.
Next, each city can be analyzed based on Final Cluster Centers table. From the table, we can
see that through eleven variables (exclude age after ANOVA result analysis) it is forms four
groups. On Final Cluster Centers table, numbers are still related to the standardization
process as we used Z-score value where negative value (-) means the data is below total
average and positive value (+) means the opposite. Hence we can see that:
1. Age, number of children, watching hour are below total average in cluster 1
2. Number of children and watching hour are variables that have lower sum in cluster 2
3. All variables are negative in cluster 3
4. Only watching hour that has negative value in cluster 4
These are the analysis of each variables that shows in Cluster table:
1. Children: average in cluster 4 (1.53252) > cluster 3 (-.02254) > cluster 2 (-43200) >
cluster 1 (-.63104). This proves that amount of children really impacts behavior the
most in cluster 4 where some people who live in cities included in cluster 1 are not
influenced by number of children.
2. Income: shows that cluster 1 (3.08493) > cluster 4 (1.71135) > cluster 2 (.54735) >
cluster 3 (-.54605). It means some consumer behavior are really dependent toward
income the most in cluster 1 and income has least influence in cluster 3.
3. Reading Hour: results is cluster 1 (1.40767) > cluster 4 (.89056) > cluster 2 (.81101)
> cluster 3 (-.44528). Some people who live in Jakarta Utara and Jakarta Timur have
spent time the most in reading hour rather than some people who live in cities
included in cluster 3.
4. Watching Hour: variables shows cluster 2 (.38582) > cluster 3 (-.06739) > cluster 1
(-.21379) > cluster 4 (-.3810). Some people who live in cities categorized on cluster
2 (Bandung, Semarang, Jakarta Selatan, Surabaya, Jakarta Barat, Yogya, Solo,
Malang) watch the most rather than cluster 4 who have least watching hour.
5. Motorcycle: shows cluster 2 (.83282) > cluster 4 (.73579) > cluster 1 (.10511) >
cluster 3 (-.36789). Cluster 2 have motorcycles the most and cluster 3 have the least
motorcycle.
6. Car: is owned the most in cluster 1 (1.44152) > cluster 4 (1.28707) > cluster 2
(.43166) > cluster 3 (.-37325).
7. ATM: owners group the most in cluster 1 (1.67616) > cluster 2 (1.22489) > cluster 4
(.67047) > cluster 3 (.56571).
8. Purchasing Level: reach the most amount in order: cluster 4 (1.26791) > cluster 2
(.99182) > cluster 1 (.72174) > cluster 3 (-.51692).
9. Spending: is at the highest in cluster 1 (3.11566) > cluster 4 (1.72813) > cluster 2
(.49664) > cluster 3 (-.53321)
10. Working Hour: influence the most in cluster 1 (1.64817) > cluster 2 (1.23296) >
cluster 4 (.67597) > cluster 3 (-.56752)
11. Shopping: took the longest time in cluster 1 (1.50839) > (1.23998) > cluster 4
(.53935) > cluster 3 (-.54583).
Conclusion
Based on the data above, we can conclude that manager can divide the consumer group
based on consumer behavior within four groups that consist of 3.3% in cluster 1 (Jakarta
Utara and Jakarta Timur), 21.7% of cluster 2 (Surabaya, Jakarta Selatan, Jakarta Timur,
Jakarta Barat, and Bandung), 66.67% in cluster 3 (Tegal, Yogya, Solo, Banjarnegara, Madiun,
Pekalongan, Jepara, Blora, Karawang, Magelang, Parakan, Tuban, Ciamis, Pati, Cepu,
Wonogiri, Pacitan, Malang, Yogya, Solo, Caruban, Mojokerto, Jombang, Kebumen, Kutoarjo,
Purworejo, Cirebon, Tasikmalaya, Bogor, Jakarta Barat, Jakarta Timur, Bekasi, Ambarawa,
Purwodadi, Tangerang, Sidoarjo) and 8.33% in cluster 4 (Bandung, Jakarta Barat, Jakarta
Timur, Jakarta Selatan, Surabaya).
Therefore, we can sum up that cluster 1 has the highest amount in income, reading hour,
car, ATM, spending, working hour, and shopping. Cluster 2 has the highest amount in
watching hour, motorcycle. Cluster 3 has no highest value in all variables since it is always in
the last place for income, reading hour, car, ATM, purchasing level, spending, working hour,
and shopping. Last, cluster 4 is highest in children, purchasing level and spending.
It is rational that cluster 1 has highest amount in almost all general things to have for living
in the city while cluster 3 is filled mostly by districts that do not consider things as much as
people in cluster 1,2, and 4 due to socio-demographic condition. On the other hand, people
in cluster 4 are considered to live in a capital city of provinces which produces high salary
and much offers in that city such as shopping mall, restaurant, et cetera. This creates high
spending and purchasing on that area due to living place and other supported factor such as
salary. While in group 1,2, and 4 are considered in higher economy classification rather than
group 3.
Therefore, researcher suggest marketing manager to put more exposure for consumer group
1,2, and 4 as they have highest value in vital variables to do shopping activity. The manager
can divide these 3 groups into different purpose. Marketing managers would likely to focus
on cluster 1 if they want to achieve higher profit put on attention that they really like reading.
Focus on cluster 2 can be done if managers want to promote the practicality through
advertising media. Focus on cluster 4 if mother-care supplies become the main purpose
(newly-born mother would likely to have increasing shopping for baby needs). In this case,
whether the group 3 has the lowest result, the managers can still take this situation as an
advantage to focus on product function as they don’t really pay attention to any promotions.
Managers can maximize in children and watching hour in this group. TV advertising can be
the best alternative for brand exposure and sell more children stuff.

Cluster Analysis - Nadya Fionalita - 19016041

Uploaded by

Copyright:

Available Formats

You might also like

Cluster Analysis - Nadya Fionalita - 19016041

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Cluster Analysis - Nadya Fionalita - 19016041

Uploaded by

Copyright:

Available Formats

Variable Name Label Variable Name Label

N Minimum Maximum Mean Std. Deviation

Zscore: Usia 60 -1.74613 2.14071 .0000000 1.00000000

Initial Cluster Centers

Change in Cluster Centers

1 2.587 2.360 2.451 2.648

a. Convergence achieved due to no or small change in cluster centers. The maximum

Final Cluster Centers

Mean Square df Mean Square df F Sig.

Zscore: Usia 2.400 3 .925 56 2.595 .061

Number of Cases in each

You might also like