Welcome to Scribd!

Cluster Analysis On PCA On Wholesale Customers Data

Uploaded by

0% found this document useful (0 votes)

48 views6 pages

This document discusses using principal component analysis (PCA) and hierarchical clustering to analyze customer data from a wholesale business. [1] PCA was performed on 6 product features and reduced them to 2 principal components (RC1 and RC2) that explain 72% of the variance. [2] Hierarchical clustering was then used to group customers into 2 clusters based on their scores on the 2 principal components. [3] Cluster 1 contained 349 of 352 customers, indicating it was biased, while cluster 2 showed high affinity for the RC2 component containing fresh, frozen and delicatessen products.

Original Description:

Original Title

111_Sarbani_Mishra_27th August

Copyright

Available Formats

DOCX, PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as DOCX, PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as docx, pdf, or txt

0% found this document useful (0 votes)

48 views6 pages

Cluster Analysis On PCA On Wholesale Customers Data

Uploaded by

Sarbani Mishra

Copyright:

Available Formats

Download as DOCX, PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as docx, pdf, or txt

Jump to Page

You are on page 1of 6

Search inside document

Cluster Analysis on PCA on Wholesale Customers data

PGP/24/111
Sarbani Mishra

Principal Component Analysis:

PCA is also known as exploratory data analysis, which is used to reduce the features. To
decrease the complexity, based on the loading( correlation) of each items with each group,
features are clubbed together for simplicity.
In our case, PCA was done with features: Milk, grocery, detergent, fresh, frozen and
delicatessen.

Deletion criteria :
1. If eigen value <1.
2. If communality of an item (square of the horizontal loadings)< 0.4
3. When loading of an item for each group is less than 0.4
4. When cross-loading is present.

From the elbow diagram, it is evident that by grouping in more than 2 groups will lead to
eigen values less than 1, and hence nfactors=2 or No. of features=2 is taken

By using feature reduction method, 2 groups were made

RC1- Milk+Grocery+Detergent
RC2- Fresh+ Froze +Delicatessen.
Total Cumulative variance: 72% is explained by RC1 and RC2.
where the values of RC1 and RC2 are weighted loads of linear relationship of features.

Varimax: This is used to get independent groups , i.e PC1 and PC2 have totally different
features.

Algorithm :
1. Import Data - We have used the Customer wholesale data, Which is to be analysed to
create the Basket.
2. Creating Test data and Train data : We have divide the data into 80-20 for the Cross-
validation after the training.
3. Analysis using package “Psych” : We have used the Psych package, which is
specially used for psychological analysis, and also Principal Component Analysis
(PCA). The PCA was used to determine the EigenValue and Commonality to
understand the products which commonly into the basket together.

Observation: In the first component, the Detergents_Paper, Milk and Grocery are the
most strongly correlated with original features. In second component, Fresh and Frozen
have the strongest correlations.

In the above figure, where the “nfactors=2” , Number of features= 2, where the features are
combined in 2 groups

Fig 2

In the above figure, where the “nfactors=3” , Number of features= 3, where the features are
combined in 3 groups. But the third group RC3 has only one feature of delicacy, though it has
higher variance than nfactor=2, the analysis will be more wholesome here.

Finally the data scores of RC1 and RC2 are stored in test and training data.
Hierarchical Analysis on PCA:

With the reduced features, cluster analysis with hierarchical clustering is done. This is done
to analyse the target customer. Basically, cluster 1 and 2 are customer groups which is made
based on basket which they have chosen.

From the below figure, it was evident that the no. of clusters for customers should be 2.
From the below diagram it was evident that maximum of the customers lies in cluster 1, which
consists pf 349/352 customers. Due to biasness the cluster 1 has more members.

No. of members in the Cluster:

Correlations between RC1 and RC2 to its customers

As the number of data points (customers data transactions ) for Group 2 is lower and it shows
high affinity towards RC2 , it can be inferred that group 2 majorly preferred RC2 basket of
Fresh+ Froze +Delicatessen.

Business Report Data Mining
Document18 pages
Business Report Data Mining
shorya
91% (11)
SAP Material Valuation
From Everand
SAP Material Valuation
Mayank Arora
Rating: 4.5 out of 5 stars
4.5/5 (59)
Mcdonald'S: Can A Behemoth Lead in The Era of Artificial Intelligence?
Document7 pages
Mcdonald'S: Can A Behemoth Lead in The Era of Artificial Intelligence?
Sarbani Mishra
No ratings yet
Decision Making: Submitted By-Ankita Mishra
Document20 pages
Decision Making: Submitted By-Ankita Mishra
Ankita Mishra
No ratings yet
Case 5: A Chairman's Decision: Launching A Robo-Advisor in CCB Principal Asset Management Company
Document4 pages
Case 5: A Chairman's Decision: Launching A Robo-Advisor in CCB Principal Asset Management Company
Sarbani Mishra
No ratings yet
Roth Emily Bioc426 Experiment1
Document10 pages
Roth Emily Bioc426 Experiment1
api-593584147
No ratings yet
Air France Internet Marketing: Group Number-10
Document5 pages
Air France Internet Marketing: Group Number-10
Sarbani Mishra
No ratings yet
Customer Persona - GRP 7
Document2 pages
Customer Persona - GRP 7
Sarbani Mishra
No ratings yet
Launching New Products and Services Professor Michal Maimaran Fall 2016
Document4 pages
Launching New Products and Services Professor Michal Maimaran Fall 2016
Sarbani Mishra
No ratings yet
Report - Project8 - FRA - Surabhi - Report
Document15 pages
Report - Project8 - FRA - Surabhi - Report
Surabhi Sood
0% (1)
Machine Learning Assignment
Document5 pages
Machine Learning Assignment
Scarl0s
No ratings yet
Chapter 2
Document17 pages
Chapter 2
Đức Anh Lê Ngọc
No ratings yet
Advance Stats Group 7 - Final
Document23 pages
Advance Stats Group 7 - Final
Malavika R Kumar
100% (2)
Project 3: Technical University of Denmark
Document10 pages
Project 3: Technical University of Denmark
Riyaz Alam
No ratings yet
Unit - IV - DIMENSIONALITY REDUCTION AND GRAPHICAL MODELS
Document59 pages
Unit - IV - DIMENSIONALITY REDUCTION AND GRAPHICAL MODELS
Indumathy Paranthaman
No ratings yet
Eigenfaces and Fisherfaces For Face Recognition
Document6 pages
Eigenfaces and Fisherfaces For Face Recognition
Krishna Kumar
No ratings yet
Data Mining Business Report 2
Document18 pages
Data Mining Business Report 2
Saumya Singh
No ratings yet
Manikanth BA
Document8 pages
Manikanth BA
terabapp
No ratings yet
WQD7005 Final Exam - 17219402
Document12 pages
WQD7005 Final Exam - 17219402
AdamZain788
100% (1)
Control Charts
Document28 pages
Control Charts
Ryan ms
No ratings yet
WQD7005 Final Exam - 17219402
Document12 pages
WQD7005 Final Exam - 17219402
AdamZain788
No ratings yet
Michał Krajewski GR 3 Informatyka I Ekonometria
Document2 pages
Michał Krajewski GR 3 Informatyka I Ekonometria
michk100
No ratings yet
Data Mining Project Anshul
Document48 pages
Data Mining Project Anshul
Anshul Mendhekar
100% (1)
Lecture 8-Process Capability PDF
Document29 pages
Lecture 8-Process Capability PDF
Woon How
100% (1)
Apple Data
Document8 pages
Apple Data
manohargade19
No ratings yet
I. The K-Means Clustering Method
Document17 pages
I. The K-Means Clustering Method
Merishna Singh Suwal
No ratings yet
Processcapabilityindices
Document6 pages
Processcapabilityindices
kriping
No ratings yet
TCH 1250
Document62 pages
TCH 1250
Le Anh Toan
No ratings yet
Data Mining Project DSBA PCA Report Final
Document21 pages
Data Mining Project DSBA PCA Report Final
indraneel120
No ratings yet
BAM Analysis
Document15 pages
BAM Analysis
pratik panchal
No ratings yet
Nutr 358 Paa 3
Document13 pages
Nutr 358 Paa 3
api-354211851
100% (1)
External Quality Grading of Apple Using Deep Learning: Nitika - Cse@cumail - in
Document8 pages
External Quality Grading of Apple Using Deep Learning: Nitika - Cse@cumail - in
TECHTOBYTE
No ratings yet
About Log Linear Validation
Document10 pages
About Log Linear Validation
kakarotodesu
No ratings yet
Practical Guide To Principal Component N R
Document43 pages
Practical Guide To Principal Component N R
Juan Esteban Perez Cardozo
No ratings yet
PROS - Ardilla ADR, Iwan S, Ivanna KT - Performance Comparison Between - Fulltext
Document5 pages
PROS - Ardilla ADR, Iwan S, Ivanna KT - Performance Comparison Between - Fulltext
Frafangesti Putri Dian Narta
No ratings yet
ML Mod 6
Document5 pages
ML Mod 6
MSD
No ratings yet
Statistical Process Control: by H.S.Pundle
Document31 pages
Statistical Process Control: by H.S.Pundle
PALLAVI BHISE
No ratings yet
Data Mining Journal 2 Kashan
Document13 pages
Data Mining Journal 2 Kashan
Kashan Riaz
No ratings yet
Machine Learning Performance Evaluation Report
Document40 pages
Machine Learning Performance Evaluation Report
Peace Emmanuel
No ratings yet
Process Capability
Document4 pages
Process Capability
Leonardo Sanchez
No ratings yet
Rocess Capability - The Basics: Part 1: Carl Berardinelli
Document52 pages
Rocess Capability - The Basics: Part 1: Carl Berardinelli
saravanan t
No ratings yet
Business Report Data Mining
Document29 pages
Business Report Data Mining
hepzi selvam
No ratings yet
ML Unit 2
Document41 pages
ML Unit 2
abhijit kate
No ratings yet
Stat Guide Minitab
Document37 pages
Stat Guide Minitab
Septiana Rizki
No ratings yet
Material Quantity Calculation
Document33 pages
Material Quantity Calculation
balu4indians
100% (1)
Kunal DS
Document92 pages
Kunal DS
Vipul Gupta
No ratings yet
23660
Document8 pages
23660
Ade Feriyatna
No ratings yet
LDA KNN Logistic
Document29 pages
LDA KNN Logistic
shruti gujar
100% (1)
Data Mining Disease Diagnosis Presentation
Document35 pages
Data Mining Disease Diagnosis Presentation
yojoginder86
No ratings yet
Aiml - 07 - 28
Document4 pages
Aiml - 07 - 28
darshil shah
No ratings yet
Project Report - Data Mining
Document52 pages
Project Report - Data Mining
Ruhee's Kitchen
No ratings yet
3
Document3 pages
3
om
No ratings yet
Fruit Quality Classifier - Group 1
Document12 pages
Fruit Quality Classifier - Group 1
Bruno Teles
No ratings yet
1 Stop Project2
Document30 pages
1 Stop Project2
Jagadeesh
No ratings yet
Methodology
Document3 pages
Methodology
hanzallahhassam73
No ratings yet
Interpret The Key Results For Normal Capability Analysis
Document5 pages
Interpret The Key Results For Normal Capability Analysis
Cloud Redfield
No ratings yet
MACHINE LEARNING WITH PYTHON - Digit Recognition With Scikit-Learn and Mnist
Document11 pages
MACHINE LEARNING WITH PYTHON - Digit Recognition With Scikit-Learn and Mnist
alexandre
No ratings yet
Gomez Jorge Project
Document9 pages
Gomez Jorge Project
Jorge Luis Gomez Ponce
No ratings yet
Discriminant R
Document21 pages
Discriminant R
Sachin Yadav
No ratings yet
PCA Assgn 2
Document6 pages
PCA Assgn 2
anandpiyush000
No ratings yet
Zuur A.F. Et Al 2009 - Mixed Effects Models and Extensions in Ecology With R - Chap02
Document23 pages
Zuur A.F. Et Al 2009 - Mixed Effects Models and Extensions in Ecology With R - Chap02
Ju
No ratings yet
Chapter Two - PPTX Final
Document61 pages
Chapter Two - PPTX Final
semetegna she zemen 8ተኛው ሺ zemen ዘመን
No ratings yet
Report - Project8 - FRA - Surabhi - Report
Document15 pages
Report - Project8 - FRA - Surabhi - Report
Surabhi Sood
100% (1)
Dicu Bogdan
Document19 pages
Dicu Bogdan
FRANCIS YEGO
No ratings yet
DATA MINING and MACHINE LEARNING: CLUSTER ANALYSIS and kNN CLASSIFIERS. Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING: CLUSTER ANALYSIS and kNN CLASSIFIERS. Examples with MATLAB
César Pérez López
No ratings yet
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet
Air France Internet Marketing: By, Group 8 Amol Tambe - Atika Lamba - Kanika Khanna - Mayur Ghude - Shekhar Suman
Document8 pages
Air France Internet Marketing: By, Group 8 Amol Tambe - Atika Lamba - Kanika Khanna - Mayur Ghude - Shekhar Suman
Sarbani Mishra
No ratings yet
Air France: Internet Marketing: Group 2
Document8 pages
Air France: Internet Marketing: Group 2
Sarbani Mishra
No ratings yet
Air France Internet Marketing:: Submitted by
Document5 pages
Air France Internet Marketing:: Submitted by
Sarbani Mishra
No ratings yet
Sellar's Market: by Group 7 RM-A
Document6 pages
Sellar's Market: by Group 7 RM-A
Sarbani Mishra
No ratings yet
Group 9 Sellars Market
Document3 pages
Group 9 Sellars Market
Sarbani Mishra
No ratings yet
This Study Resource Was: IMC CASE: Gillette: Dry Idea (A) File
Document5 pages
This Study Resource Was: IMC CASE: Gillette: Dry Idea (A) File
Sarbani Mishra
No ratings yet
Energy Saving Electric Vehicle Pitch Deck by Slidesgo
Document70 pages
Energy Saving Electric Vehicle Pitch Deck by Slidesgo
Sarbani Mishra
No ratings yet
Final Project Report: Artificial Intelligence For Business Ai-Loom
Document23 pages
Final Project Report: Artificial Intelligence For Business Ai-Loom
Sarbani Mishra
No ratings yet
Load Carrying Electric Vehicle: Market Analysis
Document32 pages
Load Carrying Electric Vehicle: Market Analysis
Sarbani Mishra
No ratings yet
Lohia Narain Cargo Brochure
Document2 pages
Lohia Narain Cargo Brochure
Sarbani Mishra
No ratings yet
GRP 5 - AIB SecB - Final Report
Document24 pages
GRP 5 - AIB SecB - Final Report
Sarbani Mishra
No ratings yet
Reinforcement Learning
Document8 pages
Reinforcement Learning
Sarbani Mishra
No ratings yet
Helpful Hints:: Nestlé Refrigerated Foods (A) :contadina Pasta & Pizza
Document1 page
Helpful Hints:: Nestlé Refrigerated Foods (A) :contadina Pasta & Pizza
Sarbani Mishra
No ratings yet
Artificial Intelligence in Business - Secb: Sarbani Mishra Pgp/24/111
Document4 pages
Artificial Intelligence in Business - Secb: Sarbani Mishra Pgp/24/111
Sarbani Mishra
No ratings yet
CASE Analysis
Document16 pages
CASE Analysis
Sarbani Mishra
No ratings yet
Decision Tree Explanation
Document13 pages
Decision Tree Explanation
Sarbani Mishra
No ratings yet
CA Pintura Case
Document11 pages
CA Pintura Case
Sarbani Mishra
No ratings yet
CASE Analysis
Document16 pages
CASE Analysis
Sarbani Mishra
No ratings yet
PGP 2 Case Study
Document3 pages
PGP 2 Case Study
Sarbani Mishra
No ratings yet
Thermodynamic Properties and Hilbert Space of The Human Brain
Document10 pages
Thermodynamic Properties and Hilbert Space of The Human Brain
dmoratal
No ratings yet
Jean-François Magni (Auth.) - Robust Modal Control With A Toolbox For Use With MATLAB®-Springer US (2002)
Document300 pages
Jean-François Magni (Auth.) - Robust Modal Control With A Toolbox For Use With MATLAB®-Springer US (2002)
Ravi Verma
No ratings yet
111107062
Document2 pages
111107062
Etta Amaresh
No ratings yet
22ma101 Unit - III Multivariable Calculus
Document103 pages
22ma101 Unit - III Multivariable Calculus
230603.it
No ratings yet
Belytschko Time Integration
Document29 pages
Belytschko Time Integration
Foopiew
No ratings yet
Math 215 HW #11 Solutions
Document6 pages
Math 215 HW #11 Solutions
John Mancia
No ratings yet
Principal Strains and Invariants
Document8 pages
Principal Strains and Invariants
Long Đinh Hoàng
No ratings yet
Vibrations and Stability - J.J.Thomsen, 2003 - Index
Document9 pages
Vibrations and Stability - J.J.Thomsen, 2003 - Index
Benjamin
No ratings yet
Meliga Boujo Gallaire JFM 2016 - Prod
Document50 pages
Meliga Boujo Gallaire JFM 2016 - Prod
Debendra Nath Sarkar
No ratings yet
Improved Laplacian Smoothing of Noisy Surface Meshes PDF
Document8 pages
Improved Laplacian Smoothing of Noisy Surface Meshes PDF
Rapacitor
No ratings yet
MBA Brochure 2013
Document34 pages
MBA Brochure 2013
ArunEsh
No ratings yet
Chapter 3 - Matrices PDF
Document6 pages
Chapter 3 - Matrices PDF
sam19961
No ratings yet
Aspects of Linear Stability Analysis For Higher-Order Finite-Difference Methods
Document11 pages
Aspects of Linear Stability Analysis For Higher-Order Finite-Difference Methods
KarthikPrakash
No ratings yet
Linear Algebra
Document395 pages
Linear Algebra
Abhisek Datta
67% (3)
Vollmer 1990 Eigenvalue Methods To Structural Domain Analysis
Document6 pages
Vollmer 1990 Eigenvalue Methods To Structural Domain Analysis
Flávia Braga
100% (1)
Mtech Ai ML PDF
Document19 pages
Mtech Ai ML PDF
Nupur Sharma
No ratings yet
Sy 080223055120
Document150 pages
Sy 080223055120
2023011054
No ratings yet
M Turhan Numerical Analysis With Java Examples
Document913 pages
M Turhan Numerical Analysis With Java Examples
Akira Kawa
No ratings yet
Matlab Report Morh's Circle
Document7 pages
Matlab Report Morh's Circle
georgesnomicos_60346
No ratings yet
Lecture Notes For Math 623 Matrix Analysis: 1 Normal Matrices
Document9 pages
Lecture Notes For Math 623 Matrix Analysis: 1 Normal Matrices
Camilo Almandos Cardona
No ratings yet
Panel Method Flutter Prediction 3d Wing
Document10 pages
Panel Method Flutter Prediction 3d Wing
Bedirhan NALBANT
No ratings yet
Object Oriented Programming in ANSI C by Balaguruswamy PDF Download
Document29 pages
Object Oriented Programming in ANSI C by Balaguruswamy PDF Download
saritha
No ratings yet
Package Ca': R Topics Documented
Document22 pages
Package Ca': R Topics Documented
JuanCarlosEspecialización
No ratings yet
Lecture Notes On Linear Algebra - Kuwait Uni
Document149 pages
Lecture Notes On Linear Algebra - Kuwait Uni
mya que
No ratings yet
Notsablosom
Document318 pages
Notsablosom
சிவசங்கர் முத்துமல்லு
No ratings yet
5final Syllabus-ECE - (1st To 8th Semester)
Document130 pages
5final Syllabus-ECE - (1st To 8th Semester)
Tanmay Goel
No ratings yet
PDF A Textbook of Engineering Mathematics Ptu Jalandhar Sem Ii 9 Ed Edition Usha Paul N P Bali Ebook Full Chapter
Document53 pages
PDF A Textbook of Engineering Mathematics Ptu Jalandhar Sem Ii 9 Ed Edition Usha Paul N P Bali Ebook Full Chapter
marlin.mcgregor176
100% (7)
DylanEveringham MScThesis
Document64 pages
DylanEveringham MScThesis
王子昱
No ratings yet
2020 - 21 Session Full Syllabus
Document167 pages
2020 - 21 Session Full Syllabus
kec.abhishek463
No ratings yet