Professional Documents
Culture Documents
An Introduction To Data Mining: Prof. S. Sudarshan CSE Dept, IIT Bombay
An Introduction To Data Mining: Prof. S. Sudarshan CSE Dept, IIT Bombay
Prof. S. Sudarshan
CSE Dept, IIT Bombay
❚ Predictive:
❙ Regression
❙ Classification
❙ Collaborative Filtering
❚ Descriptive:
❙ Clustering / similarity matching
❙ Association rules and variants
❙ Deviation detection
Classification
(Supervised learning)
Classification
• Pros • Cons
+ Fast training – Slow during application.
– No feature selection.
– Notion of proximity vague
Decision trees
❚ Hierarchical clustering
❙ agglomerative Vs divisive
❙ single link Vs complete link
❚ Partitional clustering
❙ distance-based: K-means
❙ model-based: EM
❙ density-based:
Agglomerative
Hierarchical clustering
Rangeela
Anand QSQTQSQT Rangeela
100 daysAnand Sholay
100 days VertigoDeewar
DeewarVertigo
Sholay
Smita
Vijay
Vijay
Rajesh
Mohan
Rajesh
Nina
Nina
Smita
Nitin ? ? ? ? ? ? ? ? ?? ??
Model-based approach
Industry Application
Finance Credit Card Analysis
Insurance Claims, Fraud Analysis
Telecommunication Call record analysis
Transport Logistics management
Consumer goods promotion analysis
Data Service providersValue added data
Utilities Power usage analysis
Why Now?