Professional Documents
Culture Documents
DTminer - Presentation
DTminer - Presentation
BY
Tapomay Dey,
Tapomay.dey@gmail.com,
9960156254,
Final year, B.E. computers.
1. Data mining.
2. Decision tree learning.
3. Decision tree.
4. Supervised data.
Data mining is the analysis of (often large) observational data
sets to find unsuspected relationships and to summarize the
data in novel ways that are both understandable and useful to
the data owner.
'Decision tree learning is a method for approximating discrete-
valued target functions, in which the learned function is
represented by a decision tree. Decision tree learning is one of
the most widely used and practical methods for inductive
inference'. (Tom M. Mitchell,1997,p52)
A decision tree is a tree in which each branch node represents
a choice between a number of alternatives (attributes), and
each leaf node represents a decision (class).
Supervised data is represented by an (x, y) pair where x is
data about entity and y is its class.
CLASS
DECISION TREE
Future Data Sets CLASSIFIER
ACCURAC
Y
DATA
STRUCTURE
STORAGE
DECISION TREE
Class attribute GENERATOR –
ID3
SCHEMA
DATA
PREPROCESSING
ATTRIBUT
E VALUES
DATABASE
ALGORITHMS USED – ID3
1. Input
2. Basic concepts and measures
used
3. Strategy
4. Output
5. Constraints
6. Future enhancements
REQUIREMENTS- 3 SUN
TECHNOLOGOES USED:
Glassfish
Netbeans IDE
MySQL
REFERENCES:
Tom M. Mitchell, (1997). Machine Learning,
Singapore, McGraw-Hill.
Paul E. Utgoff and Carla E. Brodley, (1990). 'An
Incremental Method for Finding Multivariate Splits
for Decision Trees', Machine Learning: Proceedings
of the Seventh International Conference, (pp.58).
Palo Alto, CA: Morgan Kaufmann.
MIT OCW- 15.062 Data Mining, Spring 2003.
An Implementation of ID3 --- Decision Tree Learning
Algorithm Wei Peng, Juhua Chen and Haiping Zhou:
Machine Learning, University of New South Wales,
School of Computer Science & Engineering, Sydney,
NSW 2032, Australia