Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 13

DTminer - a webapp

BY
Tapomay Dey,
Tapomay.dey@gmail.com,
9960156254,
Final year, B.E. computers.

PUNE INSTITUTE OF COMPUTER TECHNOLOGY


DHANKAWADI, PUNE – 43.
DEPARTMENT: COMPUTER ENGINEERING ACADEMIC YEAR: 2008-09
PROBLEM DEFINITION:
Input is supervised historical data. The system
must learn from this data in order to generate a
decision tree.
The decision tree shall classify instances by
traversing from root node to leaf node. We start
from root node of decision tree, testing the
attribute specified by this node, and then
moving down the tree branch according to the
attribute value in the given set. This process is
the repeated at the sub-tree level until a leaf
node (class prediction) is reached.
NEED & APPLICATION
FINANCIAL INSTITUTIONS
GENOMICS SUPERCOMPUTER
DISEASE RECOGNITION
TELECOMMUNICATIONS
SUPER MARKETS
ETC.
BASIC CONCEPTS:

1. Data mining.
2. Decision tree learning.
3. Decision tree.
4. Supervised data.
 Data mining is the analysis of (often large) observational data
sets to find unsuspected relationships and to summarize the
data in novel ways that are both understandable and useful to
the data owner.
 'Decision tree learning is a method for approximating discrete-
valued target functions, in which the learned function is
represented by a decision tree. Decision tree learning is one of
the most widely used and practical methods for inductive
inference'. (Tom M. Mitchell,1997,p52)
 A decision tree is a tree in which each branch node represents
a choice between a number of alternatives (attributes), and
each leaf node represents a decision (class).
 Supervised data is represented by an (x, y) pair where x is
data about entity and y is its class.
CLASS
DECISION TREE
Future Data Sets CLASSIFIER
ACCURAC
Y
DATA
STRUCTURE
STORAGE

DECISION TREE
Class attribute GENERATOR –
ID3

SCHEMA
DATA
PREPROCESSING
ATTRIBUT
E VALUES

DATABASE
ALGORITHMS USED – ID3

1. Input
2. Basic concepts and measures
used
3. Strategy
4. Output
5. Constraints
6. Future enhancements
REQUIREMENTS- 3 SUN
TECHNOLOGOES USED:
Glassfish
Netbeans IDE
MySQL
REFERENCES:
Tom M. Mitchell, (1997). Machine Learning,
Singapore, McGraw-Hill.
Paul E. Utgoff and Carla E. Brodley, (1990). 'An
Incremental Method for Finding Multivariate Splits
for Decision Trees', Machine Learning: Proceedings
of the Seventh International Conference, (pp.58).
Palo Alto, CA: Morgan Kaufmann.
MIT OCW- 15.062 Data Mining, Spring 2003.
An Implementation of ID3 --- Decision Tree Learning
Algorithm Wei Peng, Juhua Chen and Haiping Zhou:
Machine Learning, University of New South Wales,
School of Computer Science & Engineering, Sydney,
NSW 2032, Australia

You might also like