Professional Documents
Culture Documents
Lect12-Rule Based Classifier
Lect12-Rule Based Classifier
Rule-Based Classifier
• Classify records by using a collection of “if … then …” rules
• Rule: (Condition) y
– where
• Condition is a conjunction of some attribute tests
• y is the class label
(Status=Single) No
Coverage = 40%, Accuracy = 50%
Decision Trees vs. rules
From trees to rules.
• Easy: converting a tree into a set of rules
– One rule for each leaf
– Antecedent contains a condition for every node on the path from
the root to the leaf
– Consequent is the class assigned by the leaf
– Straightforward, but rule set might be overly complex
Decision Trees vs. rules
From rules to trees
• More difficult: transforming a rule set into a tree
– Tree cannot easily express disjunction among
different rules
• Example:
If a and b
then x If c
and d then x
• Exhaustive rules
– There exists a rule for each combination of attribute values.
– This ensures that every record is covered by at least one rule.
• Direct Method
Extract rules directly from data
e.g.: RIPPER, Holte’s 1R (OneR)
• Indirect Method
Extract rules from other classification
models
(e.g. decision trees, etc). e.g:
C4.5rules
A Direct Method: Sequential Covering
1. Start
from an empty rule {} class = C
2. Grow a rule by adding a test to LHS (a = v)
3. Repeat Step (2) until stopping criterion is met
Two issues:
– How to choose the best test? Which attribute to
choose?
– When to stop building a rule?
Example: Generating a rule
b a
b b b a b a b a a
a y b a y b b b
b b a a b b
a a a a
b b a a b b
y b b
b b b 2·6
b
b a b b b b
b b a b b a b
b b b
b b b b b
x x
x 1·2 1·2
space of
examples
• t total number of instances covered by rule
rule so far
• p positive examples of the class predicted
by rule rule after
• t – p number of errors made by rule adding new
term
•Rules accuracy = p/t
When to Stop Building a Rule
Accuracy of
candidate rules
The rule isn’t very accurate, getting only 4 out of 12 that it covers.
So, it needs further refinement.
Further refinement
Modified rule and resulting data
Should we stop here? Perhaps. But let’s say we are going for exact
rules, no matter how complex they become.
So, let’s refine further.
Further refinement