Chapter 4: Machine Learning

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 30

Chapter 4:

Machine Learning
Can Machine Learns?
 Learning? ~ to improve automatically with experience.
 We do not yet know how to make computers learn nearly
as well as people learn ~ machine and human is two
different things .
 “concept” ~ human learns from their experience (trial by
error, or being guided – like an infant/student).
 Example: baby attempts to walk after fall down several
times. Pain or how to balance are the best guidance.
 Problem ~ how machine can learn these? Do we need
to put sensory devices to detect pain? ~ how to
represent pain? ~ as “electronic pulse” of pain?
Machine Learning
 Machine learning ~ draws on concepts from statistics,
artificial intelligence, philosophy, information theory,
biology, cognitive science, computational complexity and
control theory (many more!).
 How machine learn? : a computer is said to learn from
experience E with respect of tasks T and performance
measures P. (if its P at tasks in T (measured by P)
improves with experience E)
 Need to have well-defined learning problem (based on
three features: (class of tasks, measure of performance
and source of experience)
Well-Posed Learning Problems
 A checkers learning problems:
 Task T: playing checkers
 Performance measure P: percent of games won
against opponents.
 Training experience E: playing practice games against
itself.
 A handwriting recognition problems:
 Task T: recognizing & classifying handwritten words
images.
 Performance measure P: percent of words correctly
classified.
 Training experience E: database of handwritten words.
General Machine Learning
Model
Training
experience
Database
(experience) Machine learning Adjust learning
algorithm parameter (based
on performance
results)

<New problem>

Performance?
Designing a Learning System
• Basic four steps to design learning system:

Determine type of “what is the best training


training experience approach”

Determine target “how the performance can


function be evaluated @ what to
be solved?”

Determine representation “how to represent the


of learned function learning process?”

Determine learning “how learning process will


algorithm take place?”
Types of Machine Learning
•Basic machine learning methods:
•Concept learning
•Decision tree learning
•Supervised and unsupervised learning
•Statistical learning & computational learning theory
•Instance based learning
•Explanation based learning
•Evolutionary learning
•Reinforcement learning
1) Decision Tree Learning
 The most widely used and practical methods of
classification.
 Approximating discrete valued functions ~ using “Divide
and Conquer”.
 Concept = node ”test attributes” , branch  “possible
value of the attributes”.
 Classification method  sorting the attribute down
from the root to particular leaf node.
 Decision tree algorithm : ID3, ASSTANT. C4.5, C5.0,
CART, SPRINT.
Decision Tree Learning
ROOT
Gender NODE
BRANCH
Male
Female

Height
Height > 2.0m
<1.3m >1.8m <1.5m

Tall
Short Tall Short
Medium
Medium
LEAF NODE
Example: Decision Tree Learning

ID Refund Marital Tax Cheat


1 Yes Single 125 No REFUND
2 No Married 100 No
YES
3 No Single 70 No NO
4 Yes Married 120 No
5 No Divorced 95 Yes MARITAL NO

SINGLE, MARRIED
DIVORCED
NO
TAX
<80 > 80

NO YES
Why Decision Tree ?
•Advantages:
•Easy to use and efficient.
•Tree structures are easy to interpret and understand.
•Direct representation.
•Disadvantages
•Do not easily handle continuous data.
•Difficult to handle missing data.
•Correlation between attributes are ignored by decision
tree process.
•Tree might replicate
2) Instance Based Learning
•Instance based learning ~ straight forward approaches to
approximating target value
•Basic: when a new query instance is encountered, a set of
similar related instances is retrieved from memory and used
to classify new instances.
•Sometimes referred as “lazy learner” ~ learning process
takes place when new instance must be classified.
•Main concept ~ the nearest existing example that might
similar to the new one!
•Common method ~ K-Nearest Neighbor and Case Based
Reasoning.
K-Nearest Neighbor
•Named as “lazy learner” method requires comparison
with training set . Primarily based on “nearest” distance.

•Calculate the similarities of two


points (new data and training
data. GROUP A
Group A or B?
•The lowest distance is voted as
neighbor to respective class.
•Assumption : if the nearest
neighbor class is A, then the GROUP B
class of that new data is also A.
K-Nearest Neighbor (cont)

•Assumption: for k-dimensional Euclidean space, the distance


between 2 points, x = {x1, x2,…,xn) and y = {y1,y2,…,yn) are
defined through:
n
Ed    xi  yi 
2
Euclidean Distance
i 1

n
Manhattan Distance M d   xi  yi
i 1

n
Minkowski’s Distance Minkd   x  y 
2
q
i i
i 1
K-Nearest Neighbor (cont)
Attributes X1 X2 X3 CLASS
A 5 1 3 GOOD
B 3 1 3 GOOD
C 4 1 5 BAD

•Given new set of data, D = { 2,1,3). Find the possible


class for set D using KNN. (K = 1)
d(D,A) = 3, d(D,B)=1, d(D,C)=2.83.
d ( D, A)   2  5  1  1  3  3
2 2 2
Based from the calculation, distance
d(D,B) is the minimum.
d ( D, B )   2  3  1  1  3  3
2 2 2

Since K=1, only one neighbor


d ( D, C )   2  4   1  1  3  5
2 2 2
involved, therefore dataset D can be
classified as GOOD (based on
dataset B)
K-Nearest Neighbor (cont)
•Advantages:
•Easy to program
•No optimization / training is required
•Incremental learning (information is retained).
•Robust to noisy data (only nearest data involved)
•Disadvantages:
•Exhaustive learning (more dataset, more memory)
•“Curse of dimensionality”  how about if the
dimension is too big or infinite?
Case Based Reasoning
Uses various techniques to match a situation or a problem
description with the most similar cases  similarity
assessment.

•Definition (Schank and Abelson, 1977): “technique to


solve new problems by adapting solutions that were used to
solve old problems”.

•Refers to both a cognitive and computational model of


reasoning by analogy.

•Basic  many problems are not unique, but rather a


variations of a problem type.
Case Based Reasoning (cont)
•All cases are independent from each other  each case
describes one particular situation.

•Widely implemented in legal, medical, diagnosis.

•Example of case (Case study: paddy disease)

•CASE 1: Leaf color green with yellow stripes


Stalk color green.
Spot yes.
Spot condition stripes
Panicle  yes
DISEASE = bacterial leaf blight.
Case Based Reasoning (cont)
Problem New (1)RETRIEVE
(4)RETAIN Case
Add new cases
Learned
Case
Select related
cases Retrieved
Stored
Cases Case

Solved
Tested, Repaired Case
Case (2)REUSE
Confirmed Suggested
solution (3) REVISE solution
Case Based Reasoning (cont)
CASE F12:
Leaf color green
Stalk color green.
Spot yes. NEW CASE:
Spot condition stripes Leaf color green
Panicle yes Stalk color green.
Disease : Bacterial Leaf Spot no.
Streak. Spot condition stripes
Panicle yes
Disease : ?
CASE B3:
Leaf color yellowish
Stalk color green. Compare
Possibly Bacterial
Spot no. similarities
Leaf Streak disease
Spot condition no. (local)
Panicle no
(New case is almost
Disease : Bakanae similar to case F12).
Case Based Reasoning (cont)
•Advantages:
•Easy to represent (by cases representation)
•Incremental learning (reused, retained and
adaptation process).
•Capable to handle missing value.
•Disadvantages:
•Exhaustive learning (more dataset, more memory)
•Cases should be updated regularly.
•Complexity of the cases sometimes hard to be
represented.
3) Supervised Learning
Supervised Learning
Essential ingredient: availability the external
indicator (“teacher”).  teacher provides desired or
target response for particular training vector.

Environment Teacher

Vector describing Desired


Actual response
state of environment
response
Learning
System -  +

Error signal
Supervised Learning
Example: Multilayer Perceptron Neural Networks
•Inspired by observation that biological learning systems
are built from very complex interconnected neurons.
•Learning algorithm: error-correction learning (error-
signal). dk(n) is a desired response and yk(n) is a actual
response.
ek (n)  d k (n)  yk (n)

•Aim: to minimize cost function: 1


(n)   ek2 (n)
2 k

•Gradient descent minimization method.


MLP Neural Networks
Input
layer Hidden
layer Output
layer

input node output


node

weights
hidden
node
3) Unsupervised Learning
Unsupervised Learning
Essential ingredient: no external “teacher” to
oversee the learning process. (no specific examples
of the function to be learned by the network).
•A sequence of input vector is provided, but NO target
vector.
•Basically, the similar group of data will be clustered
together (self organized learning) – “winner takes all”
strategies. ~ clustering.
•Example: Kohonen Self Organizing Map (SOM),
Adaptive Resonance Theory ~ ART)
Unsupervised Learning
Example: Kohonen Self Organizing Map
•Also known as “topology preserving map”
•The weight vector for a cluster unit serves as an
exemplar of the input patterns associated with that
cluster.
•Basically ~ the cluster unit whose weight vector matches
the input pattern most closely is chosen as a winner.
•Euclidean distance ~ minimum distance is considered
winner.
n
Ed    xi  yi 
2

i 1
Kohonen SOM

Output (cluster)

{Output layer}

weights

{input layer}
Input nodes
4) Reinforcement Learning
Reinforcement Learning
•Addresses to question of how an autonomous agent that
senses and acts in its environment can learn to choose
optimal action(s)  to achieve its goal(s).
•Concept ~ each time the agent performs an action in its
environment, reward or penalty will be given (based on
desirability of result state).
•Task ~ the agent must know which action gain most
reward (reinforcement signal) ~ strengthened signal or
reward indicates satisfactory actions.
•Learning algorithm: using Q learning, adaptive heuristic
critic and temporal-difference methods.
Reinforcement Learning
Agent : Reinforcement Learning

Agent

State (s) Action (a)


Reward (s)

Environment

a0 a1 a2
S0 S1 S2
r0 r1 r2

Aim to r0   r1   r2  ...., where 0    1


2
maximize:
Machine Learning Application
•Pattern recognition ~ to recognize hand-writing,
signature, biometrics, texture analysis, stock exchange,
signal processing or even human emotions.
•Control ~ autonomous robot, self-guided underwater
rover, manufacturing, autonomous vehicles, washing
machine, smart home.
•Medical applications ~ automated heart attack
detection, cancer cells analysis, disease diagnosis,
outbreak analysis.
•Gamming ~ chess playing program, soccer playing
game etc.

You might also like