Professional Documents
Culture Documents
Se#ng Up ML Problem: Rao Vemuri UC Davis
Se#ng Up ML Problem: Rao Vemuri UC Davis
up
ML
Problem
Rao
Vemuri
UC
Davis
rvemuri@gmail.com
Deni>on
of
ML
A
computer
program
is
said
to
learn
from
experience
E
with
respect
to
some
class
of
tasks
T
and
performance
measure
P,
if
its
performance
at
T,
as
measured
by
P,
improves
with
experience
E.
Example:
T
Playing
chess
P
Percentage
of
games
won
E
Number
of
games
played
Quan>fying
P
Percentage
of
games
won
f
(won,
lost,
draw)
Win
=
1
point
Lost
=
-1
point
Draw
=
0
points
Capturing
Experience
E:
A
Data
Set
1
2
3
4
Outlook
Temperature Humidity
Windy
Surng
Sunny
Sunny
Rainy
Overcast
Mild
Hot
Mild
Cool
True
False
False
True
Yes
No
No
Yes
Normal
High
Normal
High
A^ributes
or
Features
1
2
3
4
Outlook
Sunny
Sunny
Rainy
Overcast
Mild
Hot
Mild
Cool
Normal
High
Normal
High
True
False
False
True
Yes
No
No
Yes
1 normal
high
Glucose
high
Heart
AAack?
True Yes
1
2
3
4
Outlook
Temperature
Humidity
Windy
Sunny
Sunny
Rainy
Overcast
Mild
Hot
Mild
Cool
Normal
High
Normal
High
True
False
False
True
Yes
No
No
Yes
Nota>on:
Instance
1
2
3
4
Outlook
Temperature Hunidity
Windy
Surng
Sunny
Sunny
Rainy
Overcast
Mild
Hot
Mild
Cool
True
False
False
True
Yes
No
No
Yes
Normal
High
Normal
High
Note
on
Features
Each
instance
is
described
by
the
same
set
of
features
The
features
may
be
con>nuous
(eg.
Temperature)
discrete
(eg.
Cost
in
$)
Binary
(eg.
True/False)
Categorical
(eg.
Red/Blue/Yellow)
Outlook
Temperature Hunidity
Windy
Surng
Sunny
Sunny
Rainy
Overcast
Mild
Hot
Mild
Cool
True
False
False
True
Yes
No
No
Yes
Normal
High
Normal
High
Test
Set:
t
The
last
column
has
no
labels
Our
job
is
nd
those
labels
Hypothesis
A
combina>on
of
a^ributes
and
our
guess
as
to
what
the
label
should
be
for
that
combina>on
If
the
(outlook
=
dont
care)^(Temp
=
cool)^(humidity=normal)^(Windy=Trues)
is
one
possible
hypothesis.
For
this
hypothesis
our
machine
should
answer
YES
or
NO
Types
of
ML
Algorithms
Supervised:
You
are
given
labeled
training
data.
Create
a
func>on
that
ts
the
data
Classica>on
(looking
for
discrete
categories)
Regression
(looking
for
a
con>nuous
func>on)
WEKA
W(aikato)
E(nvironment)
for
K(nowlegde)
A(nalysis)
Developed
by
the
University
of
Waikato
in
New
Zealand
Machine
Learning
Tools
and
Techniques
in
Java
Comprehensive
suite
of
Java
class
libraries
Implemented
many
state-of-the-art
machine
learning
and
data
mining
algorithms
h^p://www.cs.waikato.ac.nz/~ml/index.html
WEKA Consists of
Explorer
Experimenter
Knowledge
Flow
Simple
Command
Line
Interface
Java
Interface
Explorer
Is
WEKAs
main
graphical
user
interface
Weka
package
consists
of
Filters
Classiers
Clusterers
Associa>ons
A^ribute
Selec>on
Visualiza>on
tool
Pre-Processing
Data
loaded
from
URL
or
DB
Preprocessing
rou>nes
in
WEKA
are
called
lters
MergeA*ributeValuesFilter
NominalToBinaryFilter
Discre:seFilter
ReplaceMissingValuesFilter
Homework
Assignment
1
Search
for
WEKA
on
the
Web
and
write
(a)
4
short
sentences
about
what
the
best
features
of
WEKA
are.
(b)
One
sentence
on
where
WEKA
is
useful