Foundations of Machine Learning

Foundations of Machine Learning
(CSE-4132)
Lecture – 1
Introduction
Learning
Ajit K Nayak, Ph.D.

Prof. , Dept. of CSIT
Room# C-118
ajitnayak@soa.ac.in
9338749992
Machine Learning Problems
• Identify the risk factors for prostate cancer.
• Classify a recorded phoneme based on a log-periodogram.
• Predict whether someone will have a heart attack on the
basis of demographic, diet and clinical measurements.
• Customize an email spam detection system.
• Identify the numbers in a handwritten postal Pin Code.
• Classify a tissue sample into one of several cancer classes,
based on a gene expression profile.
• Establish the relationship between salary and demographic
variables in population survey data.
• Classify the pixels in a LANDSAT image, by usage.
• ...
Identify the risk factors for prostate cancer
Predict PSA from

other
measurements
Classify a Recorded phoneme
• Classify two Phoneme Examples
25
vowel sounds
Log-periodigram
20
from each
other
15
10
5
0
0 50 100 150 200 250

Frequency
Customize an email spam detection system
• Data from 4601 emails to a person.
• Each is labeled as spam or email.
• goal: build a customized spam filter.
• input features: relative frequencies of 57 of the most
commonly occurring words and punctuation marks in
these email messages.
george you hp free ! edu remove
spam 0.00 2.26 0.02 0.52 0.51 0.01 0.28
email 1.27 1.27 0.90 0.07 0.11 0.29 0.01
Handwritten postal Pin Code recognition.
• Classify in to
10 digit
classes
Classify the pixels in a LANDSAT image
Spectral Band 1 Spectral Band 2 Spectral Band 3
Spectral Band 4 Predicted Land Usage Actual Land Usage

Learning
• Acquisition of knowledge or Skill
– through study, experience, or being taught
Learning
Supervised Unsupervised
Given a data set and No idea how the results look
corresponding output, like.
We can find out the We can derive structure from
relationship between data.
the input and the
output. (y=f(X))
Example - 1
• Input1: labeled
Training: Supervised
Give me a white marble/ Testing

What color marble is
this? / white
• Output1
• Input2:
unlabeled Unsupervised
Arrange them in 6 groups
• Output2
Supervised Learning Exp – Wage-I
• Consider wage data (age; wage;)
• Wage increases with age but then
300
decreases again after approximately

age 60.
Wage
200
• May be used to predict wage given age

• But there is a significant amount of
100
variability associated with this average

value
50
20 40 60 80
Age
• So age alone is unlikely to provide an accurate prediction of a

wage.
• Lets consider two more parameters year and education level
Supervised Learning Exp – Wage-II
300
300
300
Wage
Wage
Wage
200
200
200
100
100
100
50
50
50
20 40 60 80 2003 2006 2009 1 2 3 4 5

Age Year Education
• Wages increase by approximately ₹10,000, in a roughly linear
(or straight-line) fashion between 2003 and 2009
• Wages are also typically greater for individuals with higher
education levels
• So, the most accurate prediction of wage will be obtained by
combining age, education, and the year.
• This is often referred to as a regression problem
Supervised Learning Exp: Stock Market
• Considers stock index over a 5-year period
• The goal is to predict whether the index will increase (UP) or
decrease (DOWN)on a given day using the past 5 days’
percentage changes in the index.
Yesterday Two Days Previous Three Days Previous
Percentage change in index

Down Up Down Up Down Up

Today’s Direction Today’s Direction Today’s Direction
• Two boxplots: one for the days for which the market increased,
and one for the days for which the market decreased.
• This is known as a classification problem.
Data Representation
• n: number of distinct data points/observations/rows/
• p: number of variables/attributes/columns/ predictors/ features
• X: a matrix of order n p : dataset/sample
 xi1 
 x11 x12  x1 p • xi is a vector of length p,  
   xi 2 
 x21 x22  x2 p  containing the p xi   
X  



variable measurements
   x 
  for the ith observation.  ip 
x x  x 
 n1 n 2 np 
•  y1 
• xj is a vector of length y i denotes the i th
 
observation of the  y2 
n, containing the j  x1 j 
th y  
  variable on which 
variable of complete  x2 j   
xj    predictions to be made.  y 
observation.   n
  • y: set of all observations
x  (responses), y=f(X)
 nj 
Example- IRIS Dataset
• Contains 150 sepal_ sepal_ petal_ petal_
observations length width length width species
5.1 3.5 1.4 0.2 setosa
• Each 4.9 3 1.4 0.2 setosa
observation 4.7 3.2 1.3 0.2 setosa
4.6 3.1 1.5 0.2 setosa
has four … … … … …
variables
7 3.2 4.7 1.4 versicolor
• n=? 6.4 3.2 4.5 1.5 versicolor
– 150
6.9 3.1 4.9 1.5 versicolor
• p=? 5.5 2.3 4 1.3 versicolor
–4
… … … … versicolor
• X=? 6.3 3.3 6 2.5 virginica
5.8 2.7 5.1 1.9 virginica
• y=? 7.1 3 5.9 2.1 virginica
• y=f(X) 6.3 2.9 5.6 1.8 virginica
… … … … virginica
Thank You
ధన్యవాదాలు

Foundations of Machine Learning

Uploaded by

Copyright:

Available Formats

You might also like

Foundations of Machine Learning

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Foundations of Machine Learning

Uploaded by

Copyright:

Available Formats

Foundations of Machine Learning

Ajit K Nayak, Ph.D.

Predict PSA from

0 50 100 150 200 250

Spectral Band 4 Predicted Land Usage Actual Land Usage

Give me a white marble/ Testing

Arrange them in 6 groups

decreases again after approximately

• May be used to predict wage given age

variability associated with this average

• So age alone is unlikely to provide an accurate prediction of a

20 40 60 80 2003 2006 2009 1 2 3 4 5

Percentage change in index

Percentage change in index

Down Up Down Up Down Up

You might also like