Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 18

Estimation (Linear

Regression)
Muhamad Fathurahman
Data Mining Session 26-27 March 2020
This Slide is adopted from the Andrew NG’s notes on Linear Regression
Lecture
Outline....
• Prediction With Linear Regression
• Linear Regression Model
• Example
Linear Regression
• Type of Supervised Learning Algorithm
• Approach to modelling a relationship between a target variable and one
or more explanatory variables.
• The target variable is a continous value. (e.g Housing Price)
Dataset : Housing Price
Living Area Price
(Feet2) (1000$s)
2104 400
1600 330
2400 369
1416 232
3000 540
.... ....
Source : Lecture Notes, Andrew Ng
Dataset : Housing Price (Cont. 2)
Living Area Price (1000$s) •  as an input variables (Living
(Feet2) area)
• as an output or target
2104 400 variables
1600 330 • as an training example,
2400 369 • training example is called
training set
1416 232
• denoted space input and
3000 540 output values.
.... ....
Dataset : Housing Price (Cont. 3)
Living Area Price (1000$s) •  as an input variables (Living
(Feet2) area)
• as an output or target
2104 400 variables
1600 330 • as an training example,
2400 369 • training example is called
training set
1416 232
• denoted space input and
3000 540 output values.
.... ....
Dataset : Housing Price (Cont. 3)
Living Area Price (1000$s) •  as an input variables (Living
(Feet2) area)
• as an output or target
2104 400 variables
1600 330 • as an training example,
𝑖=3
  2400 369 • training example is called
training set
1416  
( 𝑥 (3) , 𝑦232
(3 ) ( 3)
𝑦(3)=369
)= 𝑥 =2400• ,denoted space input and
3000 540 output values.
.... ....
Dataset : Housing Price (Cont. 3)
Living Area Price (1000$s) •  as an input variables (Living
(Feet2) area)
• as an output or target
2104 400 variables
1600 330 𝑚=3
  • as an training example,
2400 369 • training example is called
training set
1416 232
• denoted space input and
3000 540 output values.
.... ....
 Dataset : Housing Price (Cont. 3)
Living Area Price (1000$s) •  as an input variables (Living
(Feet2) area)
• as an output or target
2104 400 variables
1600 330 • as an training example,
2400 369 • training example is called
𝑋
 
𝑌
 
training set
1416 232
• denoted space input and
3000 540 output values.
.... ....
Linear Regression Model
Training Set

Learning
Algorithm

𝒙  h  𝒑𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 𝒚
Living area of house Predicted price of house
Linear Regression : Dataset
Living •  are two dimensional vectors
Price in
Area Bedrooms
(1000$s)
(Feet2) • is living area and is its
2104 3 400 number of bedrooms
1600 3 330
2400 3 369
1416 2 232
3000 4 540
.... .... ....
Linear Regression: Model or Hypothesis
•  To perform supervised learning, we need to represent
function/hypotheses h in a computer.
• As an initial choise, let’s say we decide to approximate as a
linear function of :
  𝜃𝑖 are called parameters
 

or weights
  𝑥 0=1 is intercept term
  𝑛
𝑇
h ( 𝑥 ) =∑ 𝜃 𝑖 𝑥 𝑖=𝜃 𝑥
𝑖=0
Linear Regression: Model or Hypothesis
•• To
  perform supervised learning, we need to represent function/hypotheses h in a
computer.
• As an initial choise, let’s say we decide to approximate as a linear function of :

 
𝜃𝑖
  are called parameters
or weights
 𝑥 0=1 is intercept term
Living Area
Bedrooms Price (1000$s)
(Feet2)
2104 3 400
 
x(2) 1600 3 330
  y^
2400 3 369
Linear Regression : Cost Function
•  Now, given a training set, how do we pick, or learn parameters
• We need to define a cost function to measure how close ’ are to
corresponding . We defined the cost function :

  𝑚
1 (𝑖) (𝑖) 2
𝐽 ( 𝜃 )= ∑ (h𝜃 ( 𝑥 ) − 𝑦 ) .
2 𝑖=1
Linear Regression in Python
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error as cal_mse
import matplotlib.pyplot as plt
#Load Dataset

data = datasets.load_boston()
Linear Regression in Python (2)
# Split feature and target
X = data.data
Y = data.target
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size =
0.33, random_state = 5)

# Create model
lm = LinearRegression()
lm.fit(X_train, Y_train)
Y_pred = lm.predict(X_test)
Linear Regression in Python (3)
# Print mean Square Error
mse =cal_mse(Y_test, Y_pred)
print(mse)
#
plt.scatter(Y_test, Y_pred)
plt.xlabel("Prices: $Y_i$")
plt.ylabel("Predicted prices: $\hat{Y}_i$")
plt.title("Prices vs Predicted prices: $Y_i$ vs
$\hat{Y}_i$")
plt.show()
Linear Regression in Python (4)

You might also like