Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 31

Regression

• The main goal of regression is the construction


of an efficient model to predict the dependent
attributes from a bunch of attribute variables.
• A regression problem is when the output
variable is either real or a continuous value i.e
salary, weight, area, etc.
• Types of Regression Models
• Types Of Regression
– Simple Linear Regression
– Polynomial Regression
– Support Vector Regression
– Decision Tree Regression
– Random Forest Regression
• Simple Linear Regression
• predict a target variable Y based on the input
variable X. A linear relationship should exist
between target variable and predictor and so
comes the name Linear Regression
• Polynomial Regression
• We transform the original features into
polynomial features of a given degree and
then apply Linear Regression on it.
• Consider the linear model Y = a+bX is
transformed into something like
• Support Vector Regression
• In SVR, we identify a hyperplane with
maximum margin such that the maximum
number of data points are within that margin.
• Our best fit line is the hyperplane that has the
maximum number of points.
• Decision Tree Regression
• A decision tree is built by partitioning the data
into subsets containing instances with similar
values (homogenous).
• Standard deviation is used to calculate the
homogeneity of a numerical sample. If the
numerical sample is completely
homogeneous, its standard deviation is zero.
• Random Forest Regression
• Random forest is an ensemble approach
where we take into account the predictions of
several decision regression trees.
• Simple Linear Regression
X Y
1 1
2 2
3 1.3
4 3.75
5 2.25
• Solution:
y = a + bx
Steps to find a and b,
First, find the mean and covariance.
Means of x and y are given by,
• The variance of x is given by,

• The covariance of x and y, denoted by Cov(x,


y)is defined as,
• Now the values of a and b can be computed
using the following formulas:
• First, find the mean of x and y,

• Next, find the Covariance between x and y,


• Now find the variance of x,
• Now, find the intercept and coefficients,

• Therefore, the linear regression model for the


data is,
Subject Age (X) Glucose Level (y)
1 43 99
2 21 65
3 25 79
4 42 75
5 57 87
6 59 81
7 55 ?
• y’ =bo  +b1 * x
• y’ = 65.14 + (0.385225 * x)
• Prediction – the value of y for the given value
of x = 55
• y’ = 65.14 +(.385225 ∗55)
• y’ =86.327
• Multiple Linear Regression
• Multiple regression is like linear regression,
but with more than one independent value,
meaning that we try to predict a value based
on two or more variables.
Step 1: Calculate X12, X22, X1y, X2y and X1X2.
• Step 2: Calculate Regression Sums.
• Step 3: Calculate b0, b1, and b2
• The formula to calculate b1 is: [(Σx22)(Σx1y)  – (Σx1x2)(Σx2y)]  / [(Σx12) (Σx22) – (Σx1x2)2]
• Thus, b1 = [(194.875)(1162.5)  – (-200.375)(-953.5)]  / [(263.875) (194.875) – (-
200.375)2] = 3.148
• The formula to calculate b2 is: [(Σx12)(Σx2y)  – (Σx1x2)(Σx1y)]  / [(Σx12) (Σx22) – (Σx1x2)2]
• Thus, b2 = [(263.875)(-953.5)  – (-200.375)(1152.5)]  / [(263.875) (194.875) – (-
200.375)2] = -1.656

• The formula to calculate b0 is: y – b1X1 – b2X2


• Thus, b0 = 181.5 – 3.148(69.375) – (-1.656)(18.125) = -6.867

• Step 4: Place b0, b1, and b2 in the estimated linear regression equation.
• The estimated linear regression equation is: ŷ = b0 + b1*x1 + b2*x2
• ŷ = -6.867 + 3.148x1 – 1.656x2
• Linear regression is an algorithm that
provides a linear relationship between an
independent variable and a dependent
variable to predict the outcome of future
events.
• It is a statistical method used in data science
and machine learning for predictive analysis.
• Simple Linear Regression
• predict the outcome of a dependent variable
based on the independent variables, the
relationship between the variables is linear.

You might also like