Dr. S. Vairachilai Department of CSE CVR College of Engineering Mangalpalli Telangana

Dr. S.
Vairachilai
Department of CSE
CVR College Of Engineering
Mangalpalli
Telangana
Machine Learning
Machine learning (ML) is a branch of artificial intelligence (AI) that enables computers to self-
learn and improve over time from the data given, without being explicitly programmed .
Machine learning algorithms are able to detect and learn from patterns in data and make
their own predictions.
16-04-2022 S.Vairachilai 2
Machine Learning
 Tom Mitchell provides a more modern definition: “A computer program is said to learn from
experience E with respect to some class of tasks T and performance measure P, if its
performance at tasks in T, as measured by P, improves with experience E.”
16-04-2022 S.Vairachilai 3
Machine Learning Algorithm
Machine
Learning Algorithm
Supervised Un Supervised Reinforcement

Learning Algorithm Learning Algorithm Learning Algorithm
Regression Classification
Regression Analysis
• Used to predict continuous dependent variable
from a number of independent variables
• Variable denoted x, regarded as independent
(predictor/explanatory/input/cause) variables
• Variable denoted y, regarded as dependent
(response/outcome/output/effect) variable
Regression Analysis
Simple Linear Regression (SLR)
σ (𝒙 − 𝒙
ഥ) (𝒚 − 𝒚
ഥ)
𝐬𝐥𝐨𝐩𝐞 (𝜷𝟏 ) =
ෝ = 𝜷𝟎 + (𝜷𝟏 ∗ 𝐱)
𝒚 σ(𝒙 − 𝒙ഥ)𝟐
Multiple Linear Regression(MLR)

Intercept(𝛃𝟎 ) = 𝐲ത − (𝐬𝐥𝐨𝐩𝐞 ∗ 𝐱ത)
𝐲ො = 𝛃𝟎 + (𝛃𝟏 ∗ 𝐱 𝟏 ) + (𝛃𝟐 ∗ 𝐱𝟐 ) +…..+(𝛃𝐧 ∗ 𝐱𝐧 )
Number of Number of
•𝑦ො is dependent (response/outcome/output/effect) variable TV Ads (x) Cars Sold (y)
1 14
•𝑥𝑛 is independent (predictor/explanatory/input/cause) variables
3 24
•𝛽0 & 𝛽1 are parameters of the regression model (Regression Coefficients)
2 18
•𝛽0 intercept parameter
1 17
•𝛽1 slope parameter 3 27

Linear Regression
Goal : Minimize the Total Error
How do we define the regression line?
What is "best“ line?
Minimize the Sum of the Squared Errors

Example of Simple Linear Regression: Reed Auto Sales
Number of Number of Cars
TV Ads (x) Sold (y) σ (𝒙 − 𝒙
ഥ) (𝒚 − 𝒚
ഥ)
෍ 𝑦 = 100 𝐬𝐥𝐨𝐩𝐞 (𝜷𝟏 ) =
1 14 σ(𝒙 − 𝒙ഥ)𝟐
𝑦ത = 20
3 24
Intercept(𝛃𝟎 ) = 𝐲ത − (𝐬𝐥𝐨𝐩𝐞 ∗ 𝐱ത)
2 18 ෍ 𝑥 = 10
1 17 𝑥ҧ = 2
ෝ = 𝜷𝟎 + (𝜷𝟏 ∗ 𝐱)
𝒚
3 27
ഥ) (𝒚 − 𝒚
෍ (𝒙 − 𝒙 ഥ) = 𝟏 − 𝟐 ∗ 𝟏𝟒 − 𝟐𝟎 + 𝟑 − 𝟐 ∗ 𝟐𝟒 − 𝟐𝟎 + 𝟐 − 𝟐 ∗ 𝟏𝟖 − 𝟐𝟎 + 𝟏 − 𝟐 ∗ 𝟏𝟕 − 𝟐𝟎 + 𝟑 − 𝟐 ∗ (𝟐𝟕 − 𝟐𝟎)
= 6 + 4 +0 +3 +7 =20
ഥ)𝟐 =((𝟏 − 𝟐)𝟐 +(𝟑 − 𝟐)𝟐 +(𝟐 − 𝟐)𝟐 +(𝟏 − 𝟐)𝟐 +(𝟑 − 𝟐)𝟐 ) =( 1+1+0+1+1)=4
σ(𝒙 − 𝒙
ෝ = 10 + (5 * x)
𝒚
𝟐𝟎
𝐬𝐥𝐨𝐩𝐞 (𝜷𝟏 ) = =5 Estimated Regression Equation
𝟒
Intercept(𝛃𝟎 ) = 20 – (5 *2) =10

Example of Simple Linear Regression: Reed Auto Sales
• Reed Autos is Cambridgeshire's multi-award winning Car Superstore, with over
250 vehicles in stock ready for same day drive away or to be delivered to your
door anywhere in the UK.
• We have a large variety of vehicles tailored to suit all budgets and needs, priding
ourselves on our knowledge and innovation within the motor trade industry.
Number of TV Ads (x) Number of Cars Sold (y)
1 14
3 24
2 18
1 17
3 27
Multiple Linear Regression(MLR) Analysis
Simple Linear Regression (SLR)
ෝ = 𝜷𝟎 + (𝜷𝟏 ∗ 𝐱)
𝒚
σ (𝒙 − 𝒙
ഥ) (𝒚 − 𝒚
ഥ)
𝐬𝐥𝐨𝐩𝐞 (𝜷𝟏 ) = Intercept(𝛃𝟎 ) = 𝐲ത − (𝐬𝐥𝐨𝐩𝐞 ∗ 𝐱ത)
σ(𝒙 − 𝒙ഥ )𝟐
Multiple Linear Regression(MLR)
𝐲ො = 𝛃𝟎 + (𝛃𝟏 ∗ 𝐱𝟏 ) + (𝛃𝟐 ∗ 𝐱𝟐 ) +…..+(𝛃𝐧 ∗ 𝐱 𝐧 )

Linear Regression with Linear Algebra
Vectors: A matrix has only one row or only one column it is called a vector.
Row Vector: A matrix having only one row is called a row vector
Eg: 1 * 3 matrix 𝑨 = [ 𝟏 𝟐 𝟏] is a row vector, because it has only one row
Column Vector: A matrix having only one column is called a column vector
𝟓
Eg: 2 * 1 matrix 𝑨 = 𝟒
is a column vector because it has only one column
Scalars: A matrix having only one row and one column is called a scalar.
Eg: 1 * 1 matrix 𝑨 = [ 𝟑] is a scalar. In other words, a scalar is a single number.
Transpose Matrix :
Linear Regression Analysis in Matrix form
ෝ = 𝜷𝟎 + (𝜷𝟏 ∗ 𝒙)
𝒚 ෝ Dependent Variable
𝒚 1 𝒙𝟏
ෝ𝟏
𝒚
x Independent Variable ෝ𝟐
𝒚 1 𝒙𝟐
ෝ𝟏 = 𝜷𝟎 + (𝜷𝟏 ∗ 𝒙𝟏 )
𝒚 𝜷𝟎 Intercept Parameter ෝ𝟑
𝒚 1 𝒙𝟑 𝛽0
ෝ𝟐 = 𝜷𝟎 + (𝜷𝟏 ∗ 𝒙𝟐 )
𝒚 = *
𝜷𝟏 Slope Parameter . . . 𝛽1
ෝ𝟑 = 𝜷𝟎 + (𝜷𝟏 ∗ 𝒙𝟑 )
𝒚 . . .
. . . .
. 𝑦ො n × 1 Column Vector 1 𝒙𝒏
ෝ𝒏
𝒚
. 𝒙𝒏 n × 2 Matrix
ෝ𝒏 = 𝜷𝟎 + (𝜷𝟏 ∗ 𝒙𝒏 )
𝒚 β 2 × 1 Column Vector
𝑌 = 𝑿β
ෝ = 𝜷𝟎 + (𝜷𝟏 ∗ 𝐱)
𝒚
SLR Model in Matrix Form
ෝ = 𝜷𝟎 + (𝜷𝟏 ∗ 𝐱)
𝒚
෍𝒚 𝒏 ෍𝒙 𝛽0
=
𝛽1
𝑦ො n × 1 Column Vector ෍𝒙 𝒚 ෍ 𝒙 ෍ 𝒙𝟐
𝒙𝒏 n × 2 Matrix
β 2 × 1 Column Vector
𝟏𝟎𝟎 𝟓 𝟏𝟎
𝟐𝟐𝟎 =
𝟏𝟎 𝟐𝟒
Number of TV Number of Cars
Ads (x) Sold (y)
5𝜷𝟎 +10𝜷𝟏 =100 Equation (1)
1 14
1𝟎𝜷𝟎 +2𝟒𝜷𝟏 =220 Equation (2)
3 24
Solve the linear equation (1) & (2)
2 18
𝐈𝐧𝐭𝐞𝐫𝐜𝐞𝐩𝐭 (𝛃𝟎 ) = 10 Slope( 𝛃𝟏 ) = 𝟓
1 17
ෝ = 10 + 5 * x
𝒚
3 27
Estimated Regression Equation
MLR Model in Matrix Form
𝐲ො = 𝛃𝟎 + (𝛃𝟏 ∗ 𝐱𝒊𝟏 ) + (𝛃𝟐 ∗ 𝐱𝒊𝟐 ) +…..+(𝛃𝐤 ∗ 𝐱𝒊𝒏 ) ෝ𝟏

𝒚 1 𝒙𝟏𝟏 𝒙𝟏𝟐 . . . 𝒙𝟏𝒌 𝛽0
ෝ𝟐
𝒚 1 𝒙𝟐𝟏 𝒙𝟐𝟐 . . . 𝒙𝟐𝒌 𝛽1
ෝ𝟑
𝒚 . 𝛽2
ෝ Dependent Variable
𝒚 . = . * .
𝒙 Independent Variables . . .
𝜷𝟎 Intercept Parameter . . .
ෝ𝒏
𝒚 1 𝒙𝒏𝟏 𝒙𝒏𝟐 … 𝒙𝒏𝒌 𝛽𝑘
𝜷𝟏 Slope Parameter
i=1,2, . . . n
ෝ n × 1 Column Vector
𝒚
X n × (k+1) Matrix
β (k+1) × 1 Column Vector

Regression Model in Matrix Form
ෝ = 𝜷𝟎 + (𝜷𝟏 ∗ 𝐱)
𝒚
෍𝒚
𝐲ො = 𝛃𝟎 + (𝛃𝟏 ∗ 𝐱 𝟏 ) + (𝛃𝟐 ∗ 𝐱 𝟐 ) + (𝛃𝟑 ∗ 𝐱 𝟑 )
𝒏 ෍𝒙 𝛽0
=
෍𝒙 𝒚
෍ 𝒙 ෍ 𝒙𝟐 𝛽1
෍𝒚 𝑛 ෍𝒙𝟏 ෍ 𝒙𝟐 ෍ 𝒙𝟑
𝛽0
෍ 𝒙 𝟏 ෍ 𝒙𝟏𝟐 ෍ 𝒙𝟏 ∗ 𝒙𝟐 ෍ 𝒙𝟏 ∗ 𝒙𝟑
෍𝒙𝟏 ∗ 𝒚 𝛽1
𝐲ො = 𝛃𝟎 + (𝛃𝟏 ∗ 𝐱𝟏 ) + (𝛃𝟐 ∗ 𝐱𝟐 ) = ෍ 𝒙𝟐 ෍ 𝒙𝟐 ∗ 𝒙𝟏 ෍ 𝒙𝟐𝟐 ෍ 𝒙𝟐 ∗ 𝒙𝟑
෍𝒙𝟐 ∗ 𝒚 𝛽2
෍ 𝒙𝟑 ෍ 𝒙𝟑 ∗ 𝒙𝟏 ෍ 𝒙𝟑 ∗ 𝒙𝟐 ෍ 𝒙𝟑𝟐
෍𝒚 ෍𝒙𝟑 ∗ 𝒚 𝛽𝑘
𝒏 ෍𝒙𝟏 ෍ 𝒙𝟐 𝛽0
෍ 𝒙𝟏 ∗ 𝒚 = ෍ 𝒙𝟏𝟐
෍𝒙𝟏 ෍ 𝒙𝟏 ∗ 𝒙𝟐 𝛽1
෍𝒙𝟐 ∗𝒚 ෍ 𝒙𝟐𝟐
𝑌 =𝑿 ∗β
෍𝒙𝟐 ෍ 𝒙 𝟐 ∗ 𝒙𝟏 𝛽2 𝑿𝑻 ∗ 𝑌 = 𝑿𝑻 ∗ 𝑿β
Example of Multiple Linear Regression
Shows three performance measures for 5 students.
IQ ( 𝒙𝟏 ) Study Hours(𝒙𝟐 ) Test Score( y)
Independent Variables 110 40 100
• IQ 120 30 90
• Study Hours
100 20 80
Dependent Variable 90 0 70
• Test Score
80 10 60
Multiple Linear Regression
ෝ = 𝜷𝟎 + (𝜷𝟏 ∗ 𝒙𝟏 ) +(𝜷𝟐 ∗ 𝒙𝟐 )
𝒚 ෍𝒚 𝒏 ෍𝒙𝟏 ෍ 𝒙𝟐
𝛽0
෍ 𝒙𝟏 ∗ 𝒚 = ෍𝒙𝟏 ෍ 𝒙𝟏𝟐 ෍ 𝒙𝟏 ∗ 𝒙𝟐 𝛽1
𝑌 =𝑿 ∗β
𝑿𝑻 ∗ 𝑌 = 𝑿𝑻 ∗ 𝑿β 𝛽2
෍𝒙 𝟐 ∗ 𝒚 ෍ 𝒙 𝟐 ෍ 𝒙 𝟐 ∗ 𝒙𝟏 ෍ 𝒙𝟐𝟐
Example of Multiple Linear Regression
IQ ( 𝒙𝟏 ) Study Hours(𝒙𝟐 ) Test Score( y)
110 40 100
120 30 90
100 20 80
90 0 70 𝟒𝟎𝟎 𝟓 𝟓𝟎𝟎 𝟏𝟎𝟎 𝛽0

𝟒𝟎𝟗𝟎𝟎 = 𝟓𝟎𝟎 𝟓𝟏𝟎𝟎𝟎 𝟏𝟎𝟖𝟎𝟎 𝛽1
80 10 60 𝟖𝟗𝟎𝟎 𝟏𝟎𝟎 𝟏𝟎𝟖𝟎𝟎 𝟑𝟎𝟎𝟎 𝛽2
෍𝒚 𝒏 ෍𝒙𝟏 ෍ 𝒙𝟐 5 𝛃𝟎 +500𝛃𝟏 +100𝛃𝟐 =400

𝛽0
500𝛃𝟎 +51000𝛃𝟏 +10800𝛃𝟐 =40900
෍ 𝒙𝟏 ∗ 𝒚 = ෍𝒙𝟏 ෍ 𝒙𝟏 𝟐
෍ 𝒙𝟏 ∗ 𝒙𝟐
𝛽1 100𝛃𝟎 +10800𝛃𝟏 +3000𝛃𝟐 =8900
෍𝒙 𝒚 ෍ 𝒙 𝟐 ෍ 𝒙 𝟐 ∗ 𝒙𝟏 ෍ 𝒙𝟐𝟐 𝜷𝟎 =20
𝛽2 𝜷𝟏 = 𝟎. 𝟓
𝜷𝟐 = 𝟎. 𝟓
Regression Analysis
 Regression Statistics
• Explain the model ability
 ANOVA (Analysis of Variance)
• Explained and Unexplained Variation
• F-tests & Significance Value : Test the overall significance for a regression model
 Regression Model Assumptions and Diagnostics
• Linear Relationship
• Hetroscedacity: Breusch-Pagan Test
• Auto Correlation: Durbin-Watson Statistic (DW)
• Multicollinearity: Variance Inflation Factor (VIF)
 Model Validation
Mean Absolute Error (MAE) Mean Square Error (MSE)
Root Mean Square Error (RMSE) Root Mean Square Error (RMSE)
Mean Percentage Error (MPE) Akaike’s Information Criteria (AIC)
Bayesian Information Criteria (BIC)

16-04-2022 20
Regression Statistics
 R /Multiple R
• Quantify the strength (closeness) of the relationship between two variables.
 R Squared/Multiple R Squared (Coefficient of Determination)

• R Squared measures the proportion of the variation in dependent variable (Y) explained by independent variables (X) for a linear
regression model.
• The additional variable will add at least some explanatory power to the regression.
• R Squared always increases if add a new regressor to a model, so high R squared may result from including too many regressors.
 Adjusted R Squared
• Tells how important is a particular feature to the model.
• Adjusted 𝑅 2 impose the penalty for adding additional independent variable to a model.
• Adjusted 𝑅 2 increases only if the new independent variable improves the model.
• Allows for the number of regressors, and may either increase or decrease.
 Standard Error of Estimate (Goodness of Fit)

• Measure of the accuracy of predictions made with a regression line.
16-04-2022 21
Correlation Coefficient R
• Correlation is a statistical measure that indicates the extent to which two or more variables move together
• Correlation is a measure of the linear relationship between two variables
• Redundancies can be detected by correlation analysis. Values ranges between -1 (perfect negative correlation)
to +1 (perfect positive correlation).
Corrélation coefficient for a population Corrélation coefficient for a population
𝑪𝒐𝒗(𝒙, 𝒚) 𝑪𝒐𝒗(𝒙, 𝒚) 𝑪𝒐𝒗(𝒙, 𝒚)
𝑹= ρ𝒙𝒚 = 𝒓𝒙𝒚 =
𝒗𝒂𝒓(𝒙) 𝒗𝒂𝒓(𝒚) 𝝈 𝒙𝝈 𝒚 𝒔𝒙 𝒔𝒚
• Cohen (1992) proposed these guidelines for the interpretation of a correlation coefficient:
16-04-2022 22
Correlation Coefficient R
Positive Correlation: A positive correlation indicates that the variables increase or decrease together. (both
variables move in the same direction)
Negative Correlation: A negative correlation indicates that if one variable increases, the other decreases, and
vice versa. (both variables move in the opposite direction)
No Correlation: Both variables goes random.

Correlation Coefficient
Correlation Analysis Example 1
Person_ Male Female ഥ ) (Y-𝒀
(X- 𝑿 ഥ) ഥ )(Y-𝒀
(Y-𝒀 ഥ)
ഥ ഥ (Y−𝒀 ഥ ) ෍ 𝒗𝒂𝒓(𝒙) 𝒗𝒂𝒓(𝒚)
ഥ )(Y−𝒀
name X- 𝑿 Y-𝒀 ഥ )(X- 𝑿
(X- 𝑿 ഥ) ഥ )(X− 𝑿
(X− 𝑿 ഥ)
0.375 -0.375 -0.14063 0.140625 0.140625 0.375 0.375 0.140625

Ram 1 0
-0.625 0.625 -0.39063 0.390625 0.390625 0.625 0.625 0.390625
Rani 0 1
0.390625 0.390625 0.625 0.625 0.390625
Deepa 0 1 -0.625 0.625 -0.39063
0.140625 0.140625 0.375 0.375 0.140625
Ravi 1 0 0.375 -0.375 -0.14063
0.140625 0.140625 0.375 0.375 0.140625
Raju 1 0 0.375 -0.375 -0.14063 0.390625 0.390625 0.625 0.625 0.390625
Mary 0 1 -0.625 0.625 -0.39063 0.140625 0.140625 0.375 0.375 0.140625
Jim 1 0 0.375 -0.375 -0.14063 0.140625 0.140625 0.375 0.375 0.140625
Jack 1 0 0.375 -0.375 -0.14063

𝑪𝒐𝒗(𝒙,𝒚) −1.875
𝑥ҧ =0.625 𝑦ഥ =0.375 σ X− 𝑿ഥ ) (Y−𝒀
ഥ ) =−1.875 σ 𝒗𝒂𝒓(𝒙) 𝒗𝒂𝒓 𝒚 = 𝟏. 𝟖𝟕𝟓 𝒓 = = 1.875 =-1
𝒗𝒂𝒓(𝒙) 𝒗𝒂𝒓(𝒚)
𝑵 𝟐 Data set having three attributes: Person_name, Male, Female.

σ𝒊=𝟏(𝒙 𝒊 −ഥ 𝒙 )
𝝈𝟐 = Male is 1 if the corresponding person is a male else it is 0 .
σ𝑵
𝒊=𝟏(𝒙𝒊 −ഥ
𝒙)(𝒚𝒋 −ഥ
𝒚) 𝒏−𝟏
𝑪𝒐𝒗(𝒙, 𝒚) = σ 𝑵
𝒊=𝟏(𝒚 𝒊 −ഥ𝒚 )𝟐
𝒏−𝟏 𝝈𝟐 = Female is 1 if the corresponding person is a female else it is 0.
𝒏−𝟏
Eliminate the any one attribute Male and Female
R Squared/Multiple R Squared(Coefficient of Determination 𝑹𝟐 )
𝐒𝐒𝐑 𝐒𝐮𝐦 𝐨𝐟 𝐒𝐪𝐮𝐚𝐫𝐞𝐬 𝐑𝐞𝐠𝐫𝐞𝐬𝐬𝐢𝐨𝐧 𝐄𝐱𝐩𝐥𝐚𝐢𝐧𝐞𝐝 𝐕𝐚𝐫𝐢𝐚𝐭𝐢𝐨𝐧

𝐑𝟐 = = =
𝐒𝐒𝐓 𝐒𝐮𝐦 𝐨𝐟 𝐒𝐪𝐮𝐚𝐫𝐞𝐬 𝐓𝐨𝐭𝐚𝐥 𝐓𝐨𝐭𝐚𝐥 𝐕𝐚𝐫𝐢𝐚𝐭𝐢𝐨𝐧
𝑺𝑺𝑬
𝑹𝟐 = 𝟏 - 𝐒𝐒𝐓 = 𝐒𝐒𝐑 + 𝐒𝐒𝐄 𝑪𝒐𝒗(𝒙, 𝒚)
𝑺𝑺𝑻 𝑹=
SSR= σ𝒏𝒊=𝟏 𝒚ෝ𝒊 − 𝒚 𝟐

SSR= Σ(predicted-mean)²
SSE=σ𝒏𝒊=𝟏 𝒚𝒊 − 𝒚
ෝ 𝟐 SSE = Σ(actual-predicted)²
SST= Σ(actual-mean)²
SST=σ𝒏𝐢=𝟏 𝒚𝒊 − 𝒚 𝟐
16-04-2022 26
ANOVA (Analysis of Variance)
• F-tests to test the overall significance for a regression model
• Look up the probability in F- table for accepting or rejecting null hypothesis
Source of Variation DF SS MS F Significance of F
Regression/Model DFR =p 𝒏 MSR=SSR/DFR F* =MSR/MSE

𝟐
෍ 𝒚ෝ𝒊 − 𝒚
𝒊=𝟏
Residual /Error DFE =n-p 𝒏 MSE=SSE/DFE

ෝ 𝟐
෍ 𝒚𝒊 − 𝒚
𝒊=𝟏
Total DFT=n-1 𝒏
෍ 𝒚𝒊 − 𝒚 𝟐
𝒊=𝟏
If the p value is higher than the significance level, the results are not statistically significant.
If the p value is lower than the significance level, the results are statistically significant.
Independent variables has explanatory power
16-04-2022 27
Adjusted 𝑹𝟐
 Tells how important is a particular feature to the model.
𝟐
𝟐
𝟏 − 𝑹 ∗ (𝒏 − 𝟏)
𝐀𝐝𝐣𝐮𝐬𝐭𝐞𝐝 𝑹 = 𝟏 −
(𝒏 − 𝒑 − 𝟏)
• n is the number of data sample.
• p is the number of predictor variables (independent Variables)excluding the constant
𝟐
MSR
𝐀𝐝𝐣𝐮𝐬𝐭𝐞𝐝 𝐑 = 𝟏 − SSR= σ𝒏𝒊=𝟏 𝒚ෝ𝒊 − 𝒚 𝟐
𝐌𝐒𝐄
SSE=σ𝒏𝒊=𝟏 𝒚𝒊 − 𝒚
ෝ 𝟐
DFR =p
MSR=SSR/DFR
DFE =n-p
DFT=n-1 MSE=SSE/DFE SST=σ𝒏𝐢=𝟏 𝒚𝒊 − 𝒚 𝟐
16-04-2022 28
Standard Error of Estimate(SEE)
• Smaller SSE value indicates that the observations are closer to the fitted line
• High SSE value indicates that the observations are far away the fitted line
𝐒𝐄𝐄 = 𝐌𝐒𝐄 SSE=σ𝒏𝒊=𝟏 𝒚𝒊 − 𝒚

ෝ 𝟐
DFR =p
MSE=SSE/DFE DFE =n-p
DFT=n-1
16-04-2022 29
Regression Model Assumptions and Diagnostics
• Linear Relationship
• Hetroscedacity
Breusch-Pagan Test
• Auto Correlation
Durbin-Watson Statistic (DW)
• Multicollinearity
Variance Inflation Factor (VIF)

16-04-2022 30
Linear Relationship - Residual & Predicted(Fitted) Plot
• Linearity assumption can best be tested with scatter plots.
• Scatter plots is used to detect non-linearity, unequal error variances, and outliers.
• Scatter plot of residuals and predicted(fitted) values.
• Predicted values are taken on the y-axis, and residuals are then plotted on the x-axis.
• Good residuals & fitted plot should be relatively shapeless without clear patterns in the
data, no obvious outliers, and be generally symmetrically distributed around the 0 line
without particularly large residuals.

16-04-2022 31
Linear Relationship - Residual & Predicted(Fitted) Plot
Linear Relationship Non Linear Relationship
Residuals equally spread around a Residual plots are used to look for
horizontal line without distinct patterns, underlying patterns model has a problem.
model has no problem.
16-04-2022 32
Correlation Coefficient R /Multiple R
• Values ranges between -1 (perfect negative correlation) to +1 (perfect positive correlation).
𝑪𝒐𝒗(𝒙, 𝒚)
𝑹=
• Cohen (1992) proposed these guidelines for the interpretation of a correlation coefficient:
16-04-2022 33

Dr. S. Vairachilai Department of CSE CVR College of Engineering Mangalpalli Telangana

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Dr. S. Vairachilai Department of CSE CVR College of Engineering Mangalpalli Telangana

Uploaded by

Copyright:

Available Formats

Dr. S.

their own predictions.

performance at tasks in T, as measured by P, improves with experience E.”

Supervised Un Supervised Reinforcement

• Used to predict continuous dependent variable

from a number of independent variables

• Variable denoted x, regarded as independent

• Variable denoted y, regarded as dependent

Multiple Linear Regression(MLR)

•𝛽1 slope parameter 3 27

What is "best“ line?

Minimize the Sum of the Squared Errors

Intercept(𝛃𝟎 ) = 20 – (5 *2) =10

Multiple Linear Regression(MLR)

𝐲ො = 𝛃𝟎 + (𝛃𝟏 ∗ 𝐱𝟏 ) + (𝛃𝟐 ∗ 𝐱𝟐 ) +…..+(𝛃𝐧 ∗ 𝐱 𝐧 )

Eg: 1 * 3 matrix 𝑨 = [ 𝟏 𝟐 𝟏] is a row vector, because it has only one row

𝐲ො = 𝛃𝟎 + (𝛃𝟏 ∗ 𝐱𝒊𝟏 ) + (𝛃𝟐 ∗ 𝐱𝒊𝟐 ) +…..+(𝛃𝐤 ∗ 𝐱𝒊𝒏 ) ෝ𝟏

β (k+1) × 1 Column Vector

90 0 70 𝟒𝟎𝟎 𝟓 𝟓𝟎𝟎 𝟏𝟎𝟎 𝛽0

෍𝒚 𝒏 ෍𝒙𝟏 ෍ 𝒙𝟐 5 𝛃𝟎 +500𝛃𝟏 +100𝛃𝟐 =400

Mean Percentage Error (MPE) Akaike’s Information Criteria (AIC)

Bayesian Information Criteria (BIC)

 R Squared/Multiple R Squared (Coefficient of Determination)

 Standard Error of Estimate (Goodness of Fit)

• Correlation is a measure of the linear relationship between two variables

No Correlation: Both variables goes random.

0.375 -0.375 -0.14063 0.140625 0.140625 0.375 0.375 0.140625

Jack 1 0 0.375 -0.375 -0.14063

𝑵 𝟐 Data set having three attributes: Person_name, Male, Female.

𝐒𝐒𝐑 𝐒𝐮𝐦 𝐨𝐟 𝐒𝐪𝐮𝐚𝐫𝐞𝐬 𝐑𝐞𝐠𝐫𝐞𝐬𝐬𝐢𝐨𝐧 𝐄𝐱𝐩𝐥𝐚𝐢𝐧𝐞𝐝 𝐕𝐚𝐫𝐢𝐚𝐭𝐢𝐨𝐧

SSR= σ𝒏𝒊=𝟏 𝒚ෝ𝒊 − 𝒚 𝟐

• Look up the probability in F- table for accepting or rejecting null hypothesis

Source of Variation DF SS MS F Significance of F

Regression/Model DFR =p 𝒏 MSR=SSR/DFR F* =MSR/MSE

Residual /Error DFE =n-p 𝒏 MSE=SSE/DFE

• p is the number of predictor variables (independent Variables)excluding the constant

𝐒𝐄𝐄 = 𝐌𝐒𝐄 SSE=σ𝒏𝒊=𝟏 𝒚𝒊 − 𝒚

Durbin-Watson Statistic (DW)

Variance Inflation Factor (VIF)

• Scatter plot of residuals and predicted(fitted) values.

without particularly large residuals.

You might also like