Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 18

AI for Mechanical Engineering

Dr. Arsalan Arif

Artificial Intelligence A Modern Approach


Stuart J. Russell and Peter Norvig

Spring 2024
Linear Regression

Dependent Variable
Weight gain vs intake of food
Positive relationship

Weight Gain Vs expenditure

Independent variable increase and dependent


variable decreases
Independent Variable

Regression Line
Minimize the difference between the estimated and
actual value Negative relationship
Error
Linear Regression
^
𝑦𝑦=𝑏 0 +𝑏
=𝑚𝑥 1 𝑥1
+𝐶

𝑏^
𝑦 =𝑏
=
6 𝑥 +𝑏 𝑥
0 =0.6
0 1 𝑏
1 =
∑ (=1𝒙 − 𝒙 ) ∗( 𝒚 − 𝒚 )
Where
∑ ( 𝒙 − 𝒙 )𝟐
1 1
10
^
𝑦 =𝑏 +𝑏1 𝑥 1
𝑏1 =? 0

^ =slope+𝑏
𝑦 =𝑏 of the line 𝑏 0=2.2
0 1 𝑥1 4=𝑏0 + 0.6 ∗ 3
(3 , 4)

^
𝑦 =𝑏 0 +𝑏1 𝑥 1
x y
1 2 1-3=-2 2-4=-2 4 4
6
5
2 4 2-3=-1 4-4=0 1 0
4 3 5 3-3=0 5-4=1 1 0
3
4 4 4-3=1 4-4=0 1 0
2
1 5 5 5-3=2 5-4=1 4 2

0 1 2 4 5
3
3 4 10 6
Linear
Regression 6
5

𝑏1 =∑ 𝒚^ −𝒚 ¿
2 ¿ ¿ 𝑏1 =
3.6
=𝟎 . 𝟔
4
range is from 0 to 1
∑ 𝒚 −𝒚 ¿2 6
=0.6 means it’s a good fit
3
2
1
^
𝑦 =2.2+ 0.6 𝑥
0 1 2 3 4 5

x y
1 2 1-3=-2 2-4=-2 4 2.8 2.8-4=-1.2 1.44 4 4
2 4 2-3=-1 4-4=0 0 3.4 -0.6 0.36 1 0
3 5 3-3=0 5-4=1 1 4 0 0 1 0
4 4 4-3=1 4-4=0 0 4.6 0.6 0.36 1 0
5 5 5-3=2 5-4=1 1 5.2 1.2 1.44 4 2

3 4 6 3.6 10 6
Anaconda

is open source (free) of python programing for machine learning with tools like
Spider
Jupitar notebook is a platform
Google provided free GPU (Online access). Also paid ( Faster)
More number of cores, parallel computation.
Pandas Application Programing Interface (API)
Numerical Python (numpy)

Pandas is open source python library data analysis tool, providing high performance

Read “csv “ file.


By default first 5 samples

It can be chosen by writing required


data visibility
25 % of your data is less than or equal to 7

50 % of your data is less than or equal to 7.5

Split the data


Training, Validation and Testing
Test size = 40 % of the data
Call scikit (SK) library for data spliting

Train the system and validate it.


Separate some % age of the data (Unseen by the system) for testing. Then report the results on test data
Training, Validation and Testing
Test size = 40 % of the data
Bias: Underfitting
Gap between Actual and estimated value.
High Bias means estimated value is far away from the actual
value. And vice versa.
When algorithm has limited flexibility to learn.
Pays less attention to training data, and over simplify the
model.
Such models always leads to high error on training and test
data.

Variance:
How much scattered the estimated values are
A model with high variance pays lots of attention to training
data and doesn't generalize.

Overfitting
Calibrate the regression problems to prevents under or overfitting in order to minimize the adjusted loss
function.
Ridge regularization
Used L2 Norm
Loss= sum of the square of the errors
Lasso regularization
Used L1 Norm

You might also like