Professional Documents
Culture Documents
Artificial Intelligence: Assignment On ML
Artificial Intelligence: Assignment On ML
ASSIGNMENT ON ML
So the objective of the classifier is to predict if a given fruit is a ‘Banana’ or ‘Orange’ or ‘Other’
when only the 3 features (long, sweet and yellow) are known.
Step 1: Compute the prior probabilities for each of the class of fruits.
o P(Y=Banana) = 500 / 1000 = 0.50
o P(Y=Orange) = 300 / 1000 = 0.30
o P(Y=Other) = 200 / 1000 = 0.20
Step 2: Step 2: Compute the probability of evidence that goes in the denominator.
This is nothing but the product of P of Xs for all X. This is an optional step because the
denominator is the same for all the classes and so will not affect the probabilities.
• P(x1=Long) = 500 / 1000 = 0.50
• P(x2=Sweet) = 650 / 1000 = 0.65
• P(x3=Yellow) = 800 / 1000 = 0.80
Step 3: Compute the probability of likelihood of evidences that goes in the numerator.
It is the product of conditional probabilities of the 3 features. If you refer back to the
formula, it says P(X1 |Y=k). Here X1 is ‘Long’ and k is ‘Banana’. That means the
probability the fruit is ‘Long’ given that it is a Banana. In the above table, you have 500
Bananas. Out of that 400 is long. So, P(Long | Banana) = 400/500 = 0.8.
Here, for Banana alone.
Probability of Likelihood for Banana
• P(x1=Long | Y=Banana) = 400 / 500 = 0.80
• P(x2=Sweet | Y=Banana) = 350 / 500 = 0.70
• P(x3=Yellow | Y=Banana) = 450 / 500 = 0.90
So, the overall probability of Likelihood of evidence for Banana = 0.8 * 0.7 * 0.9 = 0.504
Step 4: Substitute all the 3 equations into the Naive Bayes formula, to get the probability
that it is a banana.
𝑃(𝐵𝑎𝑛𝑎𝑛𝑎|𝐿𝑜𝑛𝑔, 𝑆𝑤𝑒𝑒𝑡 𝑎𝑛𝑑 𝑌𝑒𝑙𝑙𝑜𝑤)
𝑃(𝐿𝑜𝑛𝑔|𝐵𝑎𝑛𝑎𝑛𝑎) ∗ 𝑃(𝑆𝑤𝑒𝑒𝑡|𝐵𝑎𝑛𝑎𝑛𝑎) ∗ 𝑃(𝑌𝑒𝑙𝑙𝑜𝑤|𝐵𝑎𝑛𝑎𝑛𝑎) ∗ 𝑃(𝐵𝑎𝑛𝑎𝑛𝑎)
=
𝑃(𝑙𝑜𝑛𝑔) ∗ 𝑃(𝑆𝑤𝑒𝑒𝑡) ∗ 𝑃(𝑌𝑒𝑙𝑙𝑜𝑤)
0.8∗0.7∗0.9∗0.5
= = 0.252/P(Evidence)
𝑃(𝐸𝑣𝑖𝑑𝑒𝑛𝑐𝑒)
Linear Regression:
Linear Regression is a machine learning algorithm based on supervised learning. It performs a
regression task. Regression models a target prediction value based on independent variables. It is
mostly used for finding out the relationship between variables and forecasting.
There are mainly two types of linear regression model.
• Simple Linear Regression.
• Multivariable regression
The representation of linear regression is a linear equation that combines a specific set of input
values (x) the solution to which is the predicted output for that set of input values (y). As such,
both the input values (x) and the output value are numeric.
Simple linear regression uses traditional slope-intercept form, where m and b are the variables,
our algorithm will try to “learn” to produce the most accurate predictions. x represents our input
data and y represents our prediction.
y=mx+b
Multivariable regression is a more complex, multi-variable linear equation might look like this,
where w represents the coefficients, or weights, our model will try to learn.
f(x, y,z)=w1x+w2y+w3z
The variables x, y,z represent the attributes, or distinct pieces of information, we have about each
observation. For sales predictions, these attributes might include a company’s advertising spend
on radio, TV, and newspapers.
Sales=w1Radio+w2TV+w3News
When working with linear regression, our main goal is to find the best fit line that means the
error between predicted values and actual values should be minimized. The best fit line will have
the least error.
Our prediction function outputs an estimate of sales given a company’s radio advertising spend
and our current values for Weight and Bias.
Sales=Weight⋅Radio+Bias
Weight: the coefficient for the Radio independent variable. In machine learning we call
coefficients weights.
Radio: the independent variable. In machine learning we call these variables features.
Bias: the intercept where our line intercepts the y-axis. In machine learning we can call
intercepts bias. Bias offsets all predictions that we make.
Our algorithm will try to learn the correct values for Weight and Bias. By the end of our training,
our equation will approximate the line of best fit.
Figure 2: Regression Plot