Professional Documents
Culture Documents
Unit 7
Unit 7
• Once we find the best θ1 and θ2 values, we get the best fit line. So when
we are finally using our model for prediction, it will predict the value of y
for the input value of x.
Linear Regression
How to update θ1 and θ2 values to get the best
fit line ?
• Cost Function (J):
• By achieving the best-fit regression line, the model aims to predict y
value such that the error difference between predicted value and true
value is minimum. So, it is very important to update the θ1 and θ2 values,
to reach the best value that minimize the error between predicted y value
(pred) and true y value (y).
Linear Regression
Consider the following two variables x and y, you are required to calculate the R Squared in Regression.
Linear regression:
Problems with R-squared statistic
• The R-squared statistic isn’t perfect. In fact, it suffers from a major flaw.
Its value never decreases no matter the number of variables we add to our
regression model.
• That is, even if we are adding redundant variables to the data, the value of
R-squared does not decrease.
• It either remains the same or increases with the addition of new
independent variables.
• This clearly does not make sense because some of the independent
variables might not be useful in determining the target variable.
• Adjusted R-squared deals with this issue.
Adjusted R-squared statistic
• The Adjusted R-squared takes into account the number of independent
variables used for predicting the target variable.
• In doing so, we can determine whether adding new variables to the model
actually increases the model fit.
• Let’s have a look at the formula for adjusted R-squared to better
understand its working.
• Here,
• n represents the number of data points in our dataset
• k represents the number of independent variables, and
• R represents the R-squared values determined by the model.
• So, if R-squared does not increase significantly on the addition of a new
independent variable, then the value of Adjusted R-squared will actually
decrease.
• On the other hand, if on adding the new independent variable we see a
significant increase in R-squared value, then the Adjusted R-squared
value will also increase.
Logistic Regression in Machine Learning
• Logistic regression is one of the most popular Machine Learning algorithms,
which comes under the Supervised Learning technique. It is used for predicting
the categorical dependent variable using a given set of independent variables.
• Logistic regression predicts the output of a categorical dependent variable.
Therefore the outcome must be a categorical or discrete value. It can be either
Yes or No, 0 or 1, true or False, etc. but instead of giving the exact value as 0
and 1, it gives the probabilistic values which lie between 0 and 1.
• Logistic Regression is much similar to the Linear Regression except that how
they are used. Linear Regression is used for solving Regression problems,
whereas Logistic regression is used for solving the classification problems.
Logistic Regression in Machine Learning
• In Logistic regression, instead of fitting a regression line, we fit an "S"
shaped logistic function, which predicts two maximum values (0 or 1).
• The curve from the logistic function indicates the likelihood of something
such as whether the cells are cancerous or not, etc.
• Logistic Regression is a significant machine learning algorithm because it
has the ability to provide probabilities and classify new data using
continuous and discrete datasets.
Logistic Regression in Machine Learning
• Binary logistic regression - When we have two possible outcomes, like
example of whether a person is likely to be infected with COVID-19 or
not.
• Multinomial logistic regression - When we have multiple outcomes, say
if we build out our original example to predict whether someone may
have the flu, an allergy, a cold, or COVID-19.
Logistic Regression in Machine Learning
Logistic Regression in Machine Learning
• The hypothesis of logistic regression tends it to limit the cost function
between 0 and 1. Therefore linear functions fail to represent it as it can
have a value greater than 1 or less than 0 which is not possible as per the
hypothesis of logistic regression.
• hΘ(x) = β₀ + β₁X
• For Example, We have 2 classes, let’s take them like cats and dogs(1 —
dog , 0 — cats). We basically decide with a threshold value above which
we classify values into Class 1 and of the value goes below the threshold
then we classify it in Class 2.
Logistic Regression in Machine Learning
Logistic Regression in Machine Learning
• As shown in the above graph we have chosen the threshold as 0.5, if the
prediction function returned a value of 0.7 then we would classify this
observation as Class 1(DOG).
• If our prediction returned a value of 0.2 then we would classify the
observation as Class 2(CAT).
Cost Function
• For logistic regression, the Cost function is defined as:
• −log(hθ(x)) if y = 1
• −log(1−hθ(x)) if y = 0
Logistic Regression in Machine Learning
Logistic Regression in Machine Learning
• The above two functions can be compressed into a single function i.e.
• Now to minimize our cost function we need to run the gradient descent
function on each parameter i.e.
Logistic Regression in Machine Learning
• The Logistic Regression model can be generalized to support multiple
classes directly, without having to train and combine multiple binary
classifiers. This is called Softmax Regression, or Multinomial Logistic
Regression.
• The idea is simple: when given an instance x, the Softmax Regression
model first computes a score sk(x) for each class k, then estimates the
probability of each class by applying the softmax function (also called the
normalized exponential) to the scores.
Logistic Regression in Machine Learning
K-Nearest Neighbors Algorithm
• K-Nearest Neighbour is one of the simplest Machine Learning algorithms
based on Supervised Learning technique.
• K-NN algorithm assumes the similarity between the new case/data and
available cases and put the new case into the category that is most similar to
the available categories.
• K-NN algorithm stores all the available data and classifies a new data point
based on the similarity. This means when new data appears then it can be
easily classified into a well suite category by using K- NN algorithm.
• K-NN algorithm can be used for Regression as well as for Classification
but mostly it is used for the Classification problems.
K-Nearest Neighbors Algorithm
• K-NN is a non-parametric algorithm, which means it does not make any
assumption on underlying data.
• It is also called a lazy learner algorithm because it does not learn from the
training set immediately instead it stores the dataset and at the time of
classification, it performs an action on the dataset.
• KNN algorithm at the training phase just stores the dataset and when it
gets new data, then it classifies that data into a category that is much
similar to the new data.
K-Nearest Neighbors Algorithm
• Example: Suppose, we have an image of a creature that looks similar to
cat and dog, but we want to know either it is a cat or dog. So for this
identification, we can use the KNN algorithm, as it works on a similarity
measure. Our KNN model will find the similar features of the new data
set to the cats and dogs images and based on the most similar features it
will put it in either cat or dog category.
K-Nearest Neighbors Algorithm
K-Nearest Neighbors Algorithm
• The K-NN working can be explained on the basis of the below algorithm:
• Step-1: Select the number K of the neighbors
• Step-2: Calculate the Euclidean distance of K number of neighbors
• Step-3: Take the K nearest neighbors as per the calculated Euclidean
distance.
• Step-4: Among these k neighbors, count the number of the data points in
each category.
• Step-5: Assign the new data points to that category for which the number of
the neighbor is maximum.
• Step-6: Our model is ready.
K-Nearest Neighbors Algorithm
K-Nearest Neighbors Algorithm
• Firstly, we will choose the number of neighbors, so we will choose the
k=5.
• Next, we will calculate the Euclidean distance between the data points.
The Euclidean distance is the distance between two points, which we
have already studied in geometry. It can be calculated as:
K-Nearest Neighbors Algorithm
K-Nearest Neighbors Algorithm
• By calculating the Euclidean distance we got the nearest neighbors, as
three nearest neighbors in category A and two nearest neighbors in
category B. Consider the below image:
K-Nearest Neighbors Algorithm
• As we can see the 3 nearest neighbors are from category A, hence this new data
point must belong to category A.
• How to select the value of K in the K-NN Algorithm?
• Below are some points to remember while selecting the value of K in the K-NN
algorithm:
• There is no particular way to determine the best value for "K", so we need to try
some values to find the best out of them. The most preferred value for K is 5.
• A very low value for K such as K=1 or K=2, can be noisy and lead to the effects
of outliers in the model.
• Large values for K are good, but it may lead to more calculations
K-Nearest Neighbors Algorithm
• Advantages of KNN Algorithm:
• It is simple to implement.
• It is robust to the noisy training data
• It can be more effective if the training data is large.
• Disadvantages of KNN Algorithm:
• Always needs to determine the value of K which may be complex some
time.
• The computation cost is high because of calculating the distance between
the data points for all the training samples.