Logistic Regresion

To start your implementation of the Logistic Regression you will use and IDE using
Google Collab or if you have Jupityre Notebook.

You will implement the logistic step by step. When you first open it you can see that is
on Read Only, Mode You need to create a copy of the notebook Just click save a copy
in drive and this will create a copy. By this you can able to re-implement the whole
notebook from scratch.
You must first import the libraries that you will going to use
Import the dataset that you will use to have the data for your implementation
Splitting the dataset into the training set and Test set- this is one of the most essential
set that happen most of the time when building a model
Here you can able to change the test size so that you can have nice round numbers
In most data set you Have to make sure first that the features in the first column and in
the second column and the dependent variable in the last column.
You can test print here the data to see what
its looks like after you train it. This will automatically select all the column
except the last one
This will automatically select the last column

regardless the number of features you have in
your dataset
Features scaling is not required in regression but however it can improve the training
performance and therefore this will also improve the final predictions. But for some
models features scaling is an absolute necessity.
You can insert a print here again to see what its looks like after doing a feature scaling
Only for the training set but to deploy our model to predict whether our desired data we
will have to apply the predict model on scaled values otherwise operations will be
nonsense the predict method has to be called a set of features with the same scale as
the one that was applied during the training.
-After this step Learn to use scikit-learn

-Logistic regression is a Linear Model
-Go to scikit-learn you have the whole documentation of the logistic regression class
The next step is Training the logistic Regression model on the Training set
Classifier
Class
To import our class and then create the object so we copy and then the syntax is to start
with from so from scikit learn and then from the linear model module of cycle turn and
from this linear model module you import that logistic that's the usual scenario from the
linear model motion of the psychic learning library the legitimate class and now the next
step is to create an object of this logistic regression class which will be exactly the
logistic regression model itself location and no longer in regression wrote what are other
classification models of this part three so classifier the next step we have to call the
class that's how we create an instance of a class and then we add some parentheses
and now the question is do we have to input any parameters well we would just like to
build the model.
Final step we have to take first a classifier and then what we want to do is to train our
Classifier on the training set because remember that this line of code only builds the
logistic regression model but doesn't train it yet. The final step where we train our
classifier on the training set remember to do this we have to call the fit method which
takes us input two entities two sets of data the first one is the matrix of features of the
training set and the second one is the dependent variable vector of the training set all
the purchased decisions of the dependent variable vector correspond to the same
customers of the features or of the matrix or features we had to input here one simply x
train for the matrix of pictures of the training set and then white lane for the dependent
variable vector of the same training set there you go you obtain this and also if you try
that's how you build the logistic regression model that's one extra model going to have
many more we will train on how to select the best one for any data set so let's run this
cell this will indeed build and train the logistic regression model and y train by the way
you see all the parameters of the logistic regression you can tune the most famous one
to see which is the inverse of the regularization strength meaning that the smaller ac the
stronger will be the regular recession and therefore the more it will protect you from over
fitting.
Predicting a New Result
Here we'll just give the default value of one ready to move into the next step predicting a
new result .Take your regressor and then call the predict model to predict the result of a
single observation. Take this two inputs the still features with this values thirty years old
and eighty seven thousand dollars at the estimated salary and now would like you to
predict whether this customer has yes or no the suv you have the answer is white says
you know you take the first result of y test and thus you'll be able to compare your
prediction to the real result to figure out whether your prediction was correct. the first
customer who is actually 30 years old and earns an estimated salary of 87 000 if we
have a look at our test set you know the original tested before this picture scaling this is
exactly the customer you had to input in your method in order to predict the purchase
decision and now in this to queue we're going to check two things that first you got the
right implementation and the predict method and mostly the right input inside the predict
method and secondly we will also check the prediction is correct we'll we're going to
have a look at y test the purchase decision of that first customer is exactly this one
because y test contains all the real results near all the real purchase decision so we're
going to see if indeed our model was able to make a correct prediction by predicting
zero for this first customer of age 30 salary 87 000. Let’s proceed to the solution here it
is predicting a new result we will leave the training let's create a new code and now.
The first step to take our classifier because of course the predict method as any other
method has to be called from the classify object itself so we take our classify object and
then from this object we're going to call this predict method .You have other predict
method which are predict low probability which will return the log the logarithmic
function of the probability that your prediction is one that the purchase decision is one
you also have to predict prob method which returns directly the probability that the
purchase decision is one so depending on whether you want to have directly the final
prediction zero or one or the probability that the dependent variable is equal to one you
had the choice between predict and predict from birth.
Because we directly want to get the Prediction whether or not the first customer of the
test set per se yes or no the suv we're going to add some parentheses what did you put
inside the predict method in order to predict the purchase decision of that customer who
is 30 years old and earns a salary of eighty seven thousand dollars well for first of all
let's remind the essential any single observation inside the predict method has to be
input in a double pair of square brackets xthat's because the brick they predict method
expect from its input a two dimensional array and how do we create a two dimensional
array with only one observation meaning only one row because what through about
input is a two dimensional matrix of one row and two columns where the row
corresponds to that single customer and the columns correspond in fact to the features
that age in the first column and salary in the second column so this second pair of
square brackets which good response those columns you have to input two values of
this two features to age and the summary and therefore since our first customer is 30
years old and have salary of 87 000 dollars two features here that we had input in this
this therefore two comes here is first thirty and then eighty seven thousand dollars first
thing you must absolutely do you must have the right reflex input your single
observation within a double pair of square brackets to give the expected format of the
two dimensional array to your to your prediction method or through your predict method
you have to do something else which is to scale that single observation because 30
years all here and seven thousand dollars as the estimated salary are in the original
scale before up applying feature scaling standardization since our model was actually
trained as we can see on x train and y train which were just previously features scaled
that's the result of feature scaling.
To predict methods can only be applied to observations where the features have the
exact same scale as the one that was used for the training and therefore here we need
to apply the transport method in order to give us which has to take as input this whole
single predict input into the array.
Print that output of the print method so let's press play and remember let's see if our
classifier manages to predict the right outcome meaning the right purchase decision
which according to the test set which contains the real result is zero meaning that first
customer of age 30 with salary 87 000 isn't by that new suv so let's press play and let's
see if it is zero and great it is zero for now it's zero the model did amazing here on this
single observation or customer we're going to move on to the next step which will be to
predict the test results so that we'll surely know exactly how to do it make sure to figure
out if you need to apply feature scaling or not and you will get to the right solution
please display the vector of predictions the vector of predicted values
Predicting a new Result

Predicting the test results and displaying the vector of the predictions next to the vector
of the real results meaning the real purchase decisions and we want you to juggle with
all your toolkits whether it is the data pre-processing toolkit or your other machine
learning models where you have indeed several tools inside because now the tool we
would like to use is that in a little piece of code that allows to display the two vectors of
predicted results and real results so to be the most efficient as you can was to go to part
two regression and then into the multiple linear regression folder and then open this
multiple linear regression implementation which indeed contains inside so open it that
tool allowing to display two vectors of predictive results and real results.
What we only have to do here is to copy this little piece of code that tool and paste it in
a new code cell here to predict the test set results because we have all the same names
here for the vector of predictions which will be the result of the predicted method applied
to test set and called of course from our not regressive object but classifier object thus
that's the first change you had to make and then since this time our predicted purchase
decisions and the real purchase decisions are either zero or one you don't need to add
anything here post a number of decimals after the comma to be only two here we can
only deal with integers so we can remove this and then final question do you have to
change anything absolutely not and
Predicting The test Set Results
that's what i mean by grabbing a tool and applying it on your new model my uh the only
thing to change the name of the model type from regressor to classifier let's check it out
let's press play and we get two vectors next to each other with first on the left your
vector of predictions of predicted purchase decisions for all the customers of course the
test set this was applied to x test here so that's all the customers of the test set and on
the right in the second column you have the real purchase decisions and so here what
is interesting is to compare the predicted purchase decisions to the real ones for all the
customers in the test let's see for the first customer of the test set uh remember of age
30 and salary eight seven thousand dollars the per the prediction is no this customer
didn't buy the new suv and the real result is no in reality that customer didn't buy the
new suv same for the second customer was predicted not to buy and it did it did not buy
that new suv third customer uh actually bought and our model predicted that indeed this
new customer bought it well it's funny we actually have a lot of correct predictions and
this is far this is so far is correct we have our first incorrect prediction here our logistic
regression model predicted that this particular customer didn't buy because we have a
prediction of zero here but in reality that customer bought that new amazing suv
because the real result here is a one then here it is correct another incorrect prediction
where our model predicted again that this didn't buy the suv whereas in reality the
customer bought the new a cv and you see so that looks really good we'll have we'll get
a very nice confusion matrix what it is and mostly a very good accuracy because
accuracy is in the test and of course simply the number of correct predictions divided by
the total number of observations in the test and this is exactly what we're about to get
into to test to set we will not only get the confusion matrix showing so the
Making a confusion Matrix
matrix will show us exactly the number of correct predictions and the number of
incorrect predictions for the two cases where the real result was zero or one we'll have a
nice matrix showing how many mistakes and correct predictions are model made inside
the same new step or code cell we will compute the accuracy we'll see what is the
percentage of
correct predictions our model made on the test should so you should try it again
because once again you just have to go to the api of the scikit learn and figure out how
to make that confusing confusion matrix and how to compute the accuracy so we'll have
to look into the matrix module from scikit learn and then you can really find the tools you
need
with that now we have three steps left and the one we're about to implement is the
confusion matrix which is a simple two matrix two rows two columns which will show us
the numbers of correct predictions we did in both cases in operating zero or one how
many incorrect predictions we did in same boat cases zero or one so that will be a nice
way to see quickly where did we did right and wrong also that we will compute the
accuracy uh at the end of the previous uh this uh we will ask you to try to figure it out on
your own by looking at the circuit learn api so we're going to show you how to navigate it
and first the information we want the tool that we need let's go back to the welcome
page so of scikit-learn then you have to go to api here which contains all the classes
and functions from the different modules and look into the module called metrics so we
just have to scroll down a bit and we will find very soon matrix it's in the alphabetical
order so there it is matrix and then it's very well organized you have the regression
metrics in which we already covered the most important ones and the classification
matrix and here we are dealing with classification you have to look into here and now
we're getting closer we're getting or what we do in do we see inside we see matrix
confusion matrix train you to be independent whenever you need a new information or
any utility that you are you need and training you and training you on how to find it.
In the scikit-learn api the same working with tensorflow this is very important so inside
this confusion matrix function what do we have to replace The vector here of real results
y true since we actually want to distinguish the vector of real results in the training set
and test set called ry true vectors so say y train on for the training set and y test for the
test set since of course the compression matrix is actually usually evaluated at the test
set for new observations here we have to replace y true by white test so that we will get
indeed the confusion matrix showing the correct predictions and the incorrect prediction
predictions for both cases 0 and on the test set we actually out put the output of this
conclusion matrix function of play to whites and so we will actually put the output of this
confusion matrix function applied to y test and y thread into a new variable which we're
going to call cm which stands for con confusion matrix and which will be exactly the
output
returned by this confusion matrix function and we'll add the final point c and in order to
print indeed that confusion matrix this are the three lines of code that allow indeed to
build that infusion matrix and print it and now remember that also asked you to compute
the accuracy and to do this we just had to do exactly the same as which we just did to
find that information try to press pause on the video and find it yourself if not already uh
we're gonna go back to that matrix module remember the matrix module from the
second in the li in library and we're going to look back into this metric section to find the
accuracy according to or this uh this is actually the first one accuracy score which
computed accuracy classification score the rate of correct predictions let's click this link
and we will get indeed all the documentation on this accuracy score function which
returns the accuracy of your model on whatever set of data and will apply of course to
the test set so first of all as you can see this accuracy score function belongs to the
same matrix module and we don't have to take all of this again we can just take that
name of the function here and will show you what to do just next to confusion matrix you
just add a comma and then you can paste this other function you need to know which
you have to import still from that matrix module from the scikit-learn library now we have
it and now we can use it
just below we will actually call it in order to do this efficiently we can just take the
example here and right below we paste and we just replace once again y true by y test
so that we can get indeed the accuracy under test setand here we don't have to do a
print because this accuracy score will directly return that accuracy or that rate of correct
prediction so there you go you can just play this uh play this run the cell and you have
that we have here the confusion matrix showing that we have indeed 65 correct
predictions of the class 0 meaning the customers of the test set who didn't buy the new
suv then 24 correct predictions of the class one meaning correct predictions of the
customers who bought the suv and then three incorrect predictions of the class one
meaningfully incorrect predictions of the customers who but in reality that usd suv but
were predicted not two And eight incorrect predictions of the class zero meaning eight
customers who in reality didn't buy the suv but were predicted to buy it so you see the
confusion matrix has uh or it's very easy to read and we can even get the main
information of our predictions and finally that little number that we have here is of course
the accuracy and we got all points 18 89 which means that we had 989 percent of
correct predictions in the test set and there are one-handed observation in the test
which means that we had indeed 89 correct predictions 65 year plus 24 is equal indeed
289 but for any size of a test set this would mean that he had 89 percent of correct
predictions that's exactly the accuracy it is the rate of correct predictions now you
quickly evaluate a classification model the accuracy is usually the right metric to use
when evaluating your classification models you have it you have it in the toolkit so here
we go for our final step and we're going to visualize not only the training set results but
also the test results and this will be super interesting because we will actually see how
the logistic regression classifier was actually trained to classify our customer as our
observations in two to two different classes zero or one we have superannizers all
showing all the real results in both the training set in a test set and also the prediction
the regions where our predictions are zero and the other region where the predictions
are one
Visualizing the training set result
so here we are at the final step of this implementation the most exciting one because
this is the step where we're going to visualize on a nice td 2d plot the prediction curve
and the prediction regions of the logistic regression model so more specifically what
we're about to plot is a 2d two-dimensional pi plot with therefore two axis x and y on the
x-axis we have the first feature corresponding to age and on the y-axis you'll have the
estimated salary each of the pointsyou know the observation points you will see on the
on the plot will correspond to a specific customer it will either be a customer of the
training set on the plot of the training set results or a customer of the test set on the plot
of the test results and what is most interesting to see in this lot will be the production
regions meaning the regions where our logistic regression will predict the class 0
meaning the customers did it by the suv and the other region where our regression
model predicts the class 1 meaning the customer by the suv and lastly what will be
really interesting is the curve separating these two regions the region of the prediction
zero and the region of prediction one and this is exactly how we're going to see the
difference between linear classifiers and non-linear classifiers we are starting with one
classification model but we will see in the next section of this part that the prediction
boundary we call this the the prediction boundary between these two prediction regions
will be different depending on whether or not your classifier is linear so let's start first by
visualizing this training set and test results for the model so your code to visualize this is
pretty advanced and not only this advanced but also you will never use it again in your
career or let's say you will never have to implement that that again why because in you
will mostly work with data sets having many features more than two and here the only
reason why we have a data set of two features is so that we can be able to visualize
and this prediction regions and prediction boundary because in order to visualize this
we need maximum two features because one feature corresponds to one dimension in
this plot so there's a code to visualize this result is only useful for training for process
and it will not be useful for your future models so we suggest that we don't waste too
much time understanding the whole code and re-implement ourselves because we're
going to show you to the right way the original implementation so you will see that the
code is advanced it's not like plotting a regression curve like we did in part two so that's
the test result to show you the trains that result it uses a lot of tricks to plot all the
observation points
if you want to have to look at it and understand it fi it's fine but really for others it's totally
okay if we don't cover this code in detail just so that we can show you the difference
we're using this so that we can show you so how it's done basically what we do is we
create a grade this frame here containing all the edges of your feature and all the sun
arrays ranges and you create this grid with high density meaning that the pixels of this
grid are not separated one by one but every all points for example age it goes from 10
to 10.25 to 10.5 to 10.75 to 11 up to 69 69.25 so up to somewhere around 149 149
000.25 so resulting in having super dense.
points inside the grid and then the trick what we did is not only we plotted all the real
observation points in the grid so all the points that you see here are the tumors of either
your training set and then later on your test set the green points are of course the
customers who bought the suv represented by one year and the red points are of
course
the customers who didn't buy the suv represented by zero here all the points are your
observation points so to plot the prediction regions that prediction boundary here
separating the two regions is to apply to predict meth the predict method and to each of
these dense points in the grid so that all the dense points here in this region were
actually predicted to be zero meaning all the customers other customers inside this
region are predicted not to buy suv and all the observations points in this the green
region are actually predicted to buy suv because the region here is green representing
the predictions of one and all the red points here red region are old observation points
predicted as zero meaning all the customers who were predicted not to buy the suv so
how that that that is how it works then you don't have to understand all the techniques
used to implement this once again you will probably never have to implement that kind
of code you so what we're going to do now is we're going to get that whole code from
the original and we're going to place it inside our new implementation and we're going to
do the same for the test set then we'll actually try not to show you the test now but let's
get the code let's paste that below in a new code cell and now let's enjoy the results so
what
we're gonna do is first execute this cell it's gonna take a little while because you know
the step is point 25 meaning we will get a very dense grid as i've just explained the
predict method is applied on each of thisdense points of the grid and that's why there
are actually a lot of predictions to compute that's why it's taking a little time but there
you go it's coming
we just got the results of the training set let's also plat now the results of the test set so
we can just let it run and start by observing and interpreting the results of the training
set
so just to recap you have to understand four things in this plot is that all the points that
you see there whether they're red or green are the real customers and the real results in
the training set the green points correspond of course to the customers who put the or
by the who put the points to the customers who put the or who bought the suvs and the
red correspond to the customers who didn't buy any suv and then the other two things
to understand is that those colored regions the red and the green are the prediction
regions so this region is the region where our model predicts the customer didn't buy the
suv and this region predicts that the customers bought the suv and so in order to figure
out where to correct predictions are and incorrect predictions are the correct predictions
are where we have some observation points with the same color as the brain as the
green region and the incorrect predictions are where we have some observation points
with the color that is different than the prediction regions for example this customer here
who in reality bought the suv corresponding to one here is actually an incorrect
prediction because it falls into the wrong region the red region and vice versa here in
reality didn't buy the suv because it corresponds to zero was a wrong prediction
because it falls into the dream region where customers are predicted to buy the suv
then finally what's the most interesting this is the prediction boundary as you understood
the prediction boundary is between those two prediction regions the green prediction
region and the red prediction region it is where your classifiers separate basically the
two classes the class one and the class two and now you have to understand
something very important it is the fact that
the observation that the prediction curve of the logistic regression model actually is
actually a straight line for one specific reason because the logistics regression model is
a linear classifier for any linear classifier the prediction boundary or the prediction curve
will always be a straight line in two dimensions in three dimensions it will be a straight
line
so that's what we get for linear classifiers and what we will be really interesting to see in
the next section when building our classification models is that for the non-linear models
like naive bayes or svm with non-linear kernel you will see that the prediction curve or
the prediction boundary will actually not be a straight line so remember that this arc was
strained and therefore that's kind of easy to provide such results because this are
exactly the observations but now what we would like to of the test that are new
customers which our model wasn't trained so we have to see if our model was still able
to separate the two classes meaning the customers who bought the suv and customers
who didn't buy the suv even despite the fact that this are new customers on which the
model wasn't trained and that's exactly what we're about to see now when visualizing
test results we already executed the sales here and so this are the test results and still
our logistic regression model was perfectly able able to separate the two classes zero
all those red points here and one all those green points that are still some there are still
incorrect predictions like this who didn't buy the new brand new suv but predicted to buy
it and a few incorrect predictions of the other class meaning these customers who in
reality bought the suv but we're predicted not to not to because they fell in the ring in the
in the red not the green region and a few incorrect predictions here of the other class
meaning this customers who in reality bought the suv but were predicted not to because
they fell in the red region and so how can we conclude here what should we conclude
and what are the takeaways and we should get for our future classification models well
the first the model does a very good job at separating our two classes and therefore at
predicting whether the customers
bought suv but we actually would hope to build though that has less prediction errors
and how how can we build big and then go around like this in order to catch all the red
points dead customers and leave all the green points the green customers inside the
green region you might guess this is what we might be able to get with non-linear
classifiers be ready for some even more important classification models that manage to
separate even better these two classes.
In order to implement the other classification models we will only have to change one
single cell of the implementation which is the cell where we build and train the
classification model we will only have to change this cell where we build a classifier and
train it ,on the training set and all the rest will be the same we won't have to change
anything here because we will just call our classifier to predict this single result and then
all the results in a test set and same for the visualization we won't have to change
anything except the name of the model here because indeed all the verbal names will
be the same the classifier which here is the logistic regression model.

Logistic Regresion

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Logistic Regresion

Uploaded by

Copyright:

Available Formats

To start your implementation of the Logistic Regression you will use and IDE using

Google Collab or if you have Jupityre Notebook.

This will automatically select the last column

-After this step Learn to use scikit-learn

Predicting a New Result

Predicting a new Result

Predicting The test Set Results

Making a confusion Matrix

Visualizing the training set result

You might also like