Professional Documents
Culture Documents
DMML Unit4 Ppt.pptx
DMML Unit4 Ppt.pptx
DMML Unit4 Ppt.pptx
• Sales Forecasting
• To Predict the prices / rent of houses and
other factors
• Finance Applications To Predict Stock prices,
investment evaluation, etc.
• Rainfall and weather prediction
What is Regression?
• Sales Forecasting
• Risk Analysis
• Housing Applications To Predict the prices and
other factors
• Finance Applications To Predict Stock prices,
investment evaluation, etc.
Why do we use Regression Analysis?
• Binary(0/1, pass/fail)
• Multi(cats, dogs, lions)
• Ordinal(low, medium, high)
• With binary classification, let ‘x’ be some feature and ‘y’ be the
output which can be either 0 or 1.
• The probability that the output is 1 given its input can be
represented as:
• where, the left hand side is called the logit or log-odds function,
and p(x)/(1-p(x)) is called odds.
• The odds signifies the ratio of probability of success to probability
of failure. Therefore, in Logistic Regression, linear combination of
inputs are mapped to the log(odds) - the output being equal to 1.
If we take an inverse of the above function, we get:
36
Performance Metrics for Classification
problems in Machine Learning
38
What is Naive Bayes?
• P(YES/X) =?
• P(NO/X)=?
P(YES/X) = P(X/YES)P(YES)
P(X)
P(NO/X) = P(X/NO)P(NO)
P(X)
P(YES/X) = P(X/YES)P(YES)
P(NO/X) = P(X/NO)P(NO)
P(X/YES)P(YES) = P(age=<30/YES) *
P(income=medium/YES) * P(student=yes/YES) *
P(credit=fair/YES)*P(YES)
P(X/NO)P(NO) = P(age=<30/NO) *
P(income=medium/NO) * P(student=yes/NO) *
P(credit=fair/NO)*P(NO)
P(X/YES)P(YES) = P(age=<30/YES) *
P(income=medium/YES) * P(student=yes/YES) *
P(credit=fair/YES)*P(YES)
=.222*.444*.667*.667*.643 = 0.028
P(X/NO)P(NO) = P(age=<30/NO) *
P(income=medium/NO) * P(student=yes/NO) *
P(credit=fair/NO)*P(NO)
0.6*0.4*0.2*0.4*0.357 = 0.007
P(N/X) and P(P/X)
P(Sunny/N)*P(Cool/N)*P(High/N)*P(true/N)*P(N)
P(Sunny/Y)*P(Cool/Y)*P(High/Y)*P(true/Y)*P(Y)
2/9*3/9*3/9*3/9*9/14 ======== 0.0052
Types of Naïve Bayes Model:
• There are three types of Naive Bayes Model, which are given
below:
• Gaussian: The Gaussian model assumes that features follow a
normal distribution. This means if predictors take continuous
values instead of discrete, then the model assumes that these
values are sampled from the Gaussian distribution.
• Multinomial: The Multinomial Naïve Bayes classifier is used when
the data is multinomial distributed. It is primarily used for
document classification problems, it means a particular document
belongs to which category such as Sports, Politics, education, etc.
The classifier uses the frequency of words for the predictors.
• Bernoulli: The Bernoulli classifier works similar to the Multinomial
classifier, but the predictor variables are the independent
Booleans variables. Such as if a particular word is present or not in
a document. This model is also famous for document classification
tasks.
Advantages of Naive Bayes Classifier