Download as xlsx, pdf, or txt
Download as xlsx, pdf, or txt
You are on page 1of 7

S.

No Question

1 What are simple ensemble learning techniquesare

2 What are Advance Ensemble Technique

3 What is Bayes theorem

4 Pros and Cons of Naive Bayes model

5 Whar is imputing and its different methods


Answer
Averaging
Maxvoting
WeightedAverage
Bagging
Boosting

Using the two equations:knownasbayes theorem.P(A|B) =P(B)PP(A)(AB)*′


Baye’s theorem provides a way to calculate posterior probability P(A|B) from P(A),
P(B) and P(B|A).

Pros
Naïve baye’s model is simple to implement and fast in processing.
Requires few examples in the train set to work with.
Perform well with noisy data and missing values.

Cons
Perform poorly if the dataset contains more continuous input features.
Predictionsarebasedontheassumptionofindependentfeatureswhichisalmost
impossible in real life scenarios.
Sometimes the estimated probabilities are less reliable.
Which technique is used for representing text in structured form 1. Bag of words
2. Semantic

Cleaning Text Stop words


Stemming
Lower case conversion
Remove punctuations
Strip extra white spaces
Remove number

Document Term Matrix


Consider each document as vector and analyze
Corps->Term+Doc->Vector
which vector is close to which vector

Sentiment Analysis 1. Polarity detection


2. Aspect

3. Intent

4. Emotion

Lexicon based

ML Model

Facts vs Opinions
Challenges Irony and sarcasm
Comparision vs Emojis

What is NLTK Natural Language Toolkit,

The NLTK corpus is a massive dump


of all kinds of natural language data
sets that are definitely worth taking a
look at. Almost all of the files in the
NLTK Corpus
NLTK corpus follow the same rules
for accessing them by using the NLTK
module, but nothing is magical about
them
orpora is a group presenting
Corpus
multiple collections of text
documents. A single
collection is called corpus.
SVD - Single value decomposition
Word Embedding methods Word2Vec - Neural network
GloVe
Bayesian SPAM filter More popular and frequently n-gram model bi-gram method
Name entity identification

Optional

Postive, negative, happy,

complaint,suggesion

sad,query

If number of
Specific to language Map each word to a polarity count the number postive are
of positive and negative greater, conclude
a positive
sentiment
Little less specific to langua Doc -->Vector-->Labelling
Fast text
ELmo
Word embedding
Can interpret a word
in different manner
like Kill
you kill the
competition
you kill the product

You might also like