Professional Documents
Culture Documents
Bag of Words Algorithm: Paragraph
Bag of Words Algorithm: Paragraph
Paragraph –
We can use NLP to create chatbots and we will be making health chatbots now!
Sentence segmentation
Sent 2: We can use NLP to create chatbots and we will be making health chatbots now!
Tokenization
Sent 2: We can use NLP to create chatbots and we will be making health chatbots now !
Stemming
Lemmatization
we health chatbot treat stress nlp create make replace counselor human
Sent 1 1 1 1 1 0 0 0 0 0 0
1
Sent 1 1 1 0 0 1 1 1 0 0 0
2
Sent 0 1 1 0 0 0 0 0 1 1 1
3
Step 4: TFIDF
Term Frequency
w health chatbot treat stres nlp creat mak replac human counselor
e s e e e
2 3 3 1 1 1 1 1 1 1 1
Document frequency
we health chatbot trea stress nlp creat mak replace huma counselor
t e e n
3/ 3/3 3/3 3/1 3/1 3/1 3/1 3/1 3/1 3/1 3/1
2
TFIDF(W) = TF(W)*log[IDF(W)]
we health chatbot treat stress nlp create make replace counselo human
r
Sent 1*log3/ 1*log3/ 1*log3/ 1*log3/ 1*log3/ 0*log3/ 0*log3/ 0*log3/ 0*log3/ 0*log3/1 0*log3/1
1 2 3 3 1 1 1 1 1 1
Sent 1*log3/ 1*log3/ 1*log3/ 0*log3/ 0*log3/ 1*log3/ 1*log3/ 1*log3/ 0*log3/ 0*log3/1 0*log3/1
2 2 3 3 1 1 1 1 1 1
Sent 0*log3/ 1*log3/ 1*log3/ 0*log3/ 0*log3/ 0*log3/ 0*log3/ 0*log3/ 1*log3/ 1*log3/1 1*log3/1
3 2 3 3 1 1 1 1 1 1
IDF Values
we health chatbot treat stress nlp creat make replac counselo human
e e r
CONCLUSION
In sentence 1, priority was given to stress and treat as compared to other words.
In sentence 2, priority was given to NLP, create and make as compared to other words.
In sentence 3, priority was given to replace, counselor and human as compared to other words.
-O.Varshitha
10C