Professional Documents
Culture Documents
1 - Introduction - Rec
1 - Introduction - Rec
1
Traitement Automatique du Langage
Naturel (NLP)
Introduction
Zakariae EN-NAIMANI
2023/2024
CBOW : pseudo-code
Fonction Entraîner_CBOW(corpus, vector_size, window, learning_rate, nombre_iterations)
So here from the above intuition and the example we can understand
how we can use this probabilistic model to make a prediction. Now let’s
just discuss the applications where it can be used.
From the above application of HMM, we can understand that the
applications where the HMM can be used have sequential data like time
series data, audio, and video data, and text data or NLP data. In this
article, our main focus is on those applications of NLP where we can use
the HMM for better performance of the model, and here in the above-
given list, we can see that one of the applications of the HMM is that we
can use it in the Part-of-Speech tagging. Next in the article, we will see
how we can use the HMM for POS-tagging.
L’évaluation directe
We have learned in our school timings that the part of speech indicates
the function of any word, like what it means in any sentence. There are
commonly nine parts of speeches; noun, pronoun, verb, adverb, article,
adjective, preposition, conjunction, interjection, and a word need to be
fit into the proper part of speech to make sense in the sentence.
POS tagging is a very useful part of text preprocessing in NLP as we
know that NLP is a task where we make a machine able to communicate
with a human or with a different machine. So it becomes compulsory for
a machine to understand the part of speech.
Classifying words in their part of speech and providing their labels
according to their part of speech is called part of speech tagging or POS
tagging OR POST. Hence the set of labels/tags is called a tagset. In the
article, we have seen how we can implement the part of speech at a
beginning level using the NLTK where the tagsets package of NLTK was
helping us to provide the part of speech tag to our documents.
We can say that in the case of HMM is a stochastic technique for POS
tagging. Let’s take an example to make it more clear how HMM helps in
selecting an accurate POS tag for a sentence.
As we have seen in the example of the HMM process in POS tagging the
transition probability is the likelihood of any sequence for example what
are the chances for a noun word to come after any modal and a modal
after a verb and a verb after a noun.
Let’s take the sentence “Rahul will eat food” where Rahul is a noun, will
is a modal, eat is a verb and food is also a noun, so the probability for a
word to be in a particular class of part of speech is called the Emission
probability.