Professional Documents
Culture Documents
Le MM at Ization
Le MM at Ization
Atinkut Muche
Bizuayehu Tadege
Tiruedle Asteraye
Outline
Introduction
Statement of the problem
Objective
General Objective
Specific Objective
Significance of the Project
Methodology
Scope of the Project
Literature Review
Architecture
Experimental Result
Conclusion
Introduction
General Objective
The main objective of the project is to design Amharic language
Lemmatizer.
Specific Objective
To have a clear understanding of the area through conducting
relevant literature review and identify different lemmatization
algorithms that have been developed for other languages.
Specific Objectives ...
Literature review
Document analysis is used to understand the characteristics of the
language. As studying the language’s morphology constitute which
is an important component in the project, a literature survey was
made to gather information and to understand the language.
Data Source
A text corpus is one of the resources required in stemming and
lemmatization process in NLP works. The text was used for
compiling stop words, prefixes and suffixes. Moreover, it has been
used for testing the algorithm.
For the purpose of this project texts are gathered from Amharic
fiction books, Amhara Mass Media Agency (AMMA) and other
sources were used.
Methodology …
Notepad
10
Scope of the Project
The lemmatization process gives the lemma of the word which has contextually
meaningful sense.
In rule based lemmatizer, by adding more rules for a given sentence/word the
lemmatizer correctly identify the lemma of a words.