Professional Documents
Culture Documents
Project Proposal NLP Northeast PDF
Project Proposal NLP Northeast PDF
Project Proposal NLP Northeast PDF
U
G Council Research Project Proposal
Speech Enabled Bilingual
Machine Translation System For
English-Assamese using NLP and
Deep Neural Networks
1
GROUP DETAILS
2
Broad View
Natural Language Processing.
Natural-language processing (NLP) technology possesses the skills to turn text or audio speech into
encoded, structured information, based on an appropriate ontology. The structured data may be used to
identify discovery, conduction, medications, sensitivities and participants.
The Project will attempt to translate English text and speech to Assamese and vice versa along with the
vocalized feature of the translated text for the user either typing the input text or speaking the contents
through a microphone. Various efforts have already been made in the development of machine translation
from English to various North-eastern languages including Assamese however, the field is still
developing. Existing methods are not up to the mark as it can be more user-friendly or available to the
mass.
NLP is an interesting area which deals with the interactions between computers and natural languages.
It is used to design and draft computer programs that will analyze, understand and fabricate speech or
text of natural languages[1]. At present, it is a very popular research task in the field of computer science
that explores how computers are able to understand and influence text or speech of natural language to do
beneficial things[2]. A large number of various applications of natural language processing have already
been developed across the globe as well as in India.
In this project, we will be focusing on Machine Translation along with a feature similar to the
text-to-speech function in Google Translate using corpora from various sources and develop methods
for making it more computationally efficient and modular so that, it has a greater outreach and is
much more convenient for users(people who only speak this language or do not speak it at all or
people who are between these two extremes). This includes evaluating and extending existing
methods and developing new ones such as making it vocal.
When we joined this institute, we observed that few students who are the natives of this region were not well
acquainted with the languages like English or Hindi which are commonly spoken in other parts of the country,
thus facing some serious dialogue problems. As a group of machine learning enthusiasts, we have always wanted
to apply our knowledge and understanding with the subject on real-life and day-to-day problems.
Let’s now look at a much broader aspect which is in line with this fast-growing world.
According to a survey, done by research firm Common Sense Advisory [22], on more than 3,000 global
consumers in 10 non-Anglophone countries in Europe, Asia, and South America, nearly 3/4th of the
consumers spend majority of their time on sites in their own language and 72.4 % say they would be more
willing to buy a product with contents in their own language.
This is an illustration of the general online shopping trend and a requirement for customized search
results according to the customers need.
With the help of machine translation, one can experience more personalized search results which would
surely benefit a consumer as it increases the satisfaction. As far as retailers are concerned, the application
of machine translation will just be the way to reach and engage those consumers who only ever buy
products available in their native language.
This one buildout will change the course of the current system since language barriers have been an
inevitable hiccup to e-commerce since time immemorial.
Thus, we searched for an online machine translator from English to Assamese as it could be the best
possible solution to the above mentioned problems, but the results were not satisfying as the process was
in its budding phase. At that moment, we all felt an urge to build a machine translation system with a
feature of text-to-speech translation.
And when the announcement was made regarding the submission of project proposals under the UG Council,
we were delighted to be presented with an opportunity to use our skills to the best of our abilities. Thus, we zeroed
down on the idea of using Natural Language Processing (NLP), an interdisciplinary research area of machine
learning along with machine translation a process to translate text or speech in one language to another.
Review of the status of Research and Development in the
subject
Machine Translation (MT) is the automatic translation of a large amount of text from a particular natural
language to another natural language using computers. It is one of the most critical application and
challenging research task in NLP and was the first application of computer-related applications to NLP.
In India, it was comparatively new and made progress from 1980 onwards in institutions like IIIT
Hyderabad, IIT Bombay, IIT Kanpur, NCST Mumbai, University of Hyderabad, The Technology
Development for Indian Languages (TDIL), Centre for Development of Advanced Computing (CDAC) and
Ministry of Communications and Information Technology are also playing a pivotal role in developing
the MT systems[4].
(1) Google Translate: Google Translate is a multilingual Statistical Machine Translation Service by
Google. Although Google Translate helps in direct machine translation to most of the other languages, it
uses a different approach for languages where L→L translation could not be done. In those cases, instead
of translating from one language to another (Language 1 → Language 2), it translates the language initially
to English and then to our target language (Language 1 → ENGLISH → Language 2). Presently, it supports
64 languages which do not include Assamese or many other North-east languages neither via text nor via
speech.
(2) English to Assamese Machine Translation System:- An English to Assamese Machine Translation
system is currently under progress (Sudhir et al, 2007). The following actions are undertaken to make this
possible:
• Re-designing the graphical user interface(GUI) of the MT system (modifications have been done using
JAVA modules) which allows the display of Assamese text.
•Development of the process of entering Assamese words (equivalent of English words) in Lexical
database.
They basically used a rule-based approach which relied on a bilingual English to Assamese dictionary.
The generation of Assamese text from English text using the dictionary is considered a major step in this
MT. Information about the English lexeme and all of its Assamese equivalents is provided with each entry
in the dictionary. The dictionary is annotated for morphological, syntactic, and partially semantic
information. Presently it helps in translation of simple sentences from English to Assamese. The
dictionary has nearly 5000 root words. By means of the bilingual dictionary lookup, it uses MT to translate
source language texts to the target language texts.
Aforementioned are the recent projects in the field of machine translation that has been carried out in
our country but, as we can see that most of the work is usually based on text-based translation system;
with the additional feature of speech-enabling it could really be a boost in lack of pace in the supply of
technical translators in the classic “law of supply and demand”.
Importance of the Proposed project in the context of the
current status
There is no universal language in this world and anywhere the diversity is cherished is the ground for the
seamless translation. Language barrier should not be an impediment in securing communication which in
this ever-growing world can be a myopic benefit of machine translation. In a more comprehensive view,
there can be a whole lot of benefits that can be extracted according to human needs.
Let’s now take the case of Assam, a North-Eastern state in India where we are predominantly focusing.
The literacy rate is about 72.19% according to 2011 census. This gives us a fair bit of estimation that more
than a quarter of natives of Assam cannot read and write properly. When we analyze the situation of
female literacy, it plummets down to 66.27%. Nearly a third of the female population analphabetic says
the underlying situation.
5
This largely hampers the development of illiterate people as they hail from the economically weak
background because there are many government policies/schemes which are launched just for the sake of
such people; and just because of the lack of basic skills, they are getting even more disabled. Thus with
the assistance of machine translation, anything written or spoken in English can be translated to the
language from which they can grasp the knowledge and information provided. With the help of this
proposed system, the translation made can even be vocalized which makes it even much better.
Speech recognition is an engineering technology that is flourishing. There are hundreds of people in this
state who are victims of various disabilities, many of them are unable to see or use their hands effectively
but with the help of our proposed model they can not only share information with other people
through voice input but also in a translated form that will be understood by the other person. Our
project will possess the capability to recognize speech(convert the input audio into text) converting
into English or vice-versa and if needed speak out the translated text.
Our proposed can contribute significantly towards the growth of trade on a local level, national level
or sometimes even globally. This may sound a bit too much but probably rings true to the majority of
global business leaders who face an almost unimaginable range of problems arising when a buyer and
seller are not able to have a direct conversation[11]. The indication is quite clear. As machine learning
cracks down the language barrier between extensive languages, world trade flows should see a rise.
Work Plan
Our Project will be broken in several parts before assembling and delivering the final model. The 1st part
governs the Machine Translation(MT) after that we will look forward to speech-to-text conversions and
vice versa.
● Tokenizing: The tokenization of the data has to be done so that spaces are embedded
between words and punctuation[13].
6
● Truecasing: In order to set the casing of the first word in every sentence we have to
truecase the data[13].
● Cleaning: The last step is to clean the data i.e. clear away vacant lines, additional spaces,
punctuations and some lines that are either too short or too long[13].
After performing the aforementioned steps we would be able to train our translation model.
Post Machine Translation Process we will perform Transliteration for noun translations.
4. Transliteration:-
As the availability of corpus will be very small, it will be difficult to cover all the words. Words
that are not present in our corpus like some proper nouns, i.e. names of people, places etc. are not
translated. So the transliteration system is introduced into our SMT system. In transliteration, a
character is mapped automatically from one language with the character from another language
such that the pronunciation of the original source word is preserved. For example,
A Perl script is used to retrieve those words which are not translated or which are not present in
our corpus. All the Assamese characters along with their corresponding English transliterations
are stored in a Perl file and executed the Perl script with Moses.
5. Result Evaluation:-We will evaluate the BLEU score for the translated sentences that will be
performed using the BLEU (Bilingual Evaluation Understudy) toolkit. BLEU score will give us the
efficiency of our translation system.
7
A deep feedforward neural network (DNN) is an ANN that has more than one hidden layers of
units between the input and output layers. Like any traditional Automatic Speech
Recognition(ASR) system, a DNN model also needs two fundamental actions -feature extraction
and training-testing of the model. Extracted features of the data is an input parameter to the
model. Based on the size of the data and the training process, the efficiency of the model may be
evaluated. In our work, the MFCC feature will be used.
A. Mel Frequency Cepstral Feature (MFCC):-
In Automatic Speech Recognition(ASR), MFCC features are widely used. There are
several features that we can extract from a speech signal. Some of them are linear
predictive coefficients (LPC), linear predictive cepstral coefficients (LPCC), human factor
cepstral coefficients (HFCC). Out of all these features, MFCC regarded as the standard
feature because of higher accuracy in the recognition. Fig. 2 shows a block diagram of the
steps involved in extracting MFCC features of speech data. Pre-emphasis is the first step
in MFCC to boost the high frequencies of a speech signal which may be lost during
speech production. This step also ensures the removal of additive noise from the speech
data. Instead of analysing the whole speech signal at a time, we use some user defined
frame with a short span of time. This frames may be overlapped or may be discrete. After
defining the frames, windowing is performed to avoid discontinuity in waveforms. To get
the signals in the frequency domain, Fast Fourier Transform(FFT) is measured for each
frame. The linear frequency scale is not used for Speech Signal. Therefore, to extract
features Mel-scale is used. Mel-scale is directly proportional to the log of linear
frequency, and these signals are called Mel spectrum. In the final step, the log of Mel
spectrums again converted into time domain from the frequency domain. For this to be
done Discrete Cosine Transform is used and the results obtained are the MFCC for the
input speech data to this algorithm.
B. Long Short-Term Memory (LSTM) Network:-
In a feed-forward neural network, inputs are fed into the network, and the network produces an
output. In supervised learning, the output would be a class id or label. While a Recurrent Neural
Network (RNN) not only take the current input that is fed into, but also the inputs used previously
to produce the output. RNNs are useful for learning sequential data. It possesses a weight matrix
8
that connects the current hidden state to the previous hidden state. The sequential information is
preserved in the hidden state. To classify sequential input, RNNs depend on the backpropagation
of error and gradient descent. The major problem faced by RNNs is the vanishing gradient
problem. The gradient vanishes exponentially as it backpropagates through time. By using the
LSTM network, we can overcome this problem to some extent. LSTM network is a special kind of
RNN with LSTM blocks or units. These LSTM units will preserve the error that can be
backpropagated through time. They allow RNN to learn over many time steps by maintaining a
more constant error. There is something called gated cells in an LSTM unit, which controls the
flow of data in the cell demonstrates a block diagram of an LSTM unit(Fig. 3).
In a gated cell, information can be stored, read and write operations can be performed.
There is an input gate, output gate and forget gate in a unit. These gates have their
weights. Instead of using a constant long-term memory, these gated cells use the
mechanism of forgetting unnecessary information and storing information which is
useful. The decision to choose what information is to throw away is made by forget gate,
which is a sigmoid layer.
Well, this is all about the Methodologies which will be used keeping in mind the current availability of
technology for carrying out various complex computations throughout our project. We will be making
some amendments in these procedures as we all know machine learning, deep learning and artificial
intelligence(AI) is a rapidly evolving field in the current world if we find a better way our algorithm to
tackle the problem in hand which could increase the accuracy or efficiency of the model.
9
Suggested Plan of action for utilization of research
outcome expected from the project
Machine translation is a tool that can be used to practice a distant language and that includes English
but, the text-to-speech tool that will be provided by our model will come in as a handy pronunciation
guide. One can promptly utilize it to understand how routine or uncommon words are pronounced
even in English. It will be exclusively useful for getting the correct pronunciation of common
Assamese words which have found a place in the English lexicon. In fact, one can use the audio option
to let our model read out the pasted text from an audiobook.
Most businessmen and their businesses based in Assam are predominantly carrying out their chores
either locally or on a state level. Very few of these people are able to reach out either on a national or on a
global scale but with the help of our proposed model they can look towards the expansion of their
trade without worrying about the burden of the language barrier.
The major contribution to the economy of Assam is from the agriculture sector with 69% of the
population engaged in it. Assam produces more than half of India's tea. Some of the finest and costliest
teas in the world are produced by Assam [8].
decades [10]. The machine translation tool which we will be building will be contributing significantly to
the agricultural modernisation by overriding trade-hindering language barriers to a certain extent and
will help reach out the tea market in national and global expansion. It is not perfect, but being able to
exchange the dialogues freely with someone who speaks a different language is nothing short of
staggering. And that will provide a groundbreaking stimulus to trade.
To recognize the pay-offs of machine translation, we compared the contrast in US exports to treated
countries and control countries after the amendments made in policy. The treated countries are
Spanish-speaking Latin American countries. Countries that US sellers export to on eBay comers under
the control group. Surveys and stats showed parallel trends between the two groups. Using this control
group, we find that e MT increases US exports on eBay to treated countries by 17.5%.
Even if we are able to get 20% of the above-mentioned numbers, then it will be a huge boon for the
local residents as well as for the national. We got to keep in mind while comparing the
aforementioned statistics with those generated through results that the literacy rate and internet
friendliness in the country of America are significantly higher than that here in Assam. Our model
will be computationally efficient and with further research and development, it can be converted into
a prototype for various online shopping websites or normal websites which then will be able to
translate English text and speech into Assamese for the Assamese speaking population.
When we try to get to the primary application of this proposed machine translation model and try to
reach out for the individuals who will be getting the maximum benefit i.e. the economically weaker
section of the society which is deprived of the basic amenities, we can see a giant leap in terms of their
present condition.
With the help of further research and development, and establishing the model on a much larger scale; it
can be deployed in translating the schemes from the government either be central or sometimes state[16].
As due to the poor implementation of various schemes over the years, Assam’s economic development is
falling behind the rest of the country.
According to the development report of Planning Commission [18], the gap which has widened over the
period of 40 years due to various factors would take nearly a decade or so to shrink it down and to bring it
in line with the national economy. Thus, a sudden and rapid change is less likely to be achieved, but an
appropriate and well established model would surely lay a strong foundation for a better future.
11
Declaration
We declare that this research proposal is our own idea and it has not been submitted
anywhere else for consideration.
………………… ……………………. …………………... ………………………………………...
Sachin Rai Saumya Garg Shreya Sinha Dr. Malaya Dutta Borah
(17-14-077) (17-14-078) (17-14-123) (Mentor)
12
REFERENCES
[1] S. Islam and B. S. Purkayastha, “Development of Multilingual Assamese Electronic Dictionary”,
International Journal of Computer Science and Information Technologies, Vol. 6, No. 6, pp. 5446-5452,
2015.
[4] S. Sanyal and R. Borgohain, “Machine Translation system in India”, Annals of Faculty Engineering
Hunedoara – International Journal of Engineering, pp-137-142, 2013.
[5] N.J. Kalita and B. Islam, “Bengali to Assamese Statistical Machine Translation using Moses
(Corpus-Based)”, Proceedings of the International Conference on Cognitive Computing and Information
Processing, 2015.
[6] A. Godase and S. Govilkar, “Machine Translation Development for Indian Languages and its
Approaches”, International Journal on Natural Language Computing, Vol. 4, No. 2, pp-55-74, 2015.
[7] Machine Translation projects in India: Current Status and Future Prospects - Suhas S. Kshirsagar -
M.E. (app. ExTC), PG-DAC, CDAC Pune.
[11] Annual Meeting of the New Champions -- Richard Baldwin, Professor of International Economics,
Graduate Institute of Geneva.
[12] Assamese-English Bilingual Machine Translation, K.K. Baruah, Pranjal Das, Abdul Hannan and
Shikhar Kr. Sarma--International Journal on Natural Language Computing (IJNLC) Vol. 3, No.3, June
2014
[13] Statistical Machine Translation System User Manual and Code Guide”, Available:
http://www.statmt.org//moses/manual/manual.pdf.
[14] Speech Recognition Model for Assamese Language Using Deep Neural Network -- M.T. Singh, P.P.
Barman, Rupjyoti Gogoi, CSE Department, Dibrugarh Institute of Engineering and Technology, IEEE -
June 2018.
ttps://csa-research.com
[15] CSA survey of 3000 online shoppers: h
[17] Kishore Papineni, Salim Roukos, Todd Ward and Wei-Jing Zhu, “Bleu: a Method for Automatic
Evaluation of Machine Translation”, “Proceedings of the 40th Annual Meeting of the Association for
Computer Linguistics (ACL)” Philadelphia, July 2002 pp. 311-318.
[19] Peter F. Brown et al., “A Statistical Approach to Machine Translation” Computational Linguistics
Volume 16, Number 2, June 1990, pp. 79-85.
[21] Machine Translation Approaches and Survey for Indian Languages -- Antony P.G., Computational
Linguistics and Chinese Language Processing Vol. 18, No. 1, March 2013, pp. 47-78.
[22] N.Sharma, P.Bhatia, V.Singh, “English to Hindi Statistical Machine Translation”, June 2011.