Professional Documents
Culture Documents
Machine learning project Report1 (2)
Machine learning project Report1 (2)
1
SCHOOL OF COMPUTING
DEPARTMENT OF SCIENCE AND ENGINEERING
DECLARATION BY THE STUDENT
Date:
2
BONAFIDE CERTIFICATE
Certified that this project report “AUTOMATED LANGUAGE
TRANSLATION FOR INDIGENOUS LANGUAGES” is the
bonafide work of “Sk Daniya Sulthana Begam (99220040194),
K. Lakshmi Sai Deepika (99220040283), N. Yasaswini
(99220040320), P. Manasa (99220040326)” who carried out the
project work under my supervision.
Dr. N. Suresh Kumar Mr. R. Mari Selvan
Head of the department Supervisor
Professor Assistant Professor
Department of CSE Department of CSE
Kalasalingam Academy of Kalasalingam Academy
Research and Education of Research and Education
Krishnankoil-626126 Krishnankoil-626126
3
ACKNOWLEDGEMENT
4
Table of Contents
Chapter No. Title Page No.
1 Abstract 6
2 Introduction 7
3 Literature Survey 8-10
3.1 Features and advantages 11-12
3.2 Limitations and challenges 13
4 Methodology 14
4.1 Packages used 15
4.2 Data collection methods, sources 15
5 Proposed works
5.1 Flowchart 16
5.2 Code Implementation 17-20
6 Reference papers 21
5
1.ABSTRACT
6
CHAPTER 2
INTRODUCTION
• Automated language translation for indigenous languages
leverages the capabilities of machine learning algorithms to
decipher and translate text from one language to another.
The translator app is a versatile tool designed to facilitate
seamless communication across different languages.
• Leveraging cutting-edge technologies such as streamlit ,
speech recognition, and translation apis , this application
offers users the ability to translate both text and speech in
real-time.
• With globalization becoming increasingly prevalent in
today's world, the need for efficient language translation
solutions has never been more critical.
• The translator app aims to address this need by providing a
user-friendly platform for individuals and businesses to
overcome language barriers effortlessly.
• Whether you're traveling abroad, conducting international
business, or simply seeking to connect with people from
diverse linguistic backgrounds, this app empowers users to
communicate effectively in any language.
• Join us as we explore the features and functionalities of this
innovative translator app, revolutionizing the way we
interact and communicate in a multilingual world.
7
CHAPTER 3
LITERATURE SURVEY
This survey discusses challenges, data sources, and future directions for enhancing
the processing of Indian regional languages in various language processing tasks.
8
Modules/Libraries used:
9
Feature Engineering and Selection:
Feature extraction:
Character n-grams: Extract sequences of characters (e.g., bi-
grams, tri-grams) from the text. This can capture patterns specific
to certain languages.
Word n-grams: Similarly, extract sequences of words to capture
language-specific patterns.
Language-specific Features: Incorporate linguistic features that
are known to be characteristic of certain languages. For example,
certain phonetic or orthographic features may be unique to specific
indigenous languages.
Statistical Features: Compute statistics such as word frequencies,
average word length, or entropy of the text. These can provide
insights into the language's characteristics.
Syntactic Features: Utilize syntactic features like part-of-speech
tags or syntactic dependencies. Some languages may exhibit
specific syntactic patterns.
10
3.1 FEATURES AND ADVANTAGES
Features:
Users can input text in their preferred language and translate it into
multiple target languages with just a few clicks.
With its intuitive design and robust features, the proposed system
seeks to bridge language barriers and promote global connectivity
and understanding.
12
3.2 LIMITATIONS AND CHALLENGES
LIMITATIONS:
Limited Training Data: Indigenous languages often have
limited digital presence and resources compared to widely
spoken languages.
Complex Linguistic Structures: Many indigenous languages
have complex linguistic structures, including unique
grammatical rules, syntax, and semantics.
Lack of Standardization: Indigenous languages often lack
standardization across different dialects and regions.
Cultural Context: Indigenous languages are deeply
embedded in their respective cultures, and translations often
involve conveying cultural nuances, idiomatic expressions,
and contextual meanings.
CHALLENGES:
Code-Switching and Borrowing
Quality Control and Evaluation
Accessibility and Infrastructure
Community Involvement and Ownership
Ethical and Socioeconomic Implications
13
CHAPTER 4
METHODOLOGY
• The provided code initializes a streamlit app with two main
sections: text translation and voice translation.
• In the text translation section, users can enter text, choose
source and target languages, and click the “translate”
button.
• The translated text is displayed, and an audio file is
generated and played.
• The voice translation section allows users to select the
source language for speech input.
• The recognized speech is translated, and the translated text
is displayed along with an audio playback.
14
4.1. PACKAGES USED
Streamlit
Speech_recognition
Pyttsx3
Googletrans
gtts
CHAPTER 5
15
PROPOSED WORKS
19
Chapter 6
REFERENCE PAPERS
“Ethical Considerations for Machine Translation of
Indigenous Languages”:
Discusses ethical challenges and emphasizes community
involvement.
Canadian Indigenous Languages Technology Project:
Develops language technologies for Indigenous
languages.
“IndT5: A Text-to-Text Transformer for 10 Indigenous
Languages”:
First Transformer model for Indigenous languages.
“Enhancing Translation for Indigenous Languages”:
Investigates multilingual models for translation.
20