Adi Mad

PUNE DISTRICT EDUCATION ASSOCIATION
THE INSTITUTE OF TECHNOLOGY, HADAPSAR
INSTITUTE OF TECNOLOGY
HADAPSAR- 411028
MICRO PROJECT
Academic Year: 2022-23
Title of Project
“Image Generator Application”
Department: - Computer Eng. Course Year: -22-23
Subject: -MAD Subject Code:- 22617
Guidance Teacher
MRS. Gade M.A.
Submited By:-
Rohit Ghavate
Aditya Yadav
1
ANNEXURE 1
1.0 Brief Introduction:
A text detector app is a software that extracts text from images or scanned documents using
OCR techniques. It is useful for automating tasks such as data entry, document management,
and information retrieval. Some apps offer additional features such as translation and text
editing. Overall, it is a powerful tool that saves time and increases productivity..
2.0 Aim of the Micro Project
The The aim of a text detector app is to automate the process of extracting text from images
or scanned documents. It helps to increase efficiency and accuracy in tasks such as data entry,
document management, and information retrieval. The app can save time and resources by
reducing the need for manual text extraction. Additionally, it can be used to convert paper-
based documents into digital format.
3.0 Outcomes in Affective Domain-
 Appreciation for creativity

 Confidence in design skills
 Sense of accomplishment
 Improved conflict resolution and problem-solving abilities
 Increased cultural competence and tolerance
Roll No. Student Name Marks out of for Marks out of 4for Total out
performance in performance in
Of 10
group activity oral
25. Rohit Ghavate

26. Aditya Yadav
2
 ACTION PLAN
3
INDEX
SR. NO TITLE
1. INTRODUCTION
2. METHODOLOGY
3. PROGRAM CODE
4. OUTPUT
5. CONCLUSION
6. REFERENCE
4
1. INTRODUCTION
The Image Text detection is a computer vision and image processing technique that
involves identifying and localizing text regions within an image or video. The goal of text
detection is to accurately locate and extract text from images, which can then be further
processed using OCR techniques to recognize and extract the text itself.
Deep learning approaches have revolutionized text detection in recent years, with state-
of-the-art methods utilizing convolutional neural networks (CNNs) and other deep learning
architectures to achieve high accuracy and robustness in text detection. These methods are
trained on large datasets of labeled images and can learn to detect text in a variety of
conditions, including different fonts, sizes, and orientations, as well as complex backgrounds
and image noise. Text detection has numerous practical applications, such as document
analysis, license plate recognition, and scene text recognition in videos. Text detection can
also be used in conjunction with natural language processing (NLP) techniques to analyze the
content of text within images or video frames, providing valuable insights into the meaning
and context of visual information. Overall, text detection is an important tool in modern
computer vision and machine learning research, with a wide range of real-world
applications.To perform text detection, the process typically involves several steps. The first
step is to preprocess the image to enhance its quality and remove any noise or unwanted
elements that may interfere with text detection. This may involve applying filters, adjusting
image brightness and contrast, or resizing the image to a standardized format.
Next, the image is analyzed using algorithms and techniques that can identify regions of
interest where text is likely to be found. This may involve detecting edges and lines, looking
for regions of high contrast or texture, or analyzing the statistical properties of image regions
to determine their likelihood of containing text. Once regions of interest have been identified,
the next step is to localize and segment individual characters or words within these regions.
This may involve applying machine learning models that can recognize and classify different
types of characters or words, or using rule-based methods that rely on patterns and heuristics
to separate individual characters or words. Finally, the segmented text regions can be further
processed using OCR techniques to recognize and extract the text itself. This may involve
applying language models and dictionaries to improve recognition accuracy and correct
errors, or using machine learning models to classify and recognize different types of text.
5
2. METHODOLOGY
Text detector applications typically use a combination of image processing and machine
learning techniques to detect and extract text from images or scanned documents. Here are
some of the common methodologies used in text detector applications:
1. Image Preprocessing: This involves enhancing the quality of the image or document,
removing noise, and normalizing the color and contrast to improve the accuracy of text
detection.
2. Text Localization: This involves identifying and localizing regions in the image that
contain text. This is typically done using techniques such as edge detection, thresholding, or
template matching.
3. Character Segmentation: This involves segmenting individual characters or words within

the text regions that have been identified. This can be done using techniques such as
connected component analysis, contour detection, or machine learning-based methods.
4. Optical Character Recognition (OCR): This involves recognizing and extracting the text
from the segmented characters or words. This can be done using machine learning-based
OCR models that have been trained on large datasets of labeled images.
5. Post-processing: This involves improving the accuracy of text recognition and correcting
any errors in the extracted text. This can be done using techniques such as language modeling,
spell checking, or context analysis.
Overall, the methodology used in text detector applications involves a combination of

techniques from image processing, machine learning, and natural language processing, to
accurately detect and extract text from images and scanned documents. The choice of
methodology may vary depending on the specific requirements of the application and the
nature of the images or documents being analyzed.
6
3. PROGRAM CODE
 activity_main.xml file
<?xml version="1.0" encoding="utf-8"?>
<RelativeLayout
xmlns:android="http://schemas.android.com/apk/res/android"
xmlns:tools="http://schemas.android.com/tools"
android:layout_width="match_parent"
android:layout_height="match_parent"
tools:context=".MainActivity">

<ImageView
android:id="@+id/image"
android:layout_width="200dp"
android:layout_height="200dp"
android:layout_alignParentTop="true"
android:layout_centerHorizontal="true"
android:layout_marginTop="29dp"
android:scaleType="centerCrop" />

<TextView
android:id="@+id/text"
android:layout_width="match_parent"
android:layout_height="50dp"
android:layout_below="@+id/image"
android:layout_marginTop="10dp"
android:textSize="15sp"
android:textStyle="bold" />

<Button
android:id="@+id/snapbtn"
android:layout_width="wrap_content"
7
android:layout_height="wrap_content"
android:layout_alignParentStart="true"
android:layout_alignParentLeft="true"
android:layout_alignParentBottom="true"
android:layout_marginStart="53dp"
android:layout_marginLeft="53dp"
android:layout_marginBottom="100dp"
android:text="Snap"
android:textAllCaps="false"

<Button
android:id="@+id/detectbtn"
android:layout_width="wrap_content"
android:layout_height="wrap_content"
android:layout_alignTop="@+id/snapbtn"
android:layout_alignParentEnd="true"
android:layout_alignParentRight="true"
android:layout_marginEnd="39dp"
android:layout_marginRight="39dp"
android:text="Detect"
android:textAllCaps="false"
</RelativeLayout>
8
 MainActivity.java file
import android.content.Intent;
import android.graphics.Bitmap;
import android.os.Bundle;
import android.provider.MediaStore;
import android.view.View;
import android.widget.Button;
import android.widget.ImageView;
import android.widget.TextView;
import android.widget.Toast;
import androidx.annotation.NonNull;
import androidx.appcompat.app.AppCompatActivity;
import com.google.android.gms.tasks.OnFailureListener;
import com.google.android.gms.tasks.OnSuccessListener;
import com.google.firebase.ml.vision.FirebaseVision;
import com.google.firebase.ml.vision.common.FirebaseVisionImage;
import com.google.firebase.ml.vision.text.FirebaseVisionText;
import com.google.firebase.ml.vision.text.FirebaseVisionTextDetector;
import java.util.List;
public class MainActivity extends AppCompatActivity {
// creating variables for our

// image view, text view and two buttons.
private ImageView img;
private TextView textview;
private Button snapBtn;
private Button detectBtn;
9
// variable for our image bitmap.

private Bitmap imageBitmap;
@Override
protected void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.activity_main);
// on below line we are initializing our variables.

img = (ImageView) findViewById(R.id.image);
textview = (TextView) findViewById(R.id.text);
snapBtn = (Button) findViewById(R.id.snapbtn);
detectBtn = (Button) findViewById(R.id.detectbtn);
// adding on click listener for detect button.

detectBtn.setOnClickListener(new View.OnClickListener() {
@Override
public void onClick(View v) {
// calling a method to
// detect a text .
detectTxt();
}
});
snapBtn.setOnClickListener(new View.OnClickListener() {
@Override
public void onClick(View v) {
// calling a method to capture our image.
dispatchTakePictureIntent();
}
});
}
static final int REQUEST_IMAGE_CAPTURE = 1;
1
private void dispatchTakePictureIntent() {

// in the method we are displaying an intent to capture our image.
Intent takePictureIntent = new
Intent(MediaStore.ACTION_IMAGE_CAPTURE);
// on below line we are calling a start activity

// for result method to get the image captured.
if (takePictureIntent.resolveActivity(getPackageManager()) != null) {
startActivityForResult(takePictureIntent,
REQUEST_IMAGE_CAPTURE);
}
}
@Override
protected void onActivityResult(int requestCode, int resultCode, Intent data) {
super.onActivityResult(requestCode, resultCode, data);
// calling on activity result method.
if (requestCode == REQUEST_IMAGE_CAPTURE && resultCode ==
RESULT_OK) {
// on below line we are getting
// data from our bundles. .
Bundle extras = data.getExtras();
imageBitmap = (Bitmap) extras.get("data");
// below line is to set the

// image bitmap to our image.
img.setImageBitmap(imageBitmap);
}
}
private void detectTxt() {

// this is a method to detect a text from image.
1
// below line is to create variable for firebase

// vision image and we are getting image bitmap.
FirebaseVisionImage image =
FirebaseVisionImage.fromBitmap(imageBitmap);
// below line is to create a variable for detector and we

// are getting vision text detector from our firebase vision.
FirebaseVisionTextDetector detector =
FirebaseVision.getInstance().getVisionTextDetector();
// adding on success listener method to detect the text from image.

detector.detectInImage(image).addOnSuccessListener(new
OnSuccessListener<FirebaseVisionText>() {
@Override
public void onSuccess(FirebaseVisionText firebaseVisionText) {
// calling a method to process
// our text after extracting.
processTxt(firebaseVisionText);
}
}).addOnFailureListener(new OnFailureListener() {
@Override
public void onFailure(@NonNull Exception e) {
// handling an error listener.
Toast.makeText(MainActivity.this, "Fail to detect the text from
image..", Toast.LENGTH_SHORT).show();
}
});
}
private void processTxt(FirebaseVisionText text) {

// below line is to create a list of vision blocks which
// we will get from our firebase vision text.
List<FirebaseVisionText.Block> blocks = text.getBlocks();
1
// checking if the size of the

// block is not equal to zero.
if (blocks.size() == 0) {
// if the size of blocks is zero then we are displaying
// a toast message as no text detected.
Toast.makeText(MainActivity.this, "No Text ",
Toast.LENGTH_LONG).show();
return;
}
// extracting data from each block using a for loop.
for (FirebaseVisionText.Block block : text.getBlocks()) {
// below line is to get text
// from each block.
String txt = block.getText();
// below line is to set our

// string to our text view.
textview.setText(txt);
}
}
}
1
4. OUTPUT
1
1
5. CONCLUSION
In conclusion, text detector applications have become an essential tool for automating tasks
related to data entry, document management, and information retrieval. These applications
utilize advanced image processing and machine learning techniques to accurately detect and
extract text from images and scanned documents, improving efficiency and reducing the need
for manual intervention.
Text detection technology has come a long way, with deep learning approaches and other
advanced methodologies achieving high levels of accuracy and robustness in detecting and
extracting text. These techniques are not only able to detect text in various conditions but can
also handle complex backgrounds, different fonts, sizes, and orientations.
Text detection has numerous practical applications, including document analysis, license
plate recognition, and scene text recognition in videos. With the integration of natural
language processing (NLP) techniques, text detection can provide valuable insights into the
meaning and context of visual information, unlocking new opportunities for automated
content creation, and data analysis.
In summary, text detector applications are an essential tool for modern-day businesses and
organizations, helping to increase productivity, reduce costs, and unlock new insights and
opportunities from visual data.
6. REFERENCE
1. "Scene Text Detection and Recognition: The Deep Learning Era" by Minghui Liao et al., IEEE
Transactions on Pattern Analysis and Machine Intelligence (2018):
https://ieeexplore.ieee.org/document/8237601
2. "A Review on Text Detection and Recognition in Images and Videos" by Arindam Das et al.,
ACM Computing Surveys (2019): https://dl.acm.org/doi/abs/10.1145/3347678
3. "Text Detection and Recognition in Imagery: A Survey" by Saima Rathore et al., Journal of
Visual Communication and Image Representation (2019):
https://www.sciencedirect.com/science/article/abs/pii/S1047320318303756

Adi Mad

Uploaded by

Copyright:

Available Formats

You might also like

Adi Mad

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Adi Mad

Uploaded by

Copyright:

Available Formats

PUNE DISTRICT EDUCATION ASSOCIATION

THE INSTITUTE OF TECHNOLOGY, HADAPSAR

Academic Year: 2022-23

Department: - Computer Eng. Course Year: -22-23

Subject: -MAD Subject Code:- 22617

MRS. Gade M.A.

1.0 Brief Introduction:

2.0 Aim of the Micro Project

3.0 Outcomes in Affective Domain-

 Appreciation for creativity

25. Rohit Ghavate

3. Character Segmentation: This involves segmenting individual characters or words within

Overall, the methodology used in text detector applications involves a combination of

public class MainActivity extends AppCompatActivity {

// creating variables for our

// variable for our image bitmap.

// on below line we are initializing our variables.

// adding on click listener for detect button.

static final int REQUEST_IMAGE_CAPTURE = 1;

private void dispatchTakePictureIntent() {

// on below line we are calling a start activity

// below line is to set the

private void detectTxt() {

// below line is to create variable for firebase

// below line is to create a variable for detector and we

// adding on success listener method to detect the text from image.

private void processTxt(FirebaseVisionText text) {

// checking if the size of the

// below line is to set our

You might also like