Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 5

PROJECT SYNOPSIS

ON
HANDWRITTEN
TEXT
RECOGNITION

SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR


THE AWARD OF DEGREE OF

BACHELOR OF TECHNOLOGY
IN
INFORMATION TECHNOLOGY

Submitted By:
Name: Vaidhi Kapoor Ayush Julka Gurkirat Singh Jatin Wadhwa
University Roll No. 1903868 1903823 1903829 1903838

Group No. - 10

Under the supervision of


Mrs. Reeta Bhardwaj
Assistant Professor, IT Department

DEPARTMENT OF INFORMATION TECHNOLOGY

DAV INSTITUTE OF ENGINEERING & TECHNOLOGY


Jalandhar – 144008
1.0 Project Overview
Handwritten character recognition is process of converting the hand written work over page
to a attractive digital format. There is a growing demand for software applications that
recognise characters in computer when information is scanned from paper documents, as we
all know that there are many historical, mythical, and newspaper books and newspapers still
in print [1]. They deteriorate day by day as a result of climatic changes or incorrect handling.
As a result, there is a high need presently for "saving the information available in these paper
documents in a computer storage disc and then utilising this information through a searching
procedure." Scanning the documents first is a straightforward technique to get the information
from them into a computer system. When we scan documents using the scanner, the
documents are saved in the computer system as images. These photos contain text that the
user cannot change [1]. However, reading individual contents and searching the contents of
these documents line-by-line and word-by-word is extremely tough for a computer system to
achieve. The challenge arises because the font qualities of characters in paper documents
differ from those of characters in a computer system. As a result, when reading the characters,
the computer is unable to recognise them. Document processing is the process of saving the
contents of paper documents in a computer storage location and then reading and searching
the material [2]. We occasionally need to process information that is connected to languages
other than English in this document processing. Document Image Analysis is another name
for this procedure (DIA). Many techniques to dealing with DIA have been offered by
researchers in recent years; each strategy has its own advantages and limitations, which will
be described in depth in the next portion of this study [1].

2.0 Existing System


Most present identification systems rely on classical recognition technology, which is
primarily focused on image level matching, and need human feature extraction, making the
model vulnerable to external influences such as illumination and deformation [1]. The system
uses a convolutional neural network to extract image features, which substitutes the usual
artificial feature extraction approach and can achieve superior recognition performance
because it is insensitive to illumination and deformation.
The existing system of handwritten text recognition has the following disadvantages i.e.
accuracy. Not 100% accurate, there are likely to be some mistakes made during the method
and other than that there is the need of lot of space required by the image produced.
3.0 Proposed System
Many regional languages throughout world have different writing styles which can be
recognized with OCR systems using proper algorithm and strategies. We have learning for
recognition of English characters [2]. It has been found that recognition of handwritten
character becomes difficult due to presence of odd characters or similarity in shapes for
multiple characters. Scanned image is pre-processed to get a cleaned image and the
characters are isolated into individual characters.
A new handwriting recognition approach based on geometric properties of letters is proposed.
The paper is about handwritten character recognition in isolation. The characters are drawn
with a pen on an ordinary sheet of paper and then converted to a binary image that a
computer can examine. We introduce a new approach for handwriting recognition in this
work, and we try to enhance the accuracy by running some input tests.
The handwritten character recognition generally involves the following Modules: -
A. Image Acquisition: In the image acquisition the images for system are acquired by
appropriate scanning of handwritten documents, books or by capturing photographs of
document. The input image is obtained by camera or through some scanner. The input image
may be in Gray, colour.
B. Pre-processing: The method of extraction of text from the document is called pre-
processing. The pre-processing Consists of a series of operations performed on the scan input
image, which include background Noise reduction, image restoration, filtering etc.
C. Segmentation: In text recognition module, the segmentation is the most important process.
Segmentation is done to make the separation between the individual characters of an image.
This step deals with breaking of the lines, words for getting all the characters separated.
D. Feature Extraction: Feature extraction is the process to retrieve the most important data
from the raw data. To find a set of parameters that uniquely defines the character is called
feature extraction.
E. Classification: The classification is the process of identifying each character and assigning
to it the correct character class, so that texts in images are converted into computer
understandable form.
F. Post-processing: The output of text recognition module is in the form text data which is
understand by computer, So there need to store it in to some proper format
3.1 Problem Formulation
There are mainly three problem areas in the existing system on which we are going to work
which include:

i The existing system is not user friendly.


ii The accuracy is quite low.
iii Not able to recognise handwritten text of different languages.

3.2 Objectives
i To convert handwritten based text to document-based text.
ii Recognising handwritten text in any language.
iii Easy to use i.e., user friendly environment.
iv Increasing the accuracy of the result.

4.0 Features of the Project

i. Keeping the interface simple. The best interfaces are almost invisible to the user.
Avoiding unnecessary elements
ii. Clear in the language.
iii. To be effective, the solution must be flexible enough to accommodate a wide range of
users in terms of physical characteristics.

5.0 Facility required


5.1 Hardware Requirements:
i. Processor: 64 bit, Core 2 duo, 2.93 GHz.
ii. RAM: 4 GB or more.
iii. HDD: 20 GB of available space or more.
iv. Display: Generic Non-PnP Monitor (1024 x 768) or higher resolution monitor.
v. Input: Optical Scanner.
vi. Keyboard: A standard keyboard.
vii. Stable Internet Connection.

5.2 Software Requirements:


i. Operating system: Linux- Ubuntu 16.04 to 17.10, or Windows 7 to 10, with 2GB
RAM (4GB preferable).
ii. You must install Python 3.6 and related packages.
6.0 Project Planning

Name of the Activity Date of Deliverables Name of Team


Completion member
Requirement Analysis 10-05-2022 SRS Document Vaidhi, Ayush,
Gurkirat, Jatin
Design 06-06-2022 Design Document Vaidhi, Ayush,
Gurkirat, Jatin
Coding 15-09-2022 Software Code (Prototype) Vaidhi, Ayush,
Gurkirat, Jatin
Testing 10-10-2022 Test document Vaidhi, Ayush,
Gurkirat, Jatin
Implementation 30-11-2022 Final project demonstration Vaidhi, Ayush,
Gurkirat, Jatin

References
[1] Pratik Madhukar Manwatkar, "A Technical Review on Text Recognition from Images",
IEEE Sponsored 9th International Conference on Intelligent Systems and Control (ISCO)
2019
[2] Dzulkifli, M., Muhammad, F. & Razib, O. (2018) On-Line Cursive Handwriting
Recognition: A Survey of Methods and Performance. The 4th International Conference on
Computer Science and Information Technology (CSIT2018). Amman, Jordan 5-7 April,
2018.

(Signature) (Signature)
Team Leader (Project Guide)

Date: ___25-03-2022____

You might also like