Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

Supervised By

Dr. Matheel

Done by
Ali Abid Husaen &Arwa sabah
1
What is Optical Character Recognition (OCR)

Machine replication of human functions, like reading, is an ancient dream.


The traditional way of entering data into a computer is through the keyboard. However, this
is not always the best nor the most efficient solution. In many cases automatic identification
may be an alternative
Optical character recognition is needed when the information should be readable both to
humans and to a machine and alternative inputs can not be predefined.
The more constrained the input is, the better will the performance of the OCR system be.
However, when it comes to totally unconstrained handwriting
Methodically, character recognition is a subset of the pattern recognition area
To replicate the human functions by machines, making the machine able to perform tasks
like reading, is an ancient dream. The origins of character recognition can actually be found
back in 1870

Optical character recognition (also optical character reader, OCR) is


the mechanical or electronic conversion of images of typed, handwritten or printed text
into machine-encoded text, whether from a scanned document, a photo of a document, a
scene-photo (for example the text on signs and billboards in a landscape photo) or from
subtitle text superimposed on an image (for example from a television broadcast). It is
widely used as a form of information entry from printed paper data records, whether
passport documents, invoices, bank statements, computerised receipts, business cards,
mail, printouts of static-data, or any suitable documentation. It is a common method of
digitising printed texts so that they can be electronically edited, searched, stored more
compactly, displayed on-line, and used in machine processes such as cognitive
computing, machine translation, (extracted) text-to-speech, key data and text mining. OCR
is a field of research in pattern recognition, artificial intelligence and computer vision.

Methods of OCR
The main principle in automatic recognition of patterns, is first to teach the machine
which classes of patterns that may occur and what they look like. In OCR the patterns
are letters, numbers and some special symbols like commas, question marks etc., while
the different classes correspond to the different characters. The teaching of the machine
is performed by showing the machine examples of characters of all the different classes.
Based on these examples the machine builds a prototype or a description of each class
of characters. Then, during recognition, the unknown characters are compared to the
previously ob-tained descriptions, and assigned the class that gives the best match.

Components of an OCR system


A typical OCR system consists of several components. In figure (1) a common setup is
il-lustrated. The first step in the process is to digitize the analog document using an
2
optical scanner. When the regions containing text are located, each symbol is extracted
through a segmentation process. The extracted symbols may then be preprocessed,
eliminating noise, to facilitate the extraction of features in the next step.

Optical Location
Scanning
Segmentation
Preprocessing

Feature
Recognition extraction
Post-
processing

Figure( 1 ) : Components of an OCR-system

1- Optical scanning.

Through the scanning process a digital image of the original document is captured. In
OCR optical scanners are used, which generally consist of a transport mechanism plus a
sensing device that converts light intensity into gray-levels. Printed documents usually
consist of black print on a white background. Hence, when performing OCR, it is common
practice to convert the multilevel image into a bilevel image of black and white. Often this
process, known as thresholding, is performed on the scanner to save memory space and
computational effort.

Figure 2 : Problems in thresholding: Top: Original greylevel image, Middle: Image


thresholded with global method, Bottom: Image thresholded with an adaptive method

3
2-Location and segmentation.

Segmentation is a process that determines the constituents of an image. It is necessary to


locate the regions of the document where data have been printed and distinguish them from
figures and graphics
This technique is easy to imple-ment, but problems occur if characters touch or if
characters are fragmented and consist of several parts. The main problems in segmentation
may be divided into groups:
1- Extraction of touching and fragmented characters.
Such distortions may lead to several joint characters being interpreted as one single
character, or that a piece of a character is believed to be an entire symbol
2- Distinguishing noise from text.
Dots and accents may be mistaken for noise, and vice versa.
3- Mistaking text for graphics or geometry.
In this case the text will not be passed to the recognition stage. This often happens if
characters are connected to graphics.

Figure( 3 ) : Degraded symbols.

3-Preprocessing
The image resulting from the scanning process may contain a certain amount of noise.
De-pending on the resolution on the scanner and the success of the applied technique for
thresholding, the characters may be smeared or broken. Some of these defects, which
may later cause poor recognition rates, can be eliminated by using a preprocessor to
smooth the digitized characters.
The smoothing implies both filling and thinning. Filling eliminates small breaks, gaps
and holes in the digitized characters, while thinning reduces the width of the line. The

4
most common techniques for smoothing, moves a window across the binary image of
the char-acter, applying certain rules to the contents of the window.

Figure (4) : Normalization and smoothing of a symbol.

4-Feature extraction

The objective of feature extraction is to capture the essential characteristics of the sym-
bols, and it is generally accepted that this is one of the most difficult problems of pattern
recognition. The most straight forward way of describing a character is by the actual
raster image. Another approach is to extract certain features that still characterize the
symbols, but leaves out the unimportant attributes. The techniques for extraction of such
features are often divided into three main groups, where the features are found from:
1- The distribution of points.
2- Transformations and series expansions.
3- Structural analysis.

Applications:

Many of applications exist, and some of these are mentioned below.

1- Aid for blind.

In the early days, before the digital computers and the need for input of large amounts of
data emerged, this was the imagined area of application for reading machines.
Combined with a speech synthesis system such a reader would enable the blind to
understand printed documents. However, a problem has been the high costs of reading
machines, but this may be an increasing area as the costs of microelectronics fall.

5
2- Automatic number-plate readers.

A few systems for automatic reading of number plates of cars exist. As opposed to other
applications of OCR, the input image is not a natural bilevel image, and must be
captured by a very fast camera. This creates special problems and difficulties although
the character set is limited and the syntax restricted.

3- Signature verification and identification.

This is an application specially useful for the banking environment. Such a system
estab-lishes the identity of the writer without attempting to read the handwriting. The
signature is simply considered as a pattern which is matched with signatures stored in a
reference database.

 Constrained handwriting:

Recognition of constrained handwriting deals with the problem of unconnected normal


handwritten characters. Optical readers with such capabilities are not yet very common,
but do exist. However, these systems require well-written characters, and most of them
can only recognize digits unless certain standards for the hand-printed characters are fol-
lowed (see figure 6). The characters should be printed as large as possible to retain good
resolution, and entered in specified boxes. The writer is also instructed to keep to certain
models provided, avoiding gaps and extra loops. Commercially the term ICR
(Intelligent Character Recognition) is often used for systems able to recognize
handprinted charac-ters.

6
Figure ( 6 ) : Instructions for OCR handwriting

OCR performance evaluation


The valuation of OCR system, three dif-ferent performance rates should be investigated:

1- Recognition rate.
The proportion of correctly classified characters.
2- Rejection rate.
The proportion of characters which the system were unable to recognize. Rejected
characters can be flagged by the OCR-system, and are therefore easily retraceable for
manual correction.
3- Error rate.
The proportion of characters erroneously classified. Misclassified characters go by
undetected by the system, and manual inspection of the recognized text is necessary
to detect and correct these errors.

7
Content-based image retrieval (CBIR)

CBIR is the application of computer vision techniques to the image retrieval problem, that

is, the problem of searching for digital images in large databases.

In another word CBIR known as query by image content (QBIC) and content-based visual

information retrieval (CBVIR): is the process of retrieving images from a database or

library of digital images according to the visual content of the images. In other words, it is

the retrieving of images that have similar content of colors, textures or shapes. For

example you can pick landscape image of mountains and try to find similar scenes with

similar color and/or similar shapes.

(CBIR) uses the visual contents of an image such as color, shape, texture, and spatial

layout tore present and index the image.

Image retrieval techniques are useful in many image-processing applications. (CBIR)

work with whole images and searching is based on comparison of the query. General

techniques for image retrieval are color, texture and shape

8
The current state of the content-based image retrieval:
The history of the content-based image retrieval can be divided into three phases:
1- The retrieval based on artificial notes.(labels images by using text).
2- The retrieval based on vision character of image contents. (image feature extraction-low
level feature).
3-The retrieval based on image semantic features.(high level feature).

9
Some type of feature used to image retrieval:
A- Shape retrieval
Shape does not refer to the shape of an image but to the shape of a particular region that is
being sought out. Shapes will often be determined:
1. Applying segmentation or edge detection to an image.
2. Other methods like use shape filters to identify given shapes of an image.
B- Colour retrieval
Computing distance measures based on color similarity is achieved by computing a color
histogram for each image that identifies the proportion of pixels within an image holding
specific values (that humans express as colors).
colour feature extraction method are required.
1. Colour space.
2. Colour Histogram.
3. Colour moment.
C- Texture retrieval
Texture measures look for visual patterns in images and how they are spatially defined.
Textures are represented by texels which are then placed into a number of sets, depending on
how many textures are detected in the image. These sets not only define the texture, but also
where in the image the texture is located.
The texture feature description categories are explained below:
1 .Statistical Methods
2. Model Based Approaches
3 . Transform Domain Features

10
Content-based image retrieval -CBIR system

Content-based retrieval system is divided into:


1. Off-line feature extraction:
The system automatically extracts visual attributes at either a low-level (such as color,
texture, and shape) or at a high-level (such as a color histogram), or both for each image in
the database based on its pixel values and stores them in a different database within the
system called a feature database .The feature data (also known as image signature) for each
of the visual attributes of each image is very much smaller in size compared to the image
data, thus the feature database contains an abstraction (compact form) of the images in the
image database.

2. Online image retrieval:


The user can submit a query example to the retrieval system to search for desired images.
The system represents this example with a feature vector and the distances (i.e.,
similarities) between the feature vectors of the query example and those of the image in
the feature database are then computed and ranked. Retrieval is conducted by applying an
indexing scheme to provide an efficient way of searching the image database. Finally, the

11
system ranks the search results and then returns the results that are most similar to the
query examples.
Similarity matching:
This component basically does the comparison part and returns its own output according to
the technique it uses to do the comparison and get a final result through the similarity
matcher component

Application area of CBIR

• Search for one specific image.

. Crime prevention

.Architectural and engineering design

.Art collections

. Intellectual property

. Photograph archives

. Retail catalogs

. Military

. Medical diagnosis

. Face Finding

12
Bibliography
• H.S. Baird & R. Fossey. A 100-
Font Classifier.
Proceedings ICDAR-91, Vol. 1, p. 332-340, 1991.

• M. Bokser.
Omnidocument Technologies.
IEEE Proceedings, special issue on OCR, p. 1066-1078, July 1992.

• R. Bradford & T. Nartker.


Error Correlation in Contemporary OCR Systems.
Proceedings ICDAR-91, Vol. 2, p. 516-524, 1991.

• T. Pavlidis.
Recognition of printed text under realistic conditions.
Pattern Recognition Letters 14, p. 317-326, 1993.

• R. G. Casey & K. Y. Wong. Document-Analysis
Systems and Techniques.
Image Analysisi Applications, eds: R. Kasturi & M. Tivedi, p. 1-36.
New York: Marcel Dekker, 1990.

• R. H. Davis & J. Lyall.


Recognition of Handwritten Characters - a Review.
Image and Vision Computing, Vol. 4, No. 4, p. 208-218, nov. 1986.

• S. Diehl & H. Eglowstein. Tame


the Paper Tiger.
Byte, p. 220-238, April 1991.

• G. Dimauro, S. Impedovo & G. Pirlo.


From Character to Cursive Script Recognition: Future Trends in Scientific
Research. Proceedinngs, IAPR’92, The Hague, Vol. 2, p. 516-519, 1992.

• R. C. Gonzalez & R. E. Woods.


Digital Image Processing. Addison-
Wesley, 1992.

• V. K. Govindan & A.P. Shivaprasad.


Character Recognition - a Review.
Pattern Recognition, Vol. 23, No &, P. 671-683, 1990.

• L. Haaland.
Automatisk identifikasjon - den glemte muligheten.
Teknisk Ukeblad, nr 39, 1992.

• S. Impedovo & L. Ottaviano & S. Occhinegro.


Optical Character Recognition - A survey.
Int. Journal of PRAI, Vol. 5, No 1& 2, p. 1-24, 1991

13

You might also like