Professional Documents
Culture Documents
Ocr & Cbir
Ocr & Cbir
Dr. Matheel
Done by
Ali Abid Husaen &Arwa sabah
1
What is Optical Character Recognition (OCR)
Methods of OCR
The main principle in automatic recognition of patterns, is first to teach the machine
which classes of patterns that may occur and what they look like. In OCR the patterns
are letters, numbers and some special symbols like commas, question marks etc., while
the different classes correspond to the different characters. The teaching of the machine
is performed by showing the machine examples of characters of all the different classes.
Based on these examples the machine builds a prototype or a description of each class
of characters. Then, during recognition, the unknown characters are compared to the
previously ob-tained descriptions, and assigned the class that gives the best match.
Optical Location
Scanning
Segmentation
Preprocessing
Feature
Recognition extraction
Post-
processing
1- Optical scanning.
Through the scanning process a digital image of the original document is captured. In
OCR optical scanners are used, which generally consist of a transport mechanism plus a
sensing device that converts light intensity into gray-levels. Printed documents usually
consist of black print on a white background. Hence, when performing OCR, it is common
practice to convert the multilevel image into a bilevel image of black and white. Often this
process, known as thresholding, is performed on the scanner to save memory space and
computational effort.
3
2-Location and segmentation.
3-Preprocessing
The image resulting from the scanning process may contain a certain amount of noise.
De-pending on the resolution on the scanner and the success of the applied technique for
thresholding, the characters may be smeared or broken. Some of these defects, which
may later cause poor recognition rates, can be eliminated by using a preprocessor to
smooth the digitized characters.
The smoothing implies both filling and thinning. Filling eliminates small breaks, gaps
and holes in the digitized characters, while thinning reduces the width of the line. The
4
most common techniques for smoothing, moves a window across the binary image of
the char-acter, applying certain rules to the contents of the window.
4-Feature extraction
The objective of feature extraction is to capture the essential characteristics of the sym-
bols, and it is generally accepted that this is one of the most difficult problems of pattern
recognition. The most straight forward way of describing a character is by the actual
raster image. Another approach is to extract certain features that still characterize the
symbols, but leaves out the unimportant attributes. The techniques for extraction of such
features are often divided into three main groups, where the features are found from:
1- The distribution of points.
2- Transformations and series expansions.
3- Structural analysis.
Applications:
In the early days, before the digital computers and the need for input of large amounts of
data emerged, this was the imagined area of application for reading machines.
Combined with a speech synthesis system such a reader would enable the blind to
understand printed documents. However, a problem has been the high costs of reading
machines, but this may be an increasing area as the costs of microelectronics fall.
5
2- Automatic number-plate readers.
A few systems for automatic reading of number plates of cars exist. As opposed to other
applications of OCR, the input image is not a natural bilevel image, and must be
captured by a very fast camera. This creates special problems and difficulties although
the character set is limited and the syntax restricted.
This is an application specially useful for the banking environment. Such a system
estab-lishes the identity of the writer without attempting to read the handwriting. The
signature is simply considered as a pattern which is matched with signatures stored in a
reference database.
Constrained handwriting:
6
Figure ( 6 ) : Instructions for OCR handwriting
1- Recognition rate.
The proportion of correctly classified characters.
2- Rejection rate.
The proportion of characters which the system were unable to recognize. Rejected
characters can be flagged by the OCR-system, and are therefore easily retraceable for
manual correction.
3- Error rate.
The proportion of characters erroneously classified. Misclassified characters go by
undetected by the system, and manual inspection of the recognized text is necessary
to detect and correct these errors.
7
Content-based image retrieval (CBIR)
CBIR is the application of computer vision techniques to the image retrieval problem, that
In another word CBIR known as query by image content (QBIC) and content-based visual
library of digital images according to the visual content of the images. In other words, it is
the retrieving of images that have similar content of colors, textures or shapes. For
example you can pick landscape image of mountains and try to find similar scenes with
(CBIR) uses the visual contents of an image such as color, shape, texture, and spatial
work with whole images and searching is based on comparison of the query. General
8
The current state of the content-based image retrieval:
The history of the content-based image retrieval can be divided into three phases:
1- The retrieval based on artificial notes.(labels images by using text).
2- The retrieval based on vision character of image contents. (image feature extraction-low
level feature).
3-The retrieval based on image semantic features.(high level feature).
9
Some type of feature used to image retrieval:
A- Shape retrieval
Shape does not refer to the shape of an image but to the shape of a particular region that is
being sought out. Shapes will often be determined:
1. Applying segmentation or edge detection to an image.
2. Other methods like use shape filters to identify given shapes of an image.
B- Colour retrieval
Computing distance measures based on color similarity is achieved by computing a color
histogram for each image that identifies the proportion of pixels within an image holding
specific values (that humans express as colors).
colour feature extraction method are required.
1. Colour space.
2. Colour Histogram.
3. Colour moment.
C- Texture retrieval
Texture measures look for visual patterns in images and how they are spatially defined.
Textures are represented by texels which are then placed into a number of sets, depending on
how many textures are detected in the image. These sets not only define the texture, but also
where in the image the texture is located.
The texture feature description categories are explained below:
1 .Statistical Methods
2. Model Based Approaches
3 . Transform Domain Features
10
Content-based image retrieval -CBIR system
11
system ranks the search results and then returns the results that are most similar to the
query examples.
Similarity matching:
This component basically does the comparison part and returns its own output according to
the technique it uses to do the comparison and get a final result through the similarity
matcher component
. Crime prevention
.Art collections
. Intellectual property
. Photograph archives
. Retail catalogs
. Military
. Medical diagnosis
. Face Finding
12
Bibliography
• H.S. Baird & R. Fossey. A 100-
Font Classifier.
Proceedings ICDAR-91, Vol. 1, p. 332-340, 1991.
• M. Bokser.
Omnidocument Technologies.
IEEE Proceedings, special issue on OCR, p. 1066-1078, July 1992.
• T. Pavlidis.
Recognition of printed text under realistic conditions.
Pattern Recognition Letters 14, p. 317-326, 1993.
•
• R. G. Casey & K. Y. Wong. Document-Analysis
Systems and Techniques.
Image Analysisi Applications, eds: R. Kasturi & M. Tivedi, p. 1-36.
New York: Marcel Dekker, 1990.
• L. Haaland.
Automatisk identifikasjon - den glemte muligheten.
Teknisk Ukeblad, nr 39, 1992.
13