Professional Documents
Culture Documents
Title: Spatial Cohesion Refers To The Fact That Text
Title: Spatial Cohesion Refers To The Fact That Text
ABSTRACT
Many Text Extraction methodologies have
been proposed, but none of them are
suitable to be part of a real system implemented on a device with low
computational resources, either because
their accuracy is insufficient, or because
their performance is too slow. In this sense,
we propose a Text Extraction algorithm for
the context of language translation of scene
text images which is fast and accurate at the
same time. The algorithm uses very
efficient computations to calculate the
Principal Color Components of a previously
quantized image, and decides which ones
are the main foreground-background colors,
after which it extracts the text in the image.
1. Introduction
Recent studies in the field of computer
vision and pattern recognition show a great
amount of interest in content retrieval from
images and videos. This content can be in
the form of objects, color, texture, shape as
well as the relationships between them. The
semantic information provided by an image
can be useful for content based image
retrieval, as well as for indexing and
classification purposes [4,10]. As stated by
Jung, Kim and Jain in [4], text data is
particularly interesting, because text can be
used to easily and clearly describe the
contents of an image. Since the text data can
be embedded in an image or video in
different font styles, sizes, orientations,
colors, and against a complex background,
the problem of extracting the candidate text
region becomes a challenging one [4]. Also,
current Optical Character Recognition
(OCR) techniques can only handle text
against a plain monochrome background and
cannot extract text from a complex or
Detection
This section corresponds to Steps 1 to 4 of
3.1.
Given an input image, the region with a
possibility of text in the image is detected . A
Gaussian pyramid is created by successively
filtering the input image with a Gaussian
kernel of size 3x3 and down- sampling the
image in each direction by half. Down
sampling refers to the process whereby an
image is resized to a lower resolution from
its original resolution. A Gaussian filter of
size 3x3 will be used. Each level in the
pyramid corresponds to the input image at a
different resolution. These images are next
convolved with directional filters at different
orientation kernels for edge detection in the
horizontal (00), vertical (900) and diagonal
(450 ,1350) directions.
RESULT
Conclusion
Theresultsobtainedbyeachalgorithmona
varied set of images were compared
withrespecttoprecisionandrecallrates.In
terms of scale variance, the connected
componentalgorithm is more robust as
compared to the edge based algorithm for
text regionextraction. In terms of lighting
variance also, the connected component
basedalgorithmismorerobustthantheedge
based algorithm. In terms of rotation or
orientation variance,the precision rate
obtainedbytheconnectedcomponentbased
algorithmishigherthantheedgebased,and
therecallrateobtainedbytheedgebasedis
higher than theconnected component based
Theaverageprecisionratesobtainedbyeach
algorithm forthe remaining test images are
similar, whereas the average recall rate
obtained by theconnected component
algorithm is a little lower than the edge
basedalgorithm.
REFERENCES