Optical Character Recognition: Chuan-Kai Yang

Optical Character Recognition
Chuan-kai Yang
Outline

OCR Systems Historical Perspective Commercial Applications
OCR Systems
Image Scanner OCR software/hardware Output interface
Image Scanner

Detector Illumination source Scan lens Document transport
OCR Software/Hardware

Document Analysis Character Recognition Contextual Processing
Document Analysis

Character segmentation/isolation Compensate poor scanning quality Image enhancement Underline removal Noise removal
Character Recognition
Feature extractor

Determine the descriptors or feature set Derived feature set is fed into the classifier Template matching (matrix matching) Structural classification Bayesian classifier Artificial neural networks
Classifier

Template Matching 1/2

One of the most common methods Individual image pixels are used as features Classification is performed by comparing an input character image with a set of templates(prototypes) Each comparison results in a similarity measure between the input character and the template, the comparison is pixel by pixel The character identity is assigned the identity of the most similar template
Template Matching 2/2
Template matching is a trainable process because template characters may be changed In many commercial systems, PROMs (programmable read-only memory) store templates containing single fonts. If a suitable PROM exists for a font then template matching can be trained to recognize that font
Structural Classification 1/2
It utilize structural features and decision rules to classify characters Features may be defined in terms of character strokes, character holes, or other character attributes such as concavities For instance, the letter P may be described as a vertical stroke with a hole attached on the upper right side
Structural Classification 2/2
For a character image input, the structural features are extracted and a rule-based system is applied to classify the character Structural methods are also trainable The construction of good feature set and a good rule-base can be time-consuming
Other methods
Discriminant function classifier use hypersurfaces to separate the feature description of characters Bayesian methods seek to minimize the loss function associated with misclassification through the use of probability theory ANNs, which are closer to human perception, employ mathematical minimization techniques These techniques are used in commercial OCR systems
Recognition Rate
For machine-printed characters, the rate can reach over 99% For hand-written characters, the rate is typically lower
Contextual Processing
The number of word choices for a given field can be limited by the content of another field
Knowing the zip code can help knowing address Spelling checker Verified interactively by the user
Post processing to correct recognition error

Non-Roman Character Recognition
Output Interface
The output interface allows character recognition results to be electronically transferred into the domain that uses the results:

Spread sheets Databases Word processors
Historical Perspective
Born in 1951 GISMO by M. Sheppard: a robot reader-writer 1954 J. Rainbow developed a prototyped machine that was able to read uppercase typewritten output at the fantastic speed of one character per minute Systems that cost one million dollars were not uncommon
Some ANSI Standard Fonts

machine
machine
handwritten
Todays OCR Systems
It is not uncommon to find PC-based OCR systems for under $800 capable of recognizing several hundred characters per minute Some system advertise themselves as omnifont-able
Commercial Applications
Task-Specific Readers

Assigning ZIP codes to letter mail Reading data entered in forms, e.g. tax forms Automatic accounting procedures used in processing utilities bills Verification of account numbers and courtesy amounts on bank checks Automatic accounting of airline passenger tickets Automatic validation of passports
Address Readers
Up to 400 fonts, and up to 45000 mail pieces per hour.
Form Readers

Trained with a blank form Scan regions that should be filled with data Some system can process forms at a rate of 5800 forms per hour
Check Readers

Capture the check image Cross reference the amounts specified at both places An operator can correct misclassified characters by cross-validating the recognition results
Bill Processing Systems
Focus on certain regions on a document where the expected information are located

Account number Payment value
Airline Ticket Readers
Scan/Match

Reservation record Travel agent record Passenger ticket
Some systems can scan tickets upt to 260000 tickets per day (17 tickets per second)
Passport Readers
Reads the travelers

Name Date of birth Passport number
Match against the database records containing information on fugitive felons and smugglers
General Purpose Page Readers
High-end: higher data throughput and more advanced capabilities Can adapt the recognition engine to customer data to improve accuracy Can even detect type face (bold face and italic) Low-End:
Mostly used in an office with desktop workstations Could handle a broad range of documents at a lower rate and accuracy

Optical Character Recognition: Chuan-Kai Yang

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Optical Character Recognition: Chuan-Kai Yang

Uploaded by

Copyright:

Available Formats

Optical Character Recognition

OCR Systems Historical Perspective Commercial Applications

Detector Illumination source Scan lens Document transport

Document Analysis Character Recognition Contextual Processing

Template Matching 1/2

Template Matching 2/2

Structural Classification 1/2

Structural Classification 2/2

Post processing to correct recognition error

Non-Roman Character Recognition

Spread sheets Databases Word processors

Some ANSI Standard Fonts

Todays OCR Systems

Up to 400 fonts, and up to 45000 mail pieces per hour.

Bill Processing Systems

Account number Payment value

Airline Ticket Readers

Reservation record Travel agent record Passenger ticket

Reads the travelers

Name Date of birth Passport number

General Purpose Page Readers

You might also like