Online Handwritten Script Recognition

ONLINE HANDWRITTEN SCRIPT
RECOGNITION
Presentation By:
Priya Ahuja
CSE 6C
10-CSU-110
CONTENTS
Online Recognition
Why Handwriting Recognition?
Why is Handwriting Recognition difficult?
Properties of Scripts
Features of Handwritten Script
Steps in Handwritten Script Recognition
Future Scope
References
Online Recognition
On-line handwriting recognition involves the automatic conversion of
text as it is written on a special digitizer or PDA, where a sensor picks
up the pen-tip movements as well as pen-up/pen-down switching. That
kind of data is known as digital ink and can be regarded as a dynamic
representation of handwriting. The obtained signal is converted into
letter codes which are usable within computer and text-processing
applications.
The elements of an on-line handwriting recognition interface typically
include:
1) a pen or stylus for the user to write with.
2) a touch sensitive surface, which may be integrated with, or adjacent
to, an output display.
3) a software application which interprets the movements of the stylus
across the writing surface, translating the resulting strokes into digital
text.
Devices that accept on-line handwritten data: From the top left, Pocket
PC, CrossPad, Ink Link, Cell Phone, Smart Board, Tablet with display,
Anoto pen, Wacom Tablet, Tablet PC
Why Handwriting Recognition?

Online documents may be written in different languages and scripts. A
single document page in itself may contain text written in multiple scripts.
A script is defined as a graphic form of a writing system. Different scripts
may follow the same writing system. For example, the alphabetic system is
adopted by scripts like Roman and Greek , and the phonetic-alphabetic
system is adopted by most Indian scripts , including Devnagari. A specific
script like Roman may be used by multiple languages such as English,
German, and French.
The general class of Han-based scripts include Chinese, Japanese, and
Korean (we do not consider Kana or Han-Gul).
Devnagari script is used by many Indian languages, including Hindi,
Sanskrit, Marathi, and Rajasthani.
Arabic script is used by Arabic, Farsi, Urdu, etc.
Roman script is used by many European languages like English, German,
French, and Italian.
The most important characteristic of

online documents is that they capture
the temporal sequence of strokes
while writing the document.
We use stroke properties as well as the
spatial and temporal information of a
collection of strokes to identify the
script used in the document.
Why is Handwriting Recognition

Difficult?
High variability of individual characters
Writing style
Stroke width and quality

Size of the writing
Variation even for single writer!
Reliable segmentation of cursive script extremely problematic due to
Merging of adjacent characters
Properties of scripts
Arabic : Arabic is written from right to left within a line and the lines
are written from top to bottom. A typical Arabic character contains a
relatively long main stroke which is drawn from right to left, along with
one to three dots. The character set contains three long vowels.
Short markings(diacritics) may be added to the main character to
indicate short vowels. Due to these diacritical marks and the dots in
the script, the length of the strokes vary considerably.
Cyrillic: Cyrillic script looks very similar to the cursive Roman script. The
most distinctive features of Cyrillic script, compared to Roman script are:
1) individual characters, connected together in a word, form one long
stroke,
2) the absence of delayed strokes .Delayed strokes cause movement of the
pen in the direction opposite to the regular writing direction.
The word trait contains three delayed strokes,

shown as bold dotted lines here.
Devnagari : The most important characteristic of Devnagari

script is the horizontal line present at the top of each word,
called Shirorekha .These lines are usually drawn after the word is written and
hence are similar to delayed strokes in Roman script. The words are written
from left to right in a line.
The word devnagari written in
Devnagari script. The Shirorekha is shown
in bold.
Han: Characters of Han script are composed of multiple short strokes. The
strokes are usually drawn from top to bottom and left to right within a character.
The direction of writing of words in a line is either left to right or top to bottom.
Hebrew: Words in a line of Hebrew script are written from right to left and,
hence, the script is temporally similar to Arabic. The most distinguishing factor
of Hebrew from Arabic is that the strokes are more uniform in length in the
former.
Roman: Roman script has the same writing direction as Cyrillic, Devnagari,
and Han scripts. In addition, the length of the strokes tends to fall between that
of Devnagari and Cyrillic scripts.
Features of Handwritten script

Horizontal Interstroke Direction (HID):This is the sum of the
horizontal directions between the starting points of consecutive strokes
in the pattern. The feature essentially captures the writing direction
within a line.
where Xstart(.) denotes the x coordinate of the pen-down position of the stroke,
n is the number of strokes in the pattern,
and r is set to 3 to reduce errors due to abrupt changes in direction between
successive strokes.
The value of HID falls in the range [r n , n r].
Average Stroke Length (ASL): Each stroke is resampled during

preprocessing so that the sample points are equidistant. Hence, the number
of sample points in a stroke is used as a measure of its length. The Average
Stroke Length is defined as the average length of the individual strokes in the
pattern.
where n is the number of strokes in the pattern.

The value of ASL is a real number which falls in the range [1.0, R 0 ],where the
value of R0 depends on the resampling distance used during preprocessing.
Shirorekha Strength: This feature measures the strength of the horizontal
line component in the pattern using the Hough transform. The value of this
feature is computed as:
Where H(r,denotes the number of votes in the (r,th bin in the twodimensional Hough transform space. The Hough transform can be
computed efficiently for dynamic data by considering only the sample
points. The numerator is the sum of the bins corresponding to line
orientations between -10o and 10o and the denominator is the sum of
all the bins in the Hough transform space. The value of Shirorekha
Strength is a real number which falls in the range [0.0, 1.0].
Shirorekha Confidence: We compute a confidence measure for a

stroke being a Shirorekha .Each stroke in the pattern is inspected for
three different properties of a Shirorekha; Shirorekhas span the width
of a word, always occur at the top of the word, and are horizontal.
Hence, the confidence (C) of a stroke (s) is computed as:
Stroke Density: This is the number of strokes per unit length (x-axis) of the
pattern. Note that the Han script is written using short strokes, while
Roman and Cyrillic are written using longer strokes.
where n is the number of strokes in the pattern. The value of Stroke
Density is a real number and can vary within the range (0.0,R 1), where R1
is a positive real number.
Aspect Ratio: This is the ratio of the width to the height of a pattern. The
value of Aspect Ratio is a real number and can vary within the range (0.0,
R2), where R2 is a positive real number.
Reverse Distance: This is the distance by which the pen moves in the
direction opposite to the normal writing direction. The normal writing
direction is different for different scripts. The value of Reverse Distance is a
nonnegative integer and its observed values were in the range [0,1200].
Average Horizontal Stroke Direction: Horizontal Stroke Direction (HD) of a

stroke, s, can be understood as the horizontal direction from the start of the
stroke to its end. Formally, we define HD(s) as:
where Xpen-down(.)and Xpen-up(.)
are the x-coordinates of the pen-down
and pen-up positions, respectively.
For an n-stroke pattern, the Average Horizontal Stroke Direction is computed as the
average of the HD values of its component strokes. The value of Average Horizontal
Stroke Direction falls in the range [-1.0,1.0].
Average Vertical Stroke Direction: It is defined similar to the Average Horizontal

Stroke Direction. The Vertical Direction (VD) of a single stroke s is defined as:
where Y pen-down(.)and Y pen-up(.) are the y-coordinates of the pen-down and pen-up
positions, respectively .For an n-stroke pattern, the Average Vertical Stroke
Direction is computed as the average of the VD values of its component strokes.
The value of Average Vertical Stroke Direction falls in the range [-1.0,1.0].
Vertical Interstroke Direction (VID): The Vertical Interstroke Direction is

defined as:
_
Y(s) is the average of the y-coordinates of the stroke points and n is the
number of strokes in the pattern. The value of VID is an integer and falls
in the range (1 -n, n -1).
Variance of Stroke Length: This is the variance in sample lengths of
individual strokes within a pattern. The value is of Variance of Stroke
Length is a nonnegative integer.
STEPS IN HANDWRITTEN SCRIPT

RECOGNITION
1. Preprocessing: Goal is to remove unwanted variation.
Common Methods: Skew / Slant / Size normalization:
Trajectory data mapped to 2D representation

Baselines / core area estimated similar to offline case
Special Online Methods: Outlier Elimination: Remove position

measurements caused by interferences
Resampling and smoothing of the trajectory
Elimination of delayed strokes.
Resampling and smoothing of the trajectory -:

Goal: Normalize variations in writing speed (no identification!)
Equidistant resampling & interpolation.
Elimination of delayed strokes -:

Handling of delayed strokes problematic, additional time variability!
Remove by heuristic rules
Feature Extraction
Basic Idea: Describe shape of pen trajectory locally
Typical Features:
Slope angle of local trajectory(represented as sin and cos :
continuous variation)
Binary pen-up vs. pen-down feature
Hat feature for describing delayed strokes(strokes that spatially
correspond to removed delayed strokes are marked)
Feature Dynamics: In all applications of HMMs dynamic
features greatly enhance performance.
Discrete time derivative of features
Here: Differences between successive slope angles
CLASSIFICATION
The last big step is classification. In this
step various models are used to map
the extracted features to different
classes and thus identifying the
characters or words the features
represent.
Andrei Andreyevich Markov

Born: 14 June 1856 in Ryazan, Russia
Died: 20 July 1922 in Petrograd (now St
Petersburg), Russia
Markov is particularly remembered for
his study of Markov chains, sequences
of random variables in which the future
variable is determined by the present
variable but is independent of the way in
which the present state arose from its
predecessors. This work launched the
theory of stochastic processes.
Markov random processes

A random sequence has the Markov property
if its distribution is determined solely by its
current state. Any random process having this
property is called a Markov random process.
For observable state sequences (state is
known from data), this leads to a Markov
chain model.
For non-observable states, this leads to a
Hidden Markov Model (HMM).
Chain Rule & Markov Property

Bayes rule
P (qt , qt 1 ,...q1 ) P (qt | qt 1 ,...q1 ) P (qt 1 ,...q1 )

P(qt , qt 1 ,...q1 ) P (qt | qt 1 ,...q1 ) P ( qt 1 | qt 2 ,...q1 ) P (qt 2 ,...q1 )
t
P(qt , qt 1 ,...q1 ) P (q1 ) P(qi | qi 1 ,...q1 )

i2
Markov property
P ( qi | qi 1 ,...q1 ) P (qi | qi 1 ) for i 1

t
P ( qt , qt 1 ,...q1 ) P (q1 ) P (qi | qi 1 ) P (q1 ) P (q2 | q1 )...P (qt | qt 1 )

i2
A Markov System
Has N states, called s1, s2 .. sN
s2
s1
N=3
t=0
s3
There are discrete timesteps, t=0, t=1,
A Markov System
Has N states, called s , s .. s
1
s2
s1
N=3
t=0
qt=q0=s3
s3

On the tth timestep the system is in exactly
one of the available states. Call it qt
Note: qt {s1, s2 .. sN }
Current State
A Markov System
Has N states, called s , s .. s
Current State
s2

s1
N=3
t=1
qt=q1=s2
s3
Between each timestep, the next state is

chosen randomly.
P(qt+1=s1|qt=s2) = 1/2
P(qt+1=s2|qt=s2) = 1/2
P(qt+1=s3|qt=s2) = 0
P(qt+1=s1|qt=s1) = 0
P(qt+1=s2|qt=s1) = 0
s2
P(qt+1=s3|qt=s1) = 1
s1
qt=q1=s2

s3
N=3
t=1
P(qt+1=s1|qt=s3) = 1/3
P(qt+1=s2|qt=s3) = 2/3
P(qt+1=s3|qt=s3) = 0

chosen randomly.
The current state determines the probability
distribution for the next state.
P(qt+1=s1|qt=s2) = 1/2
P(qt+1=s2|qt=s2) = 1/2
P(qt+1=s3|qt=s2) = 0
P(qt+1=s1|qt=s1) = 0
s2
P(qt+1=s2|qt=s1) = 0

1/2
P(qt+1=s3|qt=s1) = 1
2/3
1/2
s1
N=3
t=1
qt=q1=s2

1/3
s3
P(qt+1=s1|qt=s3) = 1/3
P(qt+1=s2|qt=s3) = 2/3
P(qt+1=s3|qt=s3) = 0
Often notated with arcs
between states

chosen randomly.
The current state determines the probability
distribution for the next state.
P(qt+1=s1|qt=s2) = 1/2
P(qt+1=s2|qt=s2) = 1/2
Markov Property
P(qt+1=s3|qt=s2) = 0
P(qt+1=s1|qt=s1) = 0
s2
P(qt+1=s2|qt=s1) = 0
qt+1 is conditionally independent of { qt-1, qt-2,

1/2 q , q } given q .
1
0
t
P(qt+1=s3|qt=s1) = 1
In other words:
2/3
1/2
s1
N=3
t=1
qt=q1=s2
1/3
s3
P(qt+1 = sj |qt = si ) =
P(qt+1 = sj |qt = si ,any earlier history)
Notation:
P(qt+1=s1|qt=s3) = 1/3
P(qt+1=s2|qt=s3) = 2/3
P(qt+1=s3|qt=s3) = 0
aij P (qt 1 si | q s j )
i P (q1 si )
Example: A Simple Markov

Model For Weather Prediction
Any given day, the weather can be described as being
in one of three states:
State 1: precipitation (rain, snow, hail, etc.)
State 2: cloudy
State 3: sunny
Transitions between states are described by the

transition matrix
This model can then be described by the

following directed graph
Basic Calculations-1
Example: What is the probability that the
weather for eight consecutive days is
sun-sun-sun-rain-rain-sun-cloudy-sun?
Solution:
O = sun sun sun rain rain sun cloudy sun
3
3 3 1 1 3
2
3
From Markov To Hidden Markov

The previous model assumes that each state can be
uniquely associated with an observable event
Once an observation is made, the state of the system is then

trivially retrieved
This model, however, is too restrictive to be of practical use for
most realistic problems
To make the model more flexible, we will assume that

the outcomes or observations of the model are a
probabilistic function of each state
Each state can produce a number of outputs according to a

unique probability distribution, and each distinct output can
potentially be generated at any state
These are known a Hidden Markov Models (HMM), because
the state sequence is not directly observable, it can only be
approximated from the sequence of observations produced by
the system
HMM Formal Definition

An HMM, , is a 5-tuple consisting of
N the number of states
M the number of possible
observations
{1, 2, .. N} The starting state
probabilities
P(q0 = Si) = i
a11
a12 a1N
a21
a22 a2N
:
aN1
:
:
aN2 aNN
P(qt+1=Sj | qt=Si)=aij
b1(1)
b1(2)
b1(M)
The observation probabilities
b2(1)
b2(2)
b2(M)
The state transition probabilities
P(Ot=k | qt=Si)=bi(k)
The coin-toss problem

To illustrate the concept of an HMM consider the following
scenario
Assume that you are placed in a room with a curtain

Behind the curtain there is a person performing a coin-toss experiment
This person selects one of several coins, and tosses it: heads (H) or tails
(T)
The person tells you the outcome (H,T), but not which coin was used
each time
Your goal is to build a probabilistic model that best explains

a sequence of observations O={o1,o2,o3,o4,}={H,T,T,H,,}
The coins represent the states; these are hidden because you do not
know which coin was tossed each time
The outcome of each toss represents an observation
A likely sequence of coins may be inferred from the observations, but
this state sequence will not be unique
The Coin Toss Example 1 coin

As a result, the Markov model is observable since there is only one state
In fact, we may describe the system with a deterministic model where the
states are the actual observations (see figure)
the model parameter P(H) may be found from the ratio of heads and tails
O= H H H T T H
S = 1 1 1 2 2 1
The Coin Toss Example 2 coins
From Markov to Hidden Markov Model:

The Coin Toss Example 3 coins
1, 2 or 3 coins?
Which of these models is best?
Since
the states are not observable, the

best we can do is select the model that
best explains the data (e.g., Maximum
Likelihood criterion)
Whether the observation sequence is long
and rich enough to warrant a more
complex model is a different story, though
Future Scope
Over the past three decades, many different methods
have been explored by a large number of scientists to
recognize characters. A variety of approaches have
been proposed and tested by researchers in different
parts of the world to improve the experience of
usability.
References
ieeexplore.ieee.org/iel5/34/28182/01261096.pdf
F. Coulmas, Writing Systems: An Introduction to Their Linguistic Analysis:
Cambridge University Press, 2003.
R. Plamondon and S. N. Srihari, "On-line and off-line handwriting
recognition: A comprehensive survey," IEEE Transactions on Pattern
Analysis and Machine Intelligence,vol. 22, pp. 63-84, 2000.
A. L. Spitz, "Determination of the script and language content of
document images," IEEE Transactions on Pattern Analysis and Machine
Intelligence, vol. 19, pp. 235-245,1997
G. X. Tan, C. Viard-Gaudin, and A. Kot, "Automatic Writer Identification
Framework for Online HandwrittenDocuments Using Character
Prototypes," Pattern Recogn.,2009.
THANK YOU !

Online Handwritten Script Recognition

Uploaded by

Copyright:

Available Formats

You might also like

Online Handwritten Script Recognition

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Online Handwritten Script Recognition

Uploaded by

Copyright:

Available Formats

ONLINE HANDWRITTEN SCRIPT

Why Handwriting Recognition?

The most important characteristic of

Why is Handwriting Recognition

Stroke width and quality

The word trait contains three delayed strokes,

Devnagari : The most important characteristic of Devnagari

Features of Handwritten script

Average Stroke Length (ASL): Each stroke is resampled during

where n is the number of strokes in the pattern.

Shirorekha Confidence: We compute a confidence measure for a

Average Horizontal Stroke Direction: Horizontal Stroke Direction (HD) of a

Average Vertical Stroke Direction: It is defined similar to the Average Horizontal

Vertical Interstroke Direction (VID): The Vertical Interstroke Direction is

STEPS IN HANDWRITTEN SCRIPT

Trajectory data mapped to 2D representation

Special Online Methods: Outlier Elimination: Remove position

Resampling and smoothing of the trajectory -:

Elimination of delayed strokes -:

Andrei Andreyevich Markov

Markov random processes

Chain Rule & Markov Property

P (qt , qt 1 ,...q1 ) P (qt | qt 1 ,...q1 ) P (qt 1 ,...q1 )

P(qt , qt 1 ,...q1 ) P (q1 ) P(qi | qi 1 ,...q1 )

P ( qi | qi 1 ,...q1 ) P (qi | qi 1 ) for i 1

P ( qt , qt 1 ,...q1 ) P (q1 ) P (qi | qi 1 ) P (q1 ) P (q2 | q1 )...P (qt | qt 1 )

There are discrete timesteps, t=0, t=1,

There are discrete timesteps, t=0, t=1,

There are discrete timesteps, t=0, t=1,

Between each timestep, the next state is

There are discrete timesteps, t=0, t=1,

Has N states, called s1, s2 .. sN

Between each timestep, the next state is

There are discrete timesteps, t=0, t=1,

On the tth timestep the system is in exactly

Between each timestep, the next state is

qt+1 is conditionally independent of { qt-1, qt-2,

Example: A Simple Markov

Transitions between states are described by the

This model can then be described by the

From Markov To Hidden Markov

Once an observation is made, the state of the system is then

To make the model more flexible, we will assume that

Each state can produce a number of outputs according to a

HMM Formal Definition

The observation probabilities

The state transition probabilities

The coin-toss problem

Assume that you are placed in a room with a curtain

Your goal is to build a probabilistic model that best explains

The Coin Toss Example 1 coin

The Coin Toss Example 2 coins

From Markov to Hidden Markov Model:

the states are not observable, the

You might also like