Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 17

OCR Guideline

General Requirements

OCR image annotation requires four main operations: judge if the picture is valid, select
picture scene type, text frame and text transcription. You should frame All the target
language in the picture. One line one frame.

Step 1. Valid or Invalid

Before annotation, you need to judge if the picture is valid,


If invalid, mark the image as invalid and done with this picture.
If valid, select picture scene type and annotate (frame and transcribe).

Invalid picture scenarios, please mark “invalid”, and don’t transcribe.


1) Exposure or backlight picture.
Example:

2) Watermark image.
Example:
3) Picture that is Not captured by camera. Or if the picture was captured on the
computer screen or downloaded from internet, it is “invalid”.
Example:

4) The whole picture is blurred or unidentifiable.


Example:

5) Complicate text background, which makes the text hard to be recognized.


Example:

6) Printed document (Not written).


Example:
7) Picture rotated more than 90°.
8) Picture quality: the identifiable characters of each picture should be no less than
90% of the total characters. Otherwise, it should be “invalid”.
Example:

9) If the Target language characters are less than 70%, and the blurred part is over 30%,
such pictures are marked as bad data/invalid.
Example:

10) The picture scene type is incorrect


Step2. Select picture scene type

Each picture is classified into a scene, and pictures that cannot be classified into scenes can
be selected “other”.
Picture scene type:
1) slogan
2) Business card/menu= Business-menu
3) map
4) Store
5) Receipt slip/invoice= list
6) board
7) advertisement
8) packaging
9) written
10) other

Step 3. Annotation Rules

Frame&transcribe

1) One line one frame. Frame only one line each time and transcribe the characters exactly
in the picture. (Exception case: only special formulas can include more than one line)

Incorrect Example:

(As the picture shown, it’s wrong when making a frame with two lines at the same time. If
there’s characters over one line, such as inserting characters or phonetic notation of
characters, we have to make frames separately.)
Correct example:

Example 3: Include Formulas within one frame (Never separate)

(When there are complicated formulas in picture, make one frame and don’t transcribe
the content.)

2) Frame Accuracy Requirements: draw frame closer to characters but NEVER on characters.

(As the picture shown, it’s wrong when draw a frame not closer to characters)

(As the picture shown, it’s wrong when draw a frame on characters)
Correct Example:

3) Mark the upper and lower boundary points of the box:


Upper boundary point (red point) and lower boundary point (green point) as in the picture
below.

Horizontal text: The point(red point) on the top left corner of frame is the upper boundary
point, and point (green point) on the right bottom is lower boundary point.

Vertical text: The point (red point) on top right corner is the upper boundary point, and the
point (green point) on the left bottom is lower boundary point.

Rectangle frame: the upper and lower boundary points are automatically by default when
framing.
Polygon frame: need manually add upper and lower boundary points, right-click mouse to
select upper and lower boundary points respectively.
Attributes

Each frame must be given a corresponding Attribute Tag, one Attribute only.
“Horizontal or vertical” should be judged according to the actual text.

Arabic-horizontal (or Arabic-vertical):

horizontal or vertical refers to the direction of the text layout, which needs to be
transcribed.

English-horizontal (or English-vertical):

If English characters or English numbers appear in the picture, please frame, transcribe the
text and select English-horizontal (English-vertical)
Smear Attribute

Smear-horizontal (or smear-vertical)


Smear Attribute means when the Arabic characters are incomplete, truncated, sheltered, or
the text is incomplete due to reflection, etc. choose Smear Attribute. Horizontal and vertical
refer to text layout and need to be transcribed.

Example: text is truncated.

Example: reflection
If the truncated, sheltered, and reflective characters can be judged according to the
semantics and the shape of the remaining characters, then the characters need be
transcribed.
Among the characters in one frame, if it is not possible to transcribe some truncated,
sheltered, and reflective characters according to the rest part of the characters and the
semantics, use <ERR> to replace the unclear characters. <ERR> can represent multiple
characters as needed.

Blurry& Phonetic symbols

When the whole line of text cannot be recognized, or a whole line is phonetic symbols,
select “Blurry& Phonetic symbols” attribute, and that line need be framed but no need
transcription.
Example:

Other-language

For languages other than Arabic and English, select other-language attribute, don’t
transcribe.
Example:
Formula

When there are complex formulas of physics, chemistry, and mathematics that are difficult
to transcribe, frame the text, select formula attribute, and don’t transcribe.
Example:

apostrophe

It means Multi-point, Just draw a frame, don’t transcribe.

Example:
Frame and mark “apostrophe“”; for number “3”, frame, select “English horizontal” and
transcribe “3” at the text box.
Other requirements:

Blurred text

1) The whole picture is blurred: mark as bad data/invalid, don’t transcribe


Example:

2) If part of the text in one frame is blurred: For characters that cannot be transcribed
according to semantics due to blurry, use <ERR> to replace unclear characters, and if the
remaining text is clearly recognizable characters, it can be transcribed normally. The
attribute should be Target language-horizontal (or Target language-vertical).
Example:

3) A whole line of text is unclear and cannot be transcribed, select the Blurry& Phonetic
symbols Attribute, don’t transcribe.
Example:
Incomplete text

1) If the text is truncated, blocked by leaves, wires and other obstacles, reflections, or the
text is short of strokes or characters, it is called incomplete text.
2) No matter how much the text is truncated, you must draw a frame, select the smear
attribute and transcribe, use <ERR> to replace unidentifiable characters if you cannot judge.
Example:

Non-Target language characters among Arabic/English characters

1) If the Non-Target language characters can’t be transcribed (e.g. emojis/emoticons, etc:


Only frame the Arabic/English text and transcribe, do not frame characters that cannot be
transcribed.
Example:

2) If one line of text has multiple languages, and the Arabic/English words has a larger
proportion, it can be framed separately, the text of Arabic/English is transcribed
normally, and the other languages is framed with “other-language” attribute without
transcribing the content.
If the Arabic/English has a small proportion, frame the whole line, select “other-language”
attribute, and do not transcribe the content.
Example:

3) If Arabic characters and English characters are one line, the space between two
languages is less than 2 characters, just drag one frame and transcribe them all.
The attribute should be: Arabic-horizontal/vertical
Table image

The picture in the form of a table with lists such as ingredients list, must be transcribed in
frames according to the table frames and should also follow “one line one frame” rule as
well:

Space problem between words

If the space between the characters exceeds over two characters, the frame must be
separated into two frames.
Space problem in one word

If there is a relatively large gap between letters or characters in one word, transcribe
normally without adding spaces between letters in the same word.
Example:

should be transcribed as:

should be transcribed as:

Target language stress symbol

Just transcribe the letters under the symbol, example is as follows,

should be transcribed as:

Tone symbol

If the tone symbol affects the meaning of the word, the tone symbol needs to be transcribed,
if it does not affect the meaning of the word, it is not transcribed.

Special example

1) When drawing a frame, close to the words without pressing on them, and different
frames can overlap and cross.
Example:
Use a polygonal frame to select the required text on the arc stamp as a whole, and manually
mark the upper and lower boundary points accordingly.

3) Reflection words
a. Clear reflection words(readable or recognizable) should be framed and marked with
“Blurry& Phonetic symbols” attribute.

b. The reflection words that


are not clearly visible can be
ignored, no need to draw
frames.
4) Special symbol:
a. The bullet before the word:
If the bullets that can be transcribed, you need draw a frame. For special symbols that
cannot be transcribed, you can just frame the part of text.

The yellow frame is the correct way to frame. (If the distance between the symbol and the
first character is within two characters, it needs to be framed together.)
b. Graphic symbols

Graphical symbols that can be transcribed require a frame. Such as “w” in the word graphics.

c. Special symbols that cannot be typed on the keyboard do not need to be framed:

Example: don’t transcribe the symbol marked in red.

5) Underline
a. Underline without text before and after(underline only), ignore it without frame.
b. If there is text before or after the underline, and there is no text above the underline, just
mark one “_” no matter how long the underline is.
Example:
If there is text on the underline, only frame the text and ignore the underline; (Space applies
to rule mentioned before).
Example:

7) If the superscript and subscript are on the same horizontal line as the text, please frame it
in one line, no need to separate the frame.
Example:

You might also like