Professional Documents
Culture Documents
ComarchOSSUserDoc - OCR v. 5-25
ComarchOSSUserDoc - OCR v. 5-25
ComarchOSSUserDoc - OCR v. 5-25
OCR – 5-25-0
User Reference
©2023 Comarch SA
All rights reserved
OCR
DOCUMENT DESCRIPTION
Version 1.0
Company COMARCH
Number of pages 58
REVISION
Monika
1.0 2023-04-12 Document creation
Rutkowska
TABLE OF CONTENTS
Document Description .............................................................................................................................................. 2
Revision .................................................................................................................................................................... 3
Table of contents ...................................................................................................................................................... 4
Copyright Notice ....................................................................................................................................................... 5
Abbreviations ............................................................................................................................................................ 6
Notation Conventions ............................................................................................................................................... 7
1 Preface .................................................................................................................................................................. 8
2 How to open the Comarch OSS Console .............................................................................................................. 9
3 OCR ..................................................................................................................................................................... 11
3.1 Autotagging ................................................................................................................................................. 11
3.2 How to do OCR on a single file ................................................................................................................... 11
3.3 How to do OCR and Autotagging on a folder ............................................................................................. 14
3.4 Rules Group Management .......................................................................................................................... 15
3.4.1 How to open Rules Group Management ............................................................................................ 16
3.4.2 Rules Group Management – Overview .............................................................................................. 17
3.4.3 Working in Rules Group Management ............................................................................................... 18
3.4.3.1 How to add a new rule group ..................................................................................................... 18
3.4.3.1.1 How to add a new manual rule group ................................................................................ 19
3.4.3.1.2 How to add a new automatic rule group ............................................................................ 21
3.4.3.2 How to edit a rule group ............................................................................................................. 24
3.4.3.3 How to delete a rule group ......................................................................................................... 26
3.4.3.4 How to add a new rule to a rule group ....................................................................................... 28
3.4.3.5 How to add an existing rule to a rule group ............................................................................... 31
3.4.3.6 How to remove a rule from a rule group .................................................................................... 34
3.4.3.7 How to reload data for automatic rule groups ............................................................................ 35
3.4.4 Rules Management View .................................................................................................................... 37
3.4.4.1 How to add a new rule ............................................................................................................... 39
3.4.4.2 How to edit a rule ....................................................................................................................... 42
3.4.4.3 How to delete a rule ................................................................................................................... 44
3.4.4.4 How to display automatic rules .................................................................................................. 46
3.5 OCR Result View ........................................................................................................................................ 48
3.5.1 OCR Result View – Overview ............................................................................................................. 51
3.5.2 How to manage Massive OCR ........................................................................................................... 53
List of Figures ......................................................................................................................................................... 56
List of Tables .......................................................................................................................................................... 58
COPYRIGHT NOTICE
Comarch OSS Suite Documentation
Copyright © 2023 COMARCH
All rights reserved.
No part of this document may be copied, distributed, transmitted, transcribed, stored in a retrieval system, or
translated into any human or computer language without the prior written permission of COMARCH.
The manufacturer has made every effort to ensure that the instructions contained in the documents are
adequate and free of errors and omissions. The manufacturer’s liability for any errors in the documents is limited
to the correction of the errors. The manufacturer reserves rights to change the documents without notice. Some
parts of the documents may not follow the current version of the software. Please refer to release notes of the
installed software version to find out about new features of the software, error corrections and specific
installation instructions.
The documents have been prepared to be used by professional and properly trained personnel, and the
customer assumes full responsibility when using them. The manufacturer welcomes customer comments as part
of the process of continual development and improvement of the documentation in the best way possible from
the user’s viewpoint. Please submit your comments to the nearest Comarch representative.
ABBREVIATIONS
The following abbreviations apply throughout this document:
IP Internet Protocol
MO Managed Object
NOTATION CONVENTIONS
The following typographic conventions apply throughout this document:
Style Description
Tips for the users are distinguished by the icon presented on the left.
Objects All names of objects start with a capital letter, e.g. Address, Location.
'quotation marks' All attribute values are marked by quotation marks, e.g. ‘PATH_1’.
Names of other documents, commands, windows, tabs and other information that
Bold you must use literally appear in bold, e.g. Framework Console Operations,
Layers.
Names of buttons appear in Courier New in square brackets and in bold, e.g.
[BUTTON]
[FINISH].
References to other specific parts of the document appear as blue, bold and
Blue bold underlined
underlined text, e.g. Blue bold underlined .
1 Preface
Please note that the screenshots in this document are provided for illustrative purposes only and
should not be treated as an exact representation of the Console Graphical User Interface. It may
differ depending on the software version and installed components. Additionally, the user may not
have access to some functionalities or modules because of the lack of appropriate privileges.
This document describes the out-of-the-box version of the product and does not include project-
specific features or customisations.
Recommended reading
To read and understand this document, one must have basic knowledge about the Comarch
OSS system, its basic concepts, data model introduction, generic views and functionalities. For
introduction to the Comarch OSS system please refer to ComarchOSSUserDoc_Web_Console.
4. Once the username and password are entered correctly, the Home page is displayed.
Recommended reading
To learn more about the Console and its basic capabilities, see
ComarchOSSUserDoc_Web_Console.
3 OCR
Tesseract Open Source is OCR Engine, constantly developed under the Apache Open Source 2.0 license. It
supports 100+ languages, including Polish and English.
Images for OCR recognition should have at least 200 DPI, usually 300 DPI, 1 bpp (bit per pixel) monochrome or
8 bpp in grayscale, TIFF or PNG. PNG format is usually smaller than the other formats of images and still keeps
high quality thanks to lossless compression algorithms.
3.1 Autotagging
Automatic tagging engine is a REST microservice intended for OCR files using Tesseract. It qualifies texts with
tags by using Regex and tag Attachment.
Uploaded Attachment with extensions compatible with Tesseract or with the .txt extension are able to be tagged
in order to extract text from them.
Tagging is regulated by selected rules (for more information see Rules Group Management).
2. Optionally, select a Rule group from the drop-down list. If you do not select anything in this field, OCR will be
done without Autotagging.
3. Select Priority.
4. Click [Save].
OCR is done. The icon turns to green. You can click it to display the results of the process.
1. Select a folder.
2. Click [Create] > [Automatic tagging].
3. Complete the Select rule group wizard (the mandatory fields are marked with an asterisk). Please note that
by selecting optional parameters, you can narrow down the group of Attachments from a folder in order to
speed up the process.
Attribute Description
Rule group Specifies the rule group according to which the autotagging will be performed.
Priority * The priority of the OCR process. The possible values are:
• 'Low' (default value)
• 'Medium'
• 'High'
• 'Urgent'.
Attribute Description
File The extensions of Attachment files on which the process will be performed. You can select
extension more than one value.
Type The type of Attachment on which the process will be performed. You can select more than one
value.
4. Click [Save].
1. Toolbar.
2. List of rule groups.
• Automatic – rules are generated automatically from global search with available types: Region, District,
Commune, City, Street.
2. The Add new rule group wizard is displayed. Enter the Title of the rule group and select 'Manually' from the
Data Input Type field.
3. Click [Save].
2. The Add new rule group wizard is displayed. Enter the Title and select 'Automatic' Data Input Type.
3. Two mandatory fields are added to the wizard: Attribute and Type. The 'Name' value in the Attribute field is
predefined and cannot be changed. Select one of the following values for Type:
• 'Region'
• 'District'
• 'Commune'
• 'City'
• 'Street'.
4. Click [Save].
3. The Edit rule group wizard is displayed. You can modify the Title only.
4. Click [Save].
Please note that if you do not introduce any changes, the error notification is displayed.
3. The Add new rule to rule group wizard is displayed. Enter the Title of the rule and the Regex expression.
4. Click [Save].
3. The Add existing rule to rule group wizard is displayed. Start typing the name of the rule you want to add
to the rule group, the system will display matching rules on the drop-down list. Select the rule from the list.
4. Click [Save].
1. Search field
2. Toolbar
3. List of rules
2. The Add new rule wizard is displayed. The process of adding a new rule is the same as in How to add a
new rule to a rule group.
3. The Edit rule wizard is displayed. Modify the attributes you want to update.
4. Click [Save].
2. Click [Search].
Figure 45 Opening OCR Result View from the navigation panel (1)
Figure 46 Opening OCR Result View from the navigation panel (2)
1. Filter the results by Status or User Name using the Filter section. The available Status options are as
follows:
Status Description
Success Automatic Tagging result was ended with the 'Success' status.
Failed Automatic Tagging result was ended with the 'Failed' status.
Scheduled automatic Automatic Tagging result is waiting in a queue with a lower rank.
Scheduled manually Automatic Tagging result is waiting in a queue with a higher rank.
2. Click [Filter].
A list with OCR results is displayed. You can use the hyperlinks in the Attachment Id column to open the
Attachment in Information View.
You can manage this process by using the [Start/Stop Massive OCR] in OCR Result View.
LIST OF FIGURES
Figure 1 Opening the Web Console ......................................................................................................................... 9
Figure 8 Opening Rules group Management from the Home page ....................................................................... 16
Figure 9 Opening Rules group Management from the navigation panel ............................................................... 17
Figure 13 Newly added manual rule group displayed on the list ............................................................................ 21
Figure 17 Newly added automatic rule group displayed on the list ........................................................................ 24
Figure 44 Opening OCR Result View from the Home page ................................................................................... 49
Figure 45 Opening OCR Result View from the navigation panel (1)...................................................................... 50
Figure 46 Opening OCR Result View from the navigation panel (2)...................................................................... 51
LIST OF TABLES
Table 1 Automatic tagging – wizard values ............................................................................................................ 15