Download as pdf or txt
Download as pdf or txt
You are on page 1of 14

Documents are

everywhere!!!

Documents exist in
almost every business
operation scenario!
Source: Deloitte-UK-Intelligent Document Processing Report, February 2020

How to
process them?

Source: https://www.parascript.com/blog

Mohtasim Mapkar
RPA Developer
Table of Contents
What is IDP? ............................................................................................................................................................................ 3
Critical Capabilities .................................................................................................................................................................. 5
Microsoft Power Platform – AI Builder Document Automation ............................................................................................. 6
Architecture ........................................................................................................................................................................ 7
Hardware & Licensing ......................................................................................................................................................... 7
Technical Support ............................................................................................................................................................... 7
UiPath Document Understanding ........................................................................................................................................... 8
Architecture ........................................................................................................................................................................ 9
Hardware & Licensing ......................................................................................................................................................... 9
Technical Support ............................................................................................................................................................... 9
ABBYY FlexiCapture ............................................................................................................................................................... 10
Architecture ...................................................................................................................................................................... 11
Hardware & Licensing ....................................................................................................................................................... 11
Technical Support ............................................................................................................................................................. 11
Pros & Cons ........................................................................................................................................................................... 12
Success Stories ...................................................................................................................................................................... 13
References ............................................................................................................................................................................ 14
Source: Statista
200 181
Data volume in zetabytes

147
150 120 The total amount of data created, captured, copied, and
97 consumed globally is forecast to increase rapidly, reaching 97
100 79 zettabytes in 2022. Over the next three years up to 2025, global
64.2
data creation is projected to grow to more than 180 zettabytes
50 26 33 41
15.518
12.5
2 5 6.5 9 Most of which is in emails, pdf, etc. How to make it useful when
0 dealing with different business scenarios?

Did you know?

What is IDP?
Document processing is a task that organizations have been performing routinely for many years. The process involves
interpreting the content of a physical or electronic document into an actionable task. Intelligent Document Processing
(IDP) makes this highly repetitive and labor-intensive task much simpler, wherein digital technologies like Machine
Learning, Natural Language Processing, and Intelligent Character Recognition come into play to process documents and
eliminate the tasks, which would have otherwise required relying on human intelligence. This makes it more practical for
businesses that receive high volumes of documents, such as sales orders, invoices, and customer correspondence, to
process relevant data. Simply put, IDP solutions transform unstructured and semi-structured information into usable
data.
Intelligent document processing goes through several stages that may vary depending on the system and business
requirements. Typically, IDP systems ingest documents, classify documents into categories, locate and extract the
desired data, validate the data, and perform corrective action as necessary to ensure delivery of the highest levels
of data integrity and accuracy.

IDP platforms are designed to efficiently and accurately manage the multiple steps of data extraction and
integration as listed below

Pre-processing
IDP relies on structured, accurate
and cleaned data created by
• Cleaning and grooming fed
information
• Deskewing and decreasing noise
• Cropping and binarization

Data Collection
• Gathering paper/electronic docs Classification
• Integration with hardware to • AI driven document classification
digitize paper/handwritten docs • Use of computer vision algorithms in
• Ingestion of digital docs via built- case of scans or pictures
in integrations • NLP techniques to divide docs in
different categories by structure,
content and/or type

Data extraction
• OCR to extract textual data from images Post processing & validation Export
• NLP to decide type of data being extracted • Correct common misspellings • Assembly of extracted
• ML trained models to make data consistent and transform data to standard data into final output file
• Tailored ML models to extract data from output format • Use of API to fed data
common docs such as invoices, bank statements, • Validating extracted data to downstream
etc. ensure accuracy • JSON or XML export

There are 50+ vendors in the market offering IDP solutions and some even specializing in a specific phase of the process. It
can be overwhelming to choose an enterprise partner while augmenting your automation services. Choosing the right
vendor requires an in-depth analysis of requirements and aligning it with capabilities of the vendors.
Critical Capabilities
With so much going on under to hood to facilitate the processing of variety of documents such invoices, bank
statements, engineering drawings, etc from one end of the process to polished, structured data or business insights at
the other end, there is a lot to be considered while selecting an IDP platform that better suits your business needs. This
white paper is aimed at first understanding few key questions that should be asked when it comes to selecting an IDP
platform.

Is the platform AI-native?


Relying just on the OCR to digitize and extract data won’t cut it in case of unstructured documents.
IDP platform should allow integration of multiple AI techniques such as machine learning, natural
language processing and deep learning to ensure that the solution works in an operational setting
and can be supported throughout its lifecycle.

A system better suited for AI will outperform its counterparts by huge margins!

Can it handle unstructured & complex documents?


Documents are often diverse, meaning they can be structured, semi structured, and unstructured. An
IDP platform must have the capability to accurately extract data from all kinds of documents. Failure
to handle these types of documents, necessitates the need of a team to manually handle the
exceptions manually. If the solution is primarily designed to handle unstructured documents, it can
handle a full spectrum of documents.

Fewer documents touched by humans, better is the performance of the system!

Does it promise enterprise level scalability?


Once the pilot project succeeds, suddenly you are flushed with multiple use cases demanding quick
development and deployment. IDP platform should allow the enterprise to scale up as document
volume increases or as surges occur, and also to easily add new documents and use cases. This
scalability should not come at the cost of financial or even staffing issue

Think big, start small, then scale or fail fast!

How easily does it integrate?


In any organization, often the processes are just a small piece of large puzzle that require constant
exchange of information. IDP platform should not only be flexible in terms of how it receives input
but also allow integration with different processes and systems downstream. An IDP platform can
never exist as an island cut from the mainland, rather it should be chain of cities interconnected by
means of high-speed transport

Seamless integration is the key to successful deployment!


Microsoft Power Platform – AI Builder Document Automation
Document automation toolkit provided in Microsoft Power
Platform enables to build a rich and robust document
automation solution using Power Automate to orchestrate
the overall process, AI Builder to bring the Intelligence
required to efficiently extract information from documents,
Power Apps to allow users to manually review and approve
documents, and Dataverse to manage the document queue
and store all the data, files, and configuration required.

Power Automate and Power Apps are commonly used


components for digitalization domain of RPA Competence.
However, AI Builder is a new tool that requires further
exploration that has many features still in preview mode.

AI Builder is a Microsoft Power Platform capability that Source: Microsoft Docs

provides AI models that are designed to optimize business processes. AI Builder enables businesses to use intelligence to
automate processes and glean insights from your data in Power Apps and Power Automate. With AI Builder, one doesn't
need coding or data science skills to access the power of AI. One can build custom models tailored to needs, or choose
a prebuilt model that is ready to use for many common business scenarios.

Getting started with document processing using document automation toolkit is as easy as launching Power Automate
and installing the toolkit. The only prerequisite is to have an active Power Automate license and System Customizer
permission in the environment in which the solution is created.

Once the required solution is installed, integration of AI Builder in Power Automate flows can be achieved by following
the steps shown below.

Exhaustive documentation is available along with numerous trainings on Microsoft Learn to implement document
automation

Automate the processing of documents with the AI Builder prepackaged solution - Learn | Microsoft Docs
Document automation toolkit - AI Builder | Microsoft Docs
Architecture
All the phases in intelligent document processing – document
ingestion, pre-processing, classification, extraction, validation,
and export are performed by various components of Power
Platform with seamless integration. Different components
involvement in the phases is shown in the figure

Following are the different components of Power Platform


utilized in creating document automation solution

• Power Automate
• Power Apps
• AI Builder
• Dataverse

Hardware & Licensing

All the components in AI Builder Document Automation run exclusively on cloud and does not require any
additional hardware!

AI Builder is licensed as an add-on to Power Apps, Power Automate, or Dynamics 365 license, meaning one can start AI
Builder trial based on Power Apps, Power Automate, or Dynamics 365 license. An AI Builder trial license enables use of AI
Builder features for free during the 30-day trial period that allows to Create and use AI models in any environment (trial
or production), store AI Model results in Dataverse and use created AI Model in apps, flows and more. However, trial
license includes a limited amount of AI Builder capacity which can be used when running or training models. On exceeding
the AI Builder capacity, over-capacity notification is sent to users. To continue using AI Builder after trial, AI Builder add-
on capacity has to be purchased and allocated to environments.

The AI Builder capacity add-on can be purchased by a billing administrator in Microsoft 365 admin centre, or by using
usual channel. The AI Builder calculator helps estimate the required add-on capacity based on estimated consumption.
Power Apps per app plan, Power Apps per user plan, and Power Automate per user plan with attended RPA include some
AI Builder capacity. Environment admin can check entitlement in Power Platform admin centre in Capacity add-ons. When
this amount isn't enough, it can be augmented with 1 or several AI Builder capacity add-ons. Once entitled to AI Builder
capacity, credits are unallocated and available as a pool on the tenant, which can be used on any environment. The
administrator can restrict usage by allocating all credits to specific environments.

For more details pertaining to licensing of AI Builder, visit AI Builder Licensing - Microsoft Docs

Technical Support
Features that have been released for general availability are eligible for support through Microsoft Support. Power
Platform admin centre can be used to request support from Microsoft. Also, one can get support for AI Builder on the AI
Builder community forums

AI Builder Calculator | Microsoft Power Apps


AI Builder Licensing - AI Builder | Microsoft Docs
AI Builder community forums
Power Platform Admin Center
UiPath Document Understanding
The UiPath Document Understanding framework facilitates the processing of incoming files, from file digitization to
extracted data validation, all in an open, extensible, and versatile environment. Document Understanding is designed to
help combine different approaches to extract information from multiple document types. The main aim is to make the
process of extracting data as easy as possible: creating one single workflow that will extract data from a variety of
documents. Source: UiPath Docs

All the components in the framework discussed in detail


below are added as a part of a single workflow in UiPath
project.

1. Taxonomy: Used to define the document types and


the pieces of information targeted for data extraction
(fields) for each document type, and formalizes this
information into a dedicated Taxonomy structure
2. Digitization: Used to obtain the textual content and
the structure of the incoming document, turning a file
into machine-readable content so it can be further
processed downstream. 6. Data Extraction: Used to capture the information
required for the identified document type, within the
3. Document Classification: Used to automatically
given input document and classification page range.
determine what document types are found within a
digitized file. 7. Data Extraction Validation: Used for assisting in the
human validation and correction of the automatically
4. Document Classification Validation: Used for assisting extracted data results.
in the human validation and correction of the automatic
classification and document splitting results. 8. Data Extraction Training: Used to pass the human
validated extracted data back to the extractors, to use it
5. Classification Training: Used to pass the human to improve their extraction predictions.
validated information back to the classifiers, to use it to
improve their future predictions. 9. Export: Used to export the validated data in order to
consume it.
All the steps mentioned above are programmed into the UiPath workflow and runs by utilizing the existing unattended
bot architecture. However, the validation steps both in classification and extraction can be performed in 2 ways – Using
attended bot by locally opening the validation station and using Action Centre (a web-based portal to enable validation
by multiple users). Also, its important to note that there are different classifiers and extractors which can be used based
on the type of documents being processed. Following are the available classifiers and extractors:

Classifiers Extractors

Keywords based Intelligent FlexiCapture Machine learning Regex based Intelligent form Machine learning FlexiCapture
Form extractor
classifier keyword classfier classifier classifier extractor extractor extractor extractor

Detailed information of how to use different classifiers and extractors and the special requirements to use each type can
be found in UiPath documentation. Moreover, A detailed course is available in UiPath Academy to get started with
Document Understanding Framework

UiPath Document Understanding | Docs


UiPath Document Understanding | Learning Path
Architecture
Document understanding framework leverages the existing architecture of UiPath wherein the package is developed in
studio, deployed on Orchestrator to create a process, picked up by an attended/unattended bot to process documents,
classify and extract information which can be validated by attended bot based validation station or web based portal
called Action Center, and finally export it to process downstream. The ML models can be accessed by using API endpoints
and providing API Key. Common models to process invoices, identity documents are available on community cloud.
Similarly other customized ML models can also be accessed to better suit the business needs. Following are the different
components utilized by UiPath Document Understanding Framework

• UiPath Studio
• UiPath Orchestrator
• UiPath Assistant – Attended/Unattended Bot
• UiPath Action Center
• Community and/or Customized ML models

Hardware & Licensing


In addition to the existing infrastructure, UiPath Action Center and AI Center has to be purchased, licensed, and configured
which comes with its pre-set list of hardware and computing power. For detailed hardware requirements, visit UiPath
Docs. AI Center is the infrastructure on top of which UiPath Document Understanding machine learning models run. These
models can be deployed or instantiated for retraining with a few clicks. Meanwhile, in case validation is to be performed
by the users on workstation, an additional attended bot license is required. However, using Action Center allows access
to multiple users by its web-based portal and is the recommended solution.

Technical Support
Enterprise version of AI Center and Action Center comes with technical support from UiPath which can be accessed
using the license key. UiPath forum is a vast trove of knowledge to aid in troubleshooting common issues and is free to
access

UiPath Action Center | Hardware Requirements


UiPath Forum
ABBYY FlexiCapture
ABBYY FlexiCapture is more than an intelligent data capture and extraction solution that brings together the best NLP,
machine learning, and advanced recognition capabilities into a single, enterprise-scale document capture platform to
handle every type of document and every job size. It is a powerful tool that replaces costly and inefficient manual
operations with a transparent electronic process. It facilitates transformation of streams of unstructured documents into
structured, business-ready data in 3 steps: Identifying documents, Capturing business-critical data, and finally Delivering
the data into business processes. Source: ABBY University

It has both standalone and distributed architecture. FlexiCapture


standalone facilitates intelligent document processing capabilities for
small business processes for up to 50k pages per month with single user
verification. All the actions are performed on a single machine.

While FlexiCapture distributed (explained in detail in Architecture


section) is designed for large projects and enterprises that allows
processing of more than 50k pages per month with multiple user
verification. It installs on multiple machines and is a very scalable system.
ABBYY FlexiCapture Standalone
ABBYY FlexiCapture enables intelligent document processing by utilizing
various components to deliver business critical data from a dump of structured, semi-structured and unstructured
documents in following steps

1. Import: ABBYY FlexiCapture automatically processes all types of documents from files and scanners in a single flow,
including office documents and image formats, email attachments and message bodies.

2. Classification: The neural-based automatic document classification technology enables sorting of documents by types
(e.g., driver license, bank statement, tax form, contract, invoice, etc.) and custom subcategories (e.g., invoices from vendor
A, invoices from vendor B, etc.) by text content and image patterns. It learns quickly and easily, enabling it to perform as
an auto-classifier.

Source: ABBY University

3. Extraction: Document images are assembled into multi-page document sets. Their content and data are automatically
extracted and validated. FlexiCapture provides highly accurate OCR/ICR/OMR and barcode recognition to recognize
printed text in up to 200 languages, hand-printed text in over 130 languages, a variety of 1D and 2D barcodes, and a wide
range of checkmarks. Automatic validation includes comparison against databases, conformity with built-in validation
rules, compliance with formats, data normalization, and user-defined checks.

4. Verification: Verification station allows checking if extracted fields match those of the original document. Alternatively,
verification can be started manually using the easily accessible web-based verification station. Any of the following
techniques can be used: group verification, verification in document window, field verification.

5. Export: ABBYY FlexiCapture automatically exports recognized data to different file formats, or to databases, systems of
record, and other destination points in line with user-defined rules. This includes corporate file storage repositories such
as SharePoint, ODBC compatible databases such as Oracle, ERP and ECM systems such as SAP, and RPA workflows to
deliver data in legacy systems.
Architecture

ABBYY FlexiCapture architecture comprises of different


components – multiple servers & client components,
remote stations, web applications and tools to facilitate all
the actions right from project setup, creating document
definition, configuring data extraction to publishing and
day to day operation to process more than 50k documents
per month and also provide multi-level administration,
automatic notifications for critical failures, and
comprehensive reporting.

Major components of this distributed architecture are

Administration & Monitoring Console: A web-based


module used for setting up ABBYY FlexiCapture and
monitoring its operation.
Source: ABBY University
License Manager: Allows administrators to view license information as well as to activate, deactivate, and switch licenses.

Processing Server Monitor: A module that allows an admin to set up & manage the Processing Server & Stations.

Project Setup Station: is the module where projects are created, set up, and debugged. A project allows you to create and
edit Document Definitions, batch types, image import profiles, document exports, and various processing options.

Scanning Station: A software solution for batch scanning large amounts of documents. It may acquire images either
directly from a scanner or load them from a folder by means of a virtual scanner.

Data Verification Station: A station that allows data verification operator to verify unreliably recognized characters

Verification Station: A module that allows comprehensive verification of recognized and extracted documents including
but not limited to correct assembly errors, resolve data rule violations, check batch composition integrity, etc.

Additionally, there are tools like FlexiLayout Studio and Form Designer that aid in creating document definitions which
will be utilized to classify and extract data from structured and semi-structured documents.

Hardware & Licensing


For standalone architecture, the only requirement is a single machine which hosts different stations and tools, the
configuration of which can be requested from ABBYY Support team.

However, the distributed architecture comes with a set of specification in terms of required servers and stations
discussed in detail in Specifications section on ABBYY FlexiCapture website.

Technical Support
ABBYY Help Center is a one stop solution to all technical support needs of ABBYY products and will be given access to
once the enterprise version of ABBYY FlexiCapture is configured

ABBYY FlexiCapture | Specifications


ABBYY FlexiCapture | Help Center
Pros & Cons
Success Stories

PepsiCo automates invoice processing with ABBY FlexiCapture


An automated capture solution capable of supporting multiple languages, multiple currencies, large-
volume batch processing and complex data fields. Along with vendor and customers’ names, value added
tax (VAT) information had to be captured for both customers and vendors, too – as well as purchase orders,
invoice amounts and line items.
more than mix of 5
over 21,000
documents 40,000 pages languages

Sonae tracks Covid19 results with zero manual intervention using AI


Sonae, an MNC with a diversified portfolio of companies spanning 62 countries incorporated AI Builder to
Builder
read outcomes of the test listed on the pdf, specifically positive, negative, and inconclusive results. The use
case proved successful with zero errors releasing doctors and nurses to perform their duties without
excessive after-hours requests. Results are communicated on-time with the ability to track and trace
results.
21,000 tests over 25 More than 17
with no hours/month for hours/month for
intervention doctors & nurses tech

Automating payment processing with UiPath Document Understanding


The solution involved automating P2P invoice processing through Coupa, an enterprise business spend-
management solution. This process required to extract data from PDFs received from Coupa—invoices,
purchase orders, and related documents. Then UiPath Action Center was used to validate the extracted
results or ask for any missing details if needed.

70% reduction accuracy of


over 824,000
invoices annually in processing time 85%

Submit your
idea here!!!
References
[1] https://www.statista.com/statistics/871513/worldwide-data-created/

[2] https://www.infrrd.ai/whitepaper/idp-critical-capabilities

[3] https://www.thoughttrace.com/blog/what-is-intelligent-document-processing/
[4] https://www.pinclipart.com/pindetail/ibiixJ_know-cliparts-did-you-know-icon-png-download/
[5] https://www.altexsoft.com/blog/intelligent-document-processing/
[6] https://becominghuman.ai/who-are-the-top-intelligent-document-processing-idp-vendors-96c721844dc1
[7] https://docs.microsoft.com/en-us/learn/modules/get-started-ai-builder-document-automation/1-introduction
[8] https://docs.microsoft.com/en-us/ai-builder/overview
[9] https://docs.microsoft.com/en-us/ai-builder/administer-licensing
[10] https://docs.uipath.com/document-understanding/v2020.10/docs/introduction
[11] https://www.uipath.com/resources/automation-webinars/document-understanding-ai-enhanced-webinar-
recording
[12] https://docs.uipath.com/document-understanding/docs
[13] https://www.abbyy.com/flexicapture/
[14] https://university.abbyy.com/
[15] https://www.abbyy.com/resources/report/intelligent-document-processing-quadrant-knowledge-solutions/
[16] http://mtssoftwaresolutions.com/technologies/abbyy/abbyy-flexicapture
[17] https://tekenable.ie/microsoft-platform/microsoft-power-platform/microsoft-ai-builder/
[18] https://www.abbyy.com/customer-stories/pepsico-automates-invoice-processing-with-abbyy-flexicapture/
[19] https://customers.microsoft.com/en-us/story/1401648712668027785-progressive-retailer-resolves-
challenges-with-low-code-automation-and-ai-solutions
[20] https://www.parascript.com/blog/

You might also like