Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

Document processing

Document processing is a field of research and a set of production processes aimed at making an analog
document digital. Document processing does not simply aim to photograph or scan a document to obtain a
digital image, but also to make it digitally intelligible. This includes extracting the structure of the document
or the layout and then the content, which can take the form of text or images. The process can involve
traditional computer vision algorithms, convolutional neural networks or manual labor. The problems
addressed are related to semantic segmentation, object detection, optical character recognition (OCR),
handwritten text recognition (HTR) and, more broadly, transcription, whether automatic or not.[1] The term
can also include the phase of digitizing the document using a scanner and the phase of interpreting the
document, for example using natural language processing (NLP) or image classification technologies. It is
applied in many industrial and scientific fields for the optimization of administrative processes, mail
processing and the digitization of analog archives and historical documents.

Background
Document processing was initially as is still to some extent a kind of production line work dealing with the
treatment of documents, such as letters and parcels, in an aim of sorting, extracting or massively extracting
data. This work could be performed in-house or through business process outsourcing.[2][3] Document
processing can indeed involve some kind of externalized manual labor, such as mechanical Turk.

As an example of manual document processing, as relatively recent as 2007,[4] document processing for
"millions of visa and citizenship applications" was about use of "approximately 1,000 contract workers"
working to "manage mail room and data entry."

While document processing involved data entry via keyboard well before use of a computer mouse or a
computer scanner, a 1990 article in The New York Times regarding what it called the "paperless office"
stated that "document processing begins with the scanner".[5] In this context, a former Xerox vice-
president, Paul Strassman, expressed a critical opinion, saying that computers add rather than reduce the
volume of paper in an office.[5] It was said that the engineering and maintenance documents for an airplane
weigh "more than the airplane itself".

Automatic document processing


As the state of the art advanced, document processing transitioned to handling "document components ...
as database entities."[6]

A technology called automatic document processing or sometimes intelligent document processing (ID)
emerged as a specific form of Intelligent Process Automation (IPA), combining artificial intelligence such as
Machine Learning (ML), Natural Language Processing (NLP) or Intelligent Character Recognition (ICE)
to extract data from several types documents.[7][8]

Applications
Automatic document processing applies to a whole range of documents, whether structured or not. For
instance, in the world of business and finance, technologies may be used to process paper-based invoices,
forms, purchase orders, contracts, and currency bills.[9] Financial institutions use intelligent document
processing to process high volumes of forms such as regulatory forms or loan documents. ID uses AI to
extract and classify data from documents, replacing manual data entry.[10]

In medicine, document processing methods have been developed to facilitate patient follow-up and
streamline administrative procedures, in particular by digitizing medical or laboratory analysis reports. The
goal is also to standardize medical databases.[11] Algorithms are also directly used to assist physicians in
medical diagnosis, e.g. by analyzing magnetic resonance images,[12][13] or microscopic images.[14]

Document processing is also widely used in the humanities and digital humanities, in order to extract
historical big data from archives or heritage collections. Specific approaches were developed for various
sources, including textual documents, such as newspaper archives,[15] but also images,[16] or maps.[17][18]

Technologies

If, from the 1980s onward, traditional computer vision algorithms were widely used to solve document
processing problems,[19][20] these have been gradually replaced by neural network technologies in the
2010s.[21] However, traditional computer vision technologies are still used, sometimes in conjunction with
neural networks, in some sectors.

Many technologies support the development of document processing, in particular optical character
recognition (OCR), and handwritten text recognition (HTR), which allow the text to be transcribed
automatically. Text segments as such are identified using instance or object detection algorithms, which can
sometimes also be used to detect the structure of the document. The resolution of the latter problem
sometimes also uses semantic segmentation algorithms.

These technologies often form the core of document processing. However, other algorithms may intervene
before or after these processes. Indeed, document digitization technologies are also involved, whether in the
form of classical or three-dimensional scanning.[22] The digitization of 3D documents can in particular
resort to derivatives of photogrammetry. Sometimes, specific 2D scanners must also be developed to adapt
to the size of the documents or for reasons of scanning ergonomics.[16] The document processing also
depends on the digital encoding of the documents in a suitable file format. Furthermore, the processing of
heterogeneous databases can rely on image classification technologies.

At the other end of the chain are various image completion, extrapolation or data cleanup algorithms. For
textual documents, the interpretation can use natural language processing (NLP) technologies.

See also
Document automation
Document modelling
Data Processing
Document Imaging
Duplex scanning
Text mining
Workflow

References
1. Len Asprey; Michael Middleton (2003). Integrative Document & Content Management:
Strategies for Exploiting Enterprise Knowledge (https://books.google.com/books?id=gYOpFl
MXcs0C&q=%22document+processing%22+ocr&pg=PA368). Idea Group Inc (IGI).
ISBN 9781591400554.
2. Vinod V. Sople (2009-05-25). Business Process Outsourcing: A Supply Chain of Expertises
(https://books.google.com/books?id=g4dxNB05dgoC&q=document+processing+bpo&pg=P
A47). PHI Learning Pvt. Ltd. ISBN 978-8120338159.
3. Mark Kobayashi-Hillary (2005-12-05). Outsourcing to India: The Offshore Advantage (https://
books.google.com/books?id=zdxbEwgfQzQC&q=%22document+processing%22+bpo&pg=
PA167). Springer Science & Business Media. ISBN 9783540247944.
4. Julia Preston (December 2, 2007). "Immigration Contractor Trims Wages" (https://www.nytim
es.com/2007/12/02/us/02immig.html). The New York Times.
5. Lawrence M. Fisher (July 7, 1990). "Paper, Once Written Off, Keeps a Place in the Office" (ht
tps://www.nytimes.com/1990/07/07/business/paper-once-written-off-keeps-a-place-in-the-offi
ce.html). The New York Times.
6. Al Young; Dayle Woolstein; Jay Johnson (February 1996). "Unknown Title". Object
Magazine. p. 51.
7. "Intelligent Document processing by Floriana Esposito , Stefano Ferilli , Teresa M. A. Basile ,
Nicola Di Mauro" (http://www.di.uniba.it/~ndm/pubs/esposito05icdar.pdf) (PDF). Department
of Computer Science – University of Bari. 2005-04-07. Retrieved 2018-09-08.
8. Floriana Esposito , Stefano Ferilli , Teresa M. A. Basile , Nicola Di Mauro (2005-04-01).
"Intelligent Document Processing" in Proceedings. Eighth International Conference on
Document Analysis and Recognition, Seoul, South Korea, 2005 pp. 1100-1104. doi:
10.1109/ICDAR.2005.144 (https://www.computer.org/csdl/proceedings-article/icdar/2005/24
201100/12OmNqIQS59). doi:10.1109/ICDAR.2005.144 (https://doi.org/10.1109%2FICDAR.
2005.144). S2CID 17302169 (https://api.semanticscholar.org/CorpusID:17302169).
9. US active US7873576B2 (https://patents.google.com/patent/US7873576B2/en), John E.
Jones; William J. Jones & Frank M. Csultis, "Financial document processing system",
published 2011-01-18, issued 2011-01-18
10. Bridgwater, Adrian. "Appian Adds Google Cloud Intelligence To Low-Code Automation Mix"
(https://www.forbes.com/sites/adrianbridgwater/2020/03/09/appian-adds-google-cloud-intelli
gence-to-low-code-automation-mix/). Forbes. Retrieved 2021-04-21.
11. Adamo, Francesco; Attivissimo, Filippo; Di Nisio, Attilio; Spadavecchia, Maurizio (February
2015). "An automatic document processing system for medical data extraction" (https://www.
sciencedirect.com/science/article/pii/S0263224114005016). Measurement. 61: 88–99.
Bibcode:2015Meas...61...88A (https://ui.adsabs.harvard.edu/abs/2015Meas...61...88A).
doi:10.1016/j.measurement.2014.10.032 (https://doi.org/10.1016%2Fj.measurement.2014.1
0.032). Retrieved 31 January 2021.
12. Changwan, Kim; Seong-Il, Lee; Won Joon, Cho (September 2020). "Volumetric assessment
of extrusion in medial meniscus posterior root tears through semi-automatic segmentation on
3-tesla magnetic resonance images" (https://www.sciencedirect.com/science/article/abs/pii/
S1877051720301994). Orthopaedics & Traumatology: Surgery & Research. 101 (5): 963–
968. doi:10.1016/j.rcot.2020.06.003 (https://doi.org/10.1016%2Fj.rcot.2020.06.003).
S2CID 225215597 (https://api.semanticscholar.org/CorpusID:225215597). Retrieved
31 January 2021.
13. Despotović, Ivana; Bart, Goossens; Wilfried, Philips (1 March 2015). "MRI Segmentation of
the Human Brain: Challenges, Methods, and Applications" (https://www.ncbi.nlm.nih.gov/pm
c/articles/PMC4402572). Computational Intelligence Techniques in Medicine. 2015: 963–
968. doi:10.1155/2015/450341 (https://doi.org/10.1155%2F2015%2F450341).
PMC 4402572 (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4402572). PMID 25945121
(https://pubmed.ncbi.nlm.nih.gov/25945121).
14. Putzua, Lorenzo; Caocci, Giovanni; Di Rubertoa, Cecilia (November 2014). "Leucocyte
classification for leukaemia detection using image processing techniques" (https://www.scie
ncedirect.com/science/article/pii/S0933365714001031). Artificial Intelligence in Medicine.
63 (3): 179–191. doi:10.1016/j.artmed.2014.09.002 (https://doi.org/10.1016%2Fj.artmed.201
4.09.002). hdl:11584/94592 (https://hdl.handle.net/11584%2F94592). PMID 25241903 (http
s://pubmed.ncbi.nlm.nih.gov/25241903).
15. Ehrmann, Maud; Romanello, Matteo; Clematide, Simon; Ströbel, Phillip; Barman, Raphaël
(2020). "Language Resources for Historical Newspapers: the Impresso Collection" (https://w
ww.zora.uzh.ch/id/eprint/191270/). Proceedings of the 12th Language Resources and
Evaluation Conference. Marseille, France. pp. 958–968.
16. Seguin, Benoit; Costiner, Lisandra; di Lenardo, Isabella; Kaplan, Frédéric (April 1, 2018).
"New Techniques for the Digitization of Art Historical Photographic Archives - the Case of
the Cini Foundation in Venice" (https://www.ingentaconnect.com/content/ist/ac/2018/000020
18/00000001/art00001). Archiving 2018 Final Program and Proceedings. Society for
Imaging Science and Technology. pp. 1–5. doi:10.2352/issn.2168-3204.2018.1.0.2 (https://d
oi.org/10.2352%2Fissn.2168-3204.2018.1.0.2).
17. Ares Oliveira, Sofia; di Lenardo, Isabella; Tourenc, Bastien; Kaplan, Frédéric (11 July 2019).
A deep learning approach to Cadastral Computing (https://infoscience.epfl.ch/record/26828
2). Digital Humanities Conference. Utrecht, Netherlands.
18. Petitpierre, Rémi (July 2020). Neural networks for semantic segmentation of historical city
maps: Cross-cultural performance and the impact of figurative diversity (https://www.researc
hgate.net/publication/343017681) (MSc). arXiv:2101.12478 (https://arxiv.org/abs/2101.1247
8). doi:10.13140/RG.2.2.10973.64484 (https://doi.org/10.13140%2FRG.2.2.10973.64484).
19. Fujisawa, H.; Nakano, Y.; Kurino, K. (July 1992). "Segmentation methods for character
recognition: from segmentation to document structure analysis" (https://ieeexplore.ieee.org/d
ocument/156471). Proceedings of the IEEE. 80 (7): 1079–1092. doi:10.1109/5.156471 (http
s://doi.org/10.1109%2F5.156471). Retrieved 3 February 2021.
20. Tang, Yuan Y.; Lee, Seong-Whan; Suen, Ching Y. (1996). "Automatic document processing:
a survey" (https://www.sciencedirect.com/science/article/abs/pii/S0031320396000441).
Pattern Recognition. 29 (12): 1931–1952. Bibcode:1996PatRe..29.1931T (https://ui.adsabs.
harvard.edu/abs/1996PatRe..29.1931T). doi:10.1016/S0031-3203(96)00044-1 (https://doi.or
g/10.1016%2FS0031-3203%2896%2900044-1). Retrieved 3 February 2021.
21. Ares Oliveira, Sofia; Seguin, Benoit; Kaplan, Frederic (5–8 August 2018). dhSegment: A
Generic Deep-Learning Approach for Document Segmentation (https://ieeexplore.ieee.org/d
ocument/8563218). 2018 16th International Conference on Frontiers in Handwriting
Recognition (ICFHR). Niagara Falls, NY, USA: IEEE. arXiv:1804.10371 (https://arxiv.org/ab
s/1804.10371). doi:10.1109/ICFHR-2018.2018.00011 (https://doi.org/10.1109%2FICFHR-20
18.2018.00011).
22. "Revolutionary Scanning Technology for Art" (https://artmyn.com/). Artmyn. Retrieved
3 February 2021.

Retrieved from "https://en.wikipedia.org/w/index.php?title=Document_processing&oldid=1154835533"

You might also like