Download as pdf or txt
Download as pdf or txt
You are on page 1of 27

INTERNSHIP REPORT

DEVELOP AN OCR SOFTWARE

Host Company:
AMEN BANK
Created by:
Wissem Ellefi

Summer of 2022
Acknowledgement

For this occasion, I'd like to acknowledge all the


people who got me here and gave me strength and
courage to accomplish this work specially The AMEN
BANK team who welcomed me during this internship.
I am grateful to my parents for their hidden help in
terms of prayers and faith in me, which empowered me
to fulfill this task.
Table des matières

CHAPTER 1: INTRODUCTION ......................................................................................................................... 5


1.1. Introduction .................................................................................................................................. 6
1.2. Features of the Balance-Sheet ...................................................................................................... 6
1.2.1. What is a Balance-Sheet ....................................................................................................... 6
1.2.2. Balance-Sheet Components .................................................................................................. 7
1.2.3. Balance-Sheet Example ......................................................................................................... 7
1.3. Enterprise Presentation ..................................................................................................................... 8
1.3. Report Structure ........................................................................................................................... 9
CAPTER 2: RESEARCH QUESTIONS .............................................................................................................. 10
2.1. Introduction ..................................................................................................................................... 11
2.2. Problem Description ........................................................................................................................ 11
2.3. Proposed Solution ............................................................................................................................ 11
2.4. Solution Analysis .............................................................................................................................. 11
2.5. Research Questions ......................................................................................................................... 12
CHAPTER 3: OBJECTIVE OF THE STUDY ....................................................................................................... 13
CHAPTER 4: LITERATURE REVIEW ............................................................................................................... 15
4.1. Introduction ..................................................................................................................................... 16
4.2. The SSD Algorithm ........................................................................................................................... 16
4.2.1. SSD Review ................................................................................................................................ 16
4.2.2. SSD Architecture ....................................................................................................................... 17
4.3. Optical Character Recognition (OCR) ............................................................................................... 17
4.3.1. OCR Review ............................................................................................................................... 17
4.3.2. OCR Architecture ...................................................................................................................... 18
CHAPTER 5: STUDY FRAMEWORK ............................................................................................................... 19
5.1. Introduction ..................................................................................................................................... 20
5.2. CRISP-DM ......................................................................................................................................... 20
5.3. CRISP-DM Architecture .................................................................................................................... 22
CHAPTER 6: RESULTS AND ANALYSIS .......................................................................................................... 23
CONCLUSION............................................................................................................................................... 26
Figures List
Figure 1: page 1 of the Balance-Sheet .......................................................................................................... 7
Figure 2: page 2 of the Balance-Sheet .......................................................................................................... 8
Figure 3: page 3 of the Balance-Sheet .......................................................................................................... 8
Figure 4: AMEN BANK Headquarters ............................................................................................................ 9
Figure 5: SSD Architecture .......................................................................................................................... 17
Figure 6: Optical Character Recognition Architecture ................................................................................ 18
Figure 7: CRISP-DM Architecture ................................................................................................................ 22
Figure 8: 1st Screenshot of the Application ................................................................................................ 24
Figure 9: 2nd Screenshot of the Application............................................................................................... 24
CHAPTER 1: INTRODUCTION
1.1. Introduction

Transferring scanned text from an image to a computer-readable form has always been

one of the attractive research challenges. The objective of text recognition is to develop

a robust and accurate system capable of achieving the level of human performance in

reading. Off-line text recognition applications can improve the input operation speed and

decrease the possibility of human errors by avoiding retyping a captured document. It

takes as input a raster image of a text captured by a scanner or from screen and then

transfers it into a machine-editable text. Consequently, several computer subjects are

involved in text recognition, including image and signal processing, pattern recognition,

natural language processing, and information systems and databases. Although

researchers have been intensively investigated in the field, existing systems still did not

achieve human reading capabilities. Automatic off-line text recognition is inherently

difficult due to the great variability of writing and printing styles. The inherently

difficulties are due to letters writing styles, where they presented in different sizes and

skews, and variation in width and shape of strokes. The research presented in this

Report is a contribution toward the creation of a reliable recognition system for an

Official Document which is The Balance Sheet.

1.2. Features of the Balance-Sheet

1.2.1. What is a Balance-Sheet

The balance sheet is one of the three fundamental financial statements and is key to

both financial modeling and accounting. The balance sheet displays the company’s

total assets and how the assets are financed, either through debt or equity. It can also
be referred to as a statement of net worth or a statement of financial position.

And in this project, I’m going to deal with French versions of the Balance Sheets (Bilan

Financier).

1.2.2. Balance-Sheet Components

As such, the balance sheet is divided into 3 Major Sections. The first section of the balance

sheet outlines all the company’s assets include all property and rights owned by the

business.

On the other section, the balance sheet outlines the company’s liabilities and shareholders’

equity. And finally, the income statement which is a key financial document of the business. It

shows what the business earns, what it spends, and whether it makes a profit over a given

period.

The Balance Sheet contains the Values of each component corresponding to the current year

and the previous year and we are interested in extracting the values of the current year.
1.2.3. Balance-Sheet Example

Figure 1: page 1 of the Balance-Sheet


Figure 2: page 2 of the Balance-Sheet Figure 3: page 3 of the Balance-Sheet

1.3. Enterprise Presentation

Amen Bank was founded in 1966, as a result of the independence from the “Crédit

Foncier d’Algérie et de Tunisie” (CFAT), a local branch of the French banking system

“Société Centrale de Banque” (later known as “Société Générale”) established as far

back in 1880 and headquartered in Algeria in 1966, it changed its name to Crédit

“Foncier et Commercial de Tunisie "(CFDT).

The CEO was Ismail Zouiten, yet all its shareholders were French citizens. In 1971, it

was bought by the “Banque Générale d'Investissement”, later known as PGI Holding,

and opened to Tunisian shareholders as Rachid Ben Yedder became the new CEO. In

1995, it changed its name again to Amen Bank. In 2009, Amen Bank launched Tunisia's
First Online Bank. In 2015, Amen Bank launched Tunisia's first online Direct Bank. Amen Bank
made a request to the Central Bank of Tunisia to create a subsidiary specialized in Islamic

Banking and Finance. Its headquarters is in Tunis, Tunisia.

Figure 4: AMEN BANK Headquarters

1.3. Report Structure

This dissertation is organized as follows:


· Chapter 2 discusses the research question on the areas of text-image, text
recognizers that we are going to answer along the project.

· Chapter 3 discusses The Objective of the Study. In other words, we are going to
answer the question of what our research is trying to achieve, and a clear
explanation why we are pursuing it.

· Chapter 4, it presents Literature Review where you’ll find the previously published
works in the Object Detection and especially, in the Optical Character Recognition
domain.

· The study Framework is described in Chapter 5, where the methodologies used to


solve the issues and a description of the key Project dates.

· Chapter 6, where you will find the result and Analysis of the work which is our
accurate application and some screenshots using it.

· The last chapter is 6, it presents the conclusion, limitation, and future work. Where
Results are reviewed as well as the system limitation.
CAPTER 2: RESEARCH QUESTIONS
2.1. Introduction

This Chapter is composed mainly of 4 sections organized as follows:

• Problem Description presents an overview of the issue.

• Proposed Solutions section describes the proposed idea to solve the problem described above.

• Solution Analysis discuss the requirements and the impact of the solution proposed in the
previous Section.

• Research Question where you’ll find the question our project sets to answer.

2.2. Problem Description

The bank agent who’s responsible for handling the balance-sheets must read them and then
manually input the extracted information to the database by typing each value apart (~50
values). In other words, this is a big loss of time and management.

2.3. Proposed Solution

Build a software for accurate extraction of balance bank-sheets information using optical
character recognition and successfully get rid of the manual waste of time.

2.4. Solution Analysis

To successfully build this software first we must Work cross-functionally (Cross multiple

departments within the BANK) to understand the business objective of the project, its

requirements, and the balance sheet’s elements.

There are various tools and technologies that are available in the market that can be used to
perform this task. But after a discuss of the Bank to well-choose of the dependencies of this

project we end up by

Deciding to develop the AI system (which is responsible for the OCR task) using Python, the

Backend with JAVA specifically SPRING BOOT and the Frontend using ANGULAR.

Technologies used for this project:

· Python 3.10.5
· PyTorch with support of CUDA
· EasyOCR
· NumPy
· pdf2image
· OpenCV
· Flask
· JDK 11
· Oracle Database
· Spring Boot
· ANGULAR 13.3.9

2.5. Research Questions

Is this application capable of successfully completing these tasks? And effectively reduce the
loss of time?
CHAPTER 3: OBJECTIVE OF THE STUDY
The aim of this work is to develop sentence recognition systems inspired by the human reading

process. Cognitive studies observed that the human tended to read a word at a

time. He considers the global word shapes and uses contextual knowledge to infer and

discriminate a word among other possible words. The sentence recognition system is a fully

integrated system: a word level recognizer (baseline system) integrated with linguistic

knowledge post-processing module. The presented baseline system is a holistic word-based

recognition approach characterized as a probabilistic ranked task. The output of the system is

multiple recognition hypotheses (N-best word lattice). The basic unit is the word rather than

the character: it does rely on segmentation and requires baseline and object detection. So, The

Application is an OCR system that depends on Object Detection.

Build a software for accurate extraction of balance bank-sheets information using optical

character recognition then injects the extracted information to the database automatically

and successfully get rid of the manual waste of time.


CHAPTER 4: LITERATURE REVIEW
4.1. Introduction

This chapter is an overview of the previously published works on the Object Detection (SSD:
Algorithm) and Optical Character recognition fields.

4.2. The SSD Algorithm

4.2.1. SSD Review

SSD: Single Shot Multibox Detector.

We present a method for detecting objects in images using a single deep neural network. This

approach, named SSD, discretizes the output space of bounding boxes into a set of default

boxes over different aspect ratios and scales per feature map location. At prediction time, the

network generates scores for the presence of each object category in each default box and

produces adjustments to the box to better match the object shape. Additionally, the network

combines predictions from multiple feature maps with different resolutions to naturally handle

objects of various sizes. The SSD model is simple relative to methods that require object

proposals because it eliminates proposal generation and subsequent pixel or feature

resampling stage and encapsulates all computation in a single network. This makes SSD

easy to train and straightforward to integrate into systems that require a detection component.

Experimental results on the PASCAL VOC, MS COCO, and ILSVRC datasets confirm that SSD has

comparable accuracy to methods that utilize an additional object proposal step and is much

faster, while providing a unified framework for both training and inference.

The link below will redirect you to the official document of the research (SSD):

https://arxiv.org/pdf/1512.02325.pdf
4.2.2. SSD Architecture

Figure 5: SSD Architecture

4.3. Optical Character Recognition (OCR)

4.3.1. OCR Review

Optical character recognition artificial intelligence works on a Convolutional Neural

Network (CNN) model. This OCR model utilizes embedded text recognition technology to

extract and process text found in images. What makes OCR an amazing technology is its

ability to recognize text from different images with the same font. This is where deep

learning character recognition does wonders.

To accomplish OCR machine learning, the process starts with a text localization followed

by character segmentation. Once all of this is done, the OCR performs character

recognition to find the final missing pieces.


4.3.2. OCR Architecture

Figure 6: Optical Character Recognition Architecture


CHAPTER 5: STUDY FRAMEWORK
5.1. Introduction

This chapter describes the methodology or approach used (Cross-Industry Standard Process for
Data Mining (CRISP-DM))to solve the issue and successfully build an accurate solution.

5.2. CRISP-DM

Data Science is a complex process in which the projects involve a different variety of

stakeholders, data sources and goals. To maintain order, we used the CRISP-DM methodology

which is one of the most popular methods used to date. The methodology is split into 6

sections and is iterative which means that the steps can be repeated as often as necessary until

we reach our goals

1. Business Understanding:

It is important at this point that both the Data Scientist and the stakeholders are on the

same page, in order to prevent a model that does not achieve its intended goal, and also so

that everyone understands and has the same expectations in regard to the outcome of the

model being built.

2. Data Understanding:

Once we have a firm understanding of the objectives of our project ahead of us, it is only

natural that we must have the same understanding of the data that we have available to us.

It is important to know not only the information contained in our data but also where our

data is coming from.


3. Data Preparation:

At this stage, we have enough of an understanding of our data that we can begin the

process of preparing the data for modeling.

4. Modeling:

Once our data has been sufficiently cleaned and prepared, we are now at the step where

we can begin to create models with our data. At this point it is important to consider the

type of model we are hoping to build, guided by our understanding of our objectives and

the data at hand.

5. Evaluating:

During this phase we take a step back and evaluate the models we have made! The

evaluation stage is important because it not only helps us evaluate the current progress of

our model and the success of our model but also provides us the opportunity to see if new

insights can be derived from our work.

6. Deployment:

At this point the goal is to work with the stakeholders to ascertain how to put the model

into progress.
5.3. CRISP-DM Architecture

Figure 7: CRISP-DM Architecture


CHAPTER 6: RESULTS AND ANALYSIS
Program that is able to extract information from the balance sheet document instead of doing that

manually. And below some screenshots of the final Application

Figure 8: 1st Screenshot of the Application

As Figure 1 shows, in this page you have to fill the form and choose the pdf file presenting the balance

sheet and when you click at ‘Traiter’ it will pass it to the OCR system to do his Job.

Figure 9: 2nd Screenshot of the Application


As Figure 2 shows the results of the OCR system.

The Program has an accuracy of 90% which is so reliable.


CONCLUSION
This phase of our Data Science project included a description of work context and methodology.

In addition to that we have also discuss the data science objective of the project including the

component of the balance sheet in order to have clear ideas about the work that must be done

in the earlier stages and the design where we have detailed the system architecture diagram

and described our software environment that will help us to achieve our goals in the end. Then

we showed the algorithms that we are working with and how they work . It was also essential

to provide AMEN BANK with a user friendly interface .

The project was technically beneficial as we were able to master the entire value chain of a

Data Science project using multiple tools and Technologies. Computer Vision a very broad and

very promising field with several fields of application like Optical Character Recognition and the

fact of being a specialist engineer in this field is to have the sense of analysis, design and

organization, and is also to have a sense of commitment to the community and to assume all

the responsibilities that have been assigned to it given the crucial importance of this discipline

for the strategy of the organization.

You might also like