Professional Documents
Culture Documents
Expiry Date Extraction Pre Print
Expiry Date Extraction Pre Print
Abstract—In this article we evaluate the impact of possibility of including product validity and expi-
using two image pre-processing approaches with the ration date within the barcode [4], however, this
objective of aiding an Optical Character Recognition requires the usage of an extended barcode format
(OCR) software in correctly retrieving an expiry on the product. Many retail stores still use Bar-
date from an image of a product containing it. In
codes including only the Global Trade Identification
particular, we analyze the impact of finding the
rotation angle of an image using the Hough transform
Number (GTIN). It becomes then necessary to
and the impact of image binarization using adaptive acquire the expiry date of a product displayed on a
Gaussian threshold. We attempt to further increase shelf by means other than barcode reading.
OCR accuracy through a sliding window approach. The digitization of the expiry date brings many
Our results show that applying the Hough transform advantages to both donors and beneficiaries. The
noticeably improves OCR performance with minimal donors can testify to the validity of their donated
impact on the execution time. material for legal reasons and the beneficiaries are
able to know in advance the state of the donated
I. I NTRODUCTION
food to better prepare for receiving it and avoid
The growing number of people living in relative unnecessary waste. This can be achieved by Opti-
poverty are leading to an increase of food insecurity cal Character Recognition (OCR) software. In this
in Europe. Cities and regions are devising local paper we explore the impact of pre-processing on
policies to ensure food security for their inhabitants OCR performance.
and promote resilient food systems. Such policies The rest of the paper is organized as follows.
have led to a growth of food recovery initiatives [1]. Section II gives a survey of approaches to expiry
At the same time, food waste is exerting a pressing date recognition present in the literature as well
challenge in the design of sustainable food systems as similar problems. In Sec. III we introduce the
[2]. Recent studies suggested the relevance of food Hough transform and other image pre-processing
waste induced emissions, water consumption, land approaches we adopted. In Sec. IV we introduce
use, and related economic and social impacts, repre- the algorithms for rotation correction using Hough
senting the third emitter globally, with an estimated transform and sliding window approach to Optical
cost of about 940 billion USD [3]. Consequently, Character Recognition (OCR) while, in Sec. V, we
food waste is now at the core of several interna- comment the results in terms of algorithm execution
tional policy agenda including the UN Sustainable time and OCR accuracy. Finally, Sec. VII concludes
Development Goals and the European plan for a the paper.
Circular Economy. Within this context it becomes
imperative to keep track of the expiration date of II. R ELATED W ORKS
food products in order to identify and reduce waste. Other works have already attempted the extrac-
New standards of barcode such as the GS1 tion of expiry dates from images of products. In
DataBar, also known as EAN-128, introduce the [5] the authors use an approach based on Stretched
Gabor Feature Extraction and a multi-layer dense
neural network for character recognition. The main
difference with our work is that the OCR algorithm
was designed specifically for expiry dates while
we used a generic OCR and studied the impact
of image pre-processing techniques to improve its
performance. Another approach is shown in [6]
where the authors focus on the aspect of localizing
the expiry date from images of a product. The
authors implement an approach where the image
is directly passed through an OCR at 8 different
angles and 3 magnification levels. Our approach
differs since we use Hough transform to immedi- (a) Original Image (b) Hough transformed
ately find the rotation angle of text in the picture
[7]. The problem of detecting the expiry dates from
images shares many similarities with the problem
of reading car licence plate from traffic cameras.
One interesting approach to this problem is shown
in [8] where the authors train a neural network
on an artificially enlarged dataset but that work
differs greatly from our approach as we take a
general purpose OCR software and study the impact
of image pre-processing algorithms on its perfor-
mance. Usage of Hough transform to generally
improve OCR performance outside of the specific
problem of extracting expiry dates has already been (c) Binarized Image (d) Binarized and Hough
implemented in [9], [10]. transformed image