A Review of Voice-Base Person Identification State-Of-The-Art

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 23

Covenant Journal of Engineering Technology (CJET) Vol.3 No.

1, June 2019
ISSN: p 2682-5317 e 2682-5325 DOI: 10.20370/2cdk-7y54

An Open Access Journal Available Online

A Review of Voice-Base Person Identification:


State-of-the-Art

Folorunso C. O.*1, Asaolu O. S.2 & Popoola O. P.3

1, 2, 3
Department of Systems Engineering, University of Lagos, Akoka, Lagos,
Nigeria
*1
comfortfolorunso@gmail.com, 2oasaolu@unilag.edu.ng,
3
toyin_net@yahoo.com
Received: 22.10.2018 Accepted: 11.04.2019 Date of Publication: June, 2019

Abstract - Automated person identification and authentication systems are


useful for national security, integrity of electoral processes, prevention of
cybercrimes and many access control applications. This is a critical
component of information and communication technology which is central to
national development. The use of biometrics systems in identification is fast
replacing traditional methods such as use of names, personal identification
numbers codes, password, etc., since nature bestow individuals with distinct
personal imprints and signatures. Different measures have been put in place
for person identification, ranging from face, to fingerprint and so on. This
paper highlights the key approaches and schemes developed in the last five
decades for voice-based person identification systems. Voice-base
recognition system has gained interest due to its non-intrusive technique of
data acquisition and its increasing method of continually studying and
adapting to the person’s changes. Information on the benefits and challenges
of various biometric systems are also presented in this paper. The present and
prominent voice-based recognition methods are discussed. It was observed
that these systems application areas have covered intelligent monitoring,
surveillance, population management, election forensics, immigration and
border control.
Keywords: Person-identification, biometrics, information and communication
technology, authentication, voice-base person identification.

1. Introduction of distinguishing between persons.


Person identification is the process of Conventional person identification
recognizing an individual based on technologies are developed based on
specific characteristics. Identification “something we know” (password and
by name is the most common method personal identification number (PIN)),
38
Folorunso C.O., et al CJET (2019) 3(1) 38-60
“something we have” (Token, an individual since no two persons share
Identity Card, an Access Control Card entirely the same biometric traits [3].
or a Radio Frequency Identity (RFID) With numerous developments in
card), unlike something we are (i.e. technology, the security sector has
bodily features like fingerprint, encountered some significant
finger-vein, voice, gait). These advancement. We can safely say that
identification schemes have their technology has transformed security.
limitations. For instance, a password The challenge was how to make
or a PIN can easily be forgotten or computers identify or recognise an
even guessed by an impersonator individual. Biometric being based on
while a token or an identity card can measuring physical characteristics
be lost, stolen or even cloned. As uses features which are common and
such, the conventional forms of readily available to all class of people.
identification may not really These features are distinct, easily
distinguish between an authorized collected and tested, and have high
user and an impersonator [1]. variability to carter for repetition of
The need to verify the identity of an data class. Some of the biometric
individual before such individual can features currently used include: Facial
have access to some restricted places thermogram, hand vein, gait,
is very important as so many means of keystroke, odour, ear, hand geometry,
breaking into one’s privacy is being fingerprint, face, retina, iris, palm-
developed every day by hackers. The print, voice, signature and DNA.
use of the state-of-the-art person In voice-based recognition system, the
identification technology which uses features of an individual’s voice are
biometrics technology cannot be over- based on the physical characteristics
emphasized, due to so many of their vocal tract, nasal cavities and
advantages this type of recognition the articulators (which include mouth,
system proffers. Biometrics is the lips, teeth and so on) which are used
science and technology of measuring, in generating sound. These features
analysing and comparing some human are unchanging for an individual
physiological and sociological related though the behavioural features may
data [2]. Many biometrics-based change over time due to age, location,
identification systems have been medical conditions and/or emotional
employed both for domestic and state [4]. Voice-based recognition
commercial applications. The use of methods are classified into two ways:
biometrics has shown increasing Automatic Speaker Verification
significance in our daily life and as (ASV) and Automatic Speaker
such, many books, journals and Identification (ASI) system [4]. The
conference proceedings have limitations of the conventional person
enumerated advances in theory and identification system have always
application of this technology. In all, been challenging, hence the need to
the use of biometrics has emerged as research into the state-of-the-art
the best way of identifying an biometric technologies novel

URL: http://journals.covenantuniversity.edu.ng/index.php/cjet
39
Folorunso C.O., et al CJET (2019) 3(1) 38-60
biometric features that can be body (fingerprints, face, finger
proposed. geometry, iris scans, retina scans,
2. Review of related works hand geometry, finger-vein among
So much research has been done on others) as shown in Figure 1. The
person identification using biometrics second category, which is the
technology. Biometrics traits or behavioural characteristics are based
characteristics are divided into two on what one does (a person’s action),
major categories. The first category is it uses data that are derived from
physiological characteristics, which indirect measurement of a person’s
are based on how one is, using data action (voice recognition, keystroke
(features) that are derived from direct scans, signature, gait and so on as
measurements of parts of the human shown in Figure 2 [1], [5].

Figure 1: Selected physiological biometrics in use [3]

Figure 2: Selected behavioural biometrics in use [3]

A biometric authentication system can fingerprint or face from a Forensic


be further classified either as database. Verification on the other
identification or a verification system. hand, is a one-to-one system of
Identification system is a one-to-many identification where biometrics is
system where biometrics is used to used to confirm a person’s identity for
determine a person’s identity from a access control [6], [7].
number of some people’s data stored Kaur and Kaur [8], presented a brief
in a database. For example, the police survey of different voice biometric for
may try to identify an individual’s speaker verification in an attendance

URL: http://journals.covenantuniversity.edu.ng/index.php/cjet
40
Folorunso C.O., et al CJET (2019) 3(1) 38-60
system. They proposed the use of well as environmental conditions. The
voice, having considered different voice-based biometric system is
methods that have been employed for classified into automatic speaker
automatic attendance for students and verification (ASV) and automatic
as such the use of voice for this speaker identifications (ASI). The
purpose is a very important and highly former uses voice as validation
welcome phenomenon. They used characteristic in a one to one
Gammatone filter bank instead of Mel verification scenario. While the latter
filter bank, after which the discrete uses voice to recognise who a person
cosine transform was applied to truly is. A particular voice feature of
separate overlaying signals. The use an individual is matched against a
of Gammatone frequency cepstral stored pattern in a database. A typical
coefficient (GFCC) with the Gaussian voice feature can be formants or any
Mixture Model as well as Artificial other sound characteristics which are
Neural Networks was incorporated for unique to each individual’s vocal
training and matching task tract.
respectively. Belin et al [11] working from a
Tran et al [9] introduces neuro-cognitive point of view looked
normalization technique which at neural association of voice
depends on fuzzy set theory to perception. Having considered the
improve the performance of voice- ability to evaluate a person’s gender
based verification system. In order to and age bracket from listening to their
authenticate a claimed personality, a voices was a very strong motivation
likeliness value was evaluated with a behind this work. They tried to look at
threshold in order to allow or reject the voice as an auditory face in
the person. The use of noise comparison with the face recognition
clustering as well as the c-means system. They then proposed the use of
clustering membership function was Bruce and Young’s model of face
introduced to eradicate the problem of perception as a structure for
ratio-type scores which affects the understanding the perceptual as well
false acceptance rate. Result however, as the cognitive processed contained
showed great reduction in the false in voice perception. This purpose
acceptance and false rejection rate. model predicts functional detachment
Delac and Grgic [4], and Dugelay et similar to those perceived for faces. In
al [10] presented a concise summary future, the use of face archetype to
of different biometric methods which test the prediction of the proposed
include single as well as multiple model was suggested. Similarly, more
biometric systems. Voice-based research is needed in order to
biometric system uses some of the ascertain some of the following; -
features of human-speech that are not - Are voices more noticeable than
changing for a particular individual. any other sound around
Though the behavioural features of irrespective of the fact that such
the same human speech varies over sound contains speech or not.
time due to age medical, emotional as
URL: http://journals.covenantuniversity.edu.ng/index.php/cjet
41
Folorunso C.O., et al CJET (2019) 3(1) 38-60
- To what degree does the voice security as well as reliability of the
perception system permit us to system.
excerpt identity and emotional Korshunov and Marcel [13]
information from the expression of considered the fact that most
other species such as cats, dog and biometric technology systems are
so on. vulnerable to spoofing which reduces
- What is the perceptual emergent of their wide use sometimes hence they
voices? presented the need to develop anti-
Aronowitz [12] considering the rapid spoofing detection methods also
development of smart phones and referred to as presentation attack
mobile internet banking, there is no detection (PAD) systems. They
doubt the imperative need of a very presented an integration of PAD and
strong authentication system. While Automatic Speaker Verification
the current use of password or pattern (ASV) systems using i-vector and
to unlock phones is not sufficient for local binary pattern histograms based
the smart phones having considered ASV-PAD hybrid system. The
the high threat of the device getting to AVspoof database which contains
the custody of an un-authorized user. practical presentation attack was used
Current developments in voice-based to validate this method and result
biometric system present a huge obtained show a significantly
potential for a robust authentication of enhanced resistance of this hybrid
a mobile smart phone using voice. system to attacks at the detriment of
This aspect is very crucial for the trivial degraded implementation for
financial and banking industry, where scenarios without spoofing attacks.
financial organizations are The use of cascading scheme for score
considering a flexible mobile synthesis and evaluating it with the
customer services and easy joint ASV-PAD system was suggested
authentication and at the same time for future research. In addition, the
maintaining security and most state of the art spoofing database can
importantly, reducing fraudulent be employed to explore multi-model
usage. system considering ASV and PAD
They presented the use of the current systems of different modalities for
technology text-independent as well instance speech and image may be
as text-dependent speaker combined to enhance the performance
authentication skills for user in both ‘licit’ and ‘spoof’ scenarios.
verification speaker specific digit Krawczyk and Jain [14] considering
strings, global digit strings, prompted the large evolution from paper-based
digit strings as well as text medical records towards electronic
independent were evaluated adapting medical records, guaranteeing the
the Joint Factor Analysis (JFA), security of such private and highly
Gaussian mixture models with sensitive data cannot be over-
nuisance attribute projection (GMM- emphasized since the health care giver
NAP) and Hidden Markov Model only need to edit and update patient’s
(HMM) with NAP to improve record on the tablet, personal
URL: http://journals.covenantuniversity.edu.ng/index.php/cjet
42
Folorunso C.O., et al CJET (2019) 3(1) 38-60
computers (PC) or even smart phones, factors (GDEB). The GDEB factors
hence the need to protect the patient’s classify features extracted from voice
privacy as required by all source and tract factors and other
governmental regulations. A safe pertinent features such as format data,
authentication system must be put in having in mind that male and female
place anytime such records are to be voices show both acoustic-phonetic
accessed, hence the need for variations as well as physiological
biometric-based access cannot be differences. The main idea was to
disputed. Research showed that on- improve classification rate in speaker
line signature integrated with voice recognition using few parameters.
modalities is the most appropriate Scheffer et al [16] worked on two of
means for the users in such the challenges facing voice biometrics
verification system since tablet PC is technology. These challenges include
built with the associated devices. non-ideal recording conditions (these
Evaluation of hybrid system of online are often operational situation
signature and voice-based biometrics problems such as noise, echoes, voice
was carried out in this work. Dynamic channels and so on) and audio
programming method of string compression. Adapting the SRI
matching was incorporated for the International’s innovations which
signature verification while off-the- arise from the Intelligence Advances
shelf commercial software Research Project Activity (IARPA)
development kit was used for voice Biometric Exploration Science and
verification. Fusion of these two Technology (BEST) and the Defence
methods were carried out at the Advances Research Projects Agency
matching score level after (DARPA) Robust Automatic
normalization of the data. The Transcription of speech (RATS)
prototype of these hybrid systems was projects. The SRI’s approach uses
tested using a small truly multi-modal various features excerpted from
database of fifty users and an EER of speech which are demonstrated using
0.86% was reported. latest machine learning algorithm.
Mazaira-Fernandez et al [15] came They recommended a general audio
from the view of looking at the classification system that excerpts
environment which has been so much metadata facts using i-vector. The
improved by technology of social logistic regression and the cross
media whereby some other people entropy was used to combine the
uses these social media as a form of various features that are automatically
terrorism to transmit their message. In extracted from the audio signal at the
such a situation a typical biometrics i-vector level.
recognition method such as face or Bhogal et al [17] research showed that
fingerprints have been substituted by voice is a more natural means of
another biometric traits such as voice communication and verbal
as this may be readily available in communication is quicker and more
such scenario. They proposed a efficient than textual communication.
gender-dependent extended biometry Considering a virtual environment,
URL: http://journals.covenantuniversity.edu.ng/index.php/cjet
43
Folorunso C.O., et al CJET (2019) 3(1) 38-60
they evaluated the use of virtual out-perform one of the best face
universe (VU) residents also known recognizer (Deep face)
as Avatars in online service Khitrov [19] at the speech technology
employing audio biometrics. The centre reviewed the use of voice-
method for example can include, based biometrics for data access and
spurring consumer applications with a security. He identified some
demand for an utterance, managing challenges and advantages of the
the response to the demand and system.
developing a voiceprint or voice However, voice can be used with any
summary of the speaker or participant. other biometric technology system in
This voiceprint can be connected to an order to achieve 100% accurate
avatar and when an utterance is recognition and verification. Voice-
obtained, the avatar can be recognized based person identification system can
by evaluating the utterance with the be used in the following areas to
voiceprint. Such evaluation can be safeguard security and expedient user
used to recognise avatars as they go verification. These areas include:-
from one online service to another and 1. Call centres and interactive voice
as such, voice biometrics can be used response systems (IVRs) such as to
to verify an avatar for a particular get account balance, medical result
activity. The main idea of their etc. which are private and highly
research was that they used voiceprint sensitive information in order to
to approve an operation limited to an save time while delivering these
authorized user. In other word, they services to the rightful owners
demonstrated the possibility of using once the system can recognize
biometric in internet based activities. their voices.
Zhang et al [18] came from the view 2. Mobile voiced based verification
of the inability of a system to system helps in securing our
recognize an individual from a non- sensitive and private information
frontal view. They presented the stored on the smart devices.
people in photo Albums (PIPA) 3. For cloud computing and Bring
corpus with very high differences in Your Own devices (BYOD)
pose, clothing, image resolution, applications, employees prefer the
illumination and camera viewpoint. use of their personal smart devices
They proposed the pose-invariant to access organization resources.
Person Recognition (PIPER) There is a dire need to secure the
algorithm to tackle the problem organizational information of these
identified earlier. PIPER combines the smart devices. Hence the use of
signals of pose-let level individual voice signature presents a natural,
identifiers trained by Deep appropriate and legally binding
Convolution Neural Network option to hand-written signatures.
(DCNN) to reduce the differences in 4. Voice-based authentication system
pose and then combine with a global can be employed to secure and
recognizer and face recognizer. Result enhance security of our social
obtained showed that this algorithm
URL: http://journals.covenantuniversity.edu.ng/index.php/cjet

44
Folorunso C.O., et al CJET (2019) 3(1) 38-60
networking platforms such as unknown individual. The system will
Facebooks, LinkedIn and so on. be successful in identifying the
Vatsa et al [20] having researched into individual if the comparison of a
various biometric technology, they biometric sample to a template in the
identified some serious problems database falls within a previously set
affecting this technology and they threshold. This mode can be used for
categorise them into accuracy, positive recognition (so that the user
computational speed, security, cost, does not have to provide any
real-time attacks and scalability. They information about the template to be
also identified the various possible used) or for negative recognition of
attacks on biometric technology and the person "where the system
these include impersonation, coercive, establishes whether the person is who
replay attack, as well as the attack on he or she (implicitly or explicitly)
feature extractor, template database, denies to be". The latter function can
matcher and matching results to only be achieved through biometrics
mention a few. In improving the since other methods of personal
performance of biometric technology recognition such as passwords, PINs
they identified two major ways one or cards are ineffective [6].
can protect the biometric information When an individual uses a biometric
from such attack. These include system for the first time, it is referred
encryption as well as watermarking. to as enrolment. During this process,
They introduced a three level biometric data from the individual is
Redundant Discrete Wavelet captured and saved. In subsequent
Transform (RDWT) biometric uses, biometric data is sensed and
watermarking algorithm to implant matched up to data saved at the time
the voice biometric MFCC in a colour of enrolment. The storing and
face image of the same person for recovery of such systems must be
enhanced robustness, security and critically protected if the biometric
accuracy. In biometric watermarking, system is to be robust. In Figure 3, the
a particular quantity of information data acquisition phase is made up of
also known as watermark is implanted some sensors, which act as link
into the initial cover image using a between the real world and the
private key such that the constituents system; it has to obtain the entire
of the cover image are not altered. required data e.g. image or
The result obtained was over 90% fingerprint. All necessary pre-
verification accuracy. processing of the raw audio signal in
2.1 Biometric system architecture order to remove sensor noise and
Biometrics system for identification unwanted sections are carried out in
encompasses different stages as this phase using some kind of
shown in Figure 3. standardization. The second stage is
In recognition mode, the system the feature extraction phase, where
executes a one-to-many comparison highly descriptive and discriminative
against a biometric database in an features are extracted and represented.
attempt to establish the identity of an This stage is a very important stage of
URL: http://journals.covenantuniversity.edu.ng/index.php/cjet

45
Folorunso C.O., et al CJET (2019) 3(1) 38-60
a biometric system architecture. At Hamming distance). Decision is based
the third stage, a vector of numbers or on the outcome of their evaluation
an image with specific characteristic such as granting access or denial from
is used to build a template. A template a restricted area [6].
is a representation of most descriptive The use of biometrics as a verification
features. It is often created and then system consists of three phases. In the
stored in a database for future use. first phase, a sample of a biometric
During the enrolment phase, the feature is captured, processed and
template is either stored on a card, stored in the database as a reference
within a database or both. At the model for future use. In the second
matching stage, the obtained input phase, some samples are compared
features are passed to a matcher that with a reference model in order to see
evaluates it with other current if the newly presented biometrics
templates and estimate the distance matches or
between them using a metric (e.g.

Figure 3: Biometrics system architecture, for identification process

Figure 4: Biometrics System Architecture, for Verification Process

not, after which a threshold is or access is denied if the features do


computed. During the third phase, not match. Positive recognition is a
access is given if the feature matches common use of the verification mode,

URL: http://journals.covenantuniversity.edu.ng/index.php/cjet

46
Folorunso C.O., et al CJET (2019) 3(1) 38-60
"where the aim is to stop many people biometric data each time they enrol"
from registering into the system by [21].
using similar identity but separate

Figure 5: Word cloud of popular biometric signatures

Figure 5 shows the popularity of vi. Reducibility means the


various biometrics used to develop extracted data should be
person identification technologies. reduce-able to a file,
The popularity is determined by these vii. Circumvention requires that
key factors: universality, permanence, the property should not be
collectability, uniqueness, masked or manipulated or
acceptability, reducibility, even fooled,
circumvention, privacy, performance viii. Privacy, demands that
and inimitability. Brief explanations capturing process should not
of these factors are presented as defy the privacy of the
follows: individual,
i. Universality implies that ix. Performance requires that the
every person must possess identification accuracy should
this characteristic, be very high, and
ii. Permanence implies that the x. Inimitability means the
attributes should not change attribute should be
with time, irreproducible by other
iii. Collectability requires that the means.
properties must be suitable for 2.2 Commonly used algorithm in
capture and measured biometrics technology
quantitatively without delay, i. Artificial Neural Networks
iv. Uniqueness means no two (ANN): ANN are
people should have the same computational tools used for
characteristics, analysing many complex real
v. Acceptability implies that the world problems by using the
system must be accepted by biological neuronal networks
the majority, based on the structure and
functions of neurons. It is made
up of layers of computing
URL: http://journals.covenantuniversity.edu.ng/index.php/cjet

47
Folorunso C.O., et al CJET (2019) 3(1) 38-60
neurons that are interconnected linear decision surfaces. Its uses
by some weighted lines that are ranges from handwritten digit
capable of performing very recognition to face recognition
large parallel computations for to medicine [25], [27].
data processing. ANN has been iv. Hidden Markov Model
used in various areas ranging (HMM): HMM is a statistical
from speech recognition to tool used for demonstrating
prediction as well as in generative sequences described
medicine [22]–[24]. by a set of observable
ii. Gaussian Mixture Model sequences. HMM is made up of
(GMM): The GMM is a two stochastic processes which
parameter based probability include the invisible process of
density function, which is hidden states and a visible
represented as a weighted sum process of observable symbols.
of Gaussian element densities. The hidden state makes up the
It is majorly used for measuring Markov chain and the
continuous distribution features probability distribution of the
in biometrics as well as vocal observed symbol depend on the
tract associated spectral fundamental state. It is used for
features in speaker recognition modelling time-varying spectral
system. It can also be used in vector sequences [28]–[30].
automatic laughter recognition v. Deep Neural Network (DNN):
system as well as hand Deep learning is a machine
geometry detection [25], [26]. learning algorithm that is being
iii. Support Vector Machine introduced to the area of large
(SVM): SVM is an algorithm vocabulary speech recognition
for solving a binary system. It has many hidden
classification and regression layers as compared to the
analysis problem. This artificial neural network (ANN)
algorithm maximize margin by and it incorporates new method
defining an optimal hyper- for training its data [30], [31].
plane. It is used for non-linearly The discussion on these algorithms
separable problems. It also pros and cons are presented in Table
maps data to higher component 1.
area where it easily classifies

Table 1: Selected person identification algorithms pros and cons


Algorithm Pros Cons
i. ANN It has been used to train very It requires more data for training
complex models. Choosing the right parameters may
It is easy to conceptualize. be very complex.
It has lots of libraries for Final model may use up a lot of
easy implementations. memory.
It can be used extensively Multi-layered neural networks are

URL: http://journals.covenantuniversity.edu.ng/index.php/cjet

48
Folorunso C.O., et al CJET (2019) 3(1) 38-60
for academic research work. usually harder to train and require
tuning of lots of parameter [32].

ii. GMM It is a lot more flexible with Learning may be a little difficult
regards to cluster Few libraries for implementation
covariance. [33].
GMM allows mixed
membership of points to
cluster.
iii. SVM It reduces overfitting in the It can be time consuming to train.
course of training.
It manages high dimensional It may not be explanatory enough as
problems. it does not return a probabilistic
It is used for linearly and confidence value.
non-linearly separable data. It does not support structured
Its optimality is guaranteed representations of text which gives
due to its convex high performance in Natural
optimization. Language Processing (NLP) [34].
It can be implemented in
various programming
languages (Python, Matlab),
and its libraries are easily
accessible.
iv. HMM It is highly flexible. It has a huge numbers of unstructured
It is suitable for better data parameters.
fitting. It can only represent a small fraction
It is highly statistical in of the entire distributions.
nature. Dependencies between layers may
It can learn from raw data. not be explicitly represented [35],
[36].

v. DNN It continuously grows data to It requires very large amount of


improve its training. unlabeled data for training.
It creates its new features. It requires very high performance
hardware for its implementation.
It requires more time to train [37].

3. Timeline of Biometrics technology


The advancement of biometric recognition is still in progress, hence the need for
continuous improvements in performance and usability.

URL: http://journals.covenantuniversity.edu.ng/index.php/cjet
49
Folorunso C.O., et al CJET (2019) 3(1) 38-60

Figure 6: Biometrics timeline over the past one and a half centuries

3.1 Importance and application of as authentication. They also


biometrics technology timeline employed palm and footprints to
The importance of biometrics differentiate one child from
technology cannot be over another [39], [40].
emphasized. There are various areas - 1800, Thomas Bewick, an English
of applications of this technology naturalist, used the printing of his
around the world. Among the various fingerprint to identify his
applications are: published books [41].
- As far back as 1792-1750BC, the - 1856, Sir William Herschel, a
kings of Babylon used the imprint British officer who worked for the
of their right hand for Indian civil service, commenced
authenticating the written codes of the use of thumbprint on
law printed on clay and also for documents as alternative for
business transactions [38], [39]. written signature mostly for
- 618-907 Chinese use both illiterate and other people [39].
fingerprint and handprint on clay

URL: http://journals.covenantuniversity.edu.ng/index.php/cjet

50
Folorunso C.O., et al CJET (2019) 3(1) 38-60
- 1892, Juan Vucetich, an intensifies, different biometric
Argentinean police researcher used traits were used to verify
fingerprint to identify a mother in electorates during election.
the murder of her children, hence, Somaliland was the first to use the
Argentina was the first Country to iris biometrics voting system in the
substitute anthropometry with world [47]. Also some other
fingerprint [42], [43] . countries throughout the world
- 1920s, fingerprint identification also implement biometrics
was employed by the law technology for election purpose,
enforcement all over the world among them are Kenya, Nigeria,
including the US military and the Ghana, Angola which implements
Federal Bureau of Investigation the fingerprint biometrics system
(FBI) as a form of identification to identify registered voters. So
[39]. many other biometric technology
- 1974, first commercial hand have been used over the years all
geometry system for access over the world[48].
control, time, attendance and - 2001, Face Recognition was used
personal identification became at the Super Bowl in Tampa
available [40]. Florida [49]. Also, Malaysia was
- 1975, FBI sponsors the one of the first countries to
development of devices and implement thumbprint on her
minutiae extracting expertise [44]. national Identity card. Australia
- 1996, Iris scanning was included chips for biometric
implemented in prison in USA identification on the international
[45]. More so, the hand geometry passports. In addition, Canada
recognition system was used biometric for anti-terrorism
implemented at the Olympic measure [39].
games where over 65,000 people - 2002, the enhanced Border
were successfully enrolled [46]. security and Visa Reform Act was
- 1998, Nationwide Building put into law. This act include
Society in Britain introduced iris biometric data in the passports of
recognition within its cash VISA waiver program travelers
dispensing machine in substitute of like Belgium and other Union
the Personal Identity Number Member States [39].
(PIN)[45]. - 2003, US Government National
- 1999, National Bank United in Science and Technology Council,
USA introduces the iris inaugurated a subcommittee on
recognition technology to three biometrics to coordinate
ATM outlets in Houston, Dallas biometrics research and
and Ft Worth [45]. development, policy, outreach and
- Late 1990s and early 2000s international collaboration [50].
biometrics was introduced for Also, in December 2003, the
verification of electorates during United Linkers Company was born
election. As research in this area in India. This company offers
URL: http://journals.covenantuniversity.edu.ng/index.php/cjet
51
Folorunso C.O., et al CJET (2019) 3(1) 38-60
biometric solutions in car security consist of fingerprint, iris, facial
implementing various biometrics images and DNA samples [39].
system ranges from voice to - 2010, US national security
fingerprint [51]. apparatus employs biometrics for
- 2004, US implemented biometrics terrorist identification [40]. In
into immigration services and it addition, SBTel and devices
was tagged- US-Visit. Also, US commenced the selling of
government called for compulsory biometrics devices for attendance
government-wide personal and access control solutions in
identification card for all federal Nigeria [50].
workers and contractors [50]. In - 2011, Biometrics identification
addition, Europe adopted the use was employed in the identification
of biometric system for passports of Osama Bin Laden. Also. SBTel,
and travel documents using both a biometric device vendor,
facial images and fingerprints [39]. launched web based biometrics
- 2005, OMRON Corporation show attendance for a cloud-based
case the world’s first face attendance system [50].
recognition technology at the - 2013, Apple included fingerprint
security show tagged Japan 2005. scanner into smartphones [40].
This technology can be deployed - In January 2017, Australia’s
on Personal Digital Assistants Department of Immigration and
(PDAs), mobile phones and other Border Protection announced
mobile devices with camera strategy to execute a novel system
function [52]. Moreso, seven at the country’s international
countries in Europe consisting of airport by 2020. This biometric
Austria, Belgium, The system makes use of face, iris
Netherlands, France, Germany, and/or fingerprint recognition
Luxembourg and Spain signed an system. The biometric system is
agreement on the improvement of intended to replace the existing
cross border information exchange paper identity card passports [54].
and data comparison using DNA In addition, in January of the same
profiles, fingerprints and vehicle year, President Donald Trump
registration data [39]. ordered that all non-citizens be
- 2008, US Government commenced subjected to biometric checks
the organization of the use of while entering or leaving the
biometric database [40]. Also [53] United States [54]. Moreover, in
presented an experimental result August 2017, First-Biometrics
on fusion of iris and palm print for announces the use of next
multi-biometric system. The result biometrics flexible fingerprint
obtained showed a very great sensor for smart payment and
improvement on a single biometric identity card [54].
system. In addition, FBI set up a 1-
billion dollar database which

URL: http://journals.covenantuniversity.edu.ng/index.php/cjet
52
Folorunso C.O., et al CJET (2019) 3(1) 38-60
3.2 Evolution of voice-based constructed a phoneme
recognition system recognizer which identifies four
The initial objective of speech (4) vowels and nine (9)
recognition was to develop a system consonants [55].
which imitate a person’s speech - 1960s – Atal and Itakura
communication ability. invented the basic theory of
- 1773 – The Russian scientist Linear Predictive Coding (LPD)
Christian Kratzenstein, a which abridged the
professor of physiology in approximation of the vocal tract
Copenhagen created vowel reponse from speech waveform
sound with the use of resonance [55].
tubes which was attached to the - 1962 – IBM shoebox was
organ pipes [55]. developed and it can understand
- 1791 – Wolfgang Von Kempelen 16 English words [56].
in Vienna developed the acoustic - 1971 – Alexander Waibel,
mechanical speech machine [55]. developed the Harpy machine at
- Mid 1800s – Charles Wheatstone Carnegie Mellon University.
constructed Von Kempelen’s This machine can understand
speaking machine employing 1,011 words with some phrases
resonators from leather such that [56].
different speech-like sounds - 1970s – Tom Martins started the
could be produced by using hand leading speech recognition
to change the configurations company named Threshold
[55]. Technology incorporation
- 1879 – Thomas Edison building the first Automatic
discovered the first dictation Speech Recognition (ASR)
machine [56]. product known as VIP-100
- 1930s – Homer Dudley system. Also, Lenny Baum of
developed a speech synthesizer Princeton University discovered
called Voice Operating a mathematical speech
Demonstrator (VODER) which recognizer called the Hidden
was an electrical prototype of Markov Model (HMM). In
Wheatstone’s work [55]. addition, the use of pattern
- 1952 – Davis, Biddulph and recognition technology to speech
Balashek of Bell laboratories identification system using LPC
developed Audrey a system for was introduced [55].
segregated digit recognition for a - 1980s – The use of Artificial
particular talker [55]. Neural Network was introduced
- 1950s – Oslon and Belar of RCA for speech hence; speech
laboratories constructed a ten recognition tends towards
(10) syllable recognizer for a prediction.
particular talker[55]. - 1986 – IBM Targora was
- 1959 – Fry and Denes of developed employing the HMM
University College in England to predict the next phonemes in
URL: http://journals.covenantuniversity.edu.ng/index.php/cjet

53
Folorunso C.O., et al CJET (2019) 3(1) 38-60
speech. This system was the Although, there are many challenges
world’s fastest typist at the time that are needed to be addressed in
as it was able to recognize up to order for these systems to be robust
20,000 English words and a and more accurate (Table 2). A major
number of sentences [56]. challenge of these systems is spoofing
- 1990s – The emergence of attack. It is a condition whereby a
Automatic Speech Recognition person effectively impersonates
(ASR) [55]. another person. This is often done in
- 1997 – The world’s leading order to gain an illegal access to a
uninterrupted speech recognizer facility or system.
was announced as a Dragon’s 3.4 Areas of implementation of
naturally speaking software. This Voice-based recognition system
software was able to understand 1. For attendance system
100 words per minute without 2. In mobile phone for auto texting
any break in between the words 3. Forensic
[55]. 4. Banking and financial
- 2006 – The National Security institutions
Agency (NSA) commenced the 5. Using one’s indigenous language
use of speech recognition to for recognition purpose.
identify key terms in recorded 6. In medicine.
speech [56]. 7. Assisting the aged in text
- 2008 – Google introduces a composition, bank transaction
voice exploration app on mobile and many more
devices [56]. 8. According to Boyd [56], “cloud-
- 2011 – Apple launches Siri, for based computers have entered
voice based digital assistant [56]. millions of homes and can be
3.3 Benefits and challenges of controlled by voice, even
biometrics systems offering conversational responses
Despite this progress, a number of to a wide range of queries”.
challenges continue to restrain the full 9. Voice activated home speakers.
potential of biometrics to 10. Purchases can be made over the
automatically recognize humans. internet using voice enabled
Some specific challenges, in terms of machines.
biometrics groups, are presented in According to Akhtar et al [57], the US
Table 2 [6]. National Institute of Standards and
The need for a reliable system for Technology enumerate the
person identification cannot be over- susceptibility of biometrics to
emphasized, while the state-of-the-art spoofing in their national vulnerability
biometric identification systems are database. Most of the existing
taking over access control and many biometrics systems are susceptible to
security establishments. These this attack. For instance, the iris,
systems have been in existence for finger print and face images taken
over five decades, though it started from impersonators were identical
gaining attention in recent time. with that of the real users. In order to
URL: http://journals.covenantuniversity.edu.ng/index.php/cjet

54
Folorunso C.O., et al CJET (2019) 3(1) 38-60
cub this attack, there is a need to to reduce this attack [58-60]. Thus,
introduce a liveness detection into more studies are required in this
biometrics systems. This is an area of direction. Finally, biometric trait and
future research. In addition, the use of databases are needed in order to
different biometric systems validate new biometric systems such
combination that is a multi-modal as the use of tongue, ear, sneeze,
biometric technology has the potential cough, laughter and so on.

Table 2: Benefits and challenges of Biometrics systems

4. Conclusions multiple biometric features in order to


Even though there are some ensure very high reliability is
challenges facing biometrics essential. Investigating new biometric
technology, it has tremendous usage features for identification is an area
in this information technology age. where more research is needed to be
Choice of biometric type depends on carried out.
characteristics measurement and user Having considered the advancement
requirement. Other factors which in technology, moving from the use of
affect the choice of a biometric based paper to electronics and online base
are sensor and device availability, medical records to banking and
computational time and reliability, financial industry as well as social
cost, sensor size and power media. The huge and sensitive data
consumption. In addition, cultural involved must be adequately protected
disposition is also a factor that affects from being hacked into. Voice-based
biometric technology selection. recognition system has however, been
Currently, due to divers’ nature of considered as a very appropriate
biometric applications, no single biometrics for this task having
biometric trait is likely to be ideal and considered the speed, efficiency and
satisfy all the requirements of all customer relation characteristics it
applications, hence the need to use possesses.
URL: http://journals.covenantuniversity.edu.ng/index.php/cjet

55
Folorunso C.O., et al CJET (2019) 3(1) 38-60
According to [56], “Voice is the Acknowledgements
future. The world’s technology giants The authors wish to acknowledge
are clamouring for vital market share, Ighravwe D. E. for taking his time to
with ComScore agitating that 50% of peer review this work.
all searches will be voice searches
come 2020.”

References
[1] A. K. Jain, K. Nandakumar, and dynamics: A survey,” J. Pattern
A. Ross, “50 years of biometric Recognit. Res., vol. 7, no. 1, pp.
research: Accomplishments , 116–139, 2012.
challenges , and opportunities,” [8] J. Kaur and S. Kaur, “A Brief
vol. 79, pp. 80–105, 2016. Review: Voice Biometric For
[2] T. Reuters, “Global directory,” Speaker Verification in
2017. [Online]. Available: Attendance Systems,” Imp. J.
https://blogs.thomsonreuters.com Interdiscip. Res., vol. 2, no. 10,
/answerson/biometrics- 2016.
technology-convenience-data- [9] D. Tran, M. Wagner, Y. W. Lau,
privacy/. [Accessed: 03-Oct- and M. Gen, “Fuzzy methods for
2018]. voice-based person
[3] D. Carlson, “Biometrics - Your authentication,” IEEJ Trans.
Body as a Key,” 2014. [Online]. Electron. Inf. Syst., vol. 124, no.
Available: 10, pp. 1958–1963, 2004.
http://www.dynotech.com/article [10] J.-L. Dugelay, J.-C. Junqua, C.
s/biometrics.shtml. [Accessed: Kotropoulos, R. Kuhn, F.
04-May-2017]. Perronnin, and I. Pitas, “Recent
[4] K. Delac and M. Grgic, “A advances in biometric person
survey of biometric recognition authentication,” in International
methods,” 46th Int. Symp. Conference on Acoustics,
Electron. Mar., 2004. Speech, and Signal Processing
[5] A. M. Bojamma, B. Nithya, and (ICASSP), IEEE, 2002, vol. 4, p.
C. N. Prasad, “An Overview of 4060-4063.
Biometric System,” Int. J. [11] P. Belin, S. Fecteau, and C.
Comput. Sci. Eng. Inf. Technol. Bedard, “Thinking the voice:
Res. ISSN 2249-6831, vol. 3, no. neural correlates of voice
2, pp. 153–160, 2013. perception,” Trends Cogn. Sci.,
[6] R. B. Jadhao and A. Bakshi, “An vol. 8, no. 3, pp. 129–135, 2004.
Overview of Biometric System,” [12] H. Aronowitz, R. Hoory, J.
Int. J. Sci. Res. Educ., vol. 3, no. Pelecanos, and D. Nahamoo,
7, 2015. “New developments in voice
[7] B. P. Salil and W. L. Damon, biometrics for user
“Biometric authentication and authentication,” in Twelfth
identification using keystroke Annual Conference of the

URL: http://journals.covenantuniversity.edu.ng/index.php/cjet
56
Folorunso C.O., et al CJET (2019) 3(1) 38-60
International Speech Improving person recognition
Communication Association, using multiple cues,” in
2011. Proceedings of the IEEE
[13] P. Korshunov and S. Marcel, Conference on Computer Vision
“Joint operation of voice and Pattern Recognition, 2015,
biometrics and presentation pp. 4804–4813.
attack detection,” 8th Int. Conf. [19] M. Khitrov, “Talking passwords:
Biometrics Theory, Appl. Syst. voice biometrics for data access
(BTAS), ieeexplore.ieee.org, and security,” Biometric
2016. Technol. Today, vol. 2, pp. 9–11,
[14] S. Krawczyk and A. K. Jain, 2013.
“Securing electronic medical [20] M. Vatsa, R. Singh, and A.
records using biometric Noore, “Feature based RDWT
authentication,” in International watermarking for multimodal
Conference on Audio-and Video- biometric system,” Image Vis.
Based Biometric Person Comput., vol. 27, no. 3, pp. 293–
Authentication, 2005, pp. 1110– 304, 2009.
1119. [21] P. Ghosh and R. Dutta, “A new
[15] L. M. Mazaira-Fernandez, A. approach towards biometric
Álvarez-Marquina, and P. authentication system in palm
Gómez-Vilda, “Improving vein domain,” Int. J. Adv. Innov.
Speaker Recognition by IJAITI, vol. 1, no. 2, pp. 1–10,
Biometric Voice 2012.
Deconstruction,” Front. Bioeng. [22] O. N. A. Al-Allaf, “Review of
Biotechnol., vol. 3, Sep. 2015. face detection systems based
[16] N. Scheffer, L. Ferrer, A. artificial neural networks
Lawson, Y. Lei, and M. algorithms,” arXiv Prepr.
McLaren, “Recent developments arXiv1404.1292, pp. 1404–1292,
in voice biometrics: Robustness 2014.
and high accuracy,” in [23] F. Amato et al., “Artificial neural
International Conference networks in medical diagnosis,”
onTechnologies for Homeland J. Appl. Biomed., vol. 11, pp.
Security (HST), IEEE, 2013, pp. 47–58, 2013.
447–452. [24] A. S. George, E. Roy, A. Antony,
[17] K. S. Bhogal, A. Rick, D. and M. Job, “An Efficient Gait
Kanevsky, and Pickover C A, Recognition System for Human
“Using voice biometrics across Identification using Neural
virtual environments in Networks,” Int. J. Innov. Adv.
association with an avatar’s Comput. Sci., vol. 6, no. 5, 2017.
movements,” US Patent, Google [25] J. R. Pinto, J. S. Cardoso, A.
Patents, 2012. Lourenço, and C. Carreiras,
[18] N. Zhang, M. Paluri, Y. “Towards a Continuous
Taigman, R. Fergus, and L. Biometric System Based on ECG
Bourdev, “Beyond frontal faces: Signals Acquired on the Steering
URL: http://journals.covenantuniversity.edu.ng/index.php/cjet
57
Folorunso C.O., et al CJET (2019) 3(1) 38-60
Wheel,” Sensors, vol. 17, p. from-heavy-users-welcome.
2228, 2017. [Accessed: 03-Jul-2018].
[26] J. S. Edwards, “Using Gaussian [33] O. Wibisono, “Deep Learning,”
Mixture Model and Partial Least 2016. [Online]. Available:
Squares regression classifiers for https://www.quora.com/What-
robust speaker verification with are-the-advantages-to-using-a-
various enhancement methods,” Gaussian-Mixture-Model-
Rowan University, Rowan clustering-algorithm. [Accessed:
Digital Works, 2017. 03-Jul-2018].
[27] C. D. Smyser et al., “Prediction [34] A. Ng, “Deep Learning,” 2016.
of brain maturity in infants using [Online]. Available:
machine-learning algorithms,” https://www.quora.com/What-
Neuroimage, vol. 136, pp. 1–9, are-some-pros-and-cons-of-
2016. Support-Vector-Machines.
[28] U. Jerome, “Acoustic Laughter [Accessed: 03-Jul-2018].
Processing,” TCTS Lab. Publ. [35] P. J. Rani, “Advantages and
Univ. Mons., 2014. disadvantages of hidden markov
[29] Ç. Hüseyin, U. Jérôme, and D. model.,” 2016. [Online].
Thierry, “Synchronization rules Available:
for HMM-based audio-visual https://www.slideshare.net/joshib
laughter synthesis,” in IEEE log/advantages-and-
International Conference on disadvantages-of-hidden-
Acoustics Speech and Signal markov-model. [Accessed: 04-
Processing ICASSP, 2015. Jul-2018].
[30] A. Hannun et al., “Deep speech: [36] K. Tuzcuoglu, “Deep Learning,”
Scaling up end-to-end speech 2018. [Online]. Available:
recognition,” arXiv Prepr. https://www.quora.com/What-
arXiv1412.5567, 2014. are-the-advantages-and-
[31] F. Lingenfelser, J. Wagner, J. disadvantages-of-Hidden-
Deng, R. Bruckner, B. Schuller, Markov-Models-in-forecasting-
and E. Andre, “Asynchronous values-of-a-time-series-
and Event-based Fusion Systems compared-to-other-methods-e-g-
for Affect Recognition on ARIMA-Can-we-say-that-HMM-
Naturalistic Data in Comparison is-a-Non-Statistical-learning-
to Conventional Approaches,” methods. [Accessed: 03-Jul-
IEEE Trans. Affect. Comput., 2018].
2016. [37] O. Maslovska, “Deep Learning:
[32] L. Matthew, “Deep Learning,” Definition, Benefits, and
2014. [Online]. Available: Challenges,” 2017. [Online].
https://www.quora.com/What- Available:
are-the-pros-and-cons-of-neural- https://stfalcon.com/en/blog/post/
networks-from-a-practical- deep-learning-benefits-and-
perspective-Personal-comments- challenges. [Accessed: 03-Jul-
2018].
URL: http://journals.covenantuniversity.edu.ng/index.php/cjet
58
Folorunso C.O., et al CJET (2019) 3(1) 38-60
[38] J. Ashbourn, “The social http://ntrg.cs.tcd.ie/undergrad/4b
implications of the wide scale a2.02/biometrics/now.html.
implementation of biometric and [46] N. Duta, “A survey of biometric
related technologies,” Backgr. technology based on hand
Pap. Inst. Prospect. Technol. shape,” Pattern Recognit., vol.
Stud. DG Jt. Res. Centre, Eur. 42, no. 11, pp. 2797–2806, 2009.
Comm., 2005. [47] TeleSur, “Somaliland: 1st in
[39] E. J. Kindt, “Privacy and data World to Use Iris Scanner
protection issues of biometric Technology to Stem Voter
applications,” Springer, 2016. Fraud,” 2017. [Online].
[40] S. Mayhew, “History of Available:
Biometrics,” 2016. [Online]. https://www.telesurtv.net/english
Available: /news/Somaliland-1st-in-World-
http://www.biometricupdate.com to-Use-Iris-Scanner-Technology-
/201501/history-of-biometrics. to-Stem-Voter-Fraud-20171114-
[Accessed: 09-May-2017]. 0006.html.
[41] K. Fauci, “Prezi Presentations,” [48] P. Wolf, A. Alim, B. Kasaro, P.
2014. [Online]. Available: Namugera, M. Saneem, and T.
https://prezi.com/5nzrgdiroqzn/1 Zorig, “Introducing Biometric
800s-thomas-bewick-an-english- Technology in Elections,” Voter
naturalist-used-engraving/. Information, Commun. Educ.
[42] USA-Govt, “Visible Proofs, U.S. Netw. Glob. Knowl. Netw. Voter
National Library of Medicine,” Educ., 2017.
vol. 2017. U.S. National Library [49] L. K. Nadeau, “Tracing the
of Medicine, 8600 Rockville History of Biometrics,” 2012.
Pike, Bethesda, MD 20894, [Online]. Available:
2014. http://www.govtech.com/Tracing
[43] D. A. Glaeser, “The Fingerprint, -the-History-of-Biometrics.html.
in the Service of the SWISS [50] Solutions-Broadway-Digest,
Confederation.” Federal Office “History of Biometrcs,” 2017.
of Police fedpol, 2013. [Online]. Available:
[44] S. M. Martinez, “THe FBI- https://www.sbtelecoms.com/hist
Federal Bureau of Investigation,” ory-of-biometrics-conclusion/.
2013. [Online]. Available: [51] United-Linkers-Biometric-and-
https://archives.fbi.gov/archives/ Robotic-Solutions, “World’s
news/testimony/overview-of-fbi- First Face Recognition Biometric
biometrics-efforts. for Mobile Phones,” 2017.
[45] K. Chadwick, J. Good, G. Kerr, [Online]. Available:
F. McGee, and F. O’Mahony, http://automobile-security.com.
“Biometric Authentication For [52] World’s-First-Face-Recognition-
Network Access And Network Biometric-for-Mobile-Phones,
Applications,” 2001. [Online]. “World’s First Face Recognition
Available: Biometric for Mobile Phones,”
2005. [Online]. Available:
URL: http://journals.covenantuniversity.edu.ng/index.php/cjet

59
Folorunso C.O., et al CJET (2019) 3(1) 38-60
https://phys.org/news/2005-03- [58] K. O. Okokpujie, S. N. John, E.
world-recognition-biometric- Noma-Osaghae, C. Ndujiuba, I.
mobile.html. P. Okokpujie. AN ENHANCED
[53] V. C. Subbarayudu and M. V. N. VOTERS REGISTRATION
K. Prasad, “Multimodal AND AUTHENTICATION
biometric system,” in Emerging APPLICATION USING IRIS
Trends in Engineering and RECOGNITION
Technology, 2008. ICETET’08. TECHNOLOGY. International
First International Conference Journal of Civil Engineering and
on, 2008, pp. 635–640. Technology (IJCIET). 2019 Feb
[54] J. Lee, “Biometric Update.com,” 28;10(2):57-68.
2016. [Online]. Available: [59] K. Okokpujie, E. Noma-
http://www.biometricupdate.com Osaghae, O. Okesola, O.
/201701/biometrics-to-replace- Omoruyi, C. Okereke, S. John, I.
passports-at-australian-airports. P. Okokpujie. Fingerprint
[55] B.-H. Juang and L. R. Rabiner, Biometric Authentication Based
“Automatic speech recognition–a Point of Sale Terminal. In
brief history of the technology International Conference on
development,” Georg. Inst. Information Science and
Technol. Atlanta Rutgers Univ. Applications 2018 Jun 25 (pp.
Univ. California. St. Barbar., vol. 229-237). Springer, Singapore.
1, p. 67, 2005. [60] K. Okokpujie, E. Noma-
[56] C. Boyd, “The startup,” 2018. Osaghae, O. Okesola, O.
[Online]. Available: Omoruyi, C. Okereke, S. John, I.
https://medium.com/swlh/the- P. Okokpujie. Integration of Iris
past-present-and-future-of- Biometrics in Automated Teller
speech-recognition-technology- Machines for Enhanced User
cf13c179aaf. [Accessed: 03-Oct- Authentication. In International
2018]. Conference on Information
[57] Z. Akhtar, C. Micheloni, and G. Science and Applications 2018
L. Foresti, “Biometric liveness Jun 25 (pp. 219-228). Springer,
detection: Challenges and Singapore.
research opportunities,” IEEE
Secur. Priv., vol. 13, no. 5, pp.
63–72, 2015.

URL: http://journals.covenantuniversity.edu.ng/index.php/cjet
60

You might also like