Literature Review

VOICE
VERIFICATION
LITERATURE SURVEY REPORT
SUBMITTEB BY: RANA MUHAMMAD BILAL

COURSE: ADVANCED DIGITAL SYSTEM DESIGN
INSTRUCTOR: DR. REHAN HAFIZ
ABSTRACT
Voice processing is an emerging research area having many

applications in security and automation. Usual Voice processing
systems implement feature extraction, storage and feature
matching techniques to characterize voice sources, store their
particulars and later on match features of sample from claiming
user to his previous record. Most common feature extraction
technique employed in voice processing is Mel Frequency
Cestrum Coefficient (MFCC). For efficient storage and retrieval
many techniques are available like vector quantization, LBG
algorithm (for code book generation) etc. Similarly, a number of
options are available for feature matching like Euclidean
distance & Correlation. This paper describes a top level block
design of Voice Verification System that uses MFCC, LBG &
Euclidean distance. Calculation of MFCC is further detailed to
next level blocks of Hamming Window, FFT, Mel Frequency
Filter Bank, DCT. Then a literature survey is presented for
computation algorithms available for DCT.
TABLE OF CONTENTS
1. DESCRIPTION OF PROJECT
a. OVERVIEW
b. VOICE PROCESSOR
i. DATA PATH
ii. Control Logic
c. VOICE VERIFICATION ALGORITHM
i. To enroll new user
ii. Current User Login
d. PROJECT PARTITIONING
2. APPLIED DCT ALGORITHMS
a. INTRODUCTION TO MFCC FUNCTIONAL BLOCK
b. BRIEF NOTE ON DCT
c. DCT IMPLEMENTATION ALGORITHMS
i. CHEN ET AL ALGORITHM
ii. LEE ALGORITHM
iii. LOEFFLER ALGORITHM
iv. LIU AND CHIU ALGORITHM
d. SUMMARY OF DCT IMPLEMENTATION ALGORITHMS
3. REFERENCES
DESCRIPTION OF PROJECT
a. OVERVIEW
Proposed hardware design for “Voice Verification” includes Voice

Processor, RAM, Microphone, Analog to digital Converter, Liquid Crystal
Display, Keypad, Storage medium (Magnetic disc, tape or Optical Disc) and a
main Controller that manages all these resources to implement our desired
functionality. Interconnect of all these blocks, is depicted in the figure below.
MIC ADC
Controller
LCD
Keypad
RAM
Voice Processor
Storage
When the system starts, 2 options are displayed on LCD, that user selects from
with the help of Keypad. These options are:
1. Current User Login

2. Enrolment Administration
If user selects Enrolment Administration, the controller then presents him with
Add new enrollment or Delete/Modify previous enrolments, features. Our features
of concern, related to this paper are Add new enrolment and Current user login.
When user selects Add new Enrolment Option, the controller then asks him to
enter his user name through keypad and stores it in memory. Afterwards,
controller generates signals to initiate a sequence of operations to capture voice
sample of new user, extract characteristic features (MFCCLBG) from it and
store these features against user name.
Similarly, when a user demands authentication against its user name, the
controller generates necessary sequence of signals, to capture voice sample,
extract characteristic features and match them to previously stored features.
b. VOICE PROCESSOR
Voice Processor is core functional element in this architecture. Design of this

block is sectioned in data path & control logic, each of which is discussed
below separately.
i. DATA PATH
DATA path includes Floating Point Unit (FPU), RAM, Two Registers,
a Tri-State Buffer and a Bus connecting Data Pins of Ram to inputs of
A,B Registers and output of Tri State Buffer. FPU only supports
multiply and add operations. Operands of FPU are outputs of A, B
Registers. Output of FPU is transferred and stored in RAM through
bus by enabling Tri-State Buffer. Entire operation of Data Path is
dictated through a control word (or Instruction) that is generated by
control logic and includes concerned RAM address, Read/Write Signal
of RAM, Tri-State Buffer’s Enable, A & B Register’s Load &
operation code for FPU.
ii. Control Logic

Control Logic is further sectioned in three functional units namely
MFCC, LBG and EUCLIDEAN CALCULATOR. All Functional unit
implement their respective functions by generating sequence of
appropriate control words that processes data from RAM, in FPU and
stores results again in RAM. Authority to generate Control word (or
control data path) is granted to desired unit by selection from a
Multiplexer, which is operated by Main Controller. Main Controller
also issues flag signal “Start” to desired function and receives status
signal “Done”.
In brief, Main Controller implements User Interface (with help of LCD
and Keypad) and manages sequence of operation of its three subunits
MFCC, LBG and EUCLIDEAN DISTANCE. It also manages
operation of external functional unit ADC, which samples voice
through Microphone and stores it in RAM.
MFCC, LBG and EUCLIDEAN DISTANCE perform their respective
operation when instructed and authority over data pass given from
Main Controller.
START DONE START DONE SELECT
MUX
MFCC LBG
FPU CONTROL
WORD
START EUCLIDEAN A B
DISTANCE
DONE CALCULATOR BUFFER
Scratch Pad-1 Scratch Pad-2 Scratch Pad-3
STORAGE
SAMPLE MEMORY
c. VOICE VERIFICATION ALGORITHM
Hardware described above is versatile enough to house a range of Voice

processing algorithm. Brief description of intended algorithm is as follow.
iii. To enroll new user

Voice Sample is captured. Sample is sliced on time axis and each slice
is passed through a hamming window. Each Windowed sample is
Fourier transformed. Transformed Magnitude Spectrum is squared to
estimate power. Power Spectrum is passed through Mel Frequency
Banks to simulate Human Hearing Characteristics. Output of Mel
Frequency Banks is mapped on Log Scale and Discrete Cosine
Transformed to generate Mel Frequency Septrum Coefficients
(MFCC). Linde, Buzo, and Gray (LBG) Algorithm is applied to
calculated MFCC to determine a region around sample MFCC, where
other sample MFCCs from this user may lie. Coordinates of this
Region, termed as Sample Finger Print are stored in memory against
this user.
iv. Current User Login

User is asked for a voice sample, sequence of operations described
above is carried out to calculate finger print. Euclidean Distance
between both finger prints is calculated and compared to threshold. If
Calculated distance is les than the set threshold then the user is
authenticated else not.
d. PROJECT PARTITIONING
Team working on this project includes three members, Rana Muhammad

Bilal, Waqar Akhter Khan & Mirza Qasim. I, Rana Muhammad Bilal am
to work on DCT functional unit. Mr. Mirza Qasim is to work on FFT and
Mr Waqar Akhter Khan is to work on LBG Algorithm. Next Chapter
describes in detail Literature Review concerning Applied DCT
Algorithms.
APPLIED DCT ALGORITHMS
a. INTRODUCTION TO MFCC FUNCTIONAL BLOCK
MFCC functional block houses a MFCC Controller, which allows to break

overall function to smaller control blocks. These Smaller Control Blocks
implement Window, FFT, Mel Filter Bank & DCT Procedures. Authority
to generate control word is again granted to one of these units using a
Multiplexer. Each functional unit receives trigger signal “Start” from
MFCC Controller and provides status signal “Done” to same.
START DONE
CONTROL WORD
MFCC CONTROLLER
TO MUX
WINDOW FFT
DCT MUX
MEL SPECTRUM
b. BRIEF NOTE ON DCT
Discrete Cosine transform is a mathematical technique similar to Fourier

Transform. It also transforms signal from Time Domain to frequency
domain, however in doing so it only uses real numbers as opposed to
Fourier Transform. Cosine components that are found as result of this
transform are considered more efficient than Fourier Coefficients, as fewer
are needed to approximate a signal. Mathematical equation representing
this operation is:
c. DCT IMPLEMENTATION ALGORITHMS
Listed below are some algorithms that are used for Hardware Computation
of Discrete Cosine Transform.
v. CHEN ET AL ALGORITHM
In this algorithm, if 8 point DCT of input is to be calculated then it can
be written in form of a matrix as
Y=AX
Where X = [x0 x1 x2 x3 x4 x5 x6 x7]T is input signal.

Y = [y0 y1 y2 y3 y4 y5 y6 y7]T is output signal.
and A is Transform Matrix
C4 C4 C4 C4 C4 C4 C4 C4
C1 C3 C5 C5 -C5 -C5 -C3 C1
C2 C5 -C5 -C2 -C2 -C5 C5 C2
C3 -C5 -C1 -C5 C5 C1 C5 -C3
C4 -C4 -C4 C4 C4 -C4 -C4 C4
C5 -C1 C5 C3 -C3 -C5 C1 -C5
C5 -C2 C2 -C5 -C5 C2 -C2 C5
C5 -C5 C3 -C1 C1 -C3 C5 -C5
In which Cn = Cos ( nπ ÷ 16)
Due to symmetry, this matrix can be further broken down into two
matrices of lower order for parallel computation, however since our
proposed architecture only supports serial operation, therefore this 4*4
breakdown is not of interest.
Calculation of y0 using above described algorithm in our architecture

requires 7 additions and one multiplication. This is achieved using
distributed arithmetic’s approach and rewriting equation for y0 as:
y0 = C4 * (x0 + x1 + x2 + x3 + x4 + x5 + x6 + x7)
Similarly, 9 additions and 3 multiplications are required for y1

8 additions and 2 multiplications are required for y2
Overall 66 additions and 18 multiplications are required.
vi. LEE ALGORITHM

This Algorithm is based on even and odd decomposition of signal.
Thus an N point DCT is broken down into two N/2 point DCTs.
This breakdown can be continued until N is an integral power of 2.

This boils down to 13 multiplications and 25 additions
vii. LOEFFLER ALGORITHM
Proposed by Loeffler, this algorithm employs block diagram given

below to calculate 8 point DCT. Using similar techniques from
distributed mathematics as above (i.e. a*b + a*c = a(b +c) ), 11
multiplications and 29 additions are required for calculation of
transformed output.
viii. LIU AND CHIU ALGORITHM
In this approach, no of input samples need to be larger than the

intended DCT sample points. A running DCT of desired length (N) is
calculated and each next DCT is obtained by adding a difference term
to previous DCT. As, our application doesn’t requires a running DCT
and it has on demand processing structure, therefore these
implementations are not feasible.
d. SUMMARY OF DCT IMPLEMENTATION ALGORITHMS
Additions and multiplications required in various DCT Algorithm is

tabulated below.
ALGORITHM ADDITIONS MULTIPLICATION

CHEN 66 18
LEE 25 13
LOEFFLER 29 11
Each of these algorithms is parallelize able and pipeline able to different

extent. However, since only one set of input is to be processed each time,
therefore pipelining is not of interest here. Similarly, single processing
element nature of architecture makes parallelism unimportant. Thus, from
minimum operations perspective of decision, LOEFFLER algorithm is the
best option available for our architecture. However, since this is a HID
(Human Interface Device) and HIDs typically have ample amount of
processing time available, therefore CHEN’s algorithm may be pursued
owing to it’s simplicity and ease of implementation. Final selection
between CHEN & LOEFFLER Algorithms will be made on the basis of
timing information from other functional units of architecture and timing
delays of Multiplier and Adder.
REFERENCES
1. An Efficient Implementation of the 1D DCT using FPGA Technology by

Hassan EL-Banna, Alaa A. EL-Fattah and Waleed Fakhr in 11th IEEE
International Conference and Workshop on the Engineering of
Computer-Based Systems (ECBS’04)
2. Implementation of Loeffler Algorithm on Stratix DSP Compared to

Classical FPGA Solutions by A. Ben Atitallah, P. Kadionik, F. Ghozzi,
P.Nouel, N. Masmoudi, Ph.Marchegay
3. A Comparison of Bit Serial and Bit Parallel DCT Designs by DAVID

CROOK and JOHN FULCHER in VLSI Design 1995, Vol. 3, No. 1, pp. 59-
65

Literature Review

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Literature Review

Uploaded by

Copyright:

Available Formats

VOICE

SUBMITTEB BY: RANA MUHAMMAD BILAL

Voice processing is an emerging research area having many

ii. Control Logic

c. VOICE VERIFICATION ALGORITHM

i. To enroll new user

ii. Current User Login

2. APPLIED DCT ALGORITHMS

a. INTRODUCTION TO MFCC FUNCTIONAL BLOCK

b. BRIEF NOTE ON DCT

c. DCT IMPLEMENTATION ALGORITHMS

ii. LEE ALGORITHM

iii. LOEFFLER ALGORITHM

iv. LIU AND CHIU ALGORITHM

d. SUMMARY OF DCT IMPLEMENTATION ALGORITHMS

Proposed hardware design for “Voice Verification” includes Voice

1. Current User Login

Voice Processor is core functional element in this architecture. Design of this

ii. Control Logic

Scratch Pad-1 Scratch Pad-2 Scratch Pad-3

Hardware described above is versatile enough to house a range of Voice

iii. To enroll new user

iv. Current User Login

Team working on this project includes three members, Rana Muhammad

a. INTRODUCTION TO MFCC FUNCTIONAL BLOCK

MFCC functional block houses a MFCC Controller, which allows to break

b. BRIEF NOTE ON DCT

Discrete Cosine transform is a mathematical technique similar to Fourier

Where X = [x0 x1 x2 x3 x4 x5 x6 x7]T is input signal.

In which Cn = Cos ( nπ ÷ 16)

Calculation of y0 using above described algorithm in our architecture

Similarly, 9 additions and 3 multiplications are required for y1

vi. LEE ALGORITHM

This breakdown can be continued until N is an integral power of 2.

vii. LOEFFLER ALGORITHM

Proposed by Loeffler, this algorithm employs block diagram given

In this approach, no of input samples need to be larger than the

d. SUMMARY OF DCT IMPLEMENTATION ALGORITHMS

Additions and multiplications required in various DCT Algorithm is

ALGORITHM ADDITIONS MULTIPLICATION

Each of these algorithms is parallelize able and pipeline able to different

1. An Efficient Implementation of the 1D DCT using FPGA Technology by

2. Implementation of Loeffler Algorithm on Stratix DSP Compared to

3. A Comparison of Bit Serial and Bit Parallel DCT Designs by DAVID

You might also like