Professional Documents
Culture Documents
Die Trifan 1
Die Trifan 1
net/publication/306350420
CITATIONS READS
4 2,747
3 authors, including:
Some of the authors of this publication are also working on these related projects:
Prediksi Debit Aliran Sungai dengan Neural Networks untuk keperluan pertanian menentukan masa tanam View project
All content following this page was uploaded by Kang Adiwijaya on 21 January 2017.
Learning Hijaiyyah letters is the first stage for someone to read the Holy Qur’an. This process is usually completed by a
learner and an advisor who introduces and teaches how to read and pronounce Hijaiyyah Letter. Speech recognition is a
system that is applied to process voice signal to become data, so it is possible to be recognized by the computer. By applying
this system, it is expected that the function of an advisor who introduces and adjusts the pronunciation of Hijaiyyah Letter
can be replaced, so the process of learning can be done independently. The problem in introducing Hijaiyyah Letter can be
solved by using Mel Frequency Cepstrum Coefficients (MFCC) to do the extraction process of every voice signal’s
characteristic and Hidden Markov Model (HMM) to form model and classify voice. After the author tests the system using
several scenarios, the best accurate data obtained is 67.75% in recognizing 50 words. This accurate data is taken from the
result of 16kHz sample rate test, the size of the codebook is 64 and state of HMM is 5.
1. INTRODUCTION
flexibly.
Hijaiyyah Letter is letters that are used in Al-Qur’an. Mel-frequency cepstral coefficients (MFCC) is a
Someone needs to study Hijaiyyah Letter and Tajweed to method to do feature extraction. This method adopts the
be able to read and understand Al-Qur’an. Studying system of human’s hearing organ, so it is possible to
Hijaiyyah Letter is the very first step in the process of catch the characteristic of important sound [11]. Hidden
learning Al-Qur’an. In the practice, a special advisor who Markov Model (HMM) is a statistic model where the
is Al-Qur’an masterly is absolutely needed to introduce modeled system is assumed as Markov process with
and teach Hijaiyyah Letter. Speech recognition is a unknown parameter process and it is aimed to decide the
technology that is applied to recognize voice and change other hidden parameters based on the known parameters
it into data representation that is understood by computer. [9]. HMM is applied in the process of modeling and
By using Speech Recognition System, it is wished that an recognition.
advisor/ corrector of Hijaiyyah Study can be replaced, so
Pronouncing hijaiyyah word is bit different with
the process of study can be completed independently and
another. We need to understand tajweed and being able to
*
recite the letters correctly is the foundation of tajweed,
Email Address: rifan.refun@gmail.com
2043 Adv. Sci. Lett. Vol. 22, No. 8, 2016 1936-6612/2016/22/2043/005
doi: 10.1166/asl.2016.7769
Adv. Sci. Lett. 22, 2043–2047, 2016 RESEARCH ARTICLE
and this is achieved by knowing where the sound anticipate loss information when filtering process using
originates (mahkraaij). This can then help in practising mel filter bank is applied. The last phase of MFCC is
the pronunciation of the letters correctly. So, building Discrete Cosine Transform (DCT). This algorithm is
speech recognition system for this case is challenging and aimed to convert log mel spectrum from frequency
might be different with another language. MFCC is domain to time domain so, it results 12 features of
adopting humans’s hearing organ and it is expected for MFCC. Next, these 12 features will be derivated by using
good extracting hijaiyyah word voices’s feature. the first order derivation that will produce 12 other
features. Feature vector that will be produced by MFCC is
2. METHODOLOGY 24 features (12 features of MFCC and 12 derivative
features).
A. Voice data recording
The list of Hijaiyyah Words that will be placed as dataset D. Vector Quantization
for this system will be taken from Qronis curriculum. Vector Quantization (VQ) that is used in this journal
Data that is taken is 50 words. Every word is said by 21 is divided into two parts; they are the formulation of
people during the recording. codebook and the determination of codebook index. The
formulation of codebook is completed during the training
B. Normalization process by using K-Means Clustering algorithm, while
The aim of the normalization process is to generalize the determination of codebook index is completed during
the maximum amplitude and sample rate voice signal so training and testing process by changing characterization
there will be no influence of amplitude change in the next vector to become codebook index which has smallest
process. The normalization that is applied in this journal euclidean distance.
covers conversing process (stereo to mono), resampling
16 kHz, centering of amplitude, and dividing every E. Training
amplitude discrete by the maximum amplitude value. Training is a process of voice data modeling to
become a model that can be used in the testing. That that
C. Characterization extraction are used in the training process are strings of codebox
Characterization extraction is an important process index as a result of vector quantization. These indexes can
in Speech Recognition because it is very significant in be named as HMM symbol of observation. Training
characterization extraction process of a voice signal. [7]. process is done by using Baum-Welch algorithm. The
Characterization extraction method that is applied in this result of the training is a model of HMM λ = (A,B,π), A
journal is Mel Frequency Cepstral Coefficients (MFCC) is matrix of transitional probability between, B is matrix
that can be pictured in Figure 1. First, voice signal is of observation symbol probability, and πis initial state
filtered by using pre-emphasis by the parameter of 0,95. probability. In this journal, the kinds of HMM is a result
Next, the result of the pre-emphasise voice signal is of ergodic discrete, where the parameters of HMM such
divided into several frames of 240 width and 160 overlap. as matrix A, B, C and π are resurrected randomly and the
Third, frame blocking result will enter the next process value is nominalized into one. The values of A, B, and π
which is windowing using hamming window. are then re-estimated by training process to get an optimal
parameter. The process of parameter HMM training is
explained in figure 2.
The initiation of HMM is the initiation of model
HMM λ = (A,B,π) that is aired randomly and the value
can be normalized to one. After that, it is important to
count and using forward and backward
algorithm, and can be counted inductively by using three
forward algorithm step[10]:
Mathematical description
: number of state
: the number of distinct observation symbol per
state
: the state transition probability distribution
: the observation symbol probability distribution
in state
: the initial state distribution
: observation
: number of sequence
b k t 1
(8) in every signal.
t t t k
aˆ i ˆ i
j T 1
t 1 t t