Professional Documents
Culture Documents
Dahlia Conf - Paper Effat-V5
Dahlia Conf - Paper Effat-V5
Dahlia Conf - Paper Effat-V5
Abstract – correct understanding of the Holy Quran is an patterns [9-10]. CNN proved reliability in handling speech
essential duty for all Muslims. Tajweed rules guide the reciter to signals and improving the speaker invariance of the acoustic
perform Holy Quran reading exactly as it was uttered by model [11].
Prophet Muhammad peace be upon him. This work focused on
the recognition of one Quranic recitation rule. Qalqalah rule is
applied to five letters of the Arabic Alphabet The main objective of our research is to construct a
(Baa/Daal/Jeem/Qaaf/Taa) having sukun vowelization. The Quranic recitation rule detection system. which is the
proposed system used the Mel Frequency Cepstral Coefficients Qalqalah rule. CNN was used as the speech recognition
(MFCC) as the feature extraction technique, and the model. The proposed system should help read Quran verses
Convolutional Neural Networks (CNN) model was used for correctly, especially for readers with limited knowledge of
recognition. The available dataset consists of 3322 audio samples Tajweed rules and non-Arabic speakers.
from different surahs of the Quran for four professional readers
(Sheihk) AlHussary, AlMinshawy, Abdel Baset, and Ayman
Swayed. The best results were gained using Ayman Swayed II. LITERATURE REVIEW
audio samples with a validation accuracy of 90.8%.
Alagrami et al. [1] worked on four Tajweed rules
Keywords— Quranic recitation rules, Qalqalah rule, the Mel (Edgham Meem, Ekhfaa Meem, Tafkheem Lam, Tarqeeq
Frequency Cepstral Coefficients, Convolutional Neural Networks Lam). The used dataset contained a total of 657 recordings.
(CNN) All audio segmentations were manually performed with an
average sample duration of 4 seconds. The applied feature
I. INTRODUCTION extraction technique basically employs 70 filter banks as the
The Holy Quran is the main source of guidance for all main method. Support Vector Machine (SVM) was adopted
Muslims. Delivering the precise meaning of the holy Quran for Classification. In the test stage, every model of the system
words is a very crucial issue. Although the holy Quran is used 30% of the data and the validation accuracy was 99%.
presented in classic Arabic, it is completely different from
any other Arabic content. For an accurate reading of the holy Damer N.A. et al [12], considered eight Tajweed rules and
Quran, recitation rules must be followed. These rules also conducted several feature extraction techniques such as Mel-
called Tajweed rules, apply certain pronunciation manners, Frequency Cepstral Coefficients (MFCC), Wavelet Packet
articulation positions, and intonation characteristics for Decomposition (WPD), and Linear Predictive Code (LPC).
letters in specific situations. These rules include merging two Different classification techniques were used such as k-
letters’ sounds, applying high stress when pronouncing a Nearest Neighbors (KNN), Support Vector Machines (SVM),
letter, prolonged letter pronunciation for a specific duration, and Multilayer Perceptron (MLP) neural networks. They
and many more rules [1]. Tajweed rules are hard to localize concluded that MFCC achieved the highest accuracy in the
and to be applied, especially for non-Arabic speakers [2]. feature extraction phase and SVM scored the highest
Teaching Tajweed rules may be considered a complicated classification accuracy of 94.4%. This was obtained when all
and confusing task. It needs prolonged sessions of direct features except for the LPC features were applied to SVM.
contact between the instructor and his/her students. Tajweed
rules should be applied to deliver the correct Hassan et.al [13] created a system for Holy Quran's
pronunciation/meaning of the Quran and consequently to Tajweed rules recognition using sphinx tools. In the proposed
maintain its integrity and authenticity [3]. system, the MFCC technique was used to find the most
informative audio features and Hidden Markov Model
Automatic Speech Recognition (ASR) is an interactive (HMM) was used for classification. The system scored 85-
process of recognition of human spoken words by machines 90% accuracy when tested with a small chapter of the Quran.
based on the embedded information in those words [4-5].
ASR enables the machine to receive, interpret and translate E-Hafiz system was created in [14] to help ordinary
audio signals (words or orders) and react accordingly [6]. readers to recite Quran correctly by training them on Tajweed
Deep Neural Networks (DNN) are promising and rules depending on expert readers. The (MFCC) technique
efficient techniques used for ASR [7]. DNN has profoundly was applied to extract the features of recorded voices of
revolutionized the field of speech recognition by applying specific verses. These features were used to build a model of
different kinds of models such as Recurrent Neural Networks speech recognition using Vector Quantization (VQ). This
(RNN), Convolutional Neural Networks (CNN), and model was used to compare readers’ trials against experts’
transform networks [8]. reference readings. Any mismatch on the word level was
highlighted.
CNN showed promising results in pattern recognition and
prediction so they are used to detect and allocate different
IV RESULTS 1 2 3 4
√ √ √ 0.790
Several experiments were conducted representing different √ √ √ 0.796
combinations of reciters’ samples. Six classes are used to √ √ √ 0.771
represent the five Qalqalah letters and the absence of Qalqalah √ √ √ 0.855
case. The highlighted raw in each table represents the highest
validation accuracy TABLE 6. RESULTS USING SAMPLES OF FOUR RECITERS
The first experiment was conducted by training the model Reader’s ID Accuracy
with combined audio samples of the four reciters and testing 1 2 3 4
the model with one, two, three of four reciters’ samples. Table √ √ √ √ 0.811
2 represents the results of the first experiment.
TABLE 2. TESTING WITH DIFFERENT RECITERS’ COMBINATIONS Table 7 introduces a comparison between the obtained results
and the results of Ismail et al. [15].