Professional Documents
Culture Documents
A Study of VoIP Quality Evaluation - User Perception of Voice Quality From G.729 G.711 and G.722 - 16jan2012
A Study of VoIP Quality Evaluation - User Perception of Voice Quality From G.729 G.711 and G.722 - 16jan2012
III. METHODOLOGY
MOS
the open-source Asterisk software, version 1.6.2. Theoretically,
VoIP systems with direct condition can provide the same QN of
infinity, as mentioned in [29-30]. For packet delay and packet
loss, it is very low packet delay (< 10 ms) and it is packet
lossless.
D. Subjects
Each codec tests requires at least 24 subjects, both male Codec
and female subjects. Therefore, this study requires at least 72
subjects totally. Of course, the subjects who represent a group (a)
of Thai native listeners were student volunteers from
KMUTNB.
E. Data Gathering
This study was conducted using a paper-based form.
Mainly, subjects read and listened to instructions about
Richard’s task briefly. After finishing the task, each pair of
subjects had to answer questions about voice quality and some
% of Votes
IV. RESULTS
G.729 and G.722 were equally tested by a different group
of 12 pairs of subjects, whereas G.711 was tested by a group of
13 pairs of subjects. That means the number of subjects was
totally 74 subjects, consisting of 47 male and 27 female
subjects with an average age of 20.66 years old (SD = 1.80
years). The results are presented in Figure 3.
Opinion Score
V. ANALYSIS
(b)
From the results in Figure 3 (a), it can be seen that the
MOS-CQS of G.729, G.711 and G.722 are not very different. Figure 3. Comparison among G.729, G.711 and G.722 for (a) MOS-CQS and
Therefore, to verify whether the user perception to these three (b) percent of votes.
codecs is significantly different, the raw data was analyzed
using ANOVA, a statistic tool, as follows: TABLE II. HYPOTHESIS TESTED RESULT
H0: The user perception to G.729, G.711 and G.722 is the
same Hypotheses p-value
H1: The user perception to G.729, G.711 and G.722 is the
different The user perception to G.729 VS G.711 VS G.722 0.880
The analyzed result from ANOVA is presented in Table II.
When considering votes, from Figure 3 (b), no codec was [4] F. L. Chong, K. Pawkikowski, and I. V. McLoughlin, “Evaluation of
ITU-T G.728 as a Voice over IP codec for Chinese Speech,” Australian
voted with a score of 1 or 2, while, all of them obtained scores Telecommunication Networks and Applications Conference, Dec 2003.
of 3 with almost the same percentage. However, with the score [5] Z. Ding, I. V. McLoughlin, and E. C. Tan, “Intelligibility evaluation of
of 5, it can be seen obviously that, G.722 obtained the highest GSM coder for Mandarin speech using CDRT,” Speech Communication,
vol. 38(1), pp. 161–165, Sep 2002.
vote, while G.711 is in the middle and G.729 is the lowest. [6] J.-H. Chen and J. Thyssen, “BroadVoice®16: A PacketCable Speech
Whereas, for the score of 4, the highest vote is G.729, the Coding Standard for Cable Telephony,” Proc. Asilomar Conf. Signals,
middle is G.711 and the lowest is G.722. This could be the Systems, Computers, Asilomar, CA, Oct 2006.
[7] J.-H. Chen and J. Thyssen, "The BroadVoice Speech Coding Algorithm,"
reasons that G.722 obtains the MOS-CQS of 4.21, while G.729 Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing, pp. IV-537 -
obtains the MOS-CQS of 4.13. IV-540, Apr 2007.
However, when considering Table II, the p-value obtained [8] C. Quinquis, “Quality comparison of wideband coders including
tandeming and transcoding,” ETSI Workshop on Speech and Noise in
from hypothesis test using ANOVA is 0.880, that is higher Wideband Communication, May 2007.
than 0.05. It means the perception of users to these three codec, [9] N. Kitawaki and T. Tamada, “Subjective and Objective Quality
G.729, G.711 and G.722, is not significantly different. This is Assessment for Noise Reduced Speech,” ETSI Workshop on Speech and
Noise in Wideband Communication, May 2007.
inconsistent with the general understanding, although G.729 [10] Z. Cai, N. Kitawaki, T. Yamada, and S. Makino, ''Comparison of MOS
uses only 8 kbps for its payload, whereas, G.711, a narrow evaluation characteristics for Chinese, Japanese, and English in IP
band codec, and G.722, a wideband codec, use 64 kbps. telephony, '' in Proc. International Universal Communication Symposium,
pp. 1-4, Oct. 2010.
[11] A. W. Rix, “Comparison between subjective listening quality and P.862
PESQ score,” Psytechnics, Sep 2003.
VII. CONCLUSION AND FUTURE WORK [12] E.M. Yiu, B. Murdoch, K. Hird, P. Lau and E.M. Ho, “Cultural and
After studying a group of Thai subjects, the MOS-CQS for language differences in voice quality perception: a preliminary
investigation using synthesized signals,” Folia Phoniatr Logop, Vol. 60
G.729, G.11 and G.722 have been obtained. Although they (3), 2008, pp. 107–119
were certainly not representative of the general Thai population, [13] J. Gandour, D. Wong, L. Hsieh, B. Weinzapfel, D. V. Lancher and G. D.
the results could be the benchmark for VoIP quality evaluation Hutchins, “A crosslinguistic PET study of tone perception,” J. Cogn.
Neurosci., Massachusetts Institute of Technology, Jan 2000, Vol. 12, No.
based on Thai users and might be used for calibration of 1, pp. 207-222.
objective measurement tools to be used in Thai environments. [14] W. Sittiprapaporn, C. Chindaduangratn, and N. Kotchabhakdi, “Long-
Also, based on the Thai users, it could be recommended to use term memory traces for familiar spoken words in tonal languages as
revealed by the Mismatch negativity,” Songklanakarin J. Sci. Technol.,
G.729 to obtain good voice quality as in G.711 and G.722 to 2004, Vol. 26, No. 6, pp. 779-786.
reduce traffic in network because it requires payload [15] H. Batteram et al., “Delivering Quality of Experience in Multimedia
bandwidth of only 8 kbps, whereas, G.711 and G.722 requires Networks,” Bell Labs Tech. J., 2010, Vol. 15(1), pp. 175-194
[16] Nokia, “Quality of Experience (QoE) of mobile services: Can it be
a payload bandwidth of 64 kbps. Another point of view, this measured and improved?,” White paper, 2004.
study presents the evidence of user perception of these three [17] K. Kilkki, “Quality of Experience in Communications Ecosystem,” J.
codecs and found that they are not significantly different. It is UCS, Mar 2008, Vol. 14, No.5, pp. 615-624
[18] M. Goudarzi, “Evaluation of Voice Quality in 3G Mobile Networks”,
inconsistent with the general understanding about voice quality Thesis, University of Plymouth, Jun 2009.
provided by G.729, G.711 and G.722. Therefore, this process [19] T. A. Hall, Objective speech quality measures for Internet telephony, in:
can be considered to verify other languages. Thus, this Voice over IP (VoIP) Technology, Proceedings of SPIE, vol. 4522,
Denver, CO, USA, 2001, pp. 128-136.
evidence can be used to challenge or improve developments of [20] ITU-T Recommendation P.800, “Methods for subjective determination of
voice quality provided by novel codecs for VoIP. transmission quality”, Aug 1996.
[21] ITU-T Recommendation P.800.1, “Mean Opinion Score (MOS)
terminology,” Jul 1996.
ACKNOWLEDGMENT [22] Avaya Labs., “Avaya IP Voice Quality Network Requirements,” Avaya
Inc, CO, Apr 2006.
Thanks you very much all reviewers for useful comments. [23] O. Hersent, J. Petit and D. Gurlr, “IP Telephony Deploying Voice-
over0IP Protocols,” Wiley, 2005.
Thank you to lecturers, students, and staff in KMUTNB who [24] S. Karapantazis and F.-N. Pavlidou, "Voip: A comprehensive survey on a
supported, particularly Mr. Wiwat Suwanuntawong, the promising technology," Comput. Networks, vol. 53, no. 12, pp. 2050-
Central Library Studio staff, and Mr. Gary Sherriff, the 2090, August 2009.
international coordinator, Faculty of Information Technology [25] ITU-T Recommendation G.729, “Coding of speech at 8 kbit/s using
conjugate-structure algebraic-code-excited linear prediction (CS-
(for editing). Lastly, the first author would like to dedicate this ACELP),” Jan 2007.
paper to Dr. Gareth Clayton, advisor who sadly passed away. [26] ITU-T Recommendation G.722, “7 kHz Audio - Coding within 64
kbit/s,” 1988.
[27] ITU-T Recommendation P.805, “Subjective evaluation of conversational
REFERENCES quality,” Apr 2007.
[28] T. Daengsi and K. Tontiwattanakul, “A Case of Improvement of Building
[1] F. D. Rango, M. Tropea, P. Fazio, and S. Marano, “Overview on VoIP: Acoustics Using Available Equipments and Limited Resources”
Subjective and Objective Measurement Methods”, IJCSNS, Vol. 6 No. Naresuan Research Conference 2010, Phitsanulok, Thailand, Jul 2010.
1B, 2006. [29] ITU-T Recommendation P.830, “Subjective Performance Assessment of
[2] J. Ren, H. Zhang, Y. Zhu and C. Gao , “Assessment of effects of Telephone-Band and Wideband Digital Codecs,” Feb 1996.
different language in VOIP,” ICALIP2008, Shanghai, 2008. [30] ITU-T Recommendation P.810, “Modulated Noise Reference Unit
[3] F. Chong, I. McLoughlin, and K. Pawliowski, “A methodology for (MNRU),” Feb 1996.
improving PESQ accuracy for chinese speech,” presented at the IEEE
Region 10 Conf., TENCON, Melbourne, Nov. 2005.