Download as pdf or txt
Download as pdf or txt
You are on page 1of 5


n Técnico, Vol.55, Issue 15, 2017, pp.206-210

Research on Resonance State of Music Signal based on Filter Model

Zhao Yuanzheng
Xi'an conservatory of Music, Xi’an, Shaanxi, 710061, China

The basic structure of the source filter model is the convolution of the signal source and the linear filter in time
domain to form the sound signal. In this paper, the author analyse the resonance state of music signal based on
filter model. For speech signals, the most important is the semantic information, followed by the voice of
personalized information. For singing signals, the importance of resonance state and rhythm is far more than
semantic information. By analyzing the common features of bel canto resonance and the characteristics of other
timbre recognition, the author classifies the bel canto effect, and puts forward some suggestions for bel canto

Key words: Amplitude perturbation, Prediction coefficient, Singing signal, Filter model

Human voice signals include voice signals and singing signals. For speech signals, the most important is
the semantic information, followed by the individual information of the voice; and for the singing signal, the
importance of resonance state, rhythm is much more than semantic information. In order to obtain beautiful
singing voice, the most important thing is to master the correct resonance state, in addition to controlling the
breath and mastering the correct vocal method. Study on resonance singing of vocal music training can not only
improve efficiency, but also is of great significance to promote the development of cross singing training and
signal processing technology. The source filter model is used to model glottal pulse, and the channel impulse
response is modeled by filter. The vocal tract impulse response contains the semantic, timbre and other
information, which is the basis of studying the vocal resonance state. Resonance refers to the glottal wave
through the air filled cavity to the outside of the process, the tone changes and strength enhancement. Resonance
state includes oral resonance, nasal resonance, head cavity resonance and chest resonance four kinds.
Oral resonance is more used in life, its tone is primitive, lack of artistic expression, need to be
supplemented with other resonant cavity in order to play an effective role. Nasal resonance has been used to
assist oral resonance to produce clear sounds. The resonance of the head cavity can greatly improve the timbre
of the sound, strengthen the overtone of the treble area, and make the timbre brilliant. Chest resonance is often
used in singing low and Alto tones. The chest chest resonance vibration perception, head resonance can through
a slight vibration between the eyebrows tympanum and sense of recognition, while the other resonance state can
not from the body vibration perception, auditory perception can only be obtained through long-term vocal
training. Bel canto requires the cavity to be fully opened when it sounds. It can balance the resonance of the
head cavity, chest cavity resonance, oral cavity resonance and nasal cavity resonance, so as to achieve the
balance of resonance. It is impractical to distinguish the effects of different resonant cavities on Bel Canto
timbre, but the combination of these chambers is the key factor to make bel canto mellow and full, different
from popular and rural music genres. The characteristic parameters which are widely used in these studies are
only the long time average spectrum and the singing power ratio that measures the formant amplitude of singers,
which is relatively simple. With the long time average power spectrum, singing ratio on the bel canto singing
voice and voice classification reliability research is relatively small. Therefore, this paper applies the
characteristics of the existing bel canto resonance research and other features of timbre recognition to the
recognition of professional and non professional singers, as well as the classification and recognition of bel
canto voice and voice, and makes careful examination. It is of great significance to study and study the
resonance state of vocal music.


2.1. Singing signal

The vocal signal has the same sound model as the speech, that is, the source filter model. The model is the
basic theory of speech signal processing technology, and is widely used in speech recognition, synthesis,
conversion and so on. The basic structure of the source filter model is the convolution of the signal source and
the linear filter in time domain to form the sound signal. In speech signal processing, the signal source is glottal
excitation, and the linear filter is composed of glottal pulse shape, vocal tract transmission function and lip
projection response. The glottal excitation speech corresponding different types of different: single sound glottal
excitation is periodic impulse, the voiceless glottal excitation signal is similar to the Gauss white noise. In the

n Técnico, Vol.55, Issue 15, 2017, pp.206-210

source filter model, the frequency of the source determines pitch, and the linear filter determines the acoustic
characteristics such as semantics and timbre. In the pronunciation, people change the pitch by changing the
frequency of opening and closing of the glottis. By adjusting the position of the tongue, the change of the tongue
position, the size of the lip opening and the position of the soft palate, the vocal tract function can be adjusted to
achieve different pronunciation.

Figure 1. Sound signal

2.2.Resonant cavity
During vocal production, glottal pulses produced by vocal cords vibrate through the vocal tract and radiate
out of the lips to form sounds or songs. The changes in the position and shape of the vocal organs lead to
changes in the shape of the vocal tract, resulting in different resonance feelings, which can be divided into oral
resonance, nasal resonance, head cavity resonance and chest resonance. Oral resonance refers to the sound of
oral natural open, micro risorius, Xiapin palate lift down the natural resonance generated by the pull back
slightly. At this time, sound waves with the atmosphere of the push left in the front part of the hypopharynx
mouth caused by vibration, sound in the front hard palate concentrated reflection, then both nasal open, smooth
feeling. In the middle voice area, the oral cavity and pharynx cavity are the main resonance organs. The
resonance of the nasal cavity is the vibration of the sound wave on the nasal bone, and the focus of the sound is
located in the nasal cavity. Because the focus of sound resonance in the nasal cavity, the nasal cavity resonance
is also called mask singing, used to assist oral cavity resonance, produce clear voice. The head cavity resonance
is based on oral resonance on the concentrated reflection point of sound wave in hard employed slightly moves
backwards, to be put down like the upper gum to increase some feeling at the same time, the soft palate and
small tongue also raised, tongue has put down some feeling, the mouth and nose the passage between the
pharyngeal cavity, and a broader space, hire to sound along the nasopharynx, nasal and sinus, sphenoid sinus,
sinus, sinus, hoof, caused by the acoustic echo. The feeling between the eyebrows slightly tympanum and
vibration. The resonance is bright, concentrated and soft. Singers singing in the treble, want to get this kind of
head resonance feeling.

Figure 2. Resonant cavity

2.3. Singer formant and its mode of production

As early as the year, the concept of singer formant was put forward to characterize the energy aggregation
phenomenon of Bel Canto singers in their spectral range, Later scholars have done a lot of experimental
research on the singer's formant, which redefines the singer's formant is voice signal spectrum third, fourth, fifth

n Técnico, Vol.55, Issue 15, 2017, pp.206-210

formant polymerization to form the energy focused region. Research shows that the cross-sectional area at the
entrance and when the proportion of pharyngeal cross-sectional area is less than, pipes and other parts, lose
channel has become a new resonance cavity, the resonant cavity will produce a new resonance peak. The cross
section area of the vocal tract is obtained from the ray images taken by the bel canto singer. The formant is
located between the third, fourth resonant peaks of the speech spectrum. The new resonance peak makes the
distance between adjacent resonance peaks decrease. On the one hand, it increases the wave amplitude of each
other, and on the other hand increases the amplitude of the trough between the resonance peaks, thus forming
the polymerization of third, fourth, fifth resonance peaks, that is, the formant of singers. One of the
characteristics of bel canto, unlike other singing methods, is that it uses a lower throat position, resulting in a
bright, plump, relaxed, rounded, resonant tone. So what parts of the vocal tract have changed during the
reduction of the laryngeal position? The explanation of formant for soprano without singer is mainly divided
into two kinds: one is that the pitch of soprano is usually higher, and the distribution of harmonic is sparse. So
there will be some pitch vowels have effective bandwidth harmonic falls on the singer's formant within the other
vowels are not, resulting in the different amplitude pitch of the song loudness differences, singer's formant effect
did not show up but will affect the effect of the singers.


3.1.Music sound
Musical sound has four physical attributes: pitch, duration, intensity and timbre. The sound is the sound
characteristics, it can distinguish from the pitch, intensity and length of the same two people or musical
instruments. The difference of timbre depends first on the quantity and intensity of overtone, and the spectrum is
the number and amplitude of harmonics. Secondly, there is a close relationship between the timbre and the
irregular components of each overtone. Two aspects of what is more important is hard to say clearly. Resonance
characteristics is a branch of information in many tone, singing resonance state study can acquire the inspiration
from the research in timbre. And the information of timbre is reflected in the source filter model, that is, the
shape of the spectrum envelope. This chapter through the experiment to research effects of synthetic single filter
on the source filter model of signal tone.

Figure 3. Sound information

3.2.Source filter model

Spectral envelope refers to the frequency response of the filter in the source filter model. The spectral
envelope contains semantic information and timbre information, and its accurate extraction lays the foundation
for recognition, synthesis and transformation. Six algorithms, such as vocoder, linear prediction coefficient,
discrete all pole, cepstrum coefficient and regularized discrete cepstrum coefficient, are introduced in this paper.
The source filter model, the voice signal in time domain is the convolution of the glottal excitation and
channel impulse response, converted to the frequency domain into the product form, and will become the
product of the logarithm sum form, namely the voice signal cepstrum is glottal excitation Cepstrum and channel
impulse response additive type cepstrum. The glottal excitation cepstrum is distributed on the integer multiples
of the pitch period, while the cepstrum of the channel impulse response is mainly distributed in the origin and
the doubling pitch period. Therefore, the low frequency inversion part of the speech signal cepstrum is mainly
the channel transmission function information, while the back high frequency part mainly reflects the
fundamental frequency information. Therefore, the Fourier transform of the first few dimensions of cepstrum
coefficients can be used to estimate the spectral envelope. According to the source filter model, the exponential
pulse string is modeled as the excitation source, and the spectrum envelope extracted by the above six

n Técnico, Vol.55, Issue 15, 2017, pp.206-210

algorithms is used to model the filter. The convolution form in the time domain is converted into the frequency
domain, so the source and the filter are multiplied in frequency domain, and then the Fourier inverse transform
is used to splice each frame signal in time domain.

Figure 4. Vocal signal spectrum


4.1.Fundamental perturbation and amplitude perturbation

The perturbation of fundamental frequency and amplitude mainly reflects the stability of vocal cord
vibration, and the smaller the numerical value is, the smaller the acoustic signal appears during the vocal process.
The noise harmonic ratio reflects the magnitude of the noise energy in the signal, and the more the noise energy
in the signal is, the greater the harmonic component energy is. The degree of stability of fundamental frequency
perturbation and amplitude perturbation to describe the vibration of the vocal cords, taking into account the
sound attenuation and starting in part, the vibration of the vocal cords is not the law, the state does not measure
the different vocal singer of vocal fold vibration control ability difference, so the material selection is stable in
removing the initiation and decay.

Figure 5. Bel canto singing

Bel canto lowers the throat position during singing, resulting in a new formant in the throat as an
independent resonant cavity, while there is no reduction in the laryngeal position during reading. When singing,
pop singers are different from bel canto singing, and have the same pronunciation as reading aloud, which is
between Bel Canto and reading. Read the record of second groups of students of Bel bel canto songs, popular
songs and long sound samples for their spectral energy distribution on different singer formant frequency
changes with the change of mean voices, including singer formant frequency bass minimum value, singer
formant frequency maximum soprano. The difference between the different parts of the singer formant
frequency is obvious, in addition to male high tone and contralto. This phenomenon confirms that the voice
frequency can be divided according to the formant frequency of singer. In addition, the standard deviation of
soprano is larger, which indicates that the formant frequency of soprano is unstable. The professional bel canto
singer has rich singing experience, and there is obvious formant frequency difference between the voice parts.
The bel canto students have relatively short learning time and lack of experience, and their formant frequencies
are different from those of professional singers.

4.2.Bel canto majors and classifiers

The formant frequencies of singers of Bel Canto majors are larger than those of professional bel canto,
whether they are tenor or soprano, which may be due to the short study time. The formant frequencies of the

n Técnico, Vol.55, Issue 15, 2017, pp.206-210

singers are significantly different from those of Bel Canto and pop music, because there is a split peak in the
vicinity of the reading, and there is no new resonance peak produced by the low throat position. In theory, the
pop singer formant frequency should be consistent with the reading, but under the influence of bel canto, bel
canto students bring some bel canto mode when singing popular, so that the singer formant frequency is similar
to bel canto. From the standard deviation, tenor Bel Canto and pop singer formant frequency stability is stronger
than soprano.
The purpose of the classifier is to make the computer automatically classify the given data by learning.
According to the mathematical model, the commonly used classifiers are Bias classifier, decision tree algorithm,
clustering algorithm, support vector machine algorithm and so on. The author mainly used Bias classifier, its
principle is in the learning phase of the training set by learning the mean and variance of the sample in the test
phase, according to the mean and variance obtained prior probability, and then use the Bias formula to calculate
the test set object of the posterior probability, i.e. the probability of a test set of objects belonging to the class
and choose the maximum posterior probability as the test set of objects belonging to the class of. In other words,
the Bias classifier is an optimization in the sense of minimum error rate.


The experimental results of the source filter model are used to study the resonance state of the singing
sound. From the time domain, frequency domain, cepstrum extraction and resonance related acoustic features,
differences in Bel Canto songs and popular singing, reading, reading aloud and the difference of bel canto,
different professional singers and non professional singers reading classification, get the effect of different
features for classification. In the process of singing, the vocal resonance peak is generated by adjusting the
shape of vocal organ, which is a good characteristic parameter to distinguish bel canto voice and voice, and also
provides a theoretical basis for the extraction of other characteristic parameters. The existence of singer formant
makes the progress of the classification of Bel Canto and voice and popular singing more smoothly. Time
domain and frequency domain, cepstrum characteristic parameters of bel canto, popular reading, classification,
vowels and vowel singing aloud classification and professional and non professional singers reading vowel
classification experiment results show that the classification results have better time domain features of the bel
canto singers has the ability to better control the vibration frequency of the vocal folds; singer formant the
frequency domain features of Bel and reading aloud is not suitable for classification, other characteristics
classification results; the best classification results inverted frequency characteristic, especially the classification
effect of harmonic suppression after cepstral coefficients and improve.
The classification effect of vector feature is better than that of scalar feature, so this paper combines and
classifies each feature, and it can be found that some feature combinations can improve the classification effect,
but the classification effect does not exceed the best single feature classification effect. By selecting appropriate
features, the purpose of reducing the computational complexity can be achieved under the premise of ensuring
the classification effect. The singing sound resonance of the research on the state of song has always been
difficult, in the signal development is slow, the reason is because the characteristics of semantic information
signal is singing voice signal, and melody of music, and the different performance of different voices.

C.Krstev, and A.Trtovac, (2014). “Teaching Multimedia Documents to LIS Students”, The Journal of Academic
Librarianship, 40 (2), 152-162.
S.Jian-hua, L.hong, (2012).“Explore the Effective Use of Multimedia Technology in College Physics Teaching”, Energy
Procedia, 17, 1897-1900.
R.A.Sabella, (2010). “School counselors perceived importance of counseling technology competencies”, Computers in
Human Behavior, 26, 609-617.
N.R.Mastroleo, R.Turrisi, (2010). “Examination of posttraining supervision of peer counselors in a motivational
enhancement intervention to reduce drinking in a sample of heavy-drinking college students”, Journal of Substance
Abuse Treatment, 39, 289-297.
Z.Huang and M.Benyoucef, (2013). “From e-commerce to social commerce: A close look at design features”, Electronic
Commerce Research and Applications, 12(4), 246-259.
C.Zhang, X.Chen, (2012). “Use of Multimedia in Gross Infective Pathogen Experimental Teaching”, Procedia
Engineering, 37, 64-67.
W.Dai, L.Fan, (2012). “Discussion about the Pros and Cons and Recommendations for Multimedia Teaching in Local
Vocational Schools”,Physics Procedia, 33, 1144-1148.
R.Khansa, (2015) .“Teachers’ Perceptions toward School Counselors in Selected Private Schools in Lebanon” ,Procedia
- Social and Behavioral Sciences, 185, 381-387.


You might also like