Professional Documents
Culture Documents
Lecture16 Hanoi 4up
Lecture16 Hanoi 4up
• From waveform;
Cơ sở âm vị học và ngữ âm học
• From (wide-band) spectrogram;
Three ways to find F0 (tần số cơ bản, độ cao): Three ways to find F0 (tần số cơ bản, độ cao):
• F0 = 1000 / duration of a single period (in ms) • F0 = 1000 / duration of a single period (in ms)
0.2 0.2
Intensity
Intensity
0 0
-0.2 -0.2
0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2 0.22 0.24 0.26 0.28 0.3 0.32 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2 0.22 0.24 0.26 0.28 0.3 0.32
Time Time
• Chu kỳ lặp lại bao nhiều lần trên một giây? • Chu kỳ lặp lại bao nhiều lần trên một giây?
• F0 = 1000 / duration of interval between striations (in ms) • F0 = 1000 / duration of interval between striations (in ms)
5000 5000
4500 4500
4000 4000
Frequency (Hz)
Frequency (Hz)
3500 3500
3000 3000
2500 2500
2000 2000
1500 1500
1000 1000
500 500
0 0
0 0.05 0.1 0.15 0.2 0.25 0.3 0 0.05 0.1 0.15 0.2 0.25 0.3
Time (s) Time (s)
• Một đường = một mạch dây thanh • Một đường = một mạch dây thanh
F0 from spectrum F0 from spectrum
• ...or H10 (H10/10 = F0)
• Locate first harmonic (tần số cộng hưởng thứ nhất=F0)...
• Harmonics (cộng hưởng) are always multiples of F0
(bội số của F0)
Sound pressure level (dB/Hz)
40
20
20
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 0
Frequency (Hz)
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
Frequency (Hz)
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
Frequency (Hz)
How do languages use F0? How do languages use F0?
• At the lexical level, as tone (thanh điệu) • At the lexical level, as tone (thanh điệu)
• At the utterance level, as prosody (điệu tính) and • At the utterance level, as prosody (điệu tính) and
intonation (ngữ điệu) intonation (ngữ điệu)
• All three uses can occur simultaneously and • All three uses can occur simultaneously and
(semi-)independently. (semi-)independently.
• Most F0-extraction software (like Praat) essentially uses • Most F0-extraction software (like Praat) essentially uses
the last method the last method
• But... software can make mistakes, e.g. octave jump • But... software can make mistakes, e.g. octave jump
(thinking H2 is H1) (thinking H2 is H1)
• F0 trace is unreliable when speech is unvoiced/partially • F0 trace is unreliable when speech is unvoiced/partially
devoiced or creaky (dấu nặng, dấu ngã...in Praat?) devoiced or creaky (dấu nặng, dấu ngã...in Praat?)
300
HARRY’s
250
Pitch (Hz)
200 going
• But... software can make mistakes, e.g. octave jump 150
to
Hawaii
50
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.3
1.32816
Time (s)
• F0 trace is unreliable when speech is unvoiced/partially
350
devoiced or creaky (dấu nặng, dấu ngã...in Praat?) 300
250
HAWAII
Harry’s
Pitch (Hz)
200 going to
150
100
50
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2
1.23816
Time (s)
Prosodic structure Revision
• In tone languages, such effects are (often) achieved • What acoustic properties of speech of the visual
through pitch range expansion representations we have covered show?
– Waveform
– Spectrum
– Spectrogram
Revision Revision
• What acoustic properties of speech of the visual • What acoustic properties of speech of the visual
representations we have covered show? representations we have covered show?
– Waveform – Waveform
– Spectrum – Spectrum
– Spectrogram – Spectrogram
• What does each axis (trục) represent? • What does each axis (trục) represent?
• Which distinctive features of speech sounds can/can’t • Which distinctive features of speech sounds can/can’t
we measure in each representation? we measure in each representation?
– Why are certain representations better/worse for different – Why are certain representations better/worse for different
classes of speech sounds? classes of speech sounds?
Formant 6= harmonic
A formant is
not the same • Formants are properties of the vocal tract (đường dẫn
âm) - they are independent of pitch
harmonic!
Formant 6= harmonic Spectrogram reading
• Formants are properties of the vocal tract (đường dẫn • Be able to identify broad classes of sounds – nguyên
âm) - they are independent of pitch âm, bán nguyên âm, âm xát, âm tắc...
http://www.cns.nyu.edu/ david/courses/perception/lecturenotes/speech/speech.html
200 200
300 i 300 i
u u
400 400
! !
% %
500 & 500 &
600 " $ 600 " $
700 700
& &
800 800
æ # æ #
900 900
1000 1000
3000 2500 2000 1500 1000 3000 2500 2000 1500 1000
F2 (Hertz) F2 (Hertz)
• Given F1 and F2, can you guess which vowel? • Given F1 and F2, can you guess which vowel?
• Given a vowel, can you say if F1/F2 are high or low? • Given a vowel, can you say if F1/F2 are high or low?
First half: general First half: general
2. If a complex wave has three component waves with 2. If a complex wave has three component waves with
fundamental frequencies (F0) of 60 Hz, 90 Hz, and 120 fundamental frequencies (F0) of 60 Hz, 90 Hz, and 120
Hz, what is the fundamental frequency of the complex Hz, what is the fundamental frequency of the complex
wave? wave?
3. True or false: the vowel [a] is always pronounced with 3. True or false: the vowel [a] is always pronounced with
the same pitch (F0). the same pitch (F0).
Âm vị học
• Trình bầy giống như kỳ thi giữa nhưng cũng có mốt vấn
đề về biểu diễn tầng sâu....