Z Transform Model of The Vocal Tract (023-028)

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 32

Digital Speech Processing

Dr.Sudharsan P
sudharsan@nitt.edu

October 9, 2020

1 / 32
Vocal cords are excited by the air pressure from lungs

Vocal tracts also modify the sound produced, acting like a


digital filter

Source-tract model lends itself to signal processing


signal-system approach

Objective is to find transfer function for vocal tract

2 / 32
H(s)=P(s)/Q(s)

Poles are roots of Q(s)

At pole frequencies, H(s) → ∞ and this is called resonance

These frequencies are called formant frequencies

Along with fundamental frequency, they decide how we sound

Fundamental frequency is inversely proportional to vocal tract


length

Hence male children have high frequency sound when


compared to male adults

3 / 32
Uniform tube model

Vocal tract can be modeled as a uniform tube of length l and


width w

Initially rigid walls are considered. So that cross sectional area


A is constant .

Area also can vary across the tube in general

Initial pressure P0 is applied.

Two important parameter that vary across space and time are
particle velocity v(x,t) and pressure p(x,t)

4 / 32
Uniform tube model

δp δv
= −ρ (1)
δx δt
δp δv
= −ρc 2 (2)
δt δx
ρ density of air in tube, c is velocity of sound in air 340 m/s
From first equation
δ2p δv
= −ρ (3)
δx 2 δtδx
Substituting for δv
δx from 2nd equation,

δ2p 1 δ2p
= (4)
δx 2 c 2 δt 2
Similarly
δ2v 1 δ2v
= (5)
δx 2 c 2 δt 2
5 / 32
Uniform tube model

Volume velocity u=vA


From first equation
δp ρ δu
=− (6)
δx A δt
From second equation

δp ρc 2 δu
=− (7)
δt A δx

δu A δp
=− 2 (8)
δx ρc δt

6 / 32
Uniform tube model

Pressure and volume velocity are analogous to voltage vo(x,t) and


current i(x,t) respectively

δvo δi
= −L (9)
δx δt
δi δvo
= −C (10)
δx δt
Mapping equations 9 with 6 and equation 10 with 8, acoustical
inductance is L = Aρ and acoustical capacitance is C = ρcA2 .

7 / 32
Uniform tube model

By mapping pressure and volume velocity to voltage and current,


we are ready to derive the transfer function.
By solving these wave equations,

u(x, t) = u + (t − x/c) − u − (t + x/c)

and
ρc +
u (t − x/c) + u − (t + x/c)

p(x, t) =
A
where characteristic impedance is ZT = ρc
A.

8 / 32
Uniform tube model

In an uniform lossless transmission which is short circuited at the


end, voltage vo(l,t)=0.
Similarly here pressure p(l,t)=0 i.e. no radiation loss at lips.
At glottis i.e. source, u(0,t) can be represented as uG (t).

uG (t) = UG (Ω)e jΩt

9 / 32
As differential equations are linear

u + (t − x/c) = K + e jΩ(t−x/c)
u − (t + x/c) = K − e jΩ(t+x/c)
At source (x=0) and time t=0,

u(0, 0) = uG (0) = UG (Ω) = K + − K −

At lips p(l, t) = 0 . Hence

ZT (K + e jΩ(t−l/c) + K − e jΩ(t+l/c) ) = 0

10 / 32
Solving these two equations,

UG (Ω)
K+ =
1 + e −2jΩl/c
−UG (Ω)
K− =
1 + e 2jΩl/c

11 / 32
u(x, t) = u + (t − x/c) − u − (t + x/c)
u(x, t) = K + e jΩ(t−x/c) − K − e jΩ(t+x/c)
Substituting for K + , K −

cos(Ω(l − x)/c)
u(x, t) = UG (Ω)e jΩt
cos(Ωl/c)

12 / 32
p(x, t) = ZT (u + (t − x/c) + u − (t + x/c))
p(x, t) = ZT (K + e jΩ(t−x/c) + K − e jΩ(t+x/c) )
Substituting for K + , K −

sin(Ω(l − x)/c)
p(x, t) = jZT UG (Ω)e jΩt
cos(Ωl/c)

As pressure and volume velocity are analogous to voltage and


current, acoustic impedance is

p(x, t)
Z= = jZT tan(Ω(l − x)/c)
u(x, t)

If x is close to l,
Ωδx
Z = jZT
c

13 / 32
Acoustic impedance per unit length is jZT Ωc
Using u(x,t) derived,

UG (Ω)e jΩt
u(l, t) = Ul (Ω)e jΩt =
cos(Ωl/c)

Hence
u(l, t) Ul (Ω) 1
Va (Ω) = = =
u(0, t) UG (Ω) cos(Ωl/c)
Therefore the formant frequencies are at Ωl/c = (2n + 1)π/2 i.e.

(2n + 1)c
f =
4l
At n=0, f=c/4l. At l=17.5 cm, c=340 m/s. So f is approximately
500 Hz. Other formant frequencies are at 1500 Hz, 2500 Hz,...

14 / 32
In real life, there is radiation loss at lips. So the pressure
p(l, t) 6= 0. It is related to u(l, t) by

P(l, Ω) = ZL (Ω)U(l, Ω)

Impedance at end is a parallel combination of resistance and


inductance. So
1 1 1
= +
ZL (Ω) Rr jΩLr
So
jΩRr Lr
ZL (Ω) =
Rr + jΩLr
If Ω is less, ZL (Ω) = 0. So at low frequency, no radiation loss.
At high frequency, ZL (Ω) = Rr .

15 / 32
Relation between pressure at lips and volume velocity at glottis

P(l, Ω) P(l, Ω) U(l, Ω)


H(Ω) = = = ZL (Ω)Va (Ω)
UG (Ω) U(l, Ω) UG (Ω)

16 / 32
Better to split a uniform lossless tube into small segments which
have constant area.

Figure: Multiple tubes

17 / 32
ρc +
pk (x, t) = (u (t − x/c) + uk− (t + x/c))
Ak k
uk (x, t) = uk+ (t − x/c) − uk− (t + x/c)
uk (lk , t) = uk+1 (0, t)
Similarly
pk (lk , t) = pk+1 (0, t)
Time to traverse a tube is τk = lk /c. Hence

uk (lk , t) = uk+ (t − τk ) − uk− (t + τk )


+ −
uk+1 (0, t) = uk+1 (t) − uk+1 (t)

18 / 32
Equating above 2 equations,

uk+ (t − τk ) − uk− (t + τk ) = uk+1


+ −
(t) − uk+1 (t) (11)

ρc +
pk (lk , t) = (u (t − τk ) + uk− (t + τk ))
Ak k
ρc −
pk+1 (0, t) = (u + (t) + uk+1 (t))
Ak+1 k+1
ρc
Equating these two, and Zk = Ak

Zk (uk+ (t − τk ) + uk− (t + τk )) = Zk+1 (uk+1


+ −
(t) + uk+1 (t)) (12)

19 / 32
+
Objective is to express uk+1 (t) in terms of forward wave uk+ (t − τk )

and backward wave uk+1 (t) . Using equations 11 and 12

+ 2Zk − Zk − Zk+1
uk+1 (t) = uk+ (t − τk ) + uk+1 (t)
Zk + Zk+1 Zk + Zk+1

Similarly express uk− (t + τk ) in terms of uk+1



(t) and uk+ (t − τk ).

Zk+1 − Zk + 2Zk+1
uk− (t + τk ) = −
uk (t − τk ) + uk+1 (t)
Zk + Zk+1 Zk + Zk+1

Zk − Zk+1 Ak+1 − Ak
rk = =
Zk + Zk+1 Ak+1 + Ak

20 / 32
uk− (t + τk ) = −rk uk+ (t − τk ) + uk+1

(t)(1 − rk )
+ −
uk+1 (t) = (1 + rk )uk+ (t − τk ) + uk+1 (t)rk

Figure: Digital filter


21 / 32
If Ak << Ak+1 , rk =1.
If Ak >> Ak+1 , rk =-1.
−1 < rk < 1.

So when Ak+1 >> Ak , uk+1 (t) gets completely reflected back to
+
uk+1 (t). So wave does not travel from a huge tube to a very small
tube.
If all tubes are of small length, then propagation time is also same
l
i.e. τk =τk+1 =τ . lk = lk+1 = l/N. So τ = Nc .
In a vocal tract with N tubes, there are N-1 junctions.

22 / 32
r_N = r_L

Figure: Last Junction

23 / 32
PN (lN , Ω) = ZL UN (lN , Ω)

ρc + − + −
(u (t − τN ) + uN (t + τN )) = ZL (uN (t − τN ) − uN (t + τN ))
AN N

− +
uN (t + τN ) = −rL uN (t − τN )
Reflection coefficient at lips is

ρc/AN − ZL
rL =
ρc/AN + ZL

The output volume velocity at the lips is


+ − +
uN (lN , t) = uN (t − τN ) − uN (t + τN ) = (1 + rL )uN (t − τN )

24 / 32
Loss at glottis

U1 (0, Ω) = UG (Ω) − P1 (0, Ω)/ZG

If ZG is real
u1 (0, t) = uG (t) − p1 (0, t)/ZG
ρc
u1+ (t) − u1− (t) = uG (t) − (u + (t) + u1− (t))
A1 ZG 1
1 + rG
u1+ (t) = uG (t) + rG u1− (t)
2
If ZG is complex, u1+ (t) would be related to u1− (t), uG (t) by a
differential equation.

ZG − ρc/A1
rG =
ZG + ρc/A1

25 / 32
Figure: Glottis Junction

26 / 32
Figure: Two tube tract

27 / 32
Objective is to find the transfer function
UL (Ω)
Va (Ω) =
UG (Ω)

Z^(-0.5) for delay of \tau


Figure: Z transform model
28 / 32
+ −
uk+1 (t) = (1 + rk )uk+ (t − τk ) + uk+1 (t)rk


+
Uk+1 (z) = (1 + rk )z −1/2 Uk+ (z) + rk Uk+1 (z)
+ −
Uk+1 (z)z 1/2 rk Uk+1 (z)z 1/2
Uk+ (z) = −
1 + rk 1 + rk

uk− (t + τk ) = −rk uk+ (t − τk ) + uk+1



(t)(1 − rk )

Uk− (z)z 1/2 = −rk z −1/2 Uk+ (z) + (1 − rk )Uk+1



(z)
Substituting for Uk+ (z)

−rk z −1/2 + z −0.5 −


Uk− (z) = Uk+1 (z) + U (z)
1 + rk 1 + rk k+1

29 / 32
z 0.5 −rk z 0.5
!
 +  + 
Uk (z) 1+rk 1+rk Uk+1 (z)
=
Uk− (z) −rk z −0.5 z −0.5 −
Uk+1 (z)
1+rk 1+rk

Uk = Rk Uk+1
U1 = R1 R2 ...RN UN+1

   
UL (z) 1
UN+1 = = UL (z)
0 0

30 / 32
1 + rG
u1+ (t) = uG (t) + rG u1− (t)
2
1 + rG
U1+ (z) = UG (z) + rG U1− (z)
2
 U1+ (z)
 
2
UG (z) = 1 −rG
1 + rG U1− (z)
2 
UG (z) = 1 −rG U1
1 + rG
2 
UG (z) = 1 −rG R1 R2 ...RN UN+1
1 + rG
 
2  1
UG (z) = 1 −rG R1 R2 ...RN UL (z)
1 + rG 0

31 / 32
 
UG (z) 1 2  1
= = 1 −rG R1 R2 ...RN
UL (z) V (z) 1 + rG 0
V(z) has N poles. If filter is real, then poles will be complex
conjugates. Hence N/2 formant frequencies are possible.
Problem:
Find N if F=5 KHz.
Soln: Fs=10 K Hz. 2τ = 1/Fs . Using τ = l/Nc, l=17.5 cm,
c=340 m/s, N is approximately 10.

32 / 32

You might also like