Download as pdf or txt
Download as pdf or txt
You are on page 1of 51

Part 2.

Introduction to Lombard speech


Historics and interests Mechanisms Implication on experimental paradigms

Noise Bruit

Lombard (1911) le signe de llvation de la voix Deafness detection

PHYSIOLOGICAL AND COGNITIVE MECHANISMS

Mechanisms
(1) Neural reflex to the hyperstimulation of the auditory system
Nonaka et al. (1997) Neuroscience Research (2003) International Congress Series 1240.

Physiological relationship between vocalization and audition Greater activation of laryngeal and abdominal muscles with increasing loudness of the auitory stimulation

Decerebrate cats

Houde et al. 1998, Eliades et al. 2003, Hashimoto et al. 2003 (neurosciences)

Mechanisms
(1) Neural reflex to the hyperstimulation of the auditory system

Acoustic reflex

Reduces the ear sensitivity to high intensity levels Attenuation of high frequencies
(Margolis 1975, Al-Azazi 200)

Automatically activated before phonating


(Olsen et al. 1976)

Mechanisms
(2) Audio-phonation loop
Regulation of voice parameters from the auditory feedback

Noise

Automatism Servomechanism
Auditory feedback

Egan, 1972 Fairbanks, 1954 Korn 1954 Lane, Tranel, & Sisson 1970 Tonkinsson 1994

Mechanisms
(2) Audio-phonation loop
Very short reaction times
Neural reflex

Noise

Bauer, Mittal, Larson, & Hain (2006) (loudness) Leydon, Bauer, & Larson (2003) (pitch)

Auditory feedback

Mechanisms
(2) Audio-phonation loop
Regulation that cannot be completely inhibited
Pick, Siegel, Fox, Garber, & Kearney (1989) Siegel et al. (1974)

Noise

Or with another feedback


After effect

Auditory feedback

Mechanisms
(2) Audio-phonation loop
Regulation observed in babies and children
Siegel et al. (1976) Garber et al. (1980) Amazi et al. (1982)
Auditory feedback

Noise

As well as in animals
Sinnott et al. (1975) Manabe et al. (1998)

Auditory feedback of ones own voice

Different paths

3
1-Air (external)

2-Bone conduction
3-Tendons and muscles conduction

(+ Proprioceptive feedback, Scotto di Carlo (1994))

Auditory feedback of ones own voice


External feeback
Radiation and attenuation of high frequencies

Individual HRTF

Room acoustics and reverberation

Auditory feedback of ones own voice


Internal feeback

Important attenuation above 2kHz Enhancement btw 700 and 1200Hz (~F1)
Von Bekesy (1960), Hood (1962) Maurer et al. (1990), Prschmann (2000) Stenfelt et al. (2002)

Occlusion effect

Auditory feedback of ones own voice


Complementarity of these different contributions Regulation of low frequencies Bone conduction Regulation of high frequencies external feedback

Garber et al. (1981) : Sidetone effect + filtering of the external feedback

Auditory feedback of ones own voice


Perception of ones own loudness (i.e autophonic level)
Lane et al. (1961)

Lane et al. (1961)

Loudness x 2 for speaker Loudness x 1.2 for listener

Mechanisms
(3) Social regulation (active) communicative adaptation
Noise

Compensation of the perturbed intelligibility

Lane et al. (1961),

Junqua (1993)

Mechanisms
(3) Communicative adaptation
Hyper & Hypo theory, Lindblom (1990)

Situation perturbed

intelligibility

Adapted level ideal Articulatory effort

Mechanisms
(3) Communicative adaptation
Distance Noise
Lombard speech Traunmller et al. (2000) Kim (2005) Junqua (1992-1999) Garnier et al. (2006-2010) Cooke et al. (2006-2010) Patell and Schell (2008) Picheny et al. (1985, 1986)

hard of hearing people Babies and children


Infants Directed Speech (IDF)

Lindblom (1992) Dodane et al. (2006) Burnham et al. Uther et al. (2007)

Foreign directed speech

Mechanisms
(3) Communicative adaptation
Distance Noise
Lombard speech

loud / shouted speech

Bond Rostolland Schullman Guemann

hard of hearing people

Babies and children


Infants Directed Speech (IDF)

deliberate Clear speech Hyperarticulated Speech

Bradlow Cutler Krause

Foreign people

Mechanisms
(3) Communicative adaptation
In fact, many other speech modifications (developed in parts 3 and 4):

Global acoustic and articulatory modifications that may be related to voice intensity

Linguo-specific modifications that are not directly related to voice intensity

Mechanisms
(3) Communicative adaptation
Global acoustic and articulatory modifications that may be related to voice intensity
Fundamental frequency (f0) Spectral slope
Correlation established
(Schulmann 1989, Geumann 1999, 2001)

Though no physiological relationship

Jaw and mouth aperture First formant (F1)


Controversial idea

Speech rate

Mechanisms
(3) Communicative adaptation
Linguo-specific modifications that are not directly related to voice intensity (developed in Part 4)
Segment recognition / discrimination F2, F3, Lip rounding, spreading, Speech segmentation Pauses Syllable lengtening Intonation patterns Information highlighting Contrasts in intensity, f0,

Mechanisms
(3) Communicative adaptation Influence of these modifications on
Automatic Speech Recognition (--)
Junqua 1993, Bond

Speech intelligibility (++)


Junqua 1993, Van Summers et al. (1988), Cooke Dreher et al. 1957

(3) Communicative adaptation


Modeling speech transforms to improve the robustness of ASR systems
(advanced. See Part 3)

Identifying the modifications that contribute to this increased intelligibility


(few knowledge: Skowronski 2006, Lu&Cooke 2009 Part3 Garnier in prep Part 4)

Determining wether this increased intelligiblity is (actively) sought by the speaker


(in progess)

(3) Communicative adaptation


Influence of spontaneity and interaction

DIntensity spontaneous task > reading


(Lane et al. 1961, Kryter 1946, Pickett 1958, Gardner 1964, 1966, ) (Amazi et al. 1982, Junqua 1999)

speech adressed to someone > reading DTemporal and Spectral modifications

interactive task non interactive

(Cooke & Lu, 2010)

(Garnier et al., 2010)

DModifs related to voice intensity interactive task > non interactive

DModifs not related to voice intensity greater or significant only in interactive task

(3) Communicative adaptation


Do speaker actively adapt to the noise type ?
Mokbel (1992) Yes Cooke&Lu (2010) Yes for temporal modifs, No for spectral modifs

Garnier&Henrich (subm.) Only when compatible with 1st order adaptation

Further details in Part 3

(3) Communicative adaptation


Do speaker actively adapt to interaction modalities?
Situation: Acoustic modifs

(S1) No interaction

Articulatory modifs

(S2) Audio only interaction

-visible (lips)
-non/less visible (tongue) (S3) Audiovisual interaction

Collaboration with Lucie Mnard, UQAM (Montreal)

IMPLICATION IN THE CHOICE OF EXPERIMENTAL PARDIGMS

Implication on the choice of experimental paradigms


Different interests for the Lombard effect : Audiology, Psychoacoustics Logopedics Automatic speech recognition Communication, speech enhancement

different mechanisms considered, different protocols -Noisy conditions -Sound immersion methods -Speech production tasks

Amazi et al. 1982 [8] Bond et al. 1989 [31] Boril et al. 2005 [32] Castellanos et al. 1996 Davis et al. 2006 [62] Dejonckere et al. 1983 Dieroff et al. 1966 [71] Dohalska et al. 2000 [73] Egan 1972
[80] [65] [46]

Bruit de conversations Bruit rose 25 bruits de voiture (issus de CAR2E) , 4 bruit blancs entre 62-125Hz, 75-300Hz, 220-1120Hz et 840-2500Hz Bruit blanc Bruit blanc, Bruit de conversations Bruit blanc Bruit denfants dans une maternelle Bruit de traffic ferroviaire Bruit blanc large bande, basse frquences, hautes frquences et mdium Bruit blanc Bruit blanc et bruit de conversations Bruit blanc Bruit blanc et bruit de conversations

90 dB SPL 95 dB 90 dB SPL 85dB SPL 80 dB SPL 30, 50, 70, 90 dB 85-90 dB Non prcis 20 120 dB par pas de 10 dB 100-105 dB (A) 85dB SPL 85dB SPL 85dB SPL 80dB SPL 40, 50, 60, 70, 80, 90 dB 95 dB (A) 71, 75, 79 et 83 dB Non prcis 60, 70 et 80 dB SPL 66 dB (A), 79 dB (A) 80 dB SPL 90 dB SPL 80, 90, 100 dB 60, 80, 100, 110 dB 60, 70, 80 dB SPL 70, 80, 90 dB SPL 30, 70-78, 74, 78-85, 87 dB 90 dB 68, 70, 82, 85 dB SPL 30, 70-78, 74, 78-85, 87 dB 65 95 dB SPL 60, 70, 80, 90, 100 SPL 35, 45, 55, 65 dB(A) 31, 39, 61, 65, 75 dB(A) 80, 90, 100 dB 65, 75, 85 dB (C) 80 dB

Hofler 1984 [149] Junqua 1992


[173]

Junqua 1993 [171] Kadiri 1998 [176] Kim et al. 2005 Korn 1954 [185] Lamprecht 1988
[191] [179]

Bruit blanc et bruit de conversations Bruit blanc de mme enveloppe spectrale quun bruit de conversations Bruit blanc Bruit blanc Bruit de voiture tournant au ralenti, roulant 35 et 55 miles /h avec la vitre ouverte ou ferme
[244]

Lane et al. 1970 [194] Lee et al. 2004 [204] Mixdorff et al. 2006 Mokbel 1992 [245] Papon 2006
[265]

Bruit de conversations Bruit de voiture roulant 90 et 130 km/h, 4 bruits blanc : large bande, filtr par un passe bas 1kHz, 1.5 kHz ou par un passe haut 1.7 kHz Bruit de conversations Bruit blanc Bruit blanc filtr par un passe-bas 3.5 kHz
[305]

Pick et al. 1989 [271] Pisoni et al. 1985 [272] Schultz-Coulon et al. 1976 Siegel et al. 1974 [315] Sinnott et al. 1975
[318]

Bruit blanc Bruit de mme enveloppe spectrale quun bruit de conversations Bruit dans la bande 200-500Hz, Bruit dans la bande 8-16kHz Bruit dune classe calme, bruit de ventilation, bruit de classe bruyante, bar avec de la musique forte Bruit rose

Sodersten et al. 2005 [322] Stanton 1988 [327] Ternstrm et al. 2002
[346]

bruit blanc stationnaire, maternelle, bar avec de la musique forte, bruit de ventilation Bruit dune classe calme, bruit de ventilation, bruit de classe bruyante, bar avec de la musique forte Chorale Bruit rose
[360]

Ternstrm et al. 2006 [345] Tonkinson 1994


[353]

Frank et al. 2003 [97] Van Heusden et al. 1979

Bruit blanc de mme enveloppe spectrale quun bruit de conversations Salle rverbrante, cantine, salle informatique, voiture, chaufferie

Van Summers et al. 1988 Webster 1962


[372]

[361]

Bruit blanc Bruit de conversations Bruit blanc

Welby 2006 [375]

Noisy conditions
Noises of interest :
Factories (Borsuk et Klajman, 1967, Pruszewicz et al. 1974) Cars (Egan 1972, Mokbel 1992, Boril et Polack 2005, Lee et al. 2004) Train stations (Dohalska et al. 2000) Conversations, Crowds (Ternstrm et al. 2002, 2006, Zeiliger et al. 1994,
Sodersten et al. 2005, Davis et al. 2006, Garnier et al. 2007)

Loud music (Ternstrm et al. 2002, 2006) Preschool (Sodersten et al. 2002, Dieroff et al. 1966)

Noisy conditions
Spectral envelops
Broadband noises
White gaussian noise (Egan 1972, all
reference studies, )

Noises enhanced in low frequencies


In-car noise / envelop (Mokbel 1992, Boril et al. 2005) Multitalker noise / envelop (Egan 1972, Siegel et al. 1974,
many reference studies)

Pink noise (Frank et al. 2003) Ventilation noise (Ternstm et al. 2006)

Low-pass filtered broadband noise (Cooke&Yu 2010)

Noise enhanced in medium or high frequencies Band-pass / High-pass filtered broadband noise

From 60 to 105 dB (SPL, A, C / at ear) ?? Different limitations depending on the country Linearity of the Lombard effect
Garnier et al. (2006)

Voice intensity Noise level

Noise levels

Noisy conditions
Spectral envelops and energetic masking
Broadband noises
White gaussian noise (Egan 1972, all
reference studies, )

Pink noise (Frank et al. 2003) Ventilation noise (Ternstm et al. 2006)

Noises enhanced in low frequencies


In-car noise / envelop (Mokbel 1992, Boril et al. 2005) Multitalker noise / envelop (Egan 1972, Siegel et al. 1974, many reference studies) Gender, Number Low-pass filtered broadband noise (Cooke&Yu 2010)

Noise enhanced in medium or high frequencies


Band-pass / High-pass filtered broadband noise

Types of noise
Temporal fluctuations
Music, small number of talkers important fluctuations Broadband noise, large number of talkers stationnary

Informational masking
Number of talkers Same language as target speech Reverse

Noisy conditions
Noise levels
From 60 to 105 dB dB SPL, A, C Calibration at speakers ear Different limitations depending on the country dB A
Cocktail-party noise White noise

Linearity of the Lombard effect ? Garnier et al. (2006) Voice Voice


average Individual behaviours

Noise level

Noise level

Amazi et al. 1982 [8] Bond et al. 1989 [31] Boril et al. 2005 [32] Castellanos et al. 1996 Davis et al. 2006 [62] Dejonckere et al. 1983 Dieroff et al. 1966 [71] Dohalska et al. 2000 [73] Egan 1972
[80] [65] [46]

Casque avec retour de sa propre voix Casque avec Casque Casque Casque, Haut-parleurs Casque In situ In situ Casque Casque Casque Casque Casque avec et sans retour de sa propre voix 72 dB

Hofler 1984 [149] Junqua 1992


[173]

Junqua 1993 [171] Kadiri 1998 [176] Kim et al. 2005 Kim 2005 [180] Korn 1954
[185] [179]

Casque, Haut-parleurs Casque Casque Casque avec retour de sa propre voix In situ Haut-parleurs Casque Haut-parleurs Casque Casque Casque
[305]

Lamprecht 1988 [191] Lane et al. 1970 [194] Lee et al. 2004
[204]

Mixdorff et al. 2006 [244] Mokbel 1992


[245]

Nonaka et al. 1997 [256] Papon 2006 [265] Pick et al. 1989
[271]

Pisoni et al. 1985 [272] Schultz-Coulon et al. 1976 Siegel et al. 1974 [315] Sinnott et al. 1975 [318] Sodersten et al. 2005 Stanton 1988 [327] Ternstrm et al. 2002
[346] [322]

Casque Casque avec retour de sa propre voix Casque Haut-parleurs Casque Haut-parleurs Haut-parleurs Casque Casque + protections auditives

Ternstrm et al. 2006 [345] Tonkinson 1994 [353] Frank et al. 2003
[97]

Van Heusden et al. 1979

[360]

Casque spcial, sans attnuation In situ

Van Summers et al. 1988

[361]

Casque

Sound immersion methods


(1) In situ real noise
Influence of ambiant noise on acoustic analysis (Garnier et al. 2010)

~ok

Articulation, duration

ok

Sound immersion methods


(1) In situ real noise
-mouth-microphone distance -separation wall -absorbant panels
Meltzener et al. 2003

-directivity : cardioid, autodirective microphone


-correlation : binaural, multi-array microphones
Granqvist 2003

-denoising algorithms

Pabon 2006

Some of these methods affect voice spectrum

Sound immersion methods


(2) Headphones
Calibration of noise level with an artificial ear

Sound immersion methods


(2) Headphones
Problem : affects the auditory feedback (+ occlusion effect) ~ Effect of earplugs: Kryter (1946), Tuft & Frank (2003) Significant effect of heapdhones on the Lombard effect (Garnier et al. 2010)

+ other parameters (f0, spectrum centroid, F1) . Not always just an offset

Sound immersion methods


(3) Headphones with compensation for own voice attenuation
= Lombard + Sidetone (+ filtering) effects

Transfer function of the (external) attenuation caused by headphones


Lu&Cooke (2008) noise feeback

No significant effect on acoustic parameters

TF
voice Significant effect on some parameters only Does not compensate for the headphone effect

Attenuation in intensity only


Garnier et al. (2010)

Sound immersion methods


(4) Loudspeakers
Denoising techniques based on channel estimation (noise known)

Mixdorff et al. 2006

Recorded Noise

Source signal

(Noise)

Recorded signal

Denoised signal

(Speech + Noise)

(Speech)

Estimated TF
Ternstrm et al. 2002

Estimation of recorded noise

Sound immersion methods


(4) Loudspeakers
Influence of the denoising technique on acoustic analysis (Garnier et al. 2010)

Ok, under the condition that the speaker does not move too much

Sound immersion methods


Compatibility between sound immersion method
and -intended measurements (audio, articulation, physiological signals, pointing, eyes movements)

-hypothesis on the adaptation mechanisms involved

Amazi et al. 1982 [8]

dsignation dimages, puis contage dune histoire propos de ces images lecture lecture
[46]

25 cartes reprsentant une image

oui + systme de gratification

Bond et al. 1989 [31] Boril et al. 2005


[32]

10 spondes issus de Hirsh et al. 1952

[148]

non non non


[2]

Phrases quilibres phontiquement, chiffres et nombres, commandes, dates, heures, etc. 13 phrases quilibres phontiquement 10 phrases quilibres issus des phrases de 1969 Texte denviron 30s 44 phrases de 7 12 syllabes Passage dun texte 9 voyelles en contexte h-d, puis en contexte d-d et s-s. Logatomes incopors dans des phrases types. 49 mots (chiffres, lettres, commandes) 50 phrases quilibres, nombres, 82 logatomes de type C(C)VC 50 mots (commandes) dont 4 monosyllabiques, 36 bisyllabiques, 6 trisyllabiques et 4 quadrisyllabiques. 10 phrases quilibres issus des phrases de 1969 Voyelle /a/ tenue Phrase porteuse avec un mot cible changeant (commande aronautique) 13 chiffres, 26 lettres, 10 numros de tlphone, 20 phrases Phrases phontiquement quilibres 12 mots (chiffres et commandes) Mots du jeu Pictionnary, mots construits et non existants 15 mots aronautiques (chiffres et commandes) Texte
[2]

Castellanos et al. 1996 Davis et al. 2006 [62] Dejonckere et al. 1983 Dieroff et al. 1966 [71] Dohalska et al. 2000 Egan 1972 [80] Hofler 1984 [149] Junqua 1992
[173]

lecture lecture

Auditeur passif 2.5 m non oui Non prcis non non non

[65]

Lecture Parole spontane

[73]

Parole spontane Lecture lecture Lecture

Junqua 1993 [171] Kadiri 1998


[176]

Lecture Lecture Lecture

non non Ordinateur passif 1.5, 1, 2 et 3 m

Kim et al. 2005 [179]

Kim 2005 [180] Korn 1954 [185] Lamprecht 1988


[191]

Lecture Dialogue spontan Enonciation de phrases types lecture


[244]

Auditeur passif 2.5 m oui Retour de lauditeur par le biais dune ardoise non non non Oui, systme dardoise non non non

Lane et al. 1970 [194] Lee et al. 2004 [204] Mixdorff et al. 2006 Mokbel 1992 [245] Nonaka et al. 1997 Papon 2006 [265] Pick et al. 1989 [271] Pisoni et al. 1985
[272] [256]

lecture Lecture Phonation induite par stimulation lectrique lecture Parole spontane Lecture Lecture

Schultz-Coulon et al. 1976


[305]

Siegel et al. 1974 [315] Sinnott et al. 1975 [318] Sodersten et al. 2005
[322]

Parole spontane Induit par conditionnement lecture

Cri dappel Texte

non non Retour de la comprhension d auditeur (invisible) via un vumtre non Retour de la comprhension d auditeur (invisible) via un vumtre Retour de la comprhension d auditeur (invisible) via un vumtre non non

Stanton 1988 [327] Ternstrm et al. 2002


[346]

lecture lecture

56 phrases 5 petits textes diffrents pour chaque condition

Ternstrm et al. 2006 [345]

lecture

6 textes de 90 s chacun

Tonkinson 1994 [353] Frank et al. 2003 [97]

Chant Lecture

12 passages

Speech material and tasks


Voiced sounds Aaaahh Words Audiometry Two syllables with similar accentuation Spondees (Hirsh et al. 1952 ) Fourniers list (French). One and two syllables Phonetics/Intelligibility tests Vocabulary from Miller et al. 1955 Automatic speech recognition Commands (Aeronautics, Human-machine interfaces) Letters, Numbers
(Van Summers et al. 1988, Junqua 1996, Kim 2005 )

Sentences phonetically balanced


Sentences of Portmann et al. (1959)
Castellanos 1996 (spanish)

Speech material and tasks


Reading without feedback
Without speech partner Addressed to imaginary partner Adressed to computer

Reading with feedback from vu-meter Interactive games with feedback


Sudoku, map task, board

Speech material and tasks

Speaker

Experimenter

Patell and Schell 2008

Speech material and tasks


Compatibility between speech material and task and Parameters examined (long term descriptors, segments, prosody, ) Hypothesis tested Control of segmental or supra-segmentral context

You might also like